Batch predictions
You can use a batch predictions job in Amazon Fraud Detector to get predictions for a set of events that do not require real-time scoring. For example, you could create a batch predictions job to perform an offline proof-of-concept, or to retrospectively evaluate the risk of events on an hourly, daily, or weekly basis.
You can create a batch prediction job using the Amazon Fraud Detector console
Topics
How batch predictions work
The CreateBatchPredictionJob
API operation uses a specified detector
version to make predictions based on data provided in an input CSV file that is located
in an Amazon S3 bucket. The API then returns the resulting CSV file to an S3 bucket.
Batch prediction jobs calculate model scores and prediction outcomes in the same way
as the GetEventPrediction
operation. Similar to
GetEventPrediction
, to create a batch predictions job, you first create
an event type, optionally train a model, and then create a detector version that
evaluates the events in your batch job.
The pricing for event risk scores evaluated by batch prediction jobs is the same as the pricing for scores created by the GetEventPrediction
API. For details, see Amazon Fraud Detector pricing
You can only run one batch prediction job at a time.
Input and output files
The input CSV file should contain headers that match the event type that is associated with the selected detector version. The maximum size of the input data file is 1GB. The number of events will vary by your event size.
Amazon Fraud Detector creates the output file in the same bucket as the input file, unless you specify a separate location for the output data. The output file contains the original data from the input file and the following appended columns:
MODEL_SCORES
— Details the model scores for the event from each model associated with the selected detector version.OUTCOMES
— Details the event outcomes as evaluated by the selected detector version and its rules.STATUS
— Indicates whether the event was evaluated successfully. If the event was not evaluated successfully, this column shows a reason code for the failure.-
RULE_RESULTS
— A list of all the rules that matched, based on the rule execution mode.
Getting batch predictions
The following steps assume that you have already created an event type, trained a model using that event type (optional), and created a detector version for that event type.
To get a batch prediction
-
Sign in to the AWS Management Console and open the Amazon Fraud Detector console at https://console.aws.amazon.com/frauddetector
. -
In the left navigation pane of the Amazon Fraud Detector console, choose Batch Predictions, and then choose New batch prediction.
-
In Job name, specify a name for your batch prediction job. If you don’t specify a name, Amazon Fraud Detector randomly generates a job name.
-
In Detector, choose the detector for this batch prediction.
-
In Detector version, choose the detector version for this batch prediction. You can choose a detector version in any status. If your detector has a detector version in
Active
status, that version is automatically selected, but you can also change this selection if needed. -
In IAM role, choose or create a role that has read and write access to your input and output Amazon S3 buckets. See Guidance on IAM roles for more information.
To get batch predictions, the IAM role that calls the
CreateBatchPredictionJob
operation must have read permissions to your input S3 bucket and write permissions to your output S3 bucket. For more information about bucket permissions, see User policy examples in the Amazon S3 User Guide. -
In Input data location, specify the Amazon S3 location of your input data. If you want the output file in a different S3 bucket, select Separate data location for output and provide the Amazon S3 location for your output data.
(Optional) Create tags for your batch prediction job.
Choose Start.
Amazon Fraud Detector creates the batch prediction job, and the job's status is
In progress
. Batch prediction job processing times vary depending on the number of events and your detector version configuration.
To stop a batch prediction job that is in progress, go to the batch prediction job detail page, choose Actions, and then choose Stop batch prediction. If you stop a batch prediction job, you won't receive any results for the job.
When the batch prediction job's status changes to Complete
, you can
retrieve the job's output from the designated output Amazon S3 bucket. The output file's name
is in the format batch prediction job name_file creation
timestamp_output.csv
. For example, the output file from a job named
mybatchjob
is mybatchjob_
1611170650_output.csv
.
To search for specific events evaluated by a batch prediction job, in the left navigation pane of the Amazon Fraud Detector console, choose Search past predictions.
To delete a batch prediction job that has completed, go to the batch prediction job detail page, choose Actions and then choose Delete batch prediction.
Guidance on IAM roles
To get batch predictions, the IAM role that calls the CreateBatchPredictionJob operation must have read permissions to your input S3 bucket and write permissions to your output S3 bucket. For more information about bucket permissions, see User policy examples in the Amazon S3 User Guide. On the Amazon Fraud Detector console, you have three options for selecting an IAM role for Batch Predictions:
-
Create a role when creating a new Batch Prediction job.
-
Select an existing IAM role that you have previously created in Amazon Fraud Detector console. Make sure to add the
S3:PutObject
permission to the role before you do this step. -
Enter a custom ARN for a previously created IAM role.
If you receive an error related to your IAM role, verify the following:
-
Your Amazon S3 input and output buckets are in the same region as your detector.
-
The IAM role you are using has the
s3:GetObject
permission for your input S3 bucket and thes3:PutObject
permission for your output S3 bucket. -
The IAM role you are using has a trust policy for service principal
frauddetector.amazonaws.com
.
Get batch fraud predictions using the AWS SDK for Python (Boto3)
The following example shows a sample request for the CreateBatchPredictionJob API. A batch prediction job must include the following existing resources: detector, detector version, and event type name. The following example assumes you have created an event type sample_registration
, a detector sample_detector
, and a detector version 1
.
import boto3 fraudDetector = boto3.client('frauddetector') fraudDetector.create_batch_prediction_job ( jobId = 'sample_batch', inputPath = 's3://bucket_name/input_file_name.csv', outputPath = 's3://bucket_name/', eventTypeName = 'sample_registration', detectorName = 'sample_detector', detectorVersion = '1', iamRoleArn = 'arn:aws:iam::**:role/service-role/AmazonFraudDetector-DataAccessRole-**' )