Get and upload example training data - Amazon Fraud Detector

Get and upload example training data

  1. Download the following file, unzip and use one of the sample CSV files that contain fictitious, synthetically generated training data.


    This zip file contains two files of synthetic registrations that you can use to train a model. The dataset registration_data_20K_minimum contains only two variables: ip_address and email_address. The dataset registration_data_20K_full contains additional variables for each event such as billing_address, phone_number, and user_agent. Both datasets also contains two mandatory fields:

    • EVENT_TIMESTAMP – Defines when the event occurred

    • EVENT_LABEL – Classifies the event as fraudulent or legitimate

  2. Create an Amazon S3 bucket:

    1. Sign in to the AWS Management Console and open the Amazon S3 console at

    2. Choose Create bucket, and perform the steps to create your bucket. You must choose an AWS region where Amazon Fraud Detector is currently available: US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Asia Pacific (Singapore) or Asia Pacific (Sydney).

      For this exercise, the generic name bucket-name is used. You must rename your bucket because Amazon S3 bucket names must be globally unique.

  3. Upload a training data file (that is, one of the .csv files listed previously) to your Amazon S3 bucket.

    Note the Amazon S3 location of your training file (for example, s3://bucketname/path/to/some/object.csv) and your role name. For details about formatting your dataset file, see Preparing data.