Data types Default value Variable types Variable enrichments

Variables

Variables represent data elements that you want to use in a fraud prediction. These variables can be taken from the event dataset that you prepared for training your model, from your Amazon Fraud Detector model's risk score outputs, or from Amazon SageMaker AI models. For more information about variables taken from the event dataset, see Get event dataset requirements using the Data models explorer.

The variables you want to use in your fraud prediction must first be created and then added to the event when creating your event type. Each variable you create must be assigned a datatype, a default value, and optionally a variable type. Amazon Fraud Detector enriches some of the variables that you provide such as IP addresses, bank identification numbers (BINs), and phone numbers, to create additional inputs and boost performance for the models that use these variables.

Data types

Variables must have a data type for the data element that the variable represents and can optionally be assigned one of the predefined Variable types. For variables that are assigned to a variable type, the data type is pre-selected. Possible data types include the following types :

Data type	Description	Default value	Example values
String	Any combination of letters, whole numbers, or both	<empty>	abc, 123, 1D3B
Integer	Positive or negative whole numbers	0	1, -1
Boolean	True or False	False	True, False
DateTime	Date and time specified in the ISO 8601 standard UTC format only	<empty>	2019-11-30T13:01:01Z
Float	Numbers with decimal points	0.0	4.01, 0.10

Default value

Variables must have a default value. When Amazon Fraud Detector generates fraud predictions, this default value is used to run a rule or model if Amazon Fraud Detector doesn't receive a value for a variable. Default values you provide must match the selected data type. In the AWS Console, Amazon Fraud Detector assigns the default value of 0 for integers, false for Booleans, 0.0 for floats, and (empty) for strings. You can set a custom default value for any of these data types.

Variable types

When you create a variable, you can optionally assign the variable to a variable type. Variable type represents the common data elements that are used to train models and to generate fraud predictions. Only variables with an associated variable type can be used for model training. As part of the model training process, Amazon Fraud Detector uses the variable type associated with the variable to perform variable enrichments, feature engineering, and risk scoring.

Amazon Fraud Detector has pre-defined the following variable types that can be used to assign to your variables.

Category	Variable type	Description	Data type	Example
Session	IP_ADDRESS	The IP address that's collected during the event	String	192.0.2.0 Note: Amazon Fraud Detector enriches this data. For more information, see Geolocation enrichment
	USERAGENT	The user agent that's collected during the event	String	Mozilla 5.0 (Windows NT 10.0, Win64, x64,rv:68.0) Gecko 20100101
	FINGERPRINT	The unique identifier for a device used for the event	String	sadfow987u234
	SESSION_ID	The session ID for the event's active session	String	sid123456789
	ARE_CREDENTIALS_VALID	Indicates if the credentials used for event login are valid	Boolean	True
User	EMAIL_ADDRESS	The email address that's collected during the event	String	abc@domain.com
User	PHONE_NUMBER	The phone number collected during the event	String	+1 555-0100 Note: Amazon Fraud Detector enriches this data. For more information, see Phone number enrichment
Billing	BILLING_NAME	The name that's associated with the billing address	String	John Doe
	BILLING_PHONE	The phone number that's associated with the billing address	String	+1 555-0100 Note: Amazon Fraud Detector enriches this data. For more information, see Phone number enrichment
	BILLING_ADDRESS_L1	The first line of the billing address	String	Any street
	BILLING_ADDRESS_L2	The second line of the billing address	String	Any unit 123
	BILLING_CITY	The city that's in the billing address	String	Any City
	BILLING_STATE	The state or province that's in the billing address	String	Any state or province
	BILLING_COUNTRY	The country that's in the billing address	String	Any country Note: Amazon Fraud Detector enriches this data. For more information, see Geolocation enrichment
	BILLING_ZIP	The postal code that's in the billing address	String	01234 Note: Amazon Fraud Detector enriches this data. For more information, see Geolocation enrichment
Shipping	SHIPPING_NAME	The name that's associated with the shipping address	String	John Doe
	SHIPPING_PHONE	The phone number that's associated with the shipping address	String	+1 555-0100 Note: Amazon Fraud Detector enriches this data. For more information, see Phone number enrichment
	SHIPPING_ADDRESS_L1	The first line of the shipping address	String	123 Any Street
	SHIPPING_ADDRESS_L2	The second line of the shipping address	String	Unit 123
	SHIPPING_CITY	The city that's in the shipping address	String	Any City
	SHIPPING_STATE	The state or province that's in the shipping address	String	Any State
	SHIPPING_COUNTRY	The country that's in that's in the shipping address	String	Any Country Note: Amazon Fraud Detector enriches this data. For more information, see Geolocation enrichment
	SHIPPING_ZIP	The postal code that's in the shipping address	String	01234 Note: Amazon Fraud Detector enriches this data. For more information, see Geolocation enrichment
Payment	ORDER_ID	The unique identifier for the transaction	String	LUX60
	PRICE	The total order price	String	560.00
	CURRENCY_CODE	The ISO 4217 currency code	String	USD
	PAYMENT_TYPE	The payment method that's used for payment during the event	String	Credit card
	AUTH_CODE	The alphanumerical code that's sent by a credit card issuer or issuing bank	String	0000
	AVS	The address verification system (AVS) response code from the card processor	String	Y
Product	PRODUCT_CATEGORY	The product category of order item	String	Kitchen
Custom	NUMERIC	Any variable that can be represented as a real number	Float	1.224
	CATEGORICAL	Any variable that describes categories, segments, or groups	String	Large
	FREE_FORM_TEXT	Any free form text that's captured as part of the event (for example, a customer review or comment)	String	Example of a free form text input

Assigning variable to a variable type

If you are planning to use a variable for training your model, it is important that you choose a right variable type to assign to the variable. Incorrect variable type assignment can negatively impact your model performance. It can also become very difficult for you change the assignment later, especially if multiple models and events have used the variable.

You can assign your variable any one of the pre-defined variable types or one of the custom variable types – FREE_FORM_TEXT, CATEGORICAL, or NUMERIC.

Important notes for assigning variables to the right variable types

If the variable matches one of predefined variable types, use it. Make sure the variable type corresponds to the variable. For example, if you assign an ip_address variable to EMAIL_ADDRESS variable type, the ip_address variable will not get enriched with enrichments such as ASN, ISP, geo-location, and risk score. For more information, see Variable enrichments.
If the variable doesn’t match any of predefined variable types, follow the recommendations listed below to assign one of the custom variable types.
Assign CATEGORICAL variable type to variables that typically do not have natural ordering and can be put into categories, segments, or groups. The dataset you are using to train your model might have ID variables such as, merchant_id, campaign_id, or policy_id. These variables represent groups (for example, all customers with same policy_id represent a group). Variables that have the following data must be assigned CATEGORICAL variable type -
- Variables that contain data such as customer_ID, segment_ID, color_ID, department_code, or product_ID.
- Variables that contain Boolean data with true, false, or null values.
- Variables that can be put into groups or categories such as company name, product category, card type, or referral medium.
Note
ENTITY_ID is a reserved variable type used by Amazon Fraud Detector to assign to ENTITY_ID variable. The ENTITY_ID variable is the ID of the entity initiating the action you want to evaluate. If you are creating a Transaction Fraud Insight (TFI) model type, you are required to provide ENTITY_ID variable. You will need to decide which variable in your data uniquely identifies the entity initiating the action and pass it on as ENTITY_ID variable. Assign CATEGORICAL variable type to all the other IDs in your dataset, if they are present and if you are using them for model training. Examples of other IDs that are not an entity in your dataset can be merchant_ID, policy_ID, and campaign_ID.
Assign FREE_FORM_TEXT variable type to variables that contain a block of text. Examples of FREE_FORM_TEXT variable types are – user reviews, comments, dates, and referral codes. The FREE_FORM_TEXT data contains multiple tokens separated by a delimiter. The delimiters can be any character other than alpha-numeric and underscore symbol. For example, user reviews and comments can be separated by “space” delimiter, dates and referral codes can use hyphens as delimiters to separate out prefix, suffix, and intermediate parts. Amazon Fraud Detector uses the delimiters to extract data from FREE_FORM_TEXT variables.
Assign NUMERIC variable type to variables that are real numbers and have inherent ordering. Examples of NUMERIC variables include day_of_the_week, incident_severity, customer_rating. Although, you can assign CATEGORICAL variable type to these variables, we strongly recommend to assign all real number variables with inherent order to NUMERIC variable type.

Variable enrichments

Amazon Fraud Detector enriches some of the raw data elements that you provide such as IP addresses, bank identification numbers (BINs), and phone numbers, to create additional inputs and boost performance for the models that use these data elements. The enrichment helps identify potentially suspicious situations and help the models to capture more fraud.

Phone number enrichment

Amazon Fraud Detector enriches phone number data with additional information that relates to geolocation, the original carrier, and the validity of the phone number. Phone number enrichment is automatically enabled for all the models that are trained on or after December 13, 2021 and have a phone number that includes a country code (+xxx). If you have included phone number variable in your model and have trained it before December 13, 2021, retrain your model so it can take advantage of this enrichment.

We highly recommend that you use the following format for phone number variables to ensure that your data is enriched successfully.

Variable	Format	Description
PHONE_NUMBER	The E.164 standard	Make sure to include country code (+xxx) with the phone number.
BILLING_PHONE and SHIPPING_PHONE	The E.164 standard	Make sure to include country code (+xxx) with the phone number.

Geolocation enrichment

Starting on February 8, 2022 Amazon Fraud Detector calculates the physical distance between the IP_ADDRESS, BILLING_ZIP, and SHIPPING_ZIP values that you provide for an event. The calculated distances are used as inputs to your fraud detection model.

To enable geolocation enrichment, your event data must include at least two of the three variables: IP_ADDRESS, BILLING_ZIP, or SHIPPING_ZIP. In addition, each BILLING_ZIP and SHIPPING_ZIP value must have a valid BILLING_COUNTRY code and SHIPPING_COUNTRY code respectively. If you have a model that was trained before February 8, 2022 and it includes these variables, you must retrain the model to enable the geolocation enrichment.

If Amazon Fraud Detector can't determine the location that's associated with the IP_ADDRESS, BILLING_ZIP ,or SHIPPING_ZIP values for an event due to the data being not valid, a special placeholder value is used instead. For example, suppose that an event has valid IP_ADDRESS and BILLING_ZIP values, but SHIPPING_ZIP value isn't valid. In this case, enrichment is done only for IP_ADDRESS–> BILLING_ZIP. The enrichment isn't done for IP_ADDRESS–>SHIPPING_ZIP and BILLING_ZIP–>SHIPPING_ZIP . Instead, the placeholder values are used in their place. No matter if geolocation enrichment is enabled for your model or not, the performance of your model doesn't change.

You can opt out of geolocation enrichment by mapping your BILLING_ZIP and SHIPPING_ZIP variables to the CUSTOM_CATEGORICAL variable type. Changing the variable type doesn't affect your model's performance.

Geolocation variable format

We highly recommend that you use the following format for geolocation variables to ensure that your location data is enriched successfully.

Variable	Format	Description
IP_ADDRESS	IPv4 address	For example - 1.1.1.1
BILLING_ZIP and SHIPPING_ZIP	The ISO 3166-1 alpha-2 postal code for the specified country	For more information, see the Country and territory codes section in this topic.
BILLING_COUNTRY and SHIPPING_COUNTRY	The ISO 3166-1 alpha-2 two-letter standard country code	For more information, see the Country and territory codes section in this topic. Amazon Fraud Detector tries to match all the common variations of a country's name to their ISO 3166-1 two-letter standard country code. However, we cannot guarantee they will be matched correctly.

The following table provides a complete list of the countries and territories that are supported by Amazon Fraud Detector for geolocation enrichment. Each country and territory has an assigned country code (specifically, the ISO 3166-1 alpha-2 two-letter country code) and a postal code.

Postal code format

9 - number
a - letter
[X] - X is optional. For example, Guersney "GY9[9] 9aa" means both "GY9 9aa" and "GY99 9aa" are valid. Use one format.
[X/XX] - either X or XX can be used. For example, Bermuda "aa[aa/99]" means both "aa aa" and "aa 99" are valid. Use either one of these formats, but do not use both.
Some countries have fixed prefix. For example, the postal code for Andorra is AD999. This means the country code must start with letters AD followed by three numbers.

Code	Name	Postal code
AD	Andorra	AD999
AR	Netherlands Antilles	9999
AT	Austria	9999
AU	Australia	9999
AZ	Azerbaijan	AZ 9999
BD	Bangladesh	9999
BE	Belgium	9999
BG	Bulgaria	9999
BM	Bermuda	aa[aa/99]
BY	Belarus	999999
CA	Canada	a9a 9a9
CH	Switzerland	9999
CL	Chile	9999999
CO	Colombia	999999
CR	Costa Rica	99999
CY	Cyprus	9999
CZ	Czechia	999 99
DE	Germany	99999
DK	Denmark	9999
DO	Dominican Republic	99999
DZ	Algeria	99999
EE	Estonia	99999
ES	Spain	99999
FI	Finland	99999
FM	Federated States of Micronesia	99999
FO	Faroe Islands	999
FR	France	99999
GB	United Kingdom	a[a]9[a/9] 9aa
GG	Guernsey	GY9[9] 9aa
GL	Greenland	9999
GP	Guadeloupe	99999
GT	Guatemala	99999
GU	Guam	99999
HR	Croatia	99999
HU	Hungary	9999
IE	Ireland	a99[a/9][a/9][a/9][a/9]
IM	Isle of Man	IM9[9]9aa
IN	India	999999
IS	Iceland	999
IT	Italy	99999
JE	Jersey	JE9[9]9aa
JP	Japan	999-9999
KR	Republic of Korea	99999
LI	Liechtenstein	9999
LK	Sri Lanka	99999
LT	Lithuania	99999
LU	Luxembourg	L-9999
LV	Latvia	LV-9999
MC	Monaco	99999
MD	Republic of Moldova	9999
MH	Marshall Islands	99999
MK	North Macedonia	9999
MP	North Mariana Islands	99999
MQ	Matinique	99999
MT	Malta	aaa 9999
MX	Mexico	99999
MY	Malaysia	99999
NL	Netherlands	9999 aa
NO	Norway	9999
NZ	New Zealand	9999
PH	Philippines	9999
PK	Pakistan	99999
PL	Poland	99-999
PR	Puerto Rico	99999
PT	Portugal	9999-999
PW	Palau	99999
RE	Reunion	99999
RO	Romania	999999
RU	Russian Federation	999999
SE	Sweden	999 99
SG	Singapore	999999
SI	Slovenia	9999
SK	Slovakia	999 99
SM	San Marino	99999
TH	Thailand	99999
TR	Turkey	99999
UA	Ukraine	99999
US	United States	99999
UY	Uruguay	99999
VI	Virgin Islands, US	99999
WF	Wallis and Futuna	99999
YT	Mayotte	99999
ZA	South Africa	9999

Useragent enrichment

If you create the Account Takeover Insights (ATI) model, you must provide a variable of the useragent variable type in your dataset. This variable contains the browser, device, and OS data of a login event. Amazon Fraud Detector enriches the useragent data with additional information such as user_agent_family OS_family, and device_family.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Resources

Create a variable