CreateMatchingWorkflow
Creates a matching workflow that defines the configuration for a data processing job.
The workflow name must be unique. To modify an existing workflow, use
UpdateMatchingWorkflow.
Important
For workflows where resolutionType is PROVIDER, incremental
processing is not supported.
Request Syntax
POST /matchingworkflows HTTP/1.1
Content-type: application/json
{
"description": "string",
"incrementalRunConfig": {
"incrementalRunType": "string"
},
"inputSourceConfig": [
{
"applyNormalization": boolean,
"inputSourceARN": "string",
"schemaName": "string"
}
],
"outputSourceConfig": [
{
"applyNormalization": boolean,
"customerProfilesIntegrationConfig": {
"domainArn": "string",
"objectTypeArn": "string"
},
"KMSArn": "string",
"output": [
{
"hashed": boolean,
"name": "string"
}
],
"outputS3Path": "string"
}
],
"resolutionTechniques": {
"providerProperties": {
"intermediateSourceConfiguration": {
"intermediateS3Path": "string"
},
"providerConfiguration": JSON value,
"providerServiceArn": "string"
},
"resolutionType": "string",
"ruleBasedProperties": {
"attributeMatchingModel": "string",
"matchPurpose": "string",
"rules": [
{
"matchingKeys": [ "string" ],
"ruleName": "string"
}
]
},
"ruleConditionProperties": {
"matchingConfig": {
"enableTransitiveMatching": boolean
},
"rules": [
{
"condition": "string",
"ruleName": "string"
}
]
}
},
"roleArn": "string",
"tags": {
"string" : "string"
},
"workflowName": "string"
}
URI Request Parameters
The request does not use any URI parameters.
Request Body
The request accepts the following data in JSON format.
- description
-
A description of the workflow.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 255.
Required: No
- incrementalRunConfig
-
Optional. An object that defines the incremental run type. This object contains only the
incrementalRunTypefield, which appears as "Automatic" in the console.Important
For workflows where
resolutionTypeisPROVIDER, incremental processing is not supported.Type: IncrementalRunConfig object
Required: No
- inputSourceConfig
-
A list of
InputSourceobjects, which have the fieldsInputSourceARNandSchemaName.Type: Array of InputSource objects
Array Members: Minimum number of 1 item. Maximum number of 20 items.
Required: Yes
- outputSourceConfig
-
A list of
OutputSourceobjects, each of which contains fieldsoutputS3Path,applyNormalization,KMSArn, andoutput.Type: Array of OutputSource objects
Array Members: Fixed number of 1 item.
Required: Yes
- resolutionTechniques
-
An object which defines the
resolutionTypeand theruleBasedProperties.Type: ResolutionTechniques object
Required: Yes
- roleArn
-
The Amazon Resource Name (ARN) of the IAM role. AWS Entity Resolution assumes this role to create resources on your behalf as part of workflow execution.
Type: String
Required: Yes
-
The tags used to organize, track, or control access for this resource.
Type: String to string map
Map Entries: Minimum number of 0 items. Maximum number of 200 items.
Key Length Constraints: Minimum length of 1. Maximum length of 128.
Value Length Constraints: Minimum length of 0. Maximum length of 256.
Required: No
- workflowName
-
The name of the workflow. There can't be multiple
MatchingWorkflowswith the same name.Type: String
Length Constraints: Minimum length of 1. Maximum length of 255.
Pattern:
[a-zA-Z_0-9-]*Required: Yes
Response Syntax
HTTP/1.1 200
Content-type: application/json
{
"description": "string",
"incrementalRunConfig": {
"incrementalRunType": "string"
},
"inputSourceConfig": [
{
"applyNormalization": boolean,
"inputSourceARN": "string",
"schemaName": "string"
}
],
"outputSourceConfig": [
{
"applyNormalization": boolean,
"customerProfilesIntegrationConfig": {
"domainArn": "string",
"objectTypeArn": "string"
},
"KMSArn": "string",
"output": [
{
"hashed": boolean,
"name": "string"
}
],
"outputS3Path": "string"
}
],
"resolutionTechniques": {
"providerProperties": {
"intermediateSourceConfiguration": {
"intermediateS3Path": "string"
},
"providerConfiguration": JSON value,
"providerServiceArn": "string"
},
"resolutionType": "string",
"ruleBasedProperties": {
"attributeMatchingModel": "string",
"matchPurpose": "string",
"rules": [
{
"matchingKeys": [ "string" ],
"ruleName": "string"
}
]
},
"ruleConditionProperties": {
"matchingConfig": {
"enableTransitiveMatching": boolean
},
"rules": [
{
"condition": "string",
"ruleName": "string"
}
]
}
},
"roleArn": "string",
"workflowArn": "string",
"workflowName": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- description
-
A description of the workflow.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 255.
- incrementalRunConfig
-
An object which defines an incremental run type and has only
incrementalRunTypeas a field.Type: IncrementalRunConfig object
- inputSourceConfig
-
A list of
InputSourceobjects, which have the fieldsInputSourceARNandSchemaName.Type: Array of InputSource objects
Array Members: Minimum number of 1 item. Maximum number of 20 items.
- outputSourceConfig
-
A list of
OutputSourceobjects, each of which contains fieldsoutputS3Path,applyNormalization,KMSArn, andoutput.Type: Array of OutputSource objects
Array Members: Fixed number of 1 item.
- resolutionTechniques
-
An object which defines the
resolutionTypeand theruleBasedProperties.Type: ResolutionTechniques object
- roleArn
-
The Amazon Resource Name (ARN) of the IAM role. AWS Entity Resolution assumes this role to create resources on your behalf as part of workflow execution.
Type: String
- workflowArn
-
The ARN (Amazon Resource Name) that AWS Entity Resolution generated for the
MatchingWorkflow.Type: String
Pattern:
arn:(aws|aws-us-gov|aws-cn):entityresolution:[a-z]{2}-[a-z]{1,10}-[0-9]:[0-9]{12}:(matchingworkflow/[a-zA-Z_0-9-]{1,255}) - workflowName
-
The name of the workflow.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 255.
Pattern:
[a-zA-Z_0-9-]*
Errors
For information about the errors that are common to all actions, see Common Error Types.
- AccessDeniedException
-
You do not have sufficient access to perform this action.
HTTP Status Code: 403
- ConflictException
-
The request couldn't be processed because of conflict in the current state of the resource. Example: Workflow already exists, Schema already exists, Workflow is currently running, etc.
HTTP Status Code: 400
- ExceedsLimitException
-
The request was rejected because it attempted to create resources beyond the current AWS Entity Resolution account limits. The error message describes the limit exceeded.
- quotaName
-
The name of the quota that has been breached.
- quotaValue
-
The current quota value for the customers.
HTTP Status Code: 402
- InternalServerException
-
This exception occurs when there is an internal failure in the AWS Entity Resolution service.
HTTP Status Code: 500
- ThrottlingException
-
The request was denied due to request throttling.
HTTP Status Code: 429
- ValidationException
-
The input fails to satisfy the constraints specified by AWS Entity Resolution.
HTTP Status Code: 400
Examples
Example of a rule-based matching workflow with batch (manual) processing
The following example uses the CreateMatchingWorkflow API to create a
rule-based matching workflow with batch processing in AWS Entity Resolution. It sets up a
workflow named "sample" that uses an AWS Glue table as the input source and
configures output for ID, email, and gender fields. The workflow employs rule-based
matching techniques with a single rule ("Rule1") that uses the email field as a
matching key. The request specifies an attribute matching model of "ONE_TO_ONE" and
includes settings to not apply normalization to the input data. Since no
incrementalRunConfig is specified, this workflow will use the default
batch processing mode.
Sample Request
{
"workflowName": "sample",
"inputSourceConfig": [
{
"applyNormalization": false,
"inputSourceARN": "arn:aws:glue:<region>:<accountId>:table/<glueDatabaseName>/<glueTableName>",
"schemaName": "sampleSchemaName"
}
],
"outputSourceConfig": [
{
"outputS3Path": "s3://<bucketName>/prefix",
"output": [
{
"name": "id",
"hashed": false
},
{
"name": "email",
"hashed": false
},
{
"name": "gender",
"hashed": false
}
]
}
],
"resolutionTechniques": {
"resolutionType": "RULE_MATCHING",
"ruleBasedProperties": {
"rules": [
{
"ruleName": "Rule1",
"matchingKeys": [
"email"
]
}
],
"attributeMatchingModel": "ONE_TO_ONE"
}
},
"roleArn": "arn:aws:iam::<region>:role/passRoleArn"
}
Example of a rule-based matching workflow with incremental (automatic) processing
The following example uses the CreateMatchingWorkflow API to create a
rule-based matching workflow with incremental processing in AWS Entity Resolution. It
sets up a workflow named "sample" that uses an AWS Glue table as the input
source and configures output for ID, email, and gender fields. The workflow employs
rule-based matching techniques with a single rule ("Rule1") that uses the email field
as a matching key. The request specifies an attribute matching model of "ONE_TO_ONE"
and enables immediate incremental processing. It also includes settings to not apply
normalization to the input data and provides the necessary IAM role
for workflow execution.
Sample Request
{
"workflowName": "sample",
"inputSourceConfig": [
{
"applyNormalization": false,
"inputSourceARN": "arn:aws:glue:<region>:<accountId>:table/<glueDatabaseName>/<glueTableName>",
"schemaName": "sampleSchemaName"
}
],
"outputSourceConfig": [
{
"outputS3Path": "s3://<bucketName>/prefix",
"output": [
{
"name": "id",
"hashed": false
},
{
"name": "email",
"hashed": false
},
{
"name": "gender",
"hashed": false
}
]
}
],
"resolutionTechniques": {
"resolutionType": "RULE_MATCHING",
"ruleBasedProperties": {
"rules": [
{
"ruleName": "Rule1",
"matchingKeys": [
"email"
]
}
],
"attributeMatchingModel": "ONE_TO_ONE"
}
},
"incrementalRunConfig": {
"incrementalRunType": "IMMEDIATE"
},
"roleArn": "arn:aws:iam::<region>:role/passRoleArn"
}
Example of a machine learning-based matching workflow
The following example uses the CreateMatchingWorkflow API to create a
machine learning-based matching workflow in AWS Entity Resolution. It sets up a workflow
named "sample" that uses an AWS Glue table as the input source, configures
output for ID, email, and gender fields, and employs ML-based matching techniques.
The request specifies not to apply normalization to the input data and includes the
necessary IAM role for workflow execution.
Sample Request
{
"workflowName": "sample",
"inputSourceConfig": [
{
"applyNormalization": false,
"inputSourceARN": "arn:aws:glue:<region>:<accountId>:table/<glueDatabaseName>/<glueTableName>",
"schemaName": "sampleSchemaName"
}
],
"outputSourceConfig": [
{
"outputS3Path": "s3://<bucketName>/prefix",
"output": [
{
"name": "id",
"hashed": false
},
{
"name": "email",
"hashed": false
},
{
"name": "gender",
"hashed": false
}
]
}
],
"resolutionTechniques": {
"resolutionType": "ML_MATCHING"
},
"roleArn": "arn:aws:iam::<region>:role/passRoleArn"
}
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: