Jobs - AWS Data Exchange User Guide

Jobs

AWS Data Exchange jobs are asynchronous import or export operations used to create or copy assets. A data set owner can import and export, but someone with an entitlement to a data set can only export. You can use the console, AWS CLI, your own REST application, or one of the AWS SDKs to create or copy assets through jobs.

Jobs are deleted 90 days after they are created.

Job properties

Jobs have the following properties:

  • Job ID – An ID generated when the job is created that uniquely identifies the job.

  • Job type – The following job types are supported: import from Amazon S3, import from signed URL, export to Amazon S3, export to signed URL.

  • Amazon Resource Name (ARN)> – A unique identifier for AWS resources.

  • Job state – The job state can be: WAITING, IN_PROGRESS, COMPLETED, CANCELLED, ERROR, or TIMED_OUT. When a job is created, it's put in the WAITING state until the job is started.

  • Job details – Details of the operation to be performed by the job, such as export destination details or import source details.

Example Job Resource

{ "Arn": "arn:aws:dataexchange:us-east-1:123456789012:jobs/6cEXAMPLE818f7c7a23b3d0EXAMPLE1c", "Id": "6cEXAMPLE818f7c7a23b3d0EXAMPLE1c", "State": "COMPLETED", "Type": "IMPORT_ASSETS_FROM_S3", "CreatedAt": "2019-10-11T14:12:24.640Z", "UpdatedAt": "2019-10-11T14:13:00.804Z", "Details": { "ImportAssetsFromS3": { "AssetSources": [ { "Bucket": "DOC-EXAMPLE-BUCKET", "Key": "MyKey" } ], "DataSetId": "14EXAMPLE4460dc9b005a0dEXAMPLE2f", "RevisionId": "e5EXAMPLE224f879066f999EXAMPLE42" } } }

AWS Regions and jobs

If you import or export an asset to or from an Amazon S3 bucket that is in an AWS Region different from the data set's, your AWS account is charged for the data transfer costs according to Amazon S3 data transfer pricing policies.

Importing assets

There are two ways you can import assets to a revision:

  • From an Amazon S3 bucket that you have permissions to access.

  • By using a signed URL.

Importing assets from an Amazon S3 bucket

When you import from an Amazon S3 bucket, you must create and start a job of type IMPORT_ASSETS_FROM_S3. Provide the details of the import destinations (including the asset ID, revision ID, and data set ID) and the asset sources (Amazon S3). The newly created assets have a name property equal to the original S3 object's key. You can update the assets' name property after they are created. You can import up to 100 assets in a single job.

When importing assets from Amazon S3 to AWS Data Exchange, the IAM permissions you're using must include the ability to write to the AWS Data Exchange service Amazon S3 buckets and to read from the Amazon S3 bucket where your assets are stored. You can import from any Amazon S3 bucket you have permission to access, regardless of ownership. For more information, see Amazon S3 permissions.

Importing assets from a signed URL

You can use signed URLs to import assets that are not stored in Amazon S3. Create a job of type IMPORT_ASSET_FROM_SIGNED_URL, provide the 24-byte MD5 hash of the asset, and the asset name. The job's details include a signed URL that you can use to import your file. The signed URL expires one hour after it's created.

Exporting assets

There are two ways you can export assets from a published revision of a product:

  • To an Amazon S3 bucket that you have permissions to access.

  • By using a signed URL.

Exporting assets to an Amazon S3 bucket

When you export to an Amazon S3 bucket, you must create and start a job of type EXPORT_ASSETS_TO_S3. Provide details of the assets you would like to export and the target destination. By default, the assets are exported to an S3 object using the original asset name as an object key. You can export up to 100 assets in a single job.

Note

For information about exporting an entire revision as a single job, see Exporting revisions.

AWS Data Exchange supports configurable encryption parameters when exporting data sets to Amazon S3. In your export job details, you can specify the Amazon S3 server-side encryption configuration you want to apply to the exported objects. You can choose to use server-side encryption with Amazon S3-Managed Keys (SSE-S3) or server-side encryption with Customer Master Keys (CMKs) stored in AWS Key Management Service (SSE-KMS). For more information, see Protecting data using server-side encryption in the Amazon Simple Storage Service Developer Guide.

Note

When exporting assets to Amazon S3, the IAM permissions you're using must include the ability to read from the AWS Data Exchange service Amazon S3 buckets and to write to the Amazon S3 bucket where your assets are stored. You can export to any Amazon S3 bucket you have permission to access, regardless of ownership. For more information, see Amazon S3 permissions.

Important

We recommend that you consider Amazon S3 security features when exporting data to Amazon S3. See Security best practices for Amazon S3 for general guidelines and best practices.

Important

If the provider has marked a product as containing protected health information (PHI) subject to the Health Insurance Portability and Accountability Act of 1966 (HIPAA), you may not export the product's data sets into you AWS account unless such AWS account is designated as a HIPAA account (as defined in the AWS Business Associate Addendum found in AWS Artifact).

Exporting assets to a signed URL

You can use signed URLs to export assets to destinations other than S3 buckets. Create and start a job of type EXPORT_ASSET_TO_SIGNED_URL and provide the source details. The job's details include a signed URL that you can use to export your file. The signed URL has an expiry time of 1 minute.

Exporting revisions

Subscribers can export all assets in a revision of a product to an Amazon S3 bucket that they have permissions to access.

When you export to an Amazon S3 bucket, you must create and start a job of type EXPORT_REVISIONS_TO_S3. Provide details of the revisions you would like to export, the target destinations, and keyPatterns that will determine the key name of assets. The Amazon S3 object key will default to the keyPattern ${Asset.Name}. For more information on keyPatterns, see Key patterns when exporting revisions.

AWS Data Exchange supports configurable encryption parameters when exporting revisions to Amazon S3. In your export job details, you can specify the Amazon S3 server-side encryption configuration you want to apply to the exported objects. You can choose to user server-side encryption with Amazon S3-Managed Keys (SSE-S3) or server-side encryption with Customer Master Keys (CMKs) stored in AWS Key Management Service (SSE-KMS). For more information, see Protecting data using server-side encryption in the Amazon Simple Storage Service Developer Guide.

Important

If the provider has marked a product as containing protected health information (PHI) subject to the Health Insurance Portability and Accountability Act of 1966 (HIPAA), you may not export the product's data sets into you AWS account unless such AWS account is designated as a HIPAA account (as defined in the AWS Business Associate Addendum found in AWS Artifact).

Key patterns when exporting revisions

When you are exporting a revision, each asset will become an object in the Amazon S3 bucket. The names of the objects will be based on a key pattern that you provide. You can use dynamic references that represent asset attributes to create a pattern for the names that will be automatically generated during the export. You can use the following dynamic references:

Dynamic References Description
${Revision.Id} The Id of the revision being exported.
${Revision.CreatedAt} The date the revision was created.
${Revision.CreatedAt.Year} The year the revision was created.
${Revision.CreatedAt.Month} The month the revision was created.
${Revision.CreatedAt.Day} The day of the month the revision was created.
${Asset.Name} The name of the asset.
${Asset.Id} The Id of the asset.

You can use these dynamic references to create the key patterns for your asset names. You must include at least one of the two Asset dynamic references.

For example, using ${Revision.Id}/${Asset.Name} as a key pattern will result in Amazon S3 objects that use the revision Id and asset name (separated by a slash) as the object name.

If you were exporting a revision with the Id "testRevisionId", that had two assets, with the names "asset1" and "asset2", then the assets would be exported to these locations in Amazon S3:

  • <bucket>/testRevisionId/asset1

  • <bucket>/testRevisionId/asset2

Note

Your resulting objects must have unique names. If they have the same names as existing objects in the bucket, your export will overwrite existing objects. If the revision you are exporting has non-unique names (for example, two assets with the same name), the export will fail. The only dynamic reference that is guaranteed to be unique is ${Asset.Id}.