Jobs in AWS Data Exchange - AWS Data Exchange User Guide

Jobs in AWS Data Exchange

AWS Data Exchange jobs are asynchronous import or export operations that you can use to create or copy assets. If you're a dataset owner, you can perform both import and export operations. However, someone with an entitlement to a dataset can only perform an export operation. To create or copy assets through jobs, you can use the AWS Management Console, AWS Command Line Interface (AWS CLI), your own REST application, or one of the AWS SDKs.

Jobs are deleted 90 days after they are created.

Job properties

Jobs have the following properties:

  • Job ID – An ID generated when the job is created that uniquely identifies the job.

  • Job type – The following job types are supported:

    • Import from Amazon Simple Storage Service (Amazon S3)

    • Import from signed URL

    • Export from Amazon S3

    • Export from signed URL

  • Amazon Resource Name (ARN) – A unique identifier for AWS resources.

  • Job state – The job states are WAITING, IN_PROGRESS, COMPLETED, CANCELLED, ERROR, or TIMED_OUT. When a job is created, it's in the WAITING state until the job is started.

  • Job details – Details of the operation to be performed by the job, such as export destination details or import source details.

Example job resource

{ "Arn": "arn:aws:dataexchange:us-east-1:123456789012:jobs/6cEXAMPLE818f7c7a23b3d0EXAMPLE1c", "Id": "6cEXAMPLE818f7c7a23b3d0EXAMPLE1c", "State": "COMPLETED", "Type": "IMPORT_ASSETS_FROM_S3", "CreatedAt": "2019-10-11T14:12:24.640Z", "UpdatedAt": "2019-10-11T14:13:00.804Z", "Details": { "ImportAssetsFromS3": { "AssetSources": [ { "Bucket": "DOC-EXAMPLE-BUCKET", "Key": "MyKey" } ], "DataSetId": "14EXAMPLE4460dc9b005a0dEXAMPLE2f", "RevisionId": "e5EXAMPLE224f879066f999EXAMPLE42" } } }

AWS Regions and jobs

If you import or export an asset to or from an Amazon S3 bucket that is in an AWS Region that is different than the dataset's Region, your AWS account is charged for the data transfer costs, according to Amazon S3 data transfer pricing policies. If you export assets to a signed URL, your AWS account is charged for data transfer costs from Amazon S3 to the internet according to Amazon S3 pricing policies.

Importing assets

There are two ways you can import assets to a revision:

  • From an Amazon S3 bucket that you have permissions to access

  • By using a signed URL

Importing assets from an Amazon S3 bucket

When you import from an Amazon S3 bucket, you must create and start a job of type IMPORT_ASSETS_FROM_S3. Provide the details of the import destinations (including the asset ID, revision ID, and dataset ID) and the asset sources (Amazon S3). The newly created assets have a name property equal to the original S3 object's key. You can update the assets' name property after they are created. You can import up to 100 assets in a single job.

When you import assets from Amazon S3 to AWS Data Exchange, the AWS Identity and Access Management (IAM) permissions you use must include the ability to write to the AWS Data Exchange service S3 buckets and to read from the S3 bucket where your assets are stored. You can import from any S3 bucket you have permission to access, regardless of ownership. For more information, see Amazon S3 permissions.

Importing assets from a signed URL

You can use signed URLs to import assets that are not stored in Amazon S3. Create a job of type IMPORT_ASSET_FROM_SIGNED_URL, provide the 24-byte MD5 hash of the asset, and the asset name. The job's details include a signed URL that you can use to import your file. The signed URL expires one hour after it's created.

Exporting assets

There are two ways you can export assets from a published revision of a product:

  • To an Amazon S3 bucket that you have permissions to access

  • By using a signed URL

Exporting assets to an Amazon S3 bucket

When you export to an Amazon S3 bucket, you must create and start a job of type EXPORT_ASSETS_TO_S3. Provide details of the assets you would like to export and the target destination. By default, the assets are exported to an S3 object using the original asset name as an object key. You can export up to 100 assets in a single job.

Note

For information about exporting an entire revision as a single job, see Exporting revisions.

When you export assets to Amazon S3, the IAM permissions you use must include the ability to read from the AWS Data Exchange service S3 buckets and to write to the S3 bucket where your assets are stored. You can export to any Amazon S3 bucket you have permission to access, regardless of ownership. For more information, see Amazon S3 permissions.

AWS Data Exchange supports configurable encryption parameters when exporting datasets to Amazon S3. In your export job details, you can specify the Amazon S3 server-side encryption configuration you want to apply to the exported objects. You can choose to use server-side encryption with Amazon S3-Managed Keys (SSE-S3) or server-side encryption with customer master keys (CMKs) stored in AWS Key Management Service (SSE-KMS). For more information, see Protecting data using server-side encryption in the Amazon Simple Storage Service Developer Guide.

Important

We recommend that you consider Amazon S3 security features when exporting data to Amazon S3. See Security best practices for Amazon S3 for general guidelines and best practices.

Important

If the provider has marked a product as containing protected health information (PHI) subject to the Health Insurance Portability and Accountability Act of 1996 (HIPAA), you may not export the product's datasets into your AWS account unless such AWS account is designated as a HIPAA account (as defined in the AWS Business Associate Addendum found in AWS Artifact).

Exporting assets to a signed URL

You can use signed URLs to export assets to destinations other than S3 buckets. Create and start a job of type EXPORT_ASSET_TO_SIGNED_URL and provide the source details. The job's details include a signed URL that you can use to export your file. The signed URL has an expiry time of 1 minute.

Exporting revisions

Subscribers can export all assets in a revision of a product to an S3 bucket that they have permissions to access.

When you export to an S3 bucket, you must create and start a job of type EXPORT_REVISIONS_TO_S3. Provide details of the revisions you would like to export, the target destinations, and key patterns that will determine the key name of assets. The Amazon S3 object key defaults to the key pattern ${Asset.Name}. For more information about key patterns, see Key patterns when exporting revisions.

AWS Data Exchange supports configurable encryption parameters when exporting revisions to Amazon S3. In your export job details, you can specify the Amazon S3 server-side encryption configuration you want to apply to the exported objects. You can choose to use server-side encryption with Amazon S3-Managed Keys (SSE-S3) or server-side encryption with customer master keys (CMKs) stored in AWS Key Management Service (SSE-KMS). For more information, see Protecting data using server-side encryption in the Amazon Simple Storage Service Developer Guide.

Important

If the provider has marked a product as containing protected health information (PHI) subject to the Health Insurance Portability and Accountability Act of 1996 (HIPAA), you may not export the product's datasets into your AWS account unless such AWS account is designated as a HIPAA account (as defined in the AWS Business Associate Addendum found in AWS Artifact).

Key patterns when exporting revisions

When you export a revision, each asset becomes an object in the Amazon S3 bucket. The names of the objects are based on a key pattern that you provide. You can use dynamic references that represent asset attributes to create a pattern for the names that are automatically generated during the export. Use the dynamic references shown in the following table.

Dynamic references Description
${Revision.Id} The Id of the revision being exported.
${Revision.CreatedAt} The date the revision was created.
${Revision.CreatedAt.Year} The year the revision was created.
${Revision.CreatedAt.Month} The month the revision was created.
${Revision.CreatedAt.Day} The day of the month the revision was created.
${Asset.Name} The name of the asset.
${Asset.Id} The Id of the asset.

You can use these dynamic references to create the key patterns for your asset names. You must include at least one of the two Asset dynamic references, which are ${Asset.Name} and ${Asset.Id}.

For example, using ${Revision.Id}/${Asset.Name} as a key pattern results in Amazon S3 objects that use the revision Id and asset name (separated by a slash) as the object name.

If you export a revision with the Id testRevisionId that has two assets named asset1 and asset2, then the assets are exported to the following locations in Amazon S3:

  • <bucket>/testRevisionId/asset1

  • <bucket>/testRevisionId/asset2

Note

Your resulting objects must have unique names. If they have the same names as existing objects in the S3 bucket, your export will overwrite existing objects. If the revision you are exporting has non-unique names (for example, two assets with the same name), the export will fail. The only dynamic reference that is unique is ${Asset.Id}.