Importing files into HealthLake data stores
After you create your HealthLake data store, you can import files from an Amazon Simple Storage Service (Amazon S3) bucket. You can use the HealthLake console or the StartFHIRImportJob
to start an import job. HealthLake accepts input files in newline delimited JSON (.ndjson
) format, where each line consists of a valid FHIR resource. You can use the API operations DescribeFHIRImportJob
and ListFHIRImportJobs
to describe and list ongoing import jobs. A customer-owned or AWS-owned KMS key is required for encryption of the Amazon S3 bucket for all import jobs. To learn more about creating and using a KMS Keys, see Creating keys in the AWS Key Management Service Developer Guide.
Users can enqueue their import or export jobs to their HealthLake data store. These asynchronous import or export jobs will be processed in a FIFO (First In First Out) manner. Users can create, read, update, or delete FHIR resources while an import or export job is in progress.
For each import job, a manifest.json
file is generated. This file describes both the successes and failures of an import job. Users can programmatically navigate to these files. They are organized into two folders, named SUCCESS
and FAILURE
. An output file may contain sensitive information, therefore, users must provide both an output Amazon S3 bucket and an AWS KMS key for encryption.
The following is an example of the output manifest.json
file. It is recommended users use this file as the first step of troubleshooting a failed import job because it provides details on each file and what caused the import job to fail.
{ "inputDataConfig": { "s3Uri": "s3://inputS3Bucket/healthlake-input/invalidInput/" }, "outputDataConfig": { "s3Uri": "s3://outputS3Bucket/32839038a2f47f17c2fe0f53f0c3a0ba-FHIR_IMPORT-19dd7bb7bcc8ee12a09bf6d322744a3d/", "encryptionKeyID": "arn:aws:kms:us-west-2:123456789012:key/fbbbfee3-20b3-42a5-a99d-c48c655ed545" }, "successOutput": { "successOutputS3Uri": "s3://outputS3Bucket/32839038a2f47f17c2fe0f53f0c3a0ba-FHIR_IMPORT-19dd7bb7bcc8ee12a09bf6d322744a3d/SUCCESS/" }, "failureOutput": { "failureOutputS3Uri": "s3://outputS3Bucket/32839038a2f47f17c2fe0f53f0c3a0ba-FHIR_IMPORT-19dd7bb7bcc8ee12a09bf6d322744a3d/FAILURE/" }, "numberOfScannedFiles": 1, "numberOfFilesImported": 1, "sizeOfScannedFilesInMB": 0.023627, "sizeOfDataImportedSuccessfullyInMB": 0.011232, "numberOfResourcesScanned": 9, "numberOfResourcesImportedSuccessfully": 4, "numberOfResourcesWithCustomerError": 5, "numberOfResourcesWithServerError": 0 }
Performing an import
You can start an import job by using either the AWS HealthLake console or the AWS HealthLake import API, start-fhir-import-job API.
Importing files by using the API operations
Prerequisites
When you use the AWS HealthLake API operations, you must first create an AWS Identity and Access Management (IAM) policy and attach it to an IAM role. To learn more about IAM roles and trust policies, see IAM Policies and Permissions. Customers must also use a KMS key for encryption. To learn more about using KMS Keys, see Amazon Key Management Service.
To import files (API), use the following steps.
-
Upload your data into an Amazon S3 bucket.
-
To start a new import job, use the
start-FHIR-import-job
operation. When you start the job, indicate to HealthLake the name of the Amazon S3 bucket that contains the input files, the KMS key you want to use for encryption, and the output data configuration. -
To learn more about a FHIR import job, use the describe-fhir-import-job operation to get the job's ID, ARN, name, start time, end time, and current status. Use list-fhir-import-job to show all import jobs and their statuses.
Importing files by using the console
To import files (console), use the following steps.
-
Upload your data into an Amazon S3 bucket.
-
To start a new import job, identify the Amazon S3 bucket, and either create or identify the IAM role and the KMS key you want to use. To learn more about IAM roles and trust policies, see IAM Roles. To learn more about using KMS keys, see Amazon Key Management Service.
-
To see the status of your import job, use
ListFHIRImportJobs
. For more details on theListFHIRImportJobs
API command, see ListFHIRImportJobs in the AWS HealthLake API Reference.
IAM policies for import jobs
The IAM role that calls the AWS HealthLake API operations must have a policy that grants access to the Amazon S3 buckets containing the input files. It must also be assigned a trust relationship that enables HealthLake to assume the role. To learn more about IAM roles and trust policies, see IAM Roles.
The role must have the following policy:
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:ListBucket", "s3:GetBucketPublicAccessBlock", "s3:GetEncryptionConfiguration" ], "Resource": [ "arn:aws:s3:::inputS3Bucket", "arn:aws:s3:::outputS3Bucket" ], "Effect": "Allow" }, { "Action": [ "s3:GetObject" ], "Resource": [ "arn:aws:s3:::inputS3Bucket/*" ], "Effect": "Allow" }, { "Action": [ "s3:PutObject" ], "Resource": [ "arn:aws:s3:::outputS3Bucket/*" ], "Effect": "Allow" }, { "Action": [ "kms:DescribeKey", "kms:GenerateDataKey*" ], "Resource": [ "arn:aws:kms:us-east-1:012345678910:key/d330e7fc-b56c-4216-a250-f4c43ef46e83" ], "Effect": "Allow" } ] }
The role must have the following trust relationship.
{ "Version": "2012-10-17", "Statement": [ {"Effect": "Allow", "Principal": {"Service": [ "healthlake.amazonaws.com" ] }, "Action": "sts:AssumeRole" "Condition": { "StringEquals": { "aws:SourceAccount": "(accountId)" }, "ArnEquals": { "aws:SourceArn": "arn:aws:healthlake:(region):(accountId):datastore/fhir/(datastoreId)" } } } ] }