Importing files to a FHIR Data Store - Amazon HealthLake

Importing files to a FHIR Data Store

After your FHIR Data Store has been created, you can import files from an Amazon Simple Storage Service (Amazon S3) bucket. You can use the console to create and manage import jobs, or use the import APIs. Amazon HealthLake accepts input files in newline delimited JSON (.ndjson) format, where each line consists of a valid FHIR resource. The APIs start, describe, and list ongoing import jobs. A customer-owned or AWS-owned KMS key is required for encryption of the Amazon S3 bucket for all import jobs. To learn more about creating and using a KMS Keys, see Creating keys in the AWS Key Management Service developer guide.

Only one import or export job can run at time for an active Data Store. However, users can create, read, update, or delete FHIR resources while an import job is in progress.

For each import job, a manifest.json file is generated. This file describes both successes and and failures of an import job. Users can programmatically navigate to these files. They are organized into two folders named SUCCESS and FAILURE. An output file may contain sensitive information, and therefore users must provide both an output Amazon S3 bucket and a AWS KMS key for encryption.

The following is an example of the output manifest.json file. It is recommended users use this file as the first step of troubleshooting a failed import job because it provides details on each file and what caused the import job to fail.

{ "inputDataConfig": { "s3Uri": "s3://inputS3Bucket/healthlake-input/invalidInput/" }, "outputDataConfig": { "s3Uri": "s3://outputS3Bucket/32839038a2f47f17c2fe0f53f0c3a0ba-FHIR_IMPORT-19dd7bb7bcc8ee12a09bf6d322744a3d/", "encryptionKeyID": "arn:aws:kms:us-west-2:123456789012:key/fbbbfee3-20b3-42a5-a99d-c48c655ed545" }, "successOutput": { "successOutputS3Uri": "s3://outputS3Bucket/32839038a2f47f17c2fe0f53f0c3a0ba-FHIR_IMPORT-19dd7bb7bcc8ee12a09bf6d322744a3d/SUCCESS/" }, "failureOutput": { "failureOutputS3Uri": "s3://outputS3Bucket/32839038a2f47f17c2fe0f53f0c3a0ba-FHIR_IMPORT-19dd7bb7bcc8ee12a09bf6d322744a3d/FAILURE/" }, "numberOfScannedFiles": 1, "numberOfFilesImported": 1, "sizeOfScannedFilesInMB": 0.023627, "sizeOfDataImportedSuccessfullyInMB": 0.011232, "numberOfResourcesScanned": 9, "numberOfResourcesImportedSuccessfully": 4, "numberOfResourcesWithCustomerError": 5, "numberOfResourcesWithServerError": 0 }

Performing an import

You can start an import job using either the Amazon HealthLake console or the Amazon HealthLake import API, start-fhir-import-job API.

Importing files using the APIs

Prerequisites

When you use the Amazon HealthLake APIs, you must first create an AWS Identity and Access Management (IAM) policy and attach it to an IAM role. To learn more about IAM roles and trust policies, see IAM Policies and Permissions. Customers must also use a KMS key for encryption. To learn more about using KMS Keys, see Amazon Key Management Service.

To import files (API)

  1. Upload your data into an Amazon S3 bucket.

  2. To start a new import job, use the start-FHIR-import-job operation. When you start the job, tell HealthLake the name of the Amazon S3 bucket that contains the input files, the KMS key you wish to use for encryption, and the output data configuration.

  3. To learn more about a FHIR import job, use the describe-fhir-import-job operation to get the job's ID, ARN, name, start time, end time, and current status. Use list-fhir-import-job to show all import jobs and their statuses.

Importing files using the console

To import files (console)

  1. Upload your data into an Amazon S3 bucket.

  2. To start a new import job, identify the Amazon S3 bucket and either create or identify the IAM role and the KMS key you want to use. To learn more about IAM roles and trust policies, see IAM Roles. To learn more about using KMS Keys, see Amazon Key Management Service.

  3. To see the status of your import job use ListFHIRImportJobs. For more details on the ListFHIRImportJobs API command, see ListFHIRImportJobs in the Amazon HealthLake API Reference.

IAM policies for import jobs

The IAM role that calls the Amazon HealthLake APIs must have a policy that grants access tothe Amazon S3 buckets containing the input files. It must also be assigned a trust relationship that enables HealthLake to assume the role. To learn more about IAM roles and trust policies, see IAM Roles.

The role must have the following policy:

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:ListBucket", "s3:GetBucketPublicAccessBlock", "s3:GetEncryptionConfiguration" ], "Resource": [ "arn:aws:s3:::inputS3Bucket", "arn:aws:s3:::outputS3Bucket" ], "Effect": "Allow" }, { "Action": [ "s3:GetObject" ], "Resource": [ "arn:aws:s3:::inputS3Bucket/*" ], "Effect": "Allow" }, { "Action": [ "s3:PutObject" ], "Resource": [ "arn:aws:s3:::outputS3Bucket/*" ], "Effect": "Allow" }, { "Action": [ "kms:DescribeKey", "kms:GenerateDataKey*" ], "Resource": [ "arn:aws:kms:us-east-1:012345678910:key/d330e7fc-b56c-4216-a250-f4c43ef46e83" ], "Effect": "Allow" } ] }

The role must have the following trust relationship.

{ "Version": "2012-10-17", "Statement": [ {"Effect": "Allow", "Principal": {"Service": [ "healthlake.amazonaws.com" ] }, "Action": "sts:AssumeRole" "Condition": { "StringEquals": { "aws:SourceAccount": "(accountId)" }, "ArnEquals": { "aws:SourceArn": "arn:aws:healthlake:(region):(accountId):datastore/fhir/(datastoreId)" } } } ] }