Getting started with AWS HealthLake - AWS HealthLake

Getting started with AWS HealthLake

In this chapter, you use the AWS Management Console to set up permissions, create a data store, import resources, and configure an IAM user or role to be a data lake administrator in AWS Lake Formation. The data lake administrator grants access Lake Formation resources needed to use Amazon Athena to query a data store.

As an alternative to using the AWS Management Console, you can perform many of the same tasks highlighted in this exercise using the AWS Command Line Interface or the AWS SDKs. Before you use the AWS Command Line Interface or SDKs, download and configure them. See AWS Command Line Interface, AWS SDK for Python, or the AWS SDK for Java for more information.

The sections in this chapter walk you through all the steps required to get started with HealthLake.

Prerequisites: Sign up for AWS

When you sign up for Amazon Web Services (AWS), your AWS account is automatically signed up for all AWS services.

If you are a new AWS customer, you can get started with AWS HealthLake at no charge. For more information, see AWS Free Usage Tier.

If you already have an AWS account, skip to the next section.

To create an AWS account
  1. Open https://portal.aws.amazon.com/billing/signup.

  2. Follow the online instructions.

    Part of the sign-up procedure involves receiving a phone call and entering a verification code on the phone keypad.

    When you sign up for an AWS account, an AWS account root user is created. The root user has access to all AWS services and resources in the account. As a security best practice, assign administrative access to a user, and use only the root user to perform tasks that require root user access.

Record your AWS account ID because you'll need it for the next task.

Create an IAM user

Services in AWS, such as HealthLake, require that you provide credentials to access them. This allows the service to determine whether you have permissions to access the service's resources.

We strongly recommend that you access AWS using AWS Identity and Access Management (IAM), not the credentials for your AWS account. To use IAM to access AWS, create an IAM user, add the user to an IAM group with administrative permissions, and then grant administrative permissions to the IAM user. You can then access AWS using a special URL and the IAM user's credentials.

The getting started exercises in this guide assume that you have a user with administrator privileges, because you will need to add IAM policies to IAM users roles.

To create an administrator and sign in to the console
  1. Create an IAM user named AdminUser in your AWS account. For instructions, see Creating Your First IAM User and Administrators Group in the IAM User Guide.

  2. Sign in to the AWS Management Console using a special URL. For more information, see How Users Sign In to Your Account in the IAM User Guide.

A IAM user or role with AdministratorAccess is needed to add an IAM user or role as a data lake administator in AWS Lake Formation.

For more information about IAM, see the following:

Step 1: Configuring a new IAM user or role to use HealthLake (IAM Administrator)

Persona: IAM Administrator

A user who can create IAM users and roles, and can add data lake administrators.

These steps in this topic must be carried out by an IAM administrator.

To connect your HealthLake data store to Athena, you need to provision an IAM user or role that is a data lake administrator and is a HealthLake administrator. This new user or role grants access to resources found in a data store via AWS Lake Formation, and has the AmazonHealthLakeFullAccess AWS managed policy added to their user or role. Follow these instructions to prepare an IAM user or role that has access to both HealthLake and is data lake administrator in AWS Lake Formation.

Important

An IAM user or role that is a data lake administrator cannot create new data lake administrators. To add addittional data lake administrator you must use a IAM user or role which has been granted AdministratorAccess access.

Provision an IAM user or role to be a data lake administrator and a HealthLake administrator
  1. Add the following IAM AWS managed policy to a user or role in your organization: AmazonHealthLakeFullAccess

  2. Grant the IAM user access to AWS Lake Formation.

    • Add the following IAM AWS managed policy to a user or role in your organization: AWSLakeFormationDataAdmin

      Note

      The AWSLakeFormationDataAdmin policy grants access to all AWS Lake Formation resources. We recommend that you always use the minimum permissions required to accomplish your task. For more information, see IAM Best Practices in the IAM User Guide.

  3. Create a service role and add it to a user or role in your oranization.

    • Add the following inline policy to a user or role in your organization. To learn more about adding inline policies, see Step 2: Create a service role and add it to an IAM user or role (IAM Administrator).

      { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject" ], "Resource": "arn:aws:s3:::my-bucket/*" }, { "Effect": "Allow", "Action": [ "ram:GetResourceShareInvitations", "ram:AcceptResourceShareInvitation", "glue:CreateDatabase", "glue:DeleteDatabase" ], "Resource": "*" } ] }

A service role is an IAM role that a service assumes to perform actions on your behalf. An IAM administrator can create, modify, and delete a service role from within IAM. For more information, see Creating a role to delegate permissions to an AWS service in the IAM User Guide.

For more information on the AWSLakeFormationDataAdmin policy, see Lake Formation Personas and IAM Permissions Reference in the AWS Lake Formation Developer Guide.

Step 2: Create a service role and add it to an IAM user or role (IAM Administrator)

Persona: IAM Administrator

A user who can create IAM users and roles, and can add data lake administrators.

For HealthLake to integrate with Athena, you need the following service role. This service role allows HealthLake to manage sharing your data store with Athena via AWS Lake Formation.

To embed an inline policy for a service role (IAM Console)
  1. Sign in to the AWS Management Console and open the IAM console at https://console.aws.amazon.com/iam/.

  2. In the navigation pane, choose Roles.

  3. In the list, choose the name of the role that you want to edit.

  4. Choose the Permissions tab.

  5. Choose Add inline policy.

    Note

    You cannot embed an inline policy in a service-linked role in IAM.

  6. Choose the JSON tab.

  7. Enter the following JSON policy document. For details about the IAM policy language, see IAM JSON Policy Reference in the IAM User Guide.

    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject" ], "Resource": "arn:aws:s3:::my-bucket/*" }, { "Effect": "Allow", "Action": [ "ram:GetResourceShareInvitations", "ram:AcceptResourceShareInvitation", "glue:CreateDatabase", "glue:DeleteDatabase" ], "Resource": "*" } ] }
  8. When you are finished, choose Review policy. The Policy Validator reports any syntax errors.

  9. On the Review policy page, enter a Name for the policy that you are creating. Review the policy Summary to see the permissions that are granted by your policy. Then choose Create policy to save your work.

  10. After you create an inline policy, it is automatically embedded in your role.

Step 3: Add a Data Lake Administrator in Lake Formation (IAM Administrator)

Next, the IAM administrator needs to add the user or role created in step 1 as a data lake administrator in Lake Formation.

To add an IAM user or role as a data lake administrator
  1. Open the AWS Lake Formation console: https://console.aws.amazon.com/lakeformation/

    Note

    If this is your first time visiting Lake Formation, a Welcome to Lake Formation dialog box appears asking you to define a Lake Formation administrator.

    Image of a dialog box asking you to define a lake formation administrator
  2. Assign the new user or role to be a AWS Lake Formation data lake administrator.

    • Option 1: If you received the Welcome to Lake Formation dialog box.

      1. Choose Add other AWS users or roles.

      2. Choose the down arrow (▼).

      3. Choose the HealthLake administrator you would like to also be Lake Formation administrators.

      4. Choose Get started.

    • Option 2: Use the Navigation pane (☰).

      1. Choose the Navigation pane (☰).

      2. Under Permissions, choose Administrative roles and tasks.

      3. In the Data lake administrators section, select Choose administrators .

      4. In the Manage data lake administrators dialog box, choose the down arrow (▼).

      5. Next, select or search for the HealthLake administrators users or roles who you also want to be Lake Formation administrators.

      6. Then, choose Save.

  3. Change the default security settings to be managed by Lake Formation. The HealthLake data store resources need to be managed by Lake Formation not IAM. To update, see Change the default permission model in the AWS Lake Formation Developer Guide.

Step 4: Create a data store (HealthLake Administrator)

Persona: HealthLake Administrator

A user who can create IAM users and roles. Has the AdministratorAccess AWS managed policy. Has all permissions on all Lake Formation resources. Can add data lake administrators. Cannot grant Lake Formation permissions if not also designated a data lake administrator.

This exercise creates a data store and pre-populates it using Synthea data. It uses the IAM user or role you created in step 1. Synthea is preloaded sample data made available by AWS HealthLake.

To create HealthLake data store (AWS Management Console)
  1. Open the HealthLake console at https://console.aws.amazon.com//healthlake/home.

  2. Open the Navigation pane (≡).

  3. Then, choose Data Stores.

  4. Next, choose Create Data Store.

  5. In the Data Store settings section, for Data Store name specify a name.

  6. (Optional) In the Data Store settings section, for Preload sample data select the checkbox to preload Synthea data.

  7. In the Data Store encryption section choose either Use AWS owned key (default) or Choose a different AWS KMS key (advanced).

    Note

    We recommend that customers use a customer managed key for data stores that contain Personally identifiable information.

  8. In the Tags - optional section, you can add tags to your data store.

  9. Next, choose Create Data Store.

When your data store is ready the status changes to Ready.

After you create a data store, and you populate it with preloaded data or import data, you can start querying your data store using SQL in Amazon Athena. To access your data in Athena, you will need to connect your data store. For more information, see Connecting your data store to Amazon Athena.

Preloaded data types

Persona: HealthLake Administrator

A user who can create IAM users and roles. Has the AdministratorAccess AWS managed policy. Has all permissions on all Lake Formation resources. Can add data lake administrators. Cannot grant Lake Formation permissions if not also designated a data lake administrator.

HealthLake supports only SYNTHEA as a preloaded data type. Synthea is a synthetic patient generator that models the medical history of model-generated patients. It’s an open-source Git repository that allows HealthLake to generate FHIR R4-compliant resource bundles so that users can test models without using actual patient data.

The following resource types are available in preloaded data stores.

AllergyIntolerance Location
CarePlan MedicationAdministration

CareTeam

MedicationRequest

Claim

Observation

Condition

Organization

Device

Patient
DiagnosticReport Practitioner
Encounter PractitionerRole

ExplanationofBenefit

Procedure

ImagingStudy

Provenance

Immunization