HealthOmics Analytics - AWS HealthOmics

HealthOmics Analytics

HealthOmics Analytics supports the storage and analysis of genomic variants and annotations.

With the variant and annotation store API operations, you can perform the following actions:

  • Creating and managing variant stores

  • Importing variant data and managing import jobs

  • Creating and managing annotation stores

  • Importing and managing annotation jobs

  • Share analytic store data with collaborators

  • Tagging AWS resources, such as variant stores and annotation stores

Variant stores support data in VCF formats, and annotation stores support TSV/CSV and GFF3 formats. When your data is in the HealthOmics Analytics data store, access to the VCF files is managed through AWS Lake Formation. You can then query the VCF files by using Amazon Athena. To be supported, queries must use Athena query engine version 3. To read more about Athena query engine versions, see the Amazon Athena documentation.

In the AWS Lake Formation console, view the permissions by choosing Data lake permissions in the primary navigation bar. On the Data permissions page, you can view a table that shows the Resource types, Databases, and ARN that's related to a shared resource under RAM Resource Share. If you need to accept an AWS Resource Access Manager (AWS RAM) resource share, AWS Lake Formation notifies you in the console.

AWS HealthOmics can implicitly accept the AWS RAM resource shares during store creation. To accept the AWS RAM resource share, the IAM user or role that calls the CreateVariantStore or CreateAnnotationStore API operations must allow the following actions:

  • ram:GetResourceShareInvitations - This action allows AWS HealthOmics to find the invitations.

  • ram:AcceptResourceShareInvitation - This action allows AWS HealthOmics to accept the invitation by using an FAS token.

Without these permissions, you see an authorization error during store creation.

Here is a sample policy that includes these actions.

{ "Statement": [ { "Effect": "Allow", "Action": [ "omics:*", "ram:AcceptResourceShareInvitation", "ram:GetResourceShareInvitations" ], "Resource": "*" } ] }

To make a shared resource that HealthOmics Analytics users can query, the default access controls must be disabled. To learn more about disabling default access controls, see Changing the default security settings for your data lake in the Lake Formation documentation. You can create resource links individually or as a group, so that you can access data in Athena or other AWS services.

Creating resource links in the AWS Lake Formation console and sharing them with HealthOmics Analytics users
  1. Open the AWS Lake Formation console:

  2. In the primary navigation bar, choose Databases.

  3. In the Databases table, choose the Name of HealthOmics Analytics data store.

  4. On the HealthOmics Analytics data store details page, choose Actions (▼).

  5. Choose Create resource link.

  6. Next, you must provide a Resource link name.

  7. Choose Create.

  8. The new resource link is now listed under Databases.

Now, the Lake Formation database administrator needs to grant access to this shared resource using Grant on target.

  1. Open the AWS Lake Formation console:

  2. In the primary navigation bar, choose Databases.

  3. On the Databases page, Choose the radio button next the Name of the resource link you previously created.

  4. Next, choose Actions (▼).

  5. Then, choose Grant on target.

  6. On the Grant data permissions page under Principals, choose IAM users or roles.

  7. Under IAM users or roles use the down arrow (▼) to find the user to which you want to grant access.

  8. Next, under LF-Tags or catalog resources card, select the Named data catalog resources option.

  9. Under Tables-optional use the down arrow (▼) to choose All Tables you previously created.

  10. In the Table permissions card, under Table permissions choose Describe and Select.

  11. Next, choose Save.

To view the Lake Formation permissions that you have granted, choose Data lake permissions from the primary navigation pane. The table shows all databases and resource links that you have created.