Step 4: Set up permissions for a Delta Lake table
In this section, you'll learn how to create a Delta Lake table with symlink manifest file in the AWS Glue Data Catalog, set up data permissions in AWS Lake Formation and query data using Amazon Athena.
To create a Delta Lake table
In this step, you’ll run an AWS Glue job that creates a Delta Lake transactional table in the Data Catalog.
-
Open the AWS Glue console at https://console.aws.amazon.com/glue/
in the US East (N. Virginia) Region as the data lake administrator user.
-
Choose jobs from the left navigation pane.
-
Select
native-delta-create
. -
Under Actions, choose Edit job.
-
Under Job details, expand Advanced properties, and check the box next to Use AWS Glue Data Catalog as the Hive metastore to add the table metadata in the AWS Glue Data Catalog. This specifies AWS Glue Data Catalog as the metastore for the Data Catalog resources used in the job and enables Lake Formation permissions to be applied later on the catalog resources.
Choose Save.
-
Choose Run under Actions.
This job creates a Delta Lake table named
product
in thelfdeltadb
database. Verify theproduct
table in the Lake Formation console.
To register the data location with Lake Formation
Next, register the Amazon S3 path as the root location of your data lake.
-
Open the Lake Formation console at https://console.aws.amazon.com/lakeformation/
the data lake administrator user. In the navigation pane, under Register and ingest, choose Data location.
On the upper right of the console, choose Register location.
On the Register location page, enter the following:
-
Amazon S3 path – Choose Browse and select
lf-otf-datalake-123456789012
. Click on the right arrow (>) next to the Amazon S3 root location to navigate to thes3/buckets/lf-otf-datalake-123456789012/transactionaldata/native-delta
location. -
IAM role – Choose
LF-OTF-RegisterRole
as the IAM role. Choose Register location.
-
To grant data lake permissions on the Delta Lake table
In this step, we'll grant data lake permissions to the business analyst user.
Under Data lake permissions, choose Grant.
On the Grant data permissions screen, choose, IAM users and roles.
-
lf-consumer-analystuser
from the drop down. Choose Named data catalog resource.
For Databases choose
lfdeltadb
.For Tables, choose
product
.Next, you can grant column-based access by specifying columns.
Under Table permissions, choose Select.
Under Data permissions, choose Column-based access, choose Include columns.
Choose
product_name
,price
, andcategory
columns.Choose Grant.
To query the Delta Lake table using Athena
Now start querying the Delta Lake table you created using Athena. If it is the first time you are running queries in Athena, you need to configure a query result location. For more information, see Specifying a query result location.
Log out as the data lake administrator user and login as
BusinessAnalystUser
in US East (N. Virginia) Region using the password noted earlier from the AWS CloudFormation output.Open the Athena console at https://console.aws.amazon.com/athena/
. Choose Settings and select Manage.
In the Location of query result box, enter the path to the bucket that you created in AWS CloudFormation outputs. Copy the value of
AthenaQueryResultLocation
(s3://lf-otf-tutorial-123456789012/athena-results/) and Save.Run the following query to preview 10 records stored in the Delta Lake table:
select * from lfdeltadb.product limit 10;
For more information on querying Delta Lake tables, see the Querying Delta Lake tables section in the Amazon Athena User Guide.