Step 4: Switch your data stores to the Lake Formation permissions model - AWS Lake Formation

Step 4: Switch your data stores to the Lake Formation permissions model

Upgrade to Lake Formation permissions one data location at a time. To do that, repeat this entire section until you have registered all Amazon Simple Storage Service (Amazon S3) paths that are referenced by your Data Catalog.

Verify Lake Formation permissions

Before registering a location, perform a verification step to ensure that the correct principals have the required Lake Formation permissions, and that no Lake Formation permissions are granted to principals that should not have them. Using the Lake Formation GetEffectivePermissionsForPath API operation, identify the Data Catalog resources that reference the Amazon S3 location, along with the principals that have permissions on those resources.

The following AWS CLI example returns the Data Catalog databases and tables that reference the Amazon S3 bucket products.

aws lakeformation get-effective-permissions-for-path --resource-arn arn:aws:s3:::products --profile datalake_admin

Note the profile option. We recommend that you run the command as a data lake administrator.

The following is an excerpt from the returned results.

{ "PermissionsWithGrantOption": [ "SELECT" ], "Resource": { "TableWithColumns": { "Name": "inventory_product", "ColumnWildcard": {}, "DatabaseName": "inventory" } }, "Permissions": [ "SELECT" ], "Principal": { "DataLakePrincipalIdentifier": "arn:aws:iam::111122223333:user/datalake_user1", "DataLakePrincipalType": "IAM_USER" } },...
Important

If your AWS Glue Data Catalog is encrypted, GetEffectivePermissionsForPath returns only databases and tables that were created or modified after Lake Formation general availability.

Secure existing Data Catalog resources

Next, revoke the Super permission from IAMAllowedPrincipals on each table and database that you identified for the location.

Warning

If you have automation in place that creates databases and tables in the Data Catalog, the following steps might cause the automation and downstream extract, transform, and load (ETL) jobs to fail. Proceed only after you have either modified your existing processes or granted explicit Lake Formation permissions to the required principals. For information about Lake Formation permissions, see Lake Formation permissions reference.

To revoke Super from IAMAllowedPrincipals on a table
  1. Open the AWS Lake Formation console at https://console.aws.amazon.com/lakeformation/. Sign in as a data lake administrator.

  2. In the navigation pane, choose Tables.

  3. On the Tables page, select the radio button next to the desired table.

  4. On the Actions menu, choose Revoke.

  5. In the Revoke permissions dialog box, in the IAM users and roles list, scroll down to the Group heading, and choose IAMAllowedPrincipals.

  6. Under Table permissions, ensure that Super is selected, and then choose Revoke.

To revoke Super from IAMAllowedPrincipals on a database
  1. Open the AWS Lake Formation console at https://console.aws.amazon.com/lakeformation/. Sign in as a data lake administrator.

  2. In the navigation pane, choose Databases.

  3. On the Databases page, select the radio button next to the desired database.

  4. On the Actions menu, choose Edit.

  5. On the Edit database page, clear Use only IAM access control for new tables in this database, and then choose Save.

  6. Back on the Databases page, ensure that the database is still selected, and then on the Actions menu, choose Revoke.

  7. In the Revoke permissions dialog box, in the IAM users and roles list, scroll down to the Group heading, and choose IAMAllowedPrincipals.

  8. Under Database permissions, ensure that Super is selected, and then choose Revoke.

Turn on Lake Formation permissions for your Amazon S3 location

Next, register the Amazon S3 location with Lake Formation. To do this, you can use the process described in Adding an Amazon S3 location to your data lake. Or, use the RegisterResource API operation as described in Credential vending APIs.

Note

If a parent location is registered, you don't need to register child locations.

After you finish these steps and test that your users can access their data, you have successfully upgraded to Lake Formation permissions. Continue with the next step, Step 5: Secure new Data Catalog resources.