Amazon EMR
Management Guide

How Access to Data Works in Lake Formation

Lake Formation allows access to data by providing temporary credentials to services such as Amazon EMR. This process is known as credential vending. For more information, see AWS Lake Formation.

When you run a query on data that is protected by Lake Formation security policies, Amazon EMR requests temporary credentials from AWS Lake Formation to access data stored in Amazon S3.

Here is how access to data is granted:

  1. You set up and control user access to resources by using AWS Lake Formation policies. You can create the policies by using a set of grant and revoke permissions available to you within the Lake Formation Console. For example, you can grant access to a database or a table. You can also grant column-level permissions to users. You specify permissions for tables and columns directly in Lake Formation, instead of specifying them for Amazon S3 buckets and objects. For more information, see Lake Formation Permissions.

  2. When a principal attempts to run a query in Amazon EMR on data from Lake Formation, Amazon EMR requests temporary credentials for data access from AWS Lake Formation.

  3. Lake Formation returns temporary credentials, allowing data access.

  4. Amazon EMR sends the query request to obtain data from Amazon S3.

  5. Amazon EMR filters and returns the results based on the user permissions defined in Lake Formation.