Lake Formation tag-based access control best practices and considerations - AWS Lake Formation

Lake Formation tag-based access control best practices and considerations

You can create, maintain, and assign LF-Tags to control access to Data Catalog databases, tables, and columns.

Consider the following best practices when using Lake Formation tag-based access control:

  • All LF-Tags must be predefined before they can be assigned to Data Catalog resources or granted to principals.

    The data lake administrator can delegate tag management tasks by creating LF-Tag creators with the required IAM permissions. Data engineers and analysts decide on the characteristics and relationships for LF-Tags. The LF-Tag creators then creates and maintains the LF-Tags in Lake Formation.

  • You can assign multiple LF-Tags to Data Catalog resources. Only one value for a particular key can be assigned to a particular resource.

    For example, you can assign module=Orders, region=West, division=Consumer, and so on to a database, table, or column. You can't assign module=Orders,Customers.

  • You can't assign LF-Tags to resources when you create the resource. You can only add LF-Tags to existing resources.

  • You can grant LF-Tag expressions, not just single LF-Tags, to a principal.

    A LF-Tag expression looks something like the following (in pseudo-code).

    module=sales AND division=(consumer OR commercial)

    A principal that is granted this LF-Tag expression can access only Data Catalog resources (databases, tables, and columns) that are assigned module=sales and either division=consumer or division=commercial. If you want the principal to be able to access resources that have module=sales or division=commercial, don't include both in the same grant. Make two grants, one for module=sales and one for division=commercial.

    The simplest LF-Tag expression consists of just one LF-Tag, such as module=sales.

  • A principal that is granted permissions on a LF-Tag with multiple values can access Data Catalog resources with either of those values. For example, if a user is granted a LF-Tag with key=module and values=orders,customers, the user has access to resources that are assigned either module=orders or module=customers.

  • You need to have Grant with LF-Tag expressions permission to grant data permissions on Data Catalog resources by using the LF-TBAC method. The data lake administrator and the LF-Tag creator implicitly receive this permission. A principal that has the Grant with LFTag expressions permission can grant data permissions on the resources using:

    • the named resource method

    • the LF-TBAC method, but only using the same LF-Tag expression

      For example, assume that the data lake administrator makes the following grant (in pseudo-code).

      GRANT (SELECT ON TABLES) ON TAGS module=customers, region=west,south TO user1 WITH GRANT OPTION

      In this case, user1 can grant SELECT on tables to other principals by using the LF-TBAC method, but only with the complete LF-Tag expression module=customers, region=west,south.

  • If a principal is granted permissions on a resource with both the LF-TBAC method and the named resource method, the permissions that the principal has on the resource is the union of the permissions granted by both methods.

  • Lake Formation supports granting DESCRIBE and ASSOCIATE on LF-Tags across accounts, and granting permissions on Data Catalog resources across accounts using the LF-TBAC method. In both cases, the principal is an AWS account ID.

    Note

    Lake Formation supports cross-account grants to organizations and organizational units using LF-TBAC method. To use this capability, you need to update Cross account version settings to Version 3.

    For more information, see Cross-account data sharing in Lake Formation.

  • Data Catalog resources created in one account can only be tagged using LF-Tags created in the same account. LF-Tags created in one account can't be associated with shared resources from another account.

  • Using Lake Formation tag-based access control (LF-TBAC) to grant cross-account access to Data Catalog resources requires additions to the Data Catalog resource policy for your AWS account. For more information, see Prerequisites.

  • LF-Tag keys and LF-Tag values can't exceed 50 characters in length.

  • The maximum number of LF-Tags that can be assigned to a Data Catalog resource is 50.

  • The following limits are soft limits:

    • The maximum number of LF-Tags that can be created is 1000.

    • The maximum number of values that can be defined for a LF-Tag is 1000.

  • Tags keys and values are converted to all lower case when they are stored.

  • Only one value for a LF-Tag can be assigned to a particular resource.

  • If multiple LF-Tags are granted to a principal with a single grant, the principal can access only Data Catalog resources that have all of the LF-Tags.

  • AWS Glue ETL jobs require full table access. The jobs will fail if AWS Glue ETL role does not have access to all columns in a table. It is possible to apply LF-Tags at a column-level, but it may cause AWS Glue ETL roles to lose full table access and have jobs fail.

  • If a LF-Tag expression evaluation results in access to only a subset of table columns, but the Lake Formation permission granted when there is a match is one of the permissions that required full column access, namely Alter, Drop, Insert, or Delete, then none of those permissions is granted. Instead, only Describe is granted. If the granted permission is All (Super), then only Select and Describe are granted.

  • Wildcards are not used with LF-Tags. To assign a LF-Tag to all columns of a table, you assign the LF-Tag to the table, and all columns in the table inherit the LF-Tag. To assign a LF-Tag to all tables in a database, you assign the LF-Tag to the database, and all tables in the database inherit that LF-Tag.