Data consumers - AWS Prescriptive Guidance

Data consumers

Data consumers consume the data from the data producer after the centralized catalog shares it using AWS Lake Formation. The following diagram shows two data consumers in the data lake.

The data consumer's role in the reference architecture for this guide.

There are two types of data consumer: application and data-serving. The following table describes these two types.

Application type

Application data consumers run applications in their own AWS accounts. The applications consume the AWS Identity and Access Management (IAM) roles to access the shared data from a data producer and then process it according to their logic.

Typically, this type of data consumer has prescriptive data requirements to fulfill an application's needs.

Data-serving type

Data-serving data consumers are typically meant for individuals (for example, data analysts or data scientists) and applications (for example, a business intelligence application) that don't have their own AWS accounts.

Multiple data-serving data consumers can exist in one organization’s data lake. For example, different lines of business might choose to set up their own data-serving data consumers to help users consume data from the data lake. These data consumers have their own IAM role principals configured in their AWS account (for example, IAM roles associated with AWS IAM Identity Center) that are used by end users in the data consumer account to access shared data through AWS services (for example, Amazon Athena).

Typically, this type of data consumer has wide-ranging and continuously increasing data requirements.

AWS Lake Formation is the most important AWS service used by a data consumer for cross-account data sharing and accessing the centralized catalog. After databases are shared by the centralized catalog, the shared resources are available in Lake Formation in the data consumer account. Data access can then be granted to local IAM principals in the data consumer account, with permission from the data producer, if required. The shared data can then be used by AWS services integrated with Lake Formation (for example, Amazon Athena and AWS Glue). You can use the following AWS services to access shared data in the data consumer account: