Data producers
A data producer collects, processes, and stores data from their data domain, in addition to monitoring and ensuring the quality of their data assets. The following diagram shows the data producer account as a component of this guide's reference architecture.

Each data producer has a private Data Catalog managed by AWS Lake Formation in their AWS account that is used by their internal data process. Data producers provide the centralized catalog with selective permissions to their data, which means that Lake Formation in the centralized catalog account can access data that the data producer wants to share.
This means that data producers don't directly interact with data consumers. Instead, the data producer account and its data storage location are completely abstracted and hidden from the data consumer. This approach reduces costs by removing unnecessary overhead for data producers that experience an increase in their data consumers.
A change to the data producer's data location doesn’t impact the data consumer if the new data location is registered by the centralized catalog. If the data producer wants to stop sharing a particular piece of data, they can remove the centralized catalog's permissions. This prevents data consumers from accessing the data and removes the need to manually revoke access for each data consumer.
By using public and private data catalogs, data producers can choose what to share with data consumers, while independently managing internal data access through a private data catalog.
The following table describes the two AWS services that data producers use to share data with the centralized catalog.
Amazon Simple Storage Service (Amazon S3) |
Adjust the bucket policy for S3 buckets to provide data access to the AWS Identity and Access Management (IAM) roles in the centralized catalog. Data producers can also share data stored in other data applications or services by using Amazon S3 as the intermediate data layer. |
AWS Key Management Service (AWS KMS) | Provide permissions for the AWS managed keys to the IAM roles in the centralized catalog and the AWS KMS keys used to encrypt the shared Amazon S3 data in the data producer accounts. |