Data lake design patterns and principles - Best Practices for Building a Data Lake on AWS for Games

This whitepaper is for historical reference only. Some content might be outdated and some links might not be available.

Data lake design patterns and principles

Framework

Following is a high-level framework for building a data lake on AWS.

10,000 foot view

This is a 10,000 foot (high level) view of how analytics systems work with source and destination systems.

This is a 10,000 foot (high level) view of how analytics systems work with source and destination systems.

5000 foot view

This is a 5,000 foot (mid-level) view of how analytics systems work with source and destination systems.

This is a 5,000 foot (mid-level) view of how analytics systems work with source and destination systems.

Diving deeper into the framework, there are data streamers, data collectors, data aggregators, and data transformers that collect the data from the data producers (sources). Depending on the use-case, data is then consumed for analysis or downstream consumers and cataloged into a data lake for governed access.

1000 foot view

This is a 1,000 foot (detailed) view of how analytics systems work with source and destination systems.

This is a 1,000 foot (detailed) view of how analytics systems work with source and destination systems.

Diving deeper in the framework, Some AWS services are added as an example to show data flow. This layout is a common pattern AWS observed with its customers.