Configuration notes: - Analytics Lens

Configuration notes:

  1. Leverage serverless compute and analytics services where possible. This removes the undifferentiated heavy lifting of managing distributed and clustered environments. In AWS, services like Kinesis Data Firehose, Amazon Kinesis Data Analytics, AWS Glue, Amazon S3, Athena, and QuickSight are fully managed and do not require patching, installation of software, back-ups, etc.

  2. Leverage Amazon EMR for frameworks outside of the serverless scope. You may find your Speed Layer calculations are better suited in something like Apache Flink or Spark Streaming. You can use Amazon EMR for managing cluster environments for those applications.

  3. Determine the right data to combine at the right time. Not all use-cases demand the use of Lambda Architectures. Work backward from your business requirements for the speed and data freshness that is required for the views in the serving layer.

  4. Create pre-processed views for your end users. For some users, presenting access to raw streaming and master data can be complicated and overwhelming at best, and inefficient at worst (for example, SQL queries that select all columns without filtering). Creating pre-processed views (also known as materialized views in other domains) simplifies queries and also improves performance. You can align business expectations for freshness of the views.

For more information and a hands-on tutorial on Lambda Architectures, see the following blog post: Unite Real-Time and Batch Analytics Using the Big Data Lambda Architecture, Without Servers!