Architecture overview - Centralized Logging with OpenSearch

Architecture overview

Deploying this solution with the default parameters builds the following environment in the AWS Cloud.

Architecture diagram

Architecture diagram, as described in text that follows.

Centralized Logging with OpenSearch architecture

This solution deploys the AWS CloudFormation template in your AWS Cloud account and completes the following settings.

  1. Amazon CloudFront distributes the frontend web UI assets hosted in an Amazon S3 bucket.

  2. Amazon Cognito user pool or OpenID Connector (OIDC) can be used for authentication.

  3. AWS AppSync provides the backend GraphQL APIs.

  4. Amazon DynamoDB stores the solution-related information as the backend database.

  5. AWS Lambda interacts with other AWS Services to process the core logic of managing log pipeline, log agents, and obtains information updated in DynamoDB tables.

  6. AWS Step Functions orchestrates the on-demand AWS CloudFormation deployment of a set of predefined stacks for log pipeline management. The log pipeline stacks deploy separate AWS resources and are used to collect and process logs and ingest them into Amazon OpenSearch Service for further analysis and visualization.

  7. Service Log Pipeline or Application Log Pipeline is provisioned on demand via Centralized Logging with the OpenSearch console.

  8. AWS Systems Manager and Amazon EventBridge manage log agents for collecting logs from application servers, such as installing log agents (Fluent Bit) for application servers and monitoring the health status of the agents.

  9. Amazon EC2 or Amazon EKS installs Fluent Bit agents and uploads log data to the application log pipeline.

  10. Application log pipelines read, parse, process application logs, and ingest them into Amazon OpenSearch Service domains or Light Engine.

  11. Service log pipelines read, parse, process AWS service logs and ingest them into Amazon OpenSearch Service domains or Light Engine.

Note

After deploying the solution, you can use AWS WAF to protect CloudFront or AWS AppSync. Moreover, you can follow this guide to configure your AWS WAF settings to prevent GraphQL schema introspection.

This solution supports two types of log pipelines: Service Log Analytics Pipeline and Application Log Analytics Pipeline, and two types of log analytics engines: OpenSearch Engine and Light Engine. Architecture details for pipelines and Light Engine are described in:

AWS Well-Architected pillars

This solution was designed with best practices from the AWS Well-Architected Framework, which helps customers design and operate reliable, secure, efficient, and cost-effective workloads in the cloud.

This section describes how the design principles and best practices of the Well-Architected Framework benefit this solution.

Operational excellence

This section describes how we architected this solution using the principles and best practices of the operational excellence pillar.

  • The solution pushes metrics, logs, and traces to Amazon CloudWatch at various stages to provide observability into the infrastructure, Elastic Load Balancing, Amazon ECS cluster, Lambda functions, Step Function workflow, and the rest of the solution components. This solution also creates the CloudWatch dashboards for each pipeline monitoring.

Security

This section describes how we architected this solution using the principles and best practices of the security pillar.

  • The web console users are authenticated and authorized with Amazon Cognito or OpenID Connect.

  • All inter-service communications use AWS IAM roles.

  • All roles used by the solution follow least privilege access. That is, it only contains the minimum permissions required so the service can function properly.

Reliability

This section describes how we architected this solution using the principles and best practices of the reliability pillar.

  • Using AWS serverless services wherever possible (for example, AWS AppSync, Amazon DynamoDB, AWS Lambda, AWS Step Functions, Amazon S3, and Amazon SQS) for high availability and recovery from service failure.

  • Configuration management content of the solution is stored in Amazon DynamoDB, all of your data is stored on solid-state disks (SSDs) and is automatically replicated across multiple Availability Zones in an AWS Region, providing built-in high availability and data durability.

Performance efficiency

This section describes how we architected this solution using the principles and best practices of the performance efficiency pillar.

  • The ability to launch this solution in any Region that supports AWS services in this solution such as: Amazon S3, Amazon ECS, and Elastic Load Balancing.

  • Using serverless architecture removes the need for you to run and maintain physical servers for traditional compute activities.

  • Automatically testing and deploying this solution daily. Reviewing this solution by solutions architects and subject matter experts for areas to experiment and improve.

Cost optimization

This section describes how we architected this solution using the principles and best practices of the cost optimization pillar.

  • Using Auto Scaling groups so that the compute costs are only related to how much data is ingested and processed.

  • Using serverless services such as Amazon S3, Amazon DynamoDB, and AWS Lambda so that customers only get charged for what they use.

Sustainability

This section describes how we architected this solution using the principles and best practices of the sustainability pillar.

  • The solution's serverless design (using Amazon Kinesis Data Streams, Amazon S3, AWS Lambda) and the use of managed services (such as Amazon ECS) are aimed at reducing carbon footprint compared to the footprint of continually operating on-premises servers.