Domain 4: Troubleshooting and Optimization (18% of the exam content)
This domain accounts for 18% of the exam content.
Topics
Task 1: Assist in a root cause analysis
Knowledge of:
Logging and monitoring systems
Languages for log queries (for example, Amazon CloudWatch Logs Insights)
Data visualizations
Code analysis tools
Common HTTP error codes
Common exceptions generated by SDKs
Service maps in X-Ray
Skills in:
Debugging code to identify defects
Interpreting application metrics, logs, and traces
Querying logs to find relevant data
Implementing custom metrics (for example, CloudWatch embedded metric format [EMF])
Reviewing application health by using dashboards and insights
Troubleshooting deployment failures by using service output logs
Task 2: Instrument code for observability
Knowledge of:
Distributed tracing
Differences between logging, monitoring, and observability
Structured logging
Application metrics (for example, custom, embedded, built-in)
Skills in:
Implementing an effective logging strategy to record application behavior and state
Implementing code that emits custom metrics
Adding annotations for tracing services
Implementing notification alerts for specific actions (for example, notifications about quota limits or deployment completions)
Implementing tracing by using services and tools
Task 3: Optimize applications by using services and features
Knowledge of:
Caching
Concurrency
Messaging services (for example, Amazon Simple Queue Service [Amazon SQS], Amazon Simple Notification Service [Amazon SNS])
Skills in:
Profiling application performance
Determining minimum memory and compute power for an application
Using subscription filter policies to optimize messaging
Caching content based on request headers