Domain 3: Data Operations and Support (22% of the exam content)

This domain accounts for 22% of the exam content.

Task 3.1: Automate data processing by using services

Knowledge of:

How to maintain and troubleshoot data processing for repeatable business outcomes
API calls for data processing
Which services accept scripting (for example, Amazon EMR, Amazon Redshift, Glue)

Skills in:

Orchestrating data pipelines (for example, Amazon MWAA, Step Functions)
Troubleshooting Amazon managed workflows
Calling SDKs to access Amazon features from code
Using the features of services to process data (for example, Amazon EMR, Amazon Redshift, Glue)
Consuming and maintaining data APIs
Preparing data transformation (for example, Glue DataBrew)
Querying data (for example, Amazon Athena)
Using Lambda to automate data processing
Managing events and schedulers (for example, EventBridge)

Knowledge of:

Tradeoffs between provisioned services and serverless services
SQL queries (for example, SELECT statements with multiple qualifiers or JOIN clauses)
How to visualize data for analysis
When and how to apply cleansing techniques
Data aggregation, rolling average, grouping, and pivoting

Skills in:

Visualizing data by using services and tools (for example, Glue DataBrew, Amazon QuickSight)
Verifying and cleaning data (for example, Lambda, Athena, QuickSight, Jupyter Notebooks, Amazon SageMaker Data Wrangler)
Using Athena to query data or to create views
Using Athena notebooks that use Apache Spark to explore data

Knowledge of:

Skills in:

Extracting logs for audits
Deploying logging and monitoring solutions to facilitate auditing and traceability
Using notifications during monitoring to send alerts
Troubleshooting performance issues
Using CloudTrail to track API calls
Troubleshooting and maintaining pipelines (for example, Glue, Amazon EMR)
Using Amazon CloudWatch Logs to log application data (with a focus on configuration and automation)
Analyzing logs with services (for example, Athena, Amazon EMR, Amazon OpenSearch Service, CloudWatch Logs Insights, big data application logs)

Knowledge of:

Skills in:

Running data quality checks while processing the data (for example, checking for empty fields)
Defining data quality rules (for example, Glue DataBrew)
Investigating data consistency (for example, Glue DataBrew)

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Domain 2: Data Store Management (26% of the exam content)

Domain 4: Security and Governance (16% of the exam content)