Run Athena queries with Step Functions - AWS Step Functions

Run Athena queries with Step Functions

You can integrate AWS Step Functions with Amazon Athena to start and stop query execution and get query results with Step Functions. Using Step Functions, you can run ad-hoc or scheduled data queries, and retrieve results targeting your S3 data lakes. Athena is serverless, so there is no infrastructure to set up or manage, and you pay only for the queries you run. This page lists the supported Athena APIs and provides an example Task state to start an Athena query.

Step Functions can control certain AWS services directly from Amazon States Language (ASL). To learn more, see Integrating other services and Passing parameters to a service API in Step Functions.

How the Optimized Athena integration is different than the Athena AWS SDK integration

To integrate AWS Step Functions with Amazon Athena, you use the provided Athena service integration APIs.

The service integration APIs are the same as the corresponding Athena APIs. Not all APIs support all integration patterns, as shown in the following table.

API Request Response Run a Job (.sync)
StartQueryExecution
StopQueryExecution
GetQueryExecution
GetQueryResults

Supported Amazon Athena APIs:

Note

There is a quota for the maximum input or result data size for a task in Step Functions. This restricts you to 256 KB of data as a UTF-8 encoded string when you send to, or receive data from, another service. See Quotas related to state machine executions.

The following includes a Task state that starts an Athena query.

"Start an Athena query": { "Type": "Task", "Resource": "arn:aws:states:::athena:startQueryExecution.sync", "Parameters": { "QueryString": "SELECT * FROM \"myDatabase\".\"myTable\" limit 1", "WorkGroup": "primary", "ResultConfiguration": { "OutputLocation": "s3://athenaQueryResult" } }, "Next": "Get results of the query" }

For information about how to configure IAM permissions when using Step Functions with other AWS services, see How Step Functions generates IAM policies for integrated services.