Menu
Amazon Redshift
Cluster Management Guide (API Version 2012-12-01)

Analyzing Query Execution

The Query Execution Details section of the Query view provides information about the way the query was processed. This section combines data from SVL_QUERY_REPORT, STL_EXPLAIN, and other system views and tables.

The Query Execution Details section has three tabs:

  • Plan. This tab shows the explain plan for the query that is displayed.

    The information on the Plan tab is analogous to running the EXPLAIN command in the database. The EXPLAIN command examines your query text, and returns the query plan. You use this information to evaluate queries, and revise them for efficiency and performance if necessary. The EXPLAIN command doesn’t actually run the query.

    The following example shows a query that returns the top five sellers in San Diego, based on the number of tickets sold in 2008, and the query plan for that query.

    explain select sellerid, username, (firstname ||' '|| lastname) as name, city, sum(qtysold) from sales, date, users where sales.sellerid = users.userid and sales.dateid = date.dateid and year = 2008 and city = 'San Diego' group by sellerid, username, name, city order by 5 desc limit 5;

    For more information about understanding the explain plan, see Analyzing the Explain Plan in the Amazon Redshift Database Developer Guide.

    When you actually run the query (omitting the EXPLAIN command), the engine might find ways to optimize the query performance and change the way it processes the query. The actual performance data for the query is stored in the system views, such as SVL_QUERY_REPORT and SVL_QUERY_SUMMARY.

    The Amazon Redshift console uses a combination of STL_EXPLAIN, SVL_QUERY_REPORT, and other system views and tables to present the actual query performance and compare it to the explain plan for the query. This information appears on the Actual tab. In some cases, you might see that the explain plan and the actual query execution steps differ. In these cases, you might need to perform some operations in the database, such as ANALYZE, to update statistics and make the explain plan more effective.

    Additionally, sometimes the query optimizer breaks complex SQL queries into parts and creates temporary tables with the naming convention volt_tt_guid to process the query more efficiently. In this case, both the explain plan and the actual query execution summary apply to the last statement that was run. You can review previous query IDs to see the explain plan and actual query execution summary for each of the corresponding parts of the query.

    For more information about the difference between the explain plan and system views and logs, see Analyzing the Query Summary in the Amazon Redshift Database Developer Guide.

  • Actual. This tab shows the actual steps and statistics for the query that was executed. This information displays in a textual hierarchy and visual charts for Timeline and Execution time.

    The Timeline view shows the sequence in which the actual steps of the query are executed.

    The Execution time view shows the time taken for every step of the query. It can be used to understand what steps are taking longer to complete.

    Each view shows Avg time and Max time.

    The Avg statistic shows the average execution time for the step across data slices, and the percentage of the total query runtime that represents. The Max statistic shows the longest execution time for the step on any of the data slices, and the skew. The skew is the difference between the average and maximum execution times for the step.

    You might want to investigate a step if two conditions are both true. One condition is that the maximum execution time is consistently more than twice the average execution time over multiple runs of the query. The other condition is that the step also takes a significant amount of time (for example, being one of the top three steps in execution time in a large query).

    Note

    When possible, you should run a query twice to see what its execution details typically are. Compilation adds overhead to the first run of the query that is not present in subsequent runs.

    You can choose any bar in the chart to compare the data estimated from the explain plan with the actual performance of the query, as shown following.

    If the query optimizer posted alerts for the query in the STL_ALERT_EVENT_LOG system table, then the plan nodes associated with the alerts are flagged with an alert icon.

  • Metrics. This tab shows the metrics for the query that was executed.

    The Rows returned metric is the sum of the number of rows produced during each step of the query.

    The Bytes returned metric shows the number of bytes returned for each cluster node.

    The Execution time metric shows the query execution time for each cluster node.

    The Row throughput metric shows the number of rows returned divided by query execution time for each cluster node.

    If a query runs slower than expected, you can use the Metrics tab to troubleshoot the cause. Look at the Row throughput metric. If one of the cluster nodes appears to be have a much higher row throughput relative to the other nodes, the workload is unevenly distributed among the cluster nodes. One possible cause is that your data is unevenly distributed, or skewed, across node slices. for more information, see Identifying Tables with Data Skew or Unsorted Rows.

    If your data is evenly distributed, your query might be filtering for rows that are located mainly on that node. To fix this issue, look at the distribution styles for the tables in the query and see if any improvements can be made. Remember to weigh the performance of this query against the performance of other important queries and the system overall before making any changes. For more information, see Choosing a Data Distribution Style.

Note

The metrics tab is not available for a single-node cluster.

Viewing Query Execution Details Using the Console

Use the following procedure to look at the details of query execution.

To view query execution details

  1. Sign in to the AWS Management Console and open the Amazon Redshift console at https://console.aws.amazon.com/redshift/.

  2. In the navigation pane, choose Clusters.

  3. For Cluster, choose the cluster for which you want to view query execution details.

  4. Choose the Queries tab, and open the query for which you want to view performance data.

  5. Expand the Query Execution Details section and do the following:

    1. On the Plan tab, review the explain plan for the query. In some cases, you might find that your explain plan differs from the actual query execution on the Actual tab. In these cases, you might need to run ANALYZE to update statistics or perform other maintenance on the database to optimize the queries that you run. For more information about query optimization, see Tuning Query Performance in the Amazon Redshift Database Developer Guide.

    2. On the Actual tab, review the performance data associated with each of the plan nodes in the query execution. You can choose an individual plan node in the hierarchy to view performance data associated with that specific plan node. This data includes both the estimated and actual performance data.

    3. On the Metrics tab, review the metrics for each of the cluster nodes.