SVCS_S3QUERY_SUMMARY

Use the SVCS_S3QUERY_SUMMARY view to get a summary of all Redshift Spectrum queries (S3 queries) that have been run on the system. One segment can perform one external table scan.

Note

System views with the prefix SVCS provide details about queries on both the main and concurrency scaling clusters. The views are similar to the views with the prefix SVL except that the SVL views provide information only for queries run on the main cluster.

SVCS_S3QUERY_SUMMARY is visible to all users. Superusers can see all rows; regular users can see only their own data. For more information, see Visibility of data in system tables and views.

For information about SVL_S3QUERY, see SVL_S3QUERY.

Table columns

Column name	Data type	Description
userid	integer	The ID of the user that generated the given entry.
query	integer	The query ID. You can use this value to join various other system tables and views.
xid	bigint	The transaction ID.
pid	integer	The process ID.
segment	integer	The segment number. A query consists of multiple segments, and each segment consists of one or more steps.
step	integer	The query step that ran.
starttime	timestamp	The time in UTC that the Redshift Spectrum query in this segment started running. One segment can have one external table scan.
endtime	timestamp	The time in UTC that the Redshift Spectrum query in this segment completed. One segment can have one external table scan.
elapsed	integer	The length of time that it took the Redshift Spectrum query in this segment to run (in microseconds).
aborted	integer	If a query was stopped by the system or canceled by the user, this column contains `1`. If the query ran to completion, this column contains `0`.
external_table_name	char(136)	The internal format of name of the external name of the table for the external table scan.
file_format	character(16)	The file format of the external table data.
is_partitioned	char(1)	If true (`t`), this column value indicates that the external table is partitioned.
is_rrscan	char(1)	If true (`t`), this column value indicates that a range-restricted scan was applied.
is_nested	varchar(1)	If true (`t`), this column value indicates that the nested column data type is accessed.
s3_scanned_rows	bigint	The number of rows scanned from Amazon S3 and sent to the Redshift Spectrum layer.
s3_scanned_bytes	bigint	The number of bytes scanned from Amazon S3 and sent to the Redshift Spectrum layer, based on compressed data.
s3query_returned_rows	bigint	The number of rows returned from the Redshift Spectrum layer to the cluster.
s3query_returned_bytes	bigint	The number of bytes returned from the Redshift Spectrum layer to the cluster. A large amount of data returned to Amazon Redshift might affect system performance.
files	integer	The number of files that were processed for this Redshift Spectrum query. A small number of files limits the benefits of parallel processing.
files_max	integer	The maximum number of file processed on one slice.
files_avg	integer	The average number of file processed on one slice.
splits	bigint	The number of splits processed for this segment. The number of splits processed on this slice. With large splitable data files, for example, data files larger than about 512 MB, Redshift Spectrum tries to split the files into multiple S3 requests for parallel processing.
splits_max	integer	The maximum number of splits processed on this slice.
splits_avg	bigint	The average number of splits processed on this slice.
total_split_size	bigint	The total size of all splits processed.
max_split_size	bigint	The maximum split size processed, in bytes.
avg_split_size	bigint	The average split size processed, in bytes.
total_retries	bigint	The total number of retries for the Redshift Spectrum query in this segment.
max_retries	integer	The maximum number of retries for one individual processed file.
max_request_duration	bigint	The maximum duration of an individual file request (in microseconds). Long running queries might indicate a bottleneck.
avg_request_duration	bigint	The average duration of the file requests (in microseconds).
max_request_parallelism	integer	The maximum number of parallel requests at one slice for this Redshift Spectrum query.
avg_request_parallelism	double precision	The average number of parallel requests at one slice for this Redshift Spectrum query.
total_slowdown_count	bigint	The total number of Amazon S3 requests with a slow down error that occurred during the external table scan.
max_slowdown_count	integer	The maximum number of Amazon S3 requests with a slow down error that occurred during the external table scan on one slice.

Sample query

The following example gets the scan step details for the last query run.


select query, segment, elapsed, s3_scanned_rows, s3_scanned_bytes, s3query_returned_rows, s3query_returned_bytes, files 
from svcs_s3query_summary 
where query = pg_last_query_id() 
order by query,segment;

query | segment | elapsed | s3_scanned_rows | s3_scanned_bytes | s3query_returned_rows | s3query_returned_bytes | files
------+---------+---------+-----------------+------------------+-----------------------+------------------------+------               
 4587 |       2 |   67811 |               0 |                0 |                     0 |                      0 |     0
 4587 |       2 |  591568 |          172462 |         11260097 |                  8513 |                 170260 |     1
 4587 |       2 |  216849 |               0 |                0 |                     0 |                      0 |     0
 4587 |       2 |  216671 |               0 |                0 |                     0 |                      0 |     0

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

SVCS_S3PARTITION_SUMMARY

SVCS_STREAM_SEGS