Menu
Amazon Redshift
Database Developer Guide (API Version 2012-12-01)

ANALYZE COMPRESSION

Performs compression analysis and produces a report with the suggested compression encoding for the tables analyzed. For each column, the report includes an estimate of the potential reduction in disk space compared to the current encoding.

Syntax

Copy
ANALYZE COMPRESSION [ [ table_name ] [ ( column_name [, ...] ) ] ] [COMPROWS numrows]

Parameters

table_name

You can analyze compression for specific tables, including temporary tables. You can qualify the table with its schema name. You can optionally specify a table_name to analyze a single table. If you do not specify a table_name, all of the tables in the currently connected database are analyzed. You cannot specify more than one table_name with a single ANALYZE COMPRESSION statement.

column_name

If you specify a table_name, you can also specify one or more columns in the table (as a column-separated list within parentheses).

COMPROWS

Number of rows to be used as the sample size for compression analysis. The analysis is run on rows from each data slice. For example, if you specify COMPROWS 1000000 (1,000,000) and the system contains 4 total slices, no more than 250,000 rows per slice are read and analyzed. If COMPROWS is not specified, the sample size defaults to 100,000 per slice. Values of COMPROWS lower than the default of 100,000 rows per slice are automatically upgraded to the default value. However, compression analysis will not produce recommendations if the amount of data in the table is insufficient to produce a meaningful sample. If the COMPROWS number is greater than the number of rows in the table, the ANALYZE COMPRESSION command still proceeds and runs the compression analysis against all of the available rows.

numrows

Number of rows to be used as the sample size for compression analysis. The accepted range for numrows is a number between 1000 and 1000000000 (1,000,000,000).

Usage Notes

Run ANALYZE COMPRESSION to get recommendations for column encoding schemes, based on a sample of the table's contents. ANALYZE COMPRESSION is an advisory tool and doesn't modify the column encodings of the table. The suggested encoding can be applied by recreating the table, or creating a new table with the same schema. Recreating an uncompressed table with appropriate encoding schemes can significantly reduce its on-disk footprint, saving disk space and improving query performance for IO-bound workloads.

ANALYZE COMPRESSION doesn't consider Runlength Encoding encoding on any column that is designated as a SORTKEY because range-restricted scans might perform poorly when SORTKEY columns are compressed much more highly than other columns.

ANALYZE COMPRESSION acquires an exclusive table lock, which prevents concurrent reads and writes against the table. Only run the ANALYZE COMPRESSION command when the table is idle.

Examples

Analyze the LISTING table only:

Copy
analyze compression listing; Table | Column | Encoding | Est_reduction_pct --------+----------------+----------+------------------ listing | listid | delta32k | 0.00 listing | sellerid | delta32k | 0.00 listing | eventid | delta32k | 0.00 listing | dateid | lzo | 96.94 listing | numtickets | delta | 0.00 listing | priceperticket | lzo | 15.74 listing | totalprice | lzo | 18.06 listing | listtime | lzo | 20.98

Analyze the QTYSOLD, COMMISSION, and SALETIME columns in the SALES table:

Copy
analyze compression sales(qtysold, commission, saletime); Table | Column | Encoding | Est_reduction_pct ------+------------+----------+------------------ sales | salesid | N/A | 0.00 sales | listid | N/A | 0.00 sales | sellerid | N/A | 0.00 sales | buyerid | N/A | 0.00 sales | eventid | N/A | 0.00 sales | dateid | N/A | 0.00 sales | qtysold | lzo | 17.49 sales | pricepaid | N/A | 0.00 sales | commission | delta32k | 0.00 sales | saletime | lzo | 20.99