Amazon Redshift
Database Developer Guide (API Version 2012-12-01)

Document History

The following table describes the important changes since the last release of the Amazon Redshift Database Developer Guide.

API version: 2012-12-01

Latest documentation update: February 9, 2017

For a list of the changes to the Amazon Redshift Cluster Management Guide, see Amazon Redshift Cluster Management Guide Document History.

For more information about new features, including a list of fixes and the associated cluster version numbers for each release, go to the Amazon Redshift forum.

Change Description Date Changed
Server-side encryption with KMS You can now unload data to Amazon S3 using server-side encryption with an AWS Key Management Service key (SSE-KMS). In addition, COPY now transparently loads KMS-encrypted data files from Amazon S3. For more information, see UNLOAD. February 9, 2017
New authorization syntax You can now use the IAM_ROLE, MASTER_SYMMETRIC_KEY, ACCESS_KEY_ID, SECRET_ACCESS_KEY, and TOKEN parameters to provide authorization and access information for COPY, UNLOAD, and CREATE LIBRARY commands. The new authorization syntax provides a more flexible alternative to providing a single string argument to the CREDENTIALS parameter. For more information, see Authorization Parameters. February 9, 2017
Schema limit increase You can now create up to 9,900 schemas per cluster. For more information, see CREATE SCHEMA. February 9, 2017
Default table encoding CREATE TABLE and ALTER TABLE now assign LZO compression encoding to most new columns. Columns defined as sort keys, columns that are defined as BOOLEAN, REAL, or DOUBLE PRECISION data types, and temporary tables are assigned RAW encoding by default. For more information, see ENCODE. February 6, 2017
ZSTD compression encoding Amazon Redshift now supports ZSTD column compression encoding. January 19, 2017
PERCENTILE_CONT and MEDIAN aggregate functions PERCENTILE_CONT and MEDIAN are now available as aggregate functions as well as window functions. January 19, 2017
User-defined function (UDF) User Logging You can use the Python logging module to create user-defined error and warning messages in your UDFs. Following query execution, you can query the SVL_UDF_LOG system view to retrieve logged messages. For more information about user-defined messages, see Logging Errors and Warnings in UDFs December 8, 2016
ANALYZE COMPRESSION estimated reduction The ANALYZE COMPRESSION command now reports an estimate for percentage reduction in disk space for each column. For more information, see ANALYZE COMPRESSION. November 10, 2016
Connection limits You can now set a limit on the number of database connections a user is permitted to have open concurrently. You can also limit the number of concurrent connections for a database. For more information, see CREATE USER and CREATE DATABASE. November 10, 2016
COPY sort order enhancement COPY now automatically adds new rows to the table's sorted region when you load your data in sort key order. For specific requirements to enable this enhancement, see Loading Your Data in Sort Key Order November 10, 2016
CTAS with compression CREATE TABLE AS (CTAS) now automatically assigns compression encodings to new tables based on the column's data type. For more information, see Inheritance of Column and Table Attributes. October 28, 2016
Time stamp with time zone data type Amazon Redshift now supports a time stamp with time zone (TIMESTAMPTZ) data type. Also, several new functions have been added to support the new data type. For more information, see Date and Time Functions. September 29, 2016
Analyze threshold To reduce processing time and improve overall system performance for ANALYZE operations, Amazon Redshift skips analyzing a table if the percentage of rows that have changed since the last ANALYZE command run is lower than the analyze threshold specified by the analyze_threshold_percent parameter. By default, analyze_threshold_percent is 10. August 9, 2016
New STL_RESTARTED_SESSIONS system table When Amazon Redshift restarts a session, STL_RESTARTED_SESSIONS records the new process ID (PID) and the old PID. August 9, 2016
Updated the Date and Time Functions documentation Added a summary of functions with links to the Date and Time Functions, and updated the function references for consistency. June 24, 2016
New columns in STL_CONNECTION_LOG The STL_CONNECTION_LOG system table has two new columns to track SSL connections. If you routinely load audit logs to an Amazon Redshift table, you will need to add the following new columns to the target table: sslcompression and sslexpansion. May 5, 2016
MD5-hash password You can specify the password for a CREATE USER or ALTER USER command by supplying the MD5-hash string of the password and user name. April 21, 2016
New column in STV_TBL_PERM The backup column in the STV_TBL_PERM system view indicates whether the table is included in cluster snapshots. For more information, see BACKUP. April 21, 2016
No-backup tables For tables, such as staging tables, that won't contain critical data, you can specify BACKUP NO in your CREATE TABLE or CREATE TABLE AS statement to prevent Amazon Redshift from including the table in automated or manual snapshots. Using no-backup tables saves processing time when creating snapshots and restoring from snapshots and reduces storage space on Amazon S3. April 7, 2016
VACUUM delete threshold By default, the VACUUM command now reclaims space such that at least 95 percent of the remaining rows are not marked for deletion. As a result, VACUUM usually needs much less time for the delete phase compared to reclaiming 100 percent of deleted rows. You can change the default threshold for a single table by including the TO threshold PERCENT parameter when you run the VACUUM command. April 7, 2016
SVV_TRANSACTIONS system table The SVV_TRANSACTIONS system view records information about transactions that currently hold locks on tables in the database. April 7, 2016
Using IAM roles to access other AWS resources To move data between your cluster and another AWS resource, such as Amazon S3, Amazon DynamoDB, Amazon EMR, or Amazon EC2, your cluster must have permission to access the resource and perform the necessary actions. As a more secure alternative to providing an access key pair with COPY, UNLOAD, or CREATE LIBRARY commands, you can now you specify an IAM role that your cluster uses for authentication and authorization. For more information, see Role-Based Access Control. March 29, 2016
VACUUM sort threshold The VACUUM command now skips the sort phase for any table where more than 95 percent of the table's rows are already sorted. You can change the default sort threshold for a single table by including the TO threshold PERCENT parameter when you run the VACUUM command. March 17, 2016
New columns in STL_CONNECTION_LOG The STL_CONNECTION_LOG system table has three new columns. If you routinely load audit logs to an Amazon Redshift table, you will need to add the following new columns to the target table: sslversion, sslcipher, and mtu. March 17, 2016
UNLOAD with bzip2 compression You now have the option to UNLOAD using bzip2 compression. February 8, 2016
ALTER TABLE APPEND ALTER TABLE APPEND appends rows to a target table by moving data from an existing source table. ALTER TABLE APPEND is usually much faster than a similar CREATE TABLE AS or INSERT INTO operation because data is moved, not duplicated. February 8, 2016
WLM Query Queue Hopping If workload management (WLM) cancels a read-only query, such as a SELECT statement, due to a WLM timeout, WLM attempts to route the query to the next matching queue. For more information, see WLM Query Queue Hopping January 7, 2016
ALTER DEFAULT PRIVILEGES You can use the ALTER DEFAULT PRIVILEGES command to define the default set of access privileges to be applied to objects that are created in the future by the specified user. December 10, 2015
bzip2 file compression The COPY command supports loading data from files that were compressed using bzip2. December 10, 2015
NULLS FIRST and NULLS LAST You can specify whether an ORDER BY clause should rank NULLS first or last in the result set. For more information, see ORDER BY Clause and Window Function Syntax Summary. November 19, 2015
REGION keyword for CREATE LIBRARY If the Amazon S3 bucket that contains the UDF library files does not reside in the same region as your Amazon Redshift cluster, you can use the REGION option to specify the region in which the data is located. For more information, see CREATE LIBRARY. November 19, 2015
User-defined scalar functions (UDFs) You can now create custom user-defined scalar functions to implement non-SQL processing functionality provided either by Amazon Redshift-supported modules in the Python 2.7 Standard Library or your own custom UDFs based on the Python programming language. For more information, see Creating User-Defined Functions. September 11, 2015
Dynamic properties in WLM configuration The WLM configuration parameter now supports applying some properties dynamically. Other properties remain static changes and require that associated clusters be rebooted so that the configuration changes can be applied. For more information, see WLM Dynamic and Static Configuration Properties and Implementing Workload Management. August 3, 2015
LISTAGG function The LISTAGG Function and LISTAGG Window Function return a string created by concatenating a set of column values. July 30, 2015
Deprecated parameter

The max_cursor_result_set_size configuration parameter is deprecated. The size of cursor result sets are constrained based on the cluster's node type. For more information, see Cursor Constraints.

July 24, 2015
Revised COPY command documentation The COPY command reference has been extensively revised to make the material friendlier and more accessible. July 15, 2015
COPY from Avro format The COPY command supports loading data in Avro format from data files on Amazon S3, Amazon EMR, and from remote hosts using SSH. For more information, see AVRO and Copy from Avro Examples. July 8, 2015
STV_STARTUP_RECOVERY_STATE The STV_STARTUP_RECOVERY_STATE system table records the state of tables that are temporarily locked during cluster restart operations. Amazon Redshift places a temporary lock on tables while they are being processed to resolve stale transactions following a cluster restart. May 25, 2015
ORDER BY optional for ranking functions The ORDER BY clause is now optional for certain window ranking functions. For more information, see CUME_DIST Window Function, DENSE_RANK Window Function, RANK Window Function, NTILE Window Function, PERCENT_RANK Window Function, and ROW_NUMBER Window Function. May 25, 2015
Interleaved Sort Keys Interleaved sort keys give equal weight to each column in the sort key. Using interleaved sort keys instead of the default compound keys significantly improves performance for queries that use restrictive predicates on secondary sort columns, especially for large tables. Interleaved sorting also improves overall performance when multiple queries filter on different columns in the same table. For more information, see Choosing Sort Keys and CREATE TABLE. May 11, 2015
Revised Tuning Query Performance topic Tuning Query Performance has been expanded to include new queries for analyzing query performance and more examples. Also, the topic has been revised to be clearer and more complete. Best Practices for Designing Queries has more information about how to write queries to improve performance. March 23, 2015
SVL_QUERY_QUEUE_INFO The SVL_QUERY_QUEUE_INFO view summarizes details for queries that spent time in a WLM query queue or commit queue. February 19, 2015
SVV_TABLE_INFO You can use the SVV_TABLE_INFO view to diagnose and address table design issues that can influence query performance, including issues with compression encoding, distribution keys, sort style, data distribution skew, table size, and statistics. February 19, 2015
UNLOAD uses server-side file encryption The UNLOAD command now automatically uses Amazon S3 server-side encryption (SSE) to encrypt all unload data files. Server-side encryption adds another layer of data security with little or no change in performance. October 31, 2014
CUME_DIST window function The CUME_DIST Window Function calculates the cumulative distribution of a value within a window or partition. October 31, 2014
MONTHS_BETWEEN function The MONTHS_BETWEEN Function determines the number of months between two dates. October 31, 2014
NEXT_DAY function The NEXT_DAY Function returns the date of the first instance of a specified day that is later than the given date. October 31, 2014
PERCENT_RANK window function The PERCENT_RANK Window Function calculates the percent rank of a given row. October 31, 2014
RATIO_TO_REPORT window function The RATIO_TO_REPORT Window Function calculates the ratio of a value to the sum of the values in a window or partition. October 31, 2014
TRANSLATE function The TRANSLATE Function replaces all occurrences of specified characters within a given expression with specified substitutes. October 31, 2014
NVL2 function The NVL2 Expression returns one of two values based on whether a specified expression evaluates to NULL or NOT NULL. October 16, 2014
MEDIAN window function The MEDIAN Window Function calculates the median value for the range of values in a window or partition. October 16, 2014
ON ALL TABLES IN SCHEMA schema_name clause for GRANT and REVOKE commands The GRANT and REVOKE commands have been updated with an ON ALL TABLES IN SCHEMA schema_name clause. This clause allows you to use a single command to change privileges for all tables in a schema. October 16, 2014
IF EXISTS clause for DROP SCHEMA, DROP TABLE, DROP USER, and DROP VIEW commands The DROP SCHEMA, DROP TABLE, DROP USER, and DROP VIEW commands have been updated with an IF EXISTS clause. This clause causes the command to make no changes and return a message rather than terminating with an error if the specified object doesn’t exist. October 16, 2014
IF NOT EXISTS clause for CREATE SCHEMA and CREATE TABLE commands The CREATE SCHEMA and CREATE TABLE commands have been updated with an IF NOT EXISTS clause. This clause causes the command to make no changes and return a message rather than terminating with an error if the specified object already exists. October 16, 2014
COPY support for UTF-16 encoding The COPY command now supports loading from data files that use UTF-16 encoding as well as UTF-8 encoding. For more information, see ENCODING. September 29, 2014
New Workload Management Tutorial Tutorial: Configuring Workload Management (WLM) Queues to Improve Query Processing walks you through the process of configuring Workload Management (WLM) queues to improve query processing and resource allocation. September 25, 2014
AES 128-bit Encryption The COPY command now supports AES 128-bit encryption as well as AES 256-bit encryption when loading from data files encrypted using Amazon S3 client-side encryption. For more information, see Loading Encrypted Data Files from Amazon S3. September 29, 2014
PG_LAST_UNLOAD_COUNT function The PG_LAST_UNLOAD_COUNT function returns the number of rows that were processed in the most recent UNLOAD operation. For more information, see PG_LAST_UNLOAD_COUNT(). September 15, 2014
New Troubleshooting Queries section Troubleshooting Queries provides a quick reference for identifying and addressing some of the most common and most serious issues you are likely to encounter with Amazon Redshift queries. July 7, 2014
New Loading Data tutorial Tutorial: Loading Data from Amazon S3 walks you through the process of loading data into your Amazon Redshift database tables from data files in an Amazon S3 bucket, from beginning to end. July 1, 2014
PERCENTILE_CONT window function PERCENTILE_CONT Window Function is an inverse distribution function that assumes a continuous distribution model. It takes a percentile value and a sort specification, and returns an interpolated value that would fall into the given percentile value with respect to the sort specification. June 30, 2014
PERCENTILE_DISC window function PERCENTILE_DISC Window Function is an inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification and returns an element from the set. June 30, 2014
GREATEST and LEAST functions The GREATEST and LEAST functions return the largest or smallest value from a list of expressions. June 30, 2014
Cross-region COPY The COPY command supports loading data from an Amazon S3 bucket or Amazon DynamoDB table that is located in a different region than the Amazon Redshift cluster. For more information, see REGION in the COPY command reference. June 30, 2014
Best Practices expanded Amazon Redshift Best Practices has been expanded, reorganized, and moved to the top of the navigation hierarchy to make it more discoverable. May 28, 2014
UNLOAD to a single file The UNLOAD command can optionally unload table data serially to a single file on Amazon S3 by adding the PARALLEL OFF option. If the size of the data is greater than the maximum file size of 6.2 GB, UNLOAD creates additional files. May 6, 2014
REGEXP functions The REGEXP_COUNT, REGEXP_INSTR, and REGEXP_REPLACE functions manipulate strings based on regular expression pattern matching. May 6, 2014
New Tutorial The new Tutorial: Tuning Table Design walks you through the steps to optimize the design of your tables, including testing load and query performance before and after tuning. May 2, 2014
COPY from Amazon EMR

The COPY command supports loading data directly from Amazon EMR clusters. For more information, see Loading Data from Amazon EMR.

April 18, 2014
WLM concurrency limit increase

You can now configure workload management (WLM) to run up to 50 queries concurrently in user-defined query queues. This increase gives users more flexibility for managing system performance by modifying WLM configurations. For more information, see Defining Query Queues

April 18, 2014
New configuration parameter to manage cursor size

The max_cursor_result_set_size configuration parameter defines the maximum size of data, in megabytes, that can be returned per cursor result set of a larger query. This parameter value also affects the number of concurrent cursors for the cluster, enabling you to configure a value that increases or decreases the number of cursors for your cluster.

For more information, see DECLARE in this guide and Configure Maximum Size of a Cursor Result Set in the Amazon Redshift Cluster Management Guide.

March 28, 2014
COPY from JSON format The COPY command supports loading data in JSON format from data files on Amazon S3 and from remote hosts using SSH. For more information, see COPY from JSON Format usage notes. March 25, 2014
New system table STL_PLAN_INFO The STL_PLAN_INFO table supplements the EXPLAIN command as another way to look at query plans. March 25, 2014
New function REGEXP_SUBSTR The REGEXP_SUBSTR Function returns the characters extracted from a string by searching for a regular expression pattern. March 25, 2014
New columns for STL_COMMIT_STATS The STL_COMMIT_STATS table has two new columns: numxids and oldestxid. March 6, 2014
COPY from SSH support for gzip and lzop The COPY command supports gzip and lzop compression when loading data through an SSH connection. February 13, 2014
New functions The ROW_NUMBER Window Function returns the number of the current row. The STRTOL Function converts a string expression of a number of the specified base to the equivalent integer value. PG_CANCEL_BACKEND and PG_TERMINATE_BACKEND enable users to cancel queries and session connections. The LAST_DAY function has been added for Oracle compatibility. February 13, 2014
New system table The STL_COMMIT_STATS system table provides metrics related to commit performance, including the timing of the various stages of commit and the number of blocks committed. February 13, 2014
FETCH with single-node clusters When using a cursor on a single-node cluster, the maximum number of rows that can be fetched using the FETCH command is 1000. FETCH FORWARD ALL is not supported for single-node clusters. February 13, 2014
DS_DIST_ALL_INNER redistribution strategy DS_DIST_ALL_INNER in the Explain plan output indicates that the entire inner table was redistributed to a single slice because the outer table uses DISTSTYLE ALL. For more information, see Join Type Examples and Evaluating the Query Plan. January 13, 2014
New system tables for queries Amazon Redshift has added new system tables that customers can use to evaluate query execution for tuning and troubleshooting. For more information, see SVL_COMPILE, STL_SCAN, STL_RETURN, STL_SAVE STL_ALERT_EVENT_LOG. January 13, 2014
Single-node cursors Cursors are now supported for single-node clusters. A single-node cluster can have two cursors open at a time, with a maximum result set of 32 GB. On a single-node cluster, we recommend setting the ODBC Cache Size parameter to 1,000. For more information, see DECLARE. December 13, 2013
ALL distribution style ALL distribution can dramatically shorter execution times for certain types of queries. When a table uses ALL distribution style, a copy of the table is distributed to every node. Because the table is effectively collocated with every other table, no redistribution is needed during query execution. ALL distribution is not appropriate for all tables because it increases storage requirements and load time. For more information, see Choosing a Data Distribution Style. November 11, 2013
COPY from remote hosts In addition to loading tables from data files on Amazon S3 and from Amazon DynamoDB tables, the COPY command can load text data from Amazon EMR clusters, Amazon EC2 instances, and other remote hosts by using SSH connections. Amazon Redshift uses multiple simultaneous SSH connections to read and load data in parallel. For more information, see Loading Data from Remote Hosts. November 11, 2013
WLM Memory Percent Used You can balance workload by designating a specific percentage of memory for each queue in your workload management (WLM) configuration. For more information, see Defining Query Queues. November 11, 2013
APPROXIMATE COUNT(DISTINCT) Queries that use APPROXIMATE COUNT(DISTINCT) execute much faster, with a relative error of about 2%. The APPROXIMATE COUNT(DISTINCT) function uses a HyperLogLog algorithm. For more information, see the COUNT Function. November 11, 2013
New SQL functions to retrieve recent query details Four new SQL functions retrieve details about recent queries and COPY commands. The new functions make it easier to query the system log tables, and in many cases provide necessary details without needing to access the system tables. For more information, see PG_BACKEND_PID(), PG_LAST_COPY_ID(), PG_LAST_COPY_COUNT(), PG_LAST_QUERY_ID(). November 1, 2013
MANIFEST option for UNLOAD The MANIFEST option for the UNLOAD command complements the MANIFEST option for the COPY command. Using the MANIFEST option with UNLOAD automatically creates a manifest file that explicitly lists the data files that were created on Amazon S3 by the unload operation. You can then use the same manifest file with a COPY command to load the data. For more information, see Unloading Data to Amazon S3 and UNLOAD Examples. November 1, 2013
MANIFEST option for COPY You can use the MANIFEST option with the COPY command to explicitly list the data files that will be loaded from Amazon S3. October 18, 2013
System tables for troubleshooting queries Added documentation for system tables that are used to troubleshoot queries. The STL Tables for Logging section now contains documentation for the following system tables: STL_AGGR, STL_BCAST, STL_DIST, STL_DELETE, STL_HASH, STL_HASHJOIN, STL_INSERT, STL_LIMIT, STL_MERGE, STL_MERGEJOIN, STL_NESTLOOP, STL_PARSE, STL_PROJECT, STL_SCAN, STL_SORT, STL_UNIQUE, STL_WINDOW. October 3, 2013
CONVERT_TIMEZONE function The CONVERT_TIMEZONE Function converts a timestamp from one time zone to another, with the option to automatically adjust for daylight savings time. October 3, 2013
SPLIT_PART function The SPLIT_PART Function splits a string on the specified delimiter and returns the part at the specified position. October 3, 2013
STL_USERLOG system table STL_USERLOG records details for changes that occur when a database user is created, altered, or deleted. October 3, 2013
LZO column encoding and LZOP file compression. LZO column compression encoding combines a very high compression ratio with good performance. COPY from Amazon S3 supports loading from files compressed using LZOP compression. September 19, 2013
JSON, Regular Expressions, and Cursors Added support for parsing JSON strings, pattern matching using regular expressions, and using cursors to retrieve large data sets over an ODBC connection. For more information, see JSON Functions, Pattern-Matching Conditions, and DECLARE. September 10, 2013
ACCEPTINVCHAR option for COPY You can successfully load data that contains invalid UTF-8 characters by specifying the ACCEPTINVCHAR option with the COPY command. August 29, 2013
CSV option for COPY The COPY command now supports loading from CSV formatted input files. August 9, 2013
CRC32 The CRC32 Function performs cyclic redundancy checks. August 9, 2013
WLM wildcards Workload management (WLM) supports wildcards for adding user groups and query groups to queues. For more information, see Wildcards. August 1, 2013
WLM timeout To limit the amount of time that queries in a given WLM queue are permitted to use, you can set the WLM timeout value for each queue. For more information, see WLM Timeout. August 1, 2013
New COPY options 'auto' and 'epochsecs' The COPY command performs automatic recognition of date and time formats. New time formats, 'epochsecs' and 'epochmillisecs' enable COPY to load data in epoch format. July 25, 2013
CONVERT_TIMEZONE function The CONVERT_TIMEZONE Function converts a timestamp from one timezone to another. July 25, 2013
FUNC_SHA1 function The FUNC_SHA1 Function converts a string using the SHA1 algorithm. July 15, 2013
max_execution_time To limit the amount of time queries are permitted to use, you can set the max_execution_time parameter as part of the WLM configuration. For more information, see Modifying the WLM Configuration. July 22, 2013
Four-byte UTF-8 characters The VARCHAR data type now supports four-byte UTF-8 characters. Five-byte or longer UTF-8 characters are not supported. For more information, see Storage and Ranges. July 18, 2013
SVL_QERROR The SVL_QERROR system view has been deprecated. July 12, 2013
Revised Document History The Document History page now shows the date the documentation was updated. July 12, 2013
STL_UNLOAD_LOG STL_UNLOAD_LOG records the details for an unload operation. July 5, 2013
JDBC fetch size parameter To avoid client-side out of memory errors when retrieving large data sets using JDBC, you can enable your client to fetch data in batches by setting the JDBC fetch size parameter. For more information, see Setting the JDBC Fetch Size Parameter. June 27, 2013
UNLOAD encrypted files UNLOAD now supports unloading table data to encrypted files on Amazon S3. May 22, 2013
Temporary credentials COPY and UNLOAD now support the use of temporary credentials. April 11, 2013
Added clarifications Clarified and expanded discussions of Designing Tables and Loading Data. February 14, 2013
Added Best Practices Added Best Practices for Designing Tables and Best Practices for Loading Data. February 14, 2013
Clarified password constraints Clarified password constraints for CREATE USER and ALTER USER, various minor revisions. February 14, 2013

New Guide

This is the first release of the Amazon Redshift Developer Guide. February 14, 2013