|« PreviousNext »|
|Did this page help you? Yes | No | Tell us about it...|
When users run queries in Amazon Redshift, the queries are routed to query queues.
Each query queue contains a number of slots, which are units of memory and CPU that are
used to process queries. You can configure WLM properties for each query queue to
specify the way that memory is allocated among slots, how queries can be routed to
specific queues at runtime, and when to cancel long-running queries. You can also use
wlm_slot_count parameter, which is separate from the WLM properties, to
temporarily override the memory allocation set in the WLM configuration.
When you create a parameter group, Amazon Redshift has the following query queues by default:
Service classes. There are four service classes that are reserved for the Amazon Redshift service specifically. They can't be configured or used by superusers or users. These queues do not appear in the console, but they do appear in the system tables in the database as the first four queues.
A superuser queue. The super user queue is reserved for superusers only and
it can't be configured. You should only use this queue when you need to run
queries that affect the system or for troubleshooting purposes. For example,
use this queue when you need to cancel a user's long-running query or to add
users to the database. You should not use it to perform routine queries. The
queue does not appear in the console, but it does appear in the system tables
in the database as the fifth queue. To run a query in the superuser queue, a
user must be logged in as a super user, and must run the query using the
superuser query group.
A default user queue. The default queue is the queue to which users' queries are routed for processing. By default, this queue is configured to run five queries concurrently, but you can adjust this and other WLM properties for this queue and any other queues that you create.
You can modify the default user queue and add additional query queues if you need to change the way that Amazon Redshift allocates resources for query processing. You can configure the following for each query queue:
Wildcards for users and groups
WLM memory percent to use
You can modify these properties by using the Amazon Redshift console, AWS Command Line Interface (AWS CLI), the
AWS Query API, or the AWS Software Development Kit (SDK) libraries. These properties
are specified in the
wlm_json_configuration parameter, which is one of the
parameters in the parameter group that is associated with your cluster. This parameter
that correspond to query queues in WLM. Each object contains name/value pairs that
specify values for each of the properties that you can configure for the queue. For more
information about configuring WLM, go to Configure Workload Management Using the wlm_json_configuration Parameter in
the Amazon Redshift Cluster Management Guide.
Queries in a queue run concurrently until they reach the concurrency level defined for that queue. Subsequent queries then wait in the queue. Each queue can be configured to run up to 50 queries concurrently. The maximum total concurrency level for all user-defined queues, not including the reserved Superuser queue, is 50. Amazon Redshift allocates an equal, fixed share of server memory to each queue, and, by default, an equal, fixed share of a queue's memory to each query slot in the queue.
By default, WLM queues have a concurrency level of 5. Your workload might benefit from a higher concurrency level in certain cases, such as the following:
If many small queries are forced to wait for long-running queries, create a separate queue with a higher concurrency level and assign the smaller queries to that queue. A queue with a higher concurrency level has less memory allocated to each query slot, but the smaller queries require less memory.
If you have multiple queries that each access data on a single slice, set up a separate WLM queue to execute those queries concurrently. Amazon Redshift will assign concurrent queries to separate slices, which allows multiple queries to execute in parallel on multiple slices. For example, if a query is a simple aggregate with a predicate on the distribution key, the data for the query will be located on a single slice.
As a best practice, we recommend using a concurrency level of 15 or lower. All of the compute nodes in a cluster, and all of the slices on the nodes, participate in parallel query execution. By increasing concurrency, you increase the contention for system resources and limit the overall throughput.
The memory that is allocated to each queue is divided among the query slots in that queue. The amount of memory available to a query is the memory allocated to the query slot in which the query is running, regardless of the number of queries that are actually running concurrently. A query that can run entirely in memory when the concurrency level is 5 might need to write intermediate results to disk if the concurrency level is increased to 20. The additional disk I/O could degrade performance.
If a specific query needs more memory than is allocated to a single query slot,
you can increase the available memory by increasing the wlm_query_slot_count
parameter. The following example sets
wlm_query_slot_count to 10,
performs a vacuum, and then resets
wlm_query_slot_count to 1.
set wlm_query_slot_count to 10; vacuum; set wlm_query_slot_count to 1;
For more information, see Managing How Queries Use Memory.
You can assign a comma-separated list of user groups to a queue. When a member of a listed user group runs a query, that query runs in the corresponding queue. There is no set limit on the number of user groups that can be assigned to a queue.
You can assign a comma-separated list of query groups for each queue. A query group is simply a label. At runtime, you can assign the query group label to a series of queries. Any queries that are assigned to a listed query group will run in the corresponding queue. There is no set limit to the number of query groups that can be assigned to a queue.
You can assign user groups and query groups to a queue either individually or by
using wildcards. For example, if you add
'dba_*' to the list
of user groups for a queue, any query that is run by a user with a user name that
begins with '
dba_' is assigned to that queue. Wildcards are
disabled by default. To enable a queue to use wildcards through the AWS Management
Console, select the Enable User Group Wildcards check box or the
Enable Query Group Wildcards check box on the
Workload Management Configuration tab. To enable wildcards
through the API or CLI, set the
query_group_wild_card value or
user_group_wild_card value to
To specify the amount of available memory that is allocated to a query, you can set the WLM Memory Percent to Use parameter. By default, each user-defined queue is allocated an equal portion of the memory that is available for user-defined queries. For example, if you have four user-defined queues, each queue is allocated 25% of the available memory. The superuser queue has its own allocated memory and cannot be modified. You can assign an integer percentage of memory to each queue, up to a total of 100%. If you attempt to assign more than 100%, the WLM configuration is rejected. If you specify a WLM Memory Percent to Use value for one or more queues but not all of the queues, then the unallocated memory is divided evenly among the remaining queues.
For example, if you configure four queues and assign 30% and 40% to two queues, then 30% is left for the other two queues. The result is that queues are allocated as follows: 30%, 40%, 15%, 15%. If you allocate memory to all of the queues and the total allocation is less than 100%, then the remainder is allocated evenly to all of the queues.
To set the WLM Memory Percent to Use parameter by using the AWS Management
Console, go to the Workload Management Configuration tab and
specify an integer in the WLM Memory Percent to Use field for
each queue. To set the WLM Memory Percent to Use value by using the API or CLI, use
memory_percent_to_use parameter. For information about
modifying parameter groups, see Amazon Redshift Parameter Groups.
To limit the amount of time that queries in a given WLM queue are permitted to use, you can set the WLM timeout value for each queue. The timeout parameter specifies the amount of time, in milliseconds, that Amazon Redshift will wait for a query in a queue to execute before canceling the query.
The function of WLM timeout is similar to the statement_timeout configuration parameter, except that, where
statement_timeout configuration parameter applies to the
entire cluster, WLM timeout is specific to a single queue in the WLM configuration.
To set the WLM timeout parameter by using the AWS Management Console, go to the
Workload Management Configuration tab and specify a number of
milliseconds in the Timeout field for each queue. Specify
0 or leave the field blank to disable WLM timeout. WLM
timeout is disabled by default.
To set the WLM timeout value by using the API or CLI, use the
max_execution_time parameter. To set the
statement_timeout configuration parameter for a cluster,
modify the Parameter Group configuration for the cluster. For information about
modifying parameter groups, see Amazon Redshift Parameter Groups
If both WLM timeout (
statement_timeout are specified, the shorter timeout is
The last queue defined in the WLM configuration is the default queue. You can set the concurrency level and the timeout for the default queue, but it cannot include user groups or query groups. The default queue counts against the limit of eight query queues and the limit of 50 concurrent queries.
To run a query in the Superuser queue, a user must be logged in as a superuser and must run the query within the predefined 'superuser' query group.
To view a list of superusers, query the PG_USER system catalog table.
select * from pg_user where usesuper = 'true';