Guidelines for Working with Tables
- Design For Uniform Data Access Across Items In Your Tables
- Understand Partition Behavior
- Use Burst Capacity Sparingly
- Distribute Write Activity During Data Upload
- Understand Access Patterns for Time Series Data
- Cache Popular Items
- Consider Workload Uniformity When Adjusting Provisioned Throughput
- Test Your Application At Scale
This section covers some best practices for working with tables.
Design For Uniform Data Access Across Items In Your Tables
The optimal usage of a table's provisioned throughput depends on these factors:
The primary key selection.
The workload patterns on individual items.
The primary key uniquely identifies each item in a table. The primary key can be simple (partition key) or composite (partition key and sort key).
When it stores data, DynamoDB divides a table's items into multiple partitions, and distributes the data primarily based upon the partition key value. The provisioned throughput associated with a table is also divided evenly among the partitions, with no sharing of provisioned throughput across partitions.
Total Provisioned Throughput / Partitions = Throughput Per Partition
Consequently, to achieve the full amount of request throughput you have provisioned for a table, keep your workload spread evenly across the partition key values. Distributing requests across partition key values distributes the requests across partitions.
For example, if a table has a very small number of heavily accessed partition key values, possibly even a single very heavily used partition key value, request traffic is concentrated on a small number of partitions – potentially only one partition. If the workload is heavily unbalanced, meaning that it is disproportionately focused on one or a few partitions, the requests will not achieve the overall provisioned throughput level. To get the most out of DynamoDB throughput, create tables where the partition key has a large number of distinct values, and values are requested fairly uniformly, as randomly as possible.
This does not mean that you must access all of the partition key values to achieve your throughput level; nor does it mean that the percentage of accessed partition key values needs to be high. However, be aware that when your workload accesses more distinct partition key values, those requests will be spread out across the partitioned space in a manner that better utilizes your allocated throughput level. In general, you will utilize your throughput more efficiently as the ratio of partition key values accessed to the total number of partition key values in a table grows.
Choosing a Partition Key
The following table compares some common partition key schemas for provisioned throughput efficiency:
|Partition key value||Uniformity|
User ID, where the application has many users.
|Status code, where there are only a few possible status codes.||Bad|
|Item creation date, rounded to the nearest time period (e.g. day, hour, minute)||Bad|
|Device ID, where each device accesses data at relatively similar intervals||Good|
|Device ID, where even if there are a lot of devices being tracked, one is by far more popular than all the others.||Bad|
If a single table has only a very small number of partition key values, consider distributing your write operations across more distinct partition key values. In other words, structure the primary key elements to avoid one "hot" (heavily requested) partition key value that slows overall performance.
example, consider a table with a composite primary key. The partition key
represents the item's creation date, rounded to the nearest day. The sort key is an
item identifier. On a given day, say
2014-07-09, all of the
new items will be written to that same partition key value.
If the table will fit entirely into a single partition (taking into consideration growth of your data over time), and if your application's read and write throughput requirements do not exceed the read and write capabilities of a single partition, then your application should not encounter any unexpected throttling as a result of partitioning.
However, if you anticipate scaling beyond a single partition, then you should architect your application so that it can use more of the table's full provisioned throughput.
Randomizing Across Multiple Partition Key Values
One way to increase the write throughput of this application would be to
randomize the writes across multiple partition key values. Choose a random number
from a fixed set (for example,
200) and concatenate it as a suffix to the date. This
will yield partition key values such as
2014-07-09.2 and so on through
2014-07-09.200. Because you are randomizing the partition
key, the writes to the table on each day are spread evenly across all of the
partition key values; this will yield better parallelism and higher overall
To read all of the items for a given day, you would need to obtain all of the
items for each suffix. For example, you would first issue a
Query request for the partition key value
2014-07-09.1, then another
2014-07-09.2, and so on through
2014-07-09.200. Finally, your application would need
to merge the results from all of the
Using a Calculated Value
A randomizing strategy can greatly improve write throughput; however, it is difficult to read a specific item because you don't know which suffix value was used when writing the item. To make it easier to read individual items, you can use a different strategy: Instead of using a random number to distribute the items among partitions, use a number that you are able to calculate based upon something that's intrinsic to the item.
Continuing with our example, suppose that each item has an
OrderId. Before your application writes the item to
the table, it can calculate a partition key suffix based upon the order ID. The
calculation should result in a number between 1 and 200 that is fairly evenly
distributed given any set of names (or user IDs.)
A simple calculation would suffice, such as the product of the ASCII values
for the characters in the order ID, modulo 200 + 1. The partition key value would
then be the date concatenated with the calculation result as a suffix. With this
strategy, the writes are spread evenly across the partition key values, and thus across the
partitions. You can easily perform a
GetItem operation on a
particular item, because you can calculate the partition key value you need when you want
to retrieve a specific
To read all of the items for a given day, you would still need to
Query each of the
N is 1 to 200), and your application would
need to merge all of the results. However, you will avoid having a single "hot"
partition key value taking all of the workload.
Understand Partition Behavior
DynamoDB manages table partitioning for you automatically, adding new partitions as your table grows in size. You can estimate the number of partitions that DynamoDB will create for your table, and compare that estimate against your scale and access patterns. This can help you determine the best table design for your application needs.
The number of partitions in a table is based on the table's storage requirements, and also its provisioned throughput requirements.
Number of partitions required, based solely on a table's size
numPartitionstableSize = tableSizeInBytes / 10 GB
The size of the table is one factor in determining the number of partitions needed. The other factor is the table's provisioned throughput requirements. For any one partition, DynamoDB can allocate a maximum of 3000 read capacity units or 1000 write capacity units.
Number of partitions required, based solely on a table's provisioned read and write throughput settings
numPartitionsthroughput = ( readCapacityUnits / 3000 ) + ( writeCapacityUnits / 1000 )
For example, suppose that you provisioned a table with 1000 read capacity units and 500 write capacity units. In this case, numPartitionsthroughput would be:
( 1000 / 3000 ) + ( 500 / 1000 ) = 0.8333
Therefore, a single partition could accommodate all of the table's provisioned throughput requirements.
However, if you provisioned 1000 read capacity units and 1000 write capacity units, then numPartitionsthroughput would exceed a single partition's throughput capacity:
( 1000 / 3000 ) + ( 1000 / 1000 ) = 1.333
In this case, the table would require two partitions, each with 500 read capacity units and 500 write capacity units.
The total number of partitions allocated by DynamoDB is the larger of the table's size or its provisioned throughput requirements:
Total number of partitions allocated by DynamoDB
numPartitionstotal = MAX ( numPartitionstableSize | numPartitionsthroughput )
A single partition can hold approximately 10 GB of data. If your table size grows beyond 10 GB, DynamoDB will spread your data across additional partitions, and will also distribute your table's read and write throughput accordingly. Therefore, as your table grows in size, less throughput will be provisioned per partition.
Suppose that you have a table that has grown to 500 GB in size. This would mean that the table now occupies approximately 50 partitions:
500 GB / 10 GB = 50 partitions
Now suppose that you have allocated 100,000 read capacity units to the table. You could determine the amount of read capacity per partition as follows:
100,000 read capacity units / 50 partitions = 2000 read capacity units per partition
In the future, these details of partition sizes and throughput allocation per partition may change.
Use Burst Capacity Sparingly
DynamoDB provides some flexibility in the per-partition throughput provisioning: When you are not fully utilizing a partition's throughput, DynamoDB reserves a portion of your unused capacity for later "bursts" of throughput usage. DynamoDB currently reserves up 5 minutes (300 seconds) of unused read and write capacity. During an occasional burst of read or write activity, these extra capacity units can be consumed very quickly — even faster than the per-second provisioned throughput capacity that you've defined for your table. However, do not design your application so that it depends on burst capacity being available at all times: DynamoDB can and does use burst capacity for background maintenance and other tasks without prior notice.
In the future, these details of burst capacity may change.
Distribute Write Activity During Data Upload
There are times when you load data from other data sources into DynamoDB. Typically, DynamoDB partitions your table data on multiple servers. When uploading data to a table, you get better performance if you upload data to all the allocated servers simultaneously. For example, suppose you want to upload user messages to a DynamoDB table. You might design a table that uses a composite primary key in which UserID is the partition key and the MessageID is the sort key. When uploading data from your source, you might tend to read all message items for a specific user and upload these items to DynamoDB as shown in the sequence in the following table.
|U1||... up to 100|
|U2||... up to 200|
The problem in this case is that you are not distributing your write requests to DynamoDB across your partition key values. You are taking one partition key value at a time and uploading all its items, before going to the next partition key value and doing the same. Behind the scenes, DynamoDB is partitioning the data in your tables across multiple servers. To fully utilize all of the throughput capacity that has been provisioned for your tables, you need to distribute your workload across your partition key values. In this case, by directing an uneven amount of upload work toward items all with the same partition key value, you may not be able to fully utilize all of the resources DynamoDB has provisioned for your table. You can distribute your upload work by uploading one item from each partition key value first. Then you repeat the pattern for the next set of sort key values for all the items until you upload all the data as shown in the example upload sequence in the following table:
Every upload in this sequence uses a different partition key value, keeping more DynamoDB servers busy simultaneously and improving your throughput performance.
Understand Access Patterns for Time Series Data
For each table that you create, you specify the throughput requirements. DynamoDB allocates and reserves resources to handle your throughput requirements with sustained low latency. When you design your application and tables, you should consider your application's access pattern to make the most efficient use of your table's resources.
Suppose you design a table to track customer behavior on your site, such as URLs that they click. You might design the table with a composite primary key consisting of Customer ID as the partition key and date/time as the sort key. In this application, customer data grows indefinitely over time; however, the applications might show uneven access pattern across all the items in the table where the latest customer data is more relevant and your application might access the latest items more frequently and as time passes these items are less accessed, eventually the older items are rarely accessed. If this is a known access pattern, you could take it into consideration when designing your table schema. Instead of storing all items in a single table, you could use multiple tables to store these items. For example, you could create tables to store monthly or weekly data. For the table storing data from the latest month or week, where data access rate is high, request higher throughput and for tables storing older data, you could dial down the throughput and save on resources.
You can save on resources by storing "hot" items in one table with higher throughput settings, and "cold" items in another table with lower throughput settings. You can remove old items by simply deleting the tables. You can optionally backup these tables to other storage options such as Amazon Simple Storage Service (Amazon S3). Deleting an entire table is significantly more efficient than removing items one-by-one, which essentially doubles the write throughput as you do as many delete operations as put operations.
Cache Popular Items
Some items in a table might be more popular than others. For example, consider the ProductCatalog table that is described in Creating Tables and Loading Sample Data, and suppose that this table contains millions of different products. Some products might be very popular among customers, so those items would be consistently accessed more frequently than the others. As a result, the distribution of read activity on ProductCatalog would be highly skewed toward those popular items.
One solution would be to cache these reads at the application layer. Caching is a technique that is used in many high-throughput applications, offloading read activity on hot items to the cache rather than to the database. Your application can cache the most popular items in memory, or use a product such as ElastiCache to do the same.
Continuing with the ProductCatalog example, when a customer requests an item from that table, the application would first consult the cache to see if there is a copy of the item there. If so, it is a cache hit; otherwise, it is a cache miss. When there is a cache miss, the application would need to read the item from DynamoDB and store a copy of the item in the cache. Over time, the cache misses would decrease as the cache fills with the most popular items; applications would not need to access DynamoDB at all for these items.
A caching solution can mitigate the skewed read activity for popular items. In addition, since it reduces the amount of read activity against the table, caching can help reduce your overall costs for using DynamoDB.
Consider Workload Uniformity When Adjusting Provisioned Throughput
As the amount of data in your table grows, or as you provision additional read and
write capacity, DynamoDB automatically spreads the data across multiple partitions. If your
application doesn't require as much throughput, you simply decrease it using the
UpdateTable operation, and pay only for the throughput that you
For applications that are designed for use with uniform workloads, DynamoDB's partition allocation activity is not noticeable. A temporary non-uniformity in a workload can generally be absorbed by the bursting allowance, as described in Use Burst Capacity Sparingly. However, if your application must accommodate non-uniform workloads on a regular basis, you should design your table with DynamoDB's partitioning behavior in mind (see Understand Partition Behavior), and be mindful when increasing and decreasing provisioned throughput on that table.
If you reduce the amount of provisioned throughput for your table, DynamoDB will not decrease the number of partitions . Suppose that you created a table with a much larger amount of provisioned throughput than your application actually needed, and then decreased the provisioned throughput later. In this scenario, the provisioned throughput per partition would be less than it would have been if you had initially created the table with less throughput.
For example, consider a situation where you need to bulk-load 20 million items into a DynamoDB table. Assume that each item is 1 KB in size, resulting in 20 GB of data. This bulk-loading task will require a total of 20 million write capacity units. To perform this data load within 30 minutes, you would need to set the provisioned write throughput of the table to 11,000 write capacity units.
The maximum write throughput of a partition is 1000 write capacity units ( (see Understand Partition Behavior); therefore, DynamoDB will create 11 partitions, each with 1000 provisioned write capacity units.
After the bulk data load, your steady-state write throughput requirements might be much lower — for example, suppose that your applications only require 200 writes per second. If you decrease the table's provisioned throughput to this level, each of the 11 partitions will have around 20 write capacity units per second provisioned. This level of per-partition provisioned throughput, combined with DynamoDB's bursting behavior, might be adequate for the application.
However, if an application will require sustained write throughput above 20 writes per second per partition, you should either: (a) design a schema that requires fewer writes per second per partition key value, or (b) design the bulk data load so that it runs at a slower pace and reduces the initial throughput requirement. For example, suppose that it was acceptable to run the bulk import for over 3 hours, instead of just 30 minutes. In this scenario, only 1900 write capacity units per second needs to be provisioned, rather than 11,000. As a result, DynamoDB would create only two partitions for the table.
Test Your Application At Scale
Many tables begin with small amounts of data, but then grow larger as applications perform write activity. This growth can occur gradually, without exceeding the provisioned throughput settings you have defined for the table. As your table grows larger, DynamoDB automatically scales your table out by distributing the data across more partitions. When this occurs, the provisioned throughput that is allocated to each resulting partition is less than that which is allocated for the original partition(s).
Suppose that your application accesses the table's data across all of the partition key values, but in a non-uniform fashion (accessing a small number of partition key values more frequently than others). Your application might perform acceptably when there is not very much data in the table. However, as the table becomes larger, there will be more partitions and less throughput per partition. You might discover that your application is throttled when it attempts to use the same non-uniform access pattern that worked in the past.
To avoid problems with "hot" keys when your table becomes larger, make sure that you test your application design at scale. Consider the ratio of storage to throughput when running at scale, and how DynamoDB will allocate partitions to the table. (For more information, see Understand Partition Behavior.)
If it isn't possible for you to generate a large amount of test data, you can create a table that has very high provisioned throughput settings. This will create a table with many partitions; you can then use UpdateTable to reduce the settings, but keep the same ratio of storage to throughput that you determined for running the application at scale. You now have a table that has the throughput-per-partition ratio that you expect after it grows to scale. Test your application against this table using a realistic workload.
Tables that store time series data can grow in an unbounded manner, and can cause slower application performance over time. With time series data, applications typically read and write the most recent items in the table more frequently than older items. If you can remove older time series data from your real-time table, and archive that data elsewhere, you can maintain a high ratio of throughput per partition.
For best practices with time series data, Understand Access Patterns for Time Series Data.