Working with Tables in DynamoDB
- Specifying the Primary Key
- Specifying Read and Write Requirements for Tables
- Capacity Units Calculations for Various Operations
- Listing and Describing Tables
- Guidelines for Working with Tables
- Working with Tables Using the AWS SDK for Java Document API
- Working with Tables Using the AWS SDK for .NET Low-Level API
- Working with Tables Using the AWS SDK for PHP Low-Level API
When you create a table in Amazon DynamoDB, you must provide a table name, its primary key and your required read and write throughput values. The table name can include characters a-z, A-Z, 0-9, '_' (underscore), '-' (dash), and '.' (dot). Names can be between 3 and 255 characters long. In a relational database, a table has a predefined schema that describes properties such as the table name, primary key, column names, and data types. All records stored in the table must have the same set of columns. DynamoDB is a NoSQL database: Except for the required primary key, a DynamoDB table is schema-less. Individual items in a DynamoDB table can have any number of attributes, although there is a limit of 400 KB on the item size.
Specifying the Primary Key
When you create a table, in addition to the table name, you must specify the primary key of the table. The primary key uniquely identifies each item, so that no two items in the table can have the same primary key.
DynamoDB supports two different kinds of primary keys:
Partition Key—A simple primary key, composed of one attribute, known as the partition key. DynamoDB uses the partition key's value as input to an internal hash function; the output from the hash function determine the partition where the item will be stored. No two items in a table can have the same partition key value.
Partition Key and Sort Key—A composite primary key composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key. DynamoDB uses the partition key value as input to an internal hash function; the output from the hash function determines the partition where the item will be stored. All items with the same partition key are stored together, in sorted order by sort key value. It is possible for two items to have the same partition key value, but those two items must have different sort key values.
Specifying Read and Write Requirements for Tables
DynamoDB is built to support workloads of any scale with predictable, low-latency response times.
To ensure high availability and low latency responses, DynamoDB requires that you specify your required read and write throughput values when you create a table. DynamoDB uses this information to reserve sufficient hardware resources and appropriately partitions your data over multiple servers to meet your throughput requirements. As your application data and access requirements change, you can easily increase or decrease your provisioned throughput using the DynamoDB console or the API.
DynamoDB allocates and reserves resources to handle your throughput requirements with sustained low latency and you pay for the hourly reservation of these resources. However, you pay as you grow and you can easily scale up or down your throughput requirements. For example, you might want to populate a new table with a large amount of data from an existing data store. In this case, you could create the table with a large write throughput setting, and after the initial data upload, you could reduce the write throughput and increase the read throughput to meet your application's requirements.
During the table creation, you specify your throughput requirements in terms of the
following capacity units. You can also specify these units in an
UpdateTable request to
increase or decrease the provisioned throughput of an existing table:
Read capacity units – The number of strongly consistent reads per second of items up to 4 KB in size per second. For example, when you request 10 read capacity units, you are requesting a throughput of 10 strongly consistent reads per second of 4 KB for that table. For eventually consistent reads, one read capacity unit is two reads per second for items up to 4 KB. For more information about read consistency, see Data Read and Consistency Considerations.
Write capacity units – The number of 1 KB writes per second. For example, when you request 10 write capacity units, you are requesting a throughput of 10 writes per second of 1 KB size per second for that table.
DynamoDB uses these capacity units to provision sufficient resources to provide the requested throughput.
When deciding the capacity units for your table, you must take the following into consideration:
Item size – DynamoDB allocates resources for your table according to the number of read or write capacity units that you specify. These capacity units are based on a data item size of 4 KB per read or 1 KB per write. For example, if the items in your table are 4 KB or smaller, each item read operation will consume one read capacity unit. If your items are larger than 4 KB, each read operation consumes additional capacity units, in which case you can perform fewer database read operations per second than the number of read capacity units you have provisioned. For example, if you request 10 read capacity units throughput for a table, but your items are 8 KB in size, then you will get a maximum of 5 strongly consistent reads per second on that table.
Expected read and write request rates – You must also determine the expected number of read and write operations your application will perform against the table, per second. This, along with the estimated item size helps you to determine the read and write capacity unit values.
Consistency – Read capacity units are based on strongly consistent read operations, which require more effort and consume twice as many database resources as eventually consistent reads. For example, a table that has 10 read capacity units of provisioned throughput would provide either 10 strongly consistent reads per second of 4 KB items, or 20 eventually consistent reads per second of the same items. Whether your application requires strongly or eventually consistent reads is a factor in determining how many read capacity units you need to provision for your table. By default, DynamoDB read operations are eventually consistent. Some of these operations allow you to specify strongly consistent reads.
Local secondary indexes – If you want to create one or more local secondary indexes on a table, you must do so at table creation time. DynamoDB automatically creates and maintains these indexes. Queries against indexes consume provisioned read throughput. If you write to a table, DynamoDB will automatically write data to the indexes when needed, to keep them synchronized with the table. The capacity units consumed by index operations are charged against the table's provisioned throughput. In other words, you only specify provisioned throughput settings for the table, not for each individual index on that table. For more information, see Provisioned Throughput Considerations for Local Secondary Indexes.
These factors help you to determine your application's throughput requirements that you provide when you create a table. You can monitor the performance using CloudWatch metrics, and even configure alarms to notify you in the event you reach certain threshold of consumed capacity units. The DynamoDB console provides several default metrics that you can review to monitor your table performance and adjust the throughput requirements as needed. For more information, go to DynamoDB Console.
DynamoDB automatically distributes your data across table partitions, which are stored on multiple servers. For optimal throughput, you should distribute read requests as evenly as possible across these partitions. For example, you might provision a table with 1 million read capacity units per second. If you issue 1 million requests for a single item in the table, all of the read activity will be concentrated on a single partition. However, if you spread your requests across all of the items in the table, DynamoDB can access the table partitions in parallel, and allow you to reach your provisioned throughput goal for the table.
For reads, the following table compares some provisioned throughput values for different average item sizes, request rates, and consistency combinations.
|Expected Item Size||Consistency||Desired Reads Per Second||Provisioned Throughput Required|
Item sizes for reads are rounded up to the next 4 KB multiple. For example, an item of 3,500 bytes consumes the same throughput as a 4 KB item.
For writes, the following table compares some provisioned throughput values for different average item sizes and write request rates.
|Expected Item Size||Desired Writes Per Second||Provisioned Throughput Required|
Item sizes for writes are rounded up to the next 1 KB multiple. For example, an item of 500 bytes consumes the same throughput as a 1 KB item.
DynamoDB commits resources to your requested read and write capacity units, and, consequently, you are expected to stay within your requested rates. Provisioned throughput also depends on the size of the requested data. If your read or write request rate, combined with the cumulative size of the requested data, exceeds the current reserved capacity, DynamoDB returns an error that indicates that the provisioned throughput level has been exceeded.
Set your provisioned throughput using the
parameter. For information about setting the
ProvisionedThroughput parameter, see CreateTable in the Amazon DynamoDB API Reference.
For information about using provisioned throughput, see Guidelines for Working with Tables.
If you expect upcoming spikes in your workload (such as a new product launch) that
will cause your throughput to exceed the current provisioned throughput for your
table, we advise that you use the UpdateTable operation to increase the
ProvisionedThroughput value. For the current maximum
Provisioned Throughput values per table or account, see Limits in DynamoDB.
When you issue an
UpdateTable request, the status of the
table changes from
UPDATING. The table
remains fully available for use while it is
UPDATING. During this time, DynamoDB
allocates the necessary resources to support the new provisioned throughput levels.
When this process is completed, the table status changes from
Capacity Units Calculations for Various Operations
The capacity units consumed by an operation depends on the following:
Read consistency (in case of a read operation)
For a table without local secondary indexes, the basic rule is that if your request reads a item of 4 KB or writes an item of 1 KB in size, you consume 1 capacity unit. This section describes how DynamoDB computes the item size for the purpose of determining capacity units consumed by an operation. In the case of a read operation, this section describes the impact of strong consistency vs. eventual consistency read on the capacity unit consumed by the read operation.
Item Size Calculations
For each request that you send, DynamoDB computes the capacity units consumed by that operation. Item size is one of the factors that DynamoDB uses in computing the capacity units consumed. This section describes how DynamoDB determines the size of items involved in an operation.
You can optimize the read capacity consumption by making individual items as small as possible. The easiest way to do so is to minimize the length of the attribute names. You can also reduce item size by storing less frequently accessed attributes in a separate table.
The size of an item is the sum of the lengths of its attribute names and values.
The size of a Null or Boolean attribute value is (length of the attribute name + one byte).
An attribute of type List or Map requires 3 bytes of overhead, regardless of its contents. The size of an empty List or Map is (length of the attribute name + 3 bytes). If the attribute is non-empty, the size is (length of the attribute name + sum (length of attribute values) + 3 bytes).
DynamoDB reads data in blocks of 4 KB. For
reads only one item, DynamoDB rounds the item size up to the next 4 KB. For
example, if you get an item of 3.5 KB, DynamoDB rounds the items size to 4 KB.
If you get an item of 10 KB, DynamoDB rounds the item size to 12 KB.
DynamoDB writes data in blocks of 1 KB. For
write only one item, DynamoDB rounds the item size up to the next 1 KB. For
example, if you put or delete an item of 1.6 KB, DynamoDB rounds the item size up to 2
If you perform a read operation on an item that does not exist, DynamoDB will still consume provisioned read throughput: A strongly consistent read request consumes one read capacity unit, while an eventually consistent read request consumes 0.5 of a read capacity unit.
Most write operations in DynamoDB allow conditional writes, where you specify one or more conditions that must be met in order for the operation to succeed. Even if a conditional write fails, it still consumes provisioned throughput. A failed conditional write of a 1 KB item would consume one write capacity unit; if the item were twice that size, the failed conditional write would consume two write capacity units.
BatchGetItem, each item in the batch is read separately,
so DynamoDB first rounds up the size of each item to the next 4 KB and then
calculates the total size. The result is not necessarily the same as the total size
of all the items. For example, if
BatchGetItem reads a 1.5 KB
item and a 6.5 KB item, DynamoDB will calculate the size as 12 KB (4 KB + 8 KB),
not 8 KB (1.5 KB + 6.5 KB).
Query, all items returned are treated as a single read
operation. As a result, DynamoDB computes the total size of all items and then rounds
up to the next 4 KB boundary. For example, suppose your query returns 10
items whose combined size is 40.8 KB. DynamoDB rounds the item size for the operation
to 44 KB. If a query returns 1500 items of 64 bytes each, the cumulative size is 96
In the case of a
Scan operation, DynamoDB considers the size of
the items that are evaluated, not the size of the items returned by the scan. For a
scan request, DynamoDB evaluates up to 1 MB of items and returns only the
items that satisfy the scan condition.
In computing the storage used by the table, DynamoDB adds 100 bytes of
overhead to each item for indexing purposes. The
DescribeTable operation returns a table size that
includes this overhead. This overhead is also included when billing you for the
storage costs. However, this extra 100 bytes is not used in computing the
capacity unit calculation. For more information about pricing, go to DynamoDB Pricing.
For any operation that returns items, you can request a subset of attributes to
retrieve; however, doing so has no impact on the item size calculations. In
Scan can return item
counts instead of attribute values. Getting the count of items uses the same
quantity of read capacity units and is subject to the same item size calculations,
because DynamoDB has to read each item in order to increment the count.
PutItem operation adds an item to the table. If an item
with the same primary key exists in the table, the operation replaces the item. For
calculating provisioned throughput consumption, the item size that matters is the
larger of the two.
For an UpdateItem operation, DynamoDB considers the size of the item as it appears before and after the update. The provisioned throughput consumed reflects the larger of these item sizes. Even if you update just a subset of the item's attributes, UpdateItem will still consume the full amount of provisioned throughput (the larger of the "before" and "after" item sizes).
When you issue a
DeleteItem request, DynamoDB uses the size of
the deleted item to calculate provisioned throughput consumption.
Read Operation and Consistency
For a read operation, the preceding calculations assume strongly consistent read requests. For an eventually consistent read request, the operation consumes only half the capacity units. For an eventually consistent read, if total item size is 80 KB, the operation consumes only 10 capacity units.
Listing and Describing Tables
To obtain a list of all your tables, use the
operation. A single
ListTables call can return a maximum of 100
table names; if you have more than 100 tables, you can request that
ListTables return paginated results, so that you can retrieve
all of the table names.
To determine the structure of any table, use the
operation. The metadata returned by
DescribeTable includes the
timestamp when it was created, its key schema, its provisioned throughput settings, its
estimated size, and any secondary indexes that are present.
If you issue a
DescribeTable request immediately after a
CreateTable request, DynamoDB might return a
ResourceNotFoundException. This is because
DescribeTable uses an eventually consistent query, and the
metadata for your table might not be available at that moment. Wait for a few
seconds, and then try the
DescribeTable request again.