Working with Tables in DynamoDB
When you create a table in Amazon DynamoDB, you must provide a table name, its primary key and your required read and write throughput values. The table name can include characters a-z, A-Z, 0-9, '_' (underscore), '-' (hyphen), and '.' (dot). Names can be between 3 and 255 characters long. In a relational database, a table has a predefined schema that describes properties such as the table name, primary key, column names, and data types. All records stored in the table must have the same set of columns. DynamoDB is a NoSQL database: Except for the required primary key, a DynamoDB table is schema-less. Individual items in a DynamoDB table can have any number of attributes, although there is a limit of 400 KB on the item size.
For best practices, see Guidelines for Working with Tables.
Specifying the Primary Key
When you create a table, in addition to the table name, you must specify the primary key of the table. The primary key uniquely identifies each item, so that no two items in the table can have the same primary key.
DynamoDB supports two different kinds of primary keys:
Partition Key—A simple primary key, composed of one attribute, known as the partition key. DynamoDB uses the partition key's value as input to an internal hash function; the output from the hash function determine the partition where the item will be stored. No two items in a table can have the same partition key value.
Partition Key and Sort Key—A composite primary key composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key. DynamoDB uses the partition key value as input to an internal hash function; the output from the hash function determines the partition where the item will be stored. All items with the same partition key are stored together, in sorted order by sort key value. It is possible for two items to have the same partition key value, but those two items must have different sort key values.
Specifying Read and Write Requirements for Tables
DynamoDB is built to support workloads of any scale with predictable, low-latency response times.
To ensure high availability and low latency responses, DynamoDB requires that you specify your required read and write throughput values when you create a table. DynamoDB uses this information to reserve sufficient hardware resources and appropriately partitions your data over multiple servers to meet your throughput requirements. As your application data and access requirements change, you can easily increase or decrease your provisioned throughput using the DynamoDB console or the API.
DynamoDB allocates and reserves resources to handle your throughput requirements with sustained low latency and you pay for the hourly reservation of these resources. However, you pay as you grow and you can easily scale up or down your throughput requirements. For example, you might want to populate a new table with a large amount of data from an existing data store. In this case, you could create the table with a large write throughput setting, and after the initial data upload, you could reduce the write throughput and increase the read throughput to meet your application's requirements.
During the table creation, you specify your throughput requirements in terms of the
following capacity units. You can also specify these units in an
UpdateTable request to
increase or decrease the provisioned throughput of an existing table:
Read capacity units – The number of strongly consistent reads per second of items up to 4 KB in size per second. For example, when you request 10 read capacity units, you are requesting a throughput of 10 strongly consistent reads per second of 4 KB for that table. For eventually consistent reads, one read capacity unit is two reads per second for items up to 4 KB. For more information about read consistency, see Read Consistency.
Write capacity units – The number of 1 KB writes per second. For example, when you request 10 write capacity units, you are requesting a throughput of 10 writes per second of 1 KB size per second for that table.
DynamoDB uses these capacity units to provision sufficient resources to provide the requested throughput.
When deciding the capacity units for your table, you must take the following into consideration:
Item size – DynamoDB allocates resources for your table according to the number of read or write capacity units that you specify. These capacity units are based on a data item size of 4 KB per read or 1 KB per write. For example, if the items in your table are 4 KB or smaller, each item read operation will consume one read capacity unit. If your items are larger than 4 KB, each read operation consumes additional capacity units, in which case you can perform fewer database read operations per second than the number of read capacity units you have provisioned. For example, if you request 10 read capacity units throughput for a table, but your items are 8 KB in size, then you will get a maximum of 5 strongly consistent reads per second on that table.
Expected read and write request rates – You must also determine the expected number of read and write operations your application will perform against the table, per second. This, along with the estimated item size helps you to determine the read and write capacity unit values.
Consistency – Read capacity units are based on strongly consistent read operations, which require more effort and consume twice as many database resources as eventually consistent reads. For example, a table that has 10 read capacity units of provisioned throughput would provide either 10 strongly consistent reads per second of 4 KB items, or 20 eventually consistent reads per second of the same items. Whether your application requires strongly or eventually consistent reads is a factor in determining how many read capacity units you need to provision for your table. By default, DynamoDB read operations are eventually consistent. Some of these operations allow you to specify strongly consistent reads.
Secondary indexes – Queries against indexes consume provisioned read throughput. DynamoDB supports two kinds of secondary indexes (global secondary indexes and local secondary indexes). To learn more about indexes and provisinoed throughput, see Improving Data Access with Secondary Indexes in DynamoDB.
These factors help you to determine your application's throughput requirements that you provide when you create a table. You can monitor the performance using CloudWatch metrics, and even configure alarms to notify you in the event you reach certain threshold of consumed capacity units. The DynamoDB console provides several default metrics that you can review to monitor your table performance and adjust the throughput requirements as needed. For more information, go to DynamoDB Console.
DynamoDB automatically distributes your data across table partitions, which are stored on multiple servers. For optimal throughput, you should distribute read requests as evenly as possible across these partitions. For example, you might provision a table with 1 million read capacity units per second. If you issue 1 million requests for a single item in the table, all of the read activity will be concentrated on a single partition. However, if you spread your requests across all of the items in the table, DynamoDB can access the table partitions in parallel, and allow you to reach your provisioned throughput goal for the table.
For reads, the following table compares some provisioned throughput values for different average item sizes, request rates, and consistency combinations.
|Expected Item Size||Consistency||Desired Reads Per Second||Provisioned Throughput Required|
Item sizes for reads are rounded up to the next 4 KB multiple. For example, an item of 3,500 bytes consumes the same throughput as a 4 KB item.
For writes, the following table compares some provisioned throughput values for different average item sizes and write request rates.
|Expected Item Size||Desired Writes Per Second||Provisioned Throughput Required|
Item sizes for writes are rounded up to the next 1 KB multiple. For example, an item of 500 bytes consumes the same throughput as a 1 KB item.
DynamoDB commits resources to your requested read and write capacity units, and, consequently, you are expected to stay within your requested rates. Provisioned throughput also depends on the size of the requested data. If your read or write request rate, combined with the cumulative size of the requested data, exceeds the current reserved capacity, DynamoDB returns an error that indicates that the provisioned throughput level has been exceeded.
Set your provisioned throughput using the
parameter. For information about setting the
ProvisionedThroughput parameter, see CreateTable in the Amazon DynamoDB API Reference.
For information about using provisioned throughput, see Guidelines for Working with Tables.
If you expect upcoming spikes in your workload (such as a new product launch) that
will cause your throughput to exceed the current provisioned throughput for your
table, we advise that you use the UpdateTable operation to increase the
ProvisionedThroughput value. For the current maximum
Provisioned Throughput values per table or account, see Limits in DynamoDB.
When you issue an
UpdateTable request, the status of the
table changes from
UPDATING. The table
remains fully available for use while it is
UPDATING. During this time, DynamoDB
allocates the necessary resources to support the new provisioned throughput levels.
When this process is completed, the table status changes from
Capacity Unit Calculations
In DynamoDB, capacity units are defined as follows:
One read capacity unit = one strongly consistent read per second, or two eventually consistent reads per second, for items up to 4 KB in size.
One write capacity unit = one write per second, for items up to 1 KB in size.
The size of an item is the sum of the lengths of its attribute names and values. (To optimize capacity unit consumption, we recommend that you minimize the length of attribute names.)
The size of a Null or Boolean attribute value is (length of the attribute name + one byte).
An attribute of type List or Map requires 3 bytes of overhead, regardless of its contents. The size of an empty List or Map is (length of the attribute name + 3 bytes). If the attribute is non-empty, the size is (length of the attribute name + sum (length of attribute values) + 3 bytes).
The following describes how DynamoDB read operations consume read capacity units:
GetItem—reads a single item from a table. For capacity unit calculations, the item size is rounded up to the next 4 KB boundary. For example, if you read an item that is 3.5 KB, DynamoDB rounds the item size to 4 KB. If you read an item of 10 KB, DynamoDB rounds the item size to 12 KB.
BatchGetItem—reads up to100 items, from one or more tables. DynamoDB processes each item in the batch as an individual
GetItemrequest, so DynamoDB first rounds up the size of each item to the next 4 KB boundary, and then calculates the total size. The result is not necessarily the same as the total size of all the items. For example, if
BatchGetItemreads a 1.5 KB item and a 6.5 KB item, DynamoDB will calculate the size as 12 KB (4 KB + 8 KB), not 8 KB (1.5 KB + 6.5 KB).
Query—reads multiple items that have the same partition key value. All of the items returned are treated as a single read operation, where DynamoDB computes the total size of all items and then rounds up to the next 4 KB boundary. For example, suppose your query returns 10 items whose combined size is 40.8 KB. DynamoDB rounds the item size for the operation to 44 KB. If a query returns 1500 items of 64 bytes each, the cumulative size is 96 KB.
Scan—reads all of the items in a table. DynamoDB considers the size of the items that are evaluated, not the size of the items returned by the scan.
If you perform a read operation on an item that does not exist, DynamoDB will still consume provisioned read throughput: A strongly consistent read request consumes one read capacity unit, while an eventually consistent read request consumes 0.5 of a read capacity unit.
For any operation that returns items, you can request a subset of attributes to
retrieve; however, doing so has no impact on the item size calculations. In
Scan can return item counts instead of
attribute values. Getting the count of items uses the same quantity of read capacity
units and is subject to the same item size calculations, because DynamoDB has to read
each item in order to increment the count.
Read Operations and Read Consistency
The preceding calculations assume strongly consistent read requests. For an eventually consistent read request, the operation consumes only half the capacity units. For an eventually consistent read, if total item size is 80 KB, the operation consumes only 10 capacity units.
The following describes how DynamoDB write operations consume write capacity units:
PutItem—writes a single item to a table. If an item with the same primary key exists in the table, the operation replaces the item. For calculating provisioned throughput consumption, the item size that matters is the larger of the two.
UpdateItem—modifies a single item in the table. DynamoDB considers the size of the item as it appears before and after the update. The provisioned throughput consumed reflects the larger of these item sizes. Even if you update just a subset of the item's attributes,
UpdateItemwill still consume the full amount of provisioned throughput (the larger of the "before" and "after" item sizes).
DeleteItem—removes a single item from a table. The provisioned throughput consumption is based on the size of the deleted item.
BatchWriteItem—writes up to 25 items to one or more tables. DynamoDB processes each item in the batch as an individual
DeleteItemrequest (updates are not supported), so DynamoDB first rounds up the size of each item to the next 1 KB boundary, and then calculates the total size. The result is not necessarily the same as the total size of all the items. For example, if
BatchWriteItemwrites a 500 byte item and a 3.5 KB item, DynamoDB will calculate the size as 5 KB (1 KB + 4 KB), not 4 KB (500 bytes + 3.5 KB).
operations, DynamoDB rounds the item size up to the next 1 KB. For example, if
you put or delete an item of 1.6 KB, DynamoDB rounds the item size up to 2 KB.
conditional writes, where you specify an expression that
must evaluate to true in order for the operation to succeed. If the expression
evaluates to false, DynamoDB will still consume write capacity units from the table:
If the item exists, the number of write capacity units consumed depends on the size of the item. (For example, a failed conditional write of a 1 KB item would consume one write capacity unit; if the item were twice that size, the failed conditional write would consume two write capacity units.)
If the item does not exist, DynamoDB will consume one write capacity unit.
Listing and Describing Tables
To obtain a list of all your tables, use the
operation. A single
ListTables call can return a maximum of 100
table names; if you have more than 100 tables, you can request that
ListTables return paginated results, so that you can retrieve
all of the table names.
To determine the structure of any table, use the
operation. The metadata returned by
DescribeTable includes the
timestamp when it was created, its key schema, its provisioned throughput settings, its
estimated size, and any secondary indexes that are present.
If you issue a
DescribeTable request immediately after a
CreateTable request, DynamoDB might return a
ResourceNotFoundException. This is because
DescribeTable uses an eventually consistent query, and the
metadata for your table might not be available at that moment. Wait for a few
seconds, and then try the
DescribeTable request again.
In computing the storage used by the table, DynamoDB adds 100 bytes of overhead
to each item for indexing purposes. The
DescribeTable operation returns
a table size that includes this overhead. This overhead is also included when
billing you for the storage costs. However, this extra 100 bytes is not used
in computing the capacity unit calculation. For more information about pricing, go
to DynamoDB Pricing.