Menu
Amazon DynamoDB
Developer Guide (API Version 2012-08-10)

DynamoDB Data Model

Tables, Items, and Attributes

In Amazon DynamoDB, a table is a collection of items and each item is a collection of attributes.

In a relational database, a table has a predefined schema such as the table name, primary key, list of its column names and their data types. All records stored in the table must have the same set of columns. In contrast, DynamoDB only requires that a table has a primary key, but does not require you to define all of the attribute names and data types in advance. Individual items in a DynamoDB table can have any number of attributes, although there is a limit of 400 KB on the item size. An item size is the sum of lengths of its attribute names and values (binary and UTF-8 lengths).

Each attribute in an item is a name-value pair. An attribute can be a scalar (single-valued), a JSON document, or a set. For example, consider storing a catalog of products in DynamoDB. You can create a table, ProductCatalog, with the Id attribute as its primary key. The primary key uniquely identifies each item, so that no two products in the table can have the same Id.

ProductCatalog ( Id, ... )

You can store various kinds of product items in the table. The following table shows sample items.

Example items
{ 
   Id = 101                                       
   ProductName = "Book 101 Title"
   ISBN = "111-1111111111"
   Authors = [ "Author 1", "Author 2" ]
   Price = -2
   Dimensions = "8.5 x 11.0 x 0.5"
   PageCount = 500
   InPublication = 1
   ProductCategory = "Book" 
}                                    
{
   Id = 201 
   ProductName = "18-Bicycle 201"
   Description = "201 description"
   BicycleType = "Road"
   Brand = "Brand-Company A"
   Price = 100
   Color = [ "Red", "Black" ]
   ProductCategory = "Bike"
}
{
   Id = 202 
   ProductName = "21-Bicycle 202"
   Description = "202 description"
   BicycleType = "Road"
   Brand = "Brand-Company A"
   Price = 200
   Color = [ "Green", "Black" ]
   ProductCategory = "Bike"
}

In the example, the ProductCatalog table has one book item and two bicycle items. Item 101 is a book with many attributes, including a set of Authors. Item 201 and 202 are bikes, and these items are available in different colors. The Id is the only required attribute.

Primary Key

When you create a table, in addition to the table name, you must specify the primary key of the table. The primary key uniquely identifies each item in the table, so that no two items can have the same key.

DynamoDB supports two different kinds of primary keys:

  • Partition Key – A simple primary key, composed of one attribute known as the partition key. DynamoDB uses the partition key's value as input to an internal hash function; the output from the hash function determines the partition where the item will be stored. No two items in a table can have the same partition key value.

  • Partition Key and Sort Key – A composite primary key, composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key. DynamoDB uses the partition key value as input to an internal hash function; the output from the hash function determines the partition where the item will be stored. All items with the same partition key are stored together, in sorted order by sort key value. It is possible for two items to have the same partition key value, but those two items must have different sort key values.

Note

The partition key of an item is also known as its hash attribute. The term hash attribute derives from DynamoDB's usage of an internal hash function to evenly distribute data items across partitions, based on their partition key values.

The sort key of an item is also known as its range attribute. The term range attribute derives from the way DynamoDB stores items with the same partition key physically close together, in sorted order by the sort key value.

For more information on how DynamoDB stores and retrieves data, see Item Distribution.

You must define the data type for each primary key attribute: String, Number, or Binary.

Different applications will have different requirements for tables and primary keys. For example, Amazon Web Services maintains several forums (see Discussion Forums). Each forum has many threads of discussion and each thread has many replies. You could potentially model this by creating the following three tables:

Table NamePrimary Key TypePartition Key NameSort Key Name
Forum ( Name, ... )Simple

Name

-
Thread (ForumName, Subject, ... )Composite

ForumName

Subject

Reply ( Id, ReplyDateTime, ... )Composite

Id

ReplyDateTime

In this example, both the Thread and Reply tables have composite primary keys (partition key and sort key). For the Thread table, each forum name can have one or more subjects. In this case, ForumName is the partition key and Subject is the sort key.

The Reply table has Id as the partition key and ReplyDateTime as the sort key. The reply Id identifies the thread to which the reply belongs. When designing DynamoDB tables you have to take into account the fact that DynamoDB does not support cross-table joins. For example, the Reply table stores both the forum name and subject values in the Id attribute. If you have a thread reply item, you can then parse the Id attribute to find the forum name and subject and use the information to query the Thread or the Forum tables. This developer guide uses these tables to illustrate DynamoDB functionality. For information about these tables and sample data stored in these tables, see Creating Tables and Loading Sample Data.

Secondary Indexes

When you create a table with a composite primary key (partition key and sort key), you can optionally define one or more secondary indexes on that table. A secondary index lets you query the data in the table using an alternate key, in addition to queries against the primary key.

With the Reply table, you can query data items by Id (partition key) or by Id and ReplyDateTime (partition key and sort key). Now suppose you had an attribute in the table called PostedBy with the user ID of the person who posted each reply. With a secondary index on PostedBy, you could query the data by Id (partition key) and PostedBy (sort key). Such a query would let you retrieve all the replies posted by a particular user in a thread, with maximum efficiency and without having to access any other items.

DynamoDB supports two kinds of secondary indexes:

  • Global secondary index – an index with a partition key and sort key that can be different from those on the table.

  • Local secondary index – an index that has the same partition key as the table, but a different sort key.

You can define up to 5 global secondary indexes and 5 local secondary indexes per table. For more information, see Improving Data Access with Secondary Indexes in DynamoDB.

DynamoDB Data Types

Amazon DynamoDB supports the following data types:

  • Scalar types – Number, String, Binary, Boolean, and Null.

  • Document types – List and Map.

  • Set types – String Set, Number Set, and Binary Set.

For example, in the ProductCatalog table, the Id is a Number type attribute and Authors is a String Set type attribute. Note that primary key attributes must be of type String, Number, or Binary.

The following are descriptions of each data type, along with examples. Note that the examples use JSON syntax.

Scalar Data Types

String

Strings are Unicode with UTF-8 binary encoding. There is no upper limit to the string size when you assign it to an attribute except when the attribute is part of the primary key. For more information, see Limits in DynamoDB. Also, the length of the attribute is constrained by the 400 KB item size limit. Note that the length of the attribute must be greater than zero.

String value comparison is used when returning ordered results in the Query and Scan API actions. Comparison is based on ASCII character code values. For example, "a" is greater than "A" , and "aa" is greater than "B". For a list of code values, see http://en.wikipedia.org/wiki/ASCII#ASCII_printable_characters.

Example

"Bicycle"

Number

Numbers can have up to 38 digits precision, and can be positive, negative, or zero.

  • Positive range: 1E-130 to 9.9999999999999999999999999999999999999E+125

  • Negative range: -9.9999999999999999999999999999999999999E+125 to -1E-130

In DynamoDB, numbers are represented as variable length. Leading and trailing zeroes are trimmed.

All numbers are sent to DynamoDB as String types, which maximizes compatibility across languages and libraries. However DynamoDB handles them as the Number type for mathematical operations.

Note

If number precision is important, you should pass numbers to DynamoDB using strings that you convert from a number type. DynamoDB limits numbers to 38 digits. More than 38 digits will cause an error.

Example

"300"

Binary

Binary type attributes can store any binary data, for example compressed data, encrypted data, or images. DynamoDB treats each byte of the binary data as unsigned when it compares binary values, for example when evaluating query expressions.

There is no upper limit to the length of the binary value when you assign it to an attribute except when the attribute is part of the primary key. For more information, see Limits in DynamoDB. Also, the length of the attribute is constrained by the 400 KB item size limit. Note that the length of the attribute must be greater than zero.

Client applications must encode binary values in Base64 format. When DynamoDB receives the data from the client, it decodes the data into an unsigned byte array and uses that as the length of the attribute.

The following example is a binary attribute, using Base64-encoded text.

Example

"dGhpcyB0ZXh0IGlzIGJhc2U2NC1lbmNvZGVk"

Boolean

A Boolean type attribute can store either true or false.

Example

true

Null

Null represents an attribute with an unknown or undefined state.

Example

NULL

Document Data Types

DynamoDB supports List and Map data types, which can be nested to represent complex data structures.

  • A List type contains an ordered collection of values.

  • A Map type contains an unordered collection of name-value pairs.

Lists and maps are ideal for storing JSON documents. The List data type is similar to a JSON array, and the Map data type is similar to a JSON object. There are no restrictions on the data types that can be stored in List or Map elements, and the elements do not have to be of the same type.

The following example shows a Map that contains a String, a Number, and a nested List (which itself contains another Map).

Example

{
    Day: "Monday",
    UnreadEmails: 42,
    ItemsOnMyDesk: [
        "Coffee Cup",
        "Telephone",
        {
            Pens: { Quantity : 3},
            Pencils: { Quantity : 2},
            Erasers: { Quantity : 1}
        }
    ]
}

Note

DynamoDB lets you access individual elements within lists and arrays, even if those elements are deeply nested. For more information, see Document Paths.

Set Data Types

DynamoDB also supports types that represent number sets, string sets and binary sets. Attributes such as an Authors attribute in a book item and a Color attribute of a product item are examples of String Set type attributes. Because each of these types is a set, the values in each must be unique. Attribute sets are not ordered; the order of the values returned in a set is not preserved. DynamoDB does not support empty sets.

Examples

["Black", "Green" ,"Red"]

["42.2", "-19", "7.5", "3.14"]

["U3Vubnk=", "UmFpbnk=", "U25vd3k="]

Item Distribution

DynamoDB stores data in partitions. A partition is an allocation of storage for a table, backed by solid state drives (SSDs) and automatically replicated across three facilities within an AWS region. Partition management is handled entirely by DynamoDB—customers never need to manage partitions themselves. If your storage requirements exceed a partition's capacity, DynamoDB allocates additional partitions automatically.

When you create a table, the initial status of the table is CREATING. During this phase, DynamoDB allocates one partition for the table. You can begin writing and reading table data after the table status changes to ACTIVE.

As the amount of data in the table approaches the partition's maximum capacity, DynamoDB allocates another partition to the table, and then distributes the data items among the old partition and the new one. This activity occurs in the background, and is transparent to your applications. The more data you add to the table, the more partitions that DynamoDB will allocate—as many as necessary to store your table's data.

DynamoDB does not deallocate or coalesce partitions. If a table spans multiple partitions, and you delete most of the data (or all of it), the partitions will still be allocated to the table.

Note

The information in this section also pertains to global secondary indexes. Index data is stored separately from table data; otherwise, DynamoDB manages index partitions in the same way that it manages table partitions.

Item Distribution: Partition Key

If your table has a simple primary key (partition key only), DynamoDB stores and retrieves each item based on its partition key value.

To write an item to the table, DynamoDB uses the value of the partition key as input to an internal hash function. The output value from the hash function determines the partition in which the item will be stored.

To read an item from the table, you must specify the partition key value for the item. DynamoDB uses this value as input to its hash function, yielding the partition in which the item can be found.

The following diagram shows a table named Pets, which spans multiple partitions. The table's primary key is AnimalType. (Only this key attribute is shown.) DynamoDB uses its hash function to determine where to store a new item, in this case based on the hash value of the string Dog. Note that the items are not stored in sorted order; each item's location is determined by the hash value of its partition key.

Partition Key

Note

DynamoDB is optimized for uniform distribution of items across a table's partitions, no matter how many partitions there may be. We recommend that you choose a partition key that can have a large number of distinct values relative to the number of items in the table. For more information, see Guidelines for Working with Tables.

Item Distribution: Partition Key and Sort Key

If the table has a composite primary key (partition key and sort key), DynamoDB calculates the hash value of the partition key in the same way as described in Item Distribution: Partition Key—but it stores all of the items with the same partition key value physically close together, ordered by sort key value.

To write an item to the table, DynamoDB calculates the hash value of the partition key to determine which partition should contain the item. In that partition, there could be several items with the same partition key value, so DynamoDB stores the item among the others with the same partition key, in ascending order by sort key.

To read an item from the table, you must specify its partition key value and sort key value. DynamoDB calculates the partition key's hash value, yielding the partition in which the item can be found.

You can read multiple items from the table in a single operation (Query), provided that the items you want have the same partition key value. DynamoDB will return all of the items with that partition key value. You can optionally apply a condition to the sort key, to return only the items within a certain range of values.

Suppose that the Pets table has a composite primary key consisting of AnimalType (partition key) and Name (sort key). The following diagram shows DynamoDB writing an item with a partition key value of Dog, and a sort key value of Fido.

Partition Key and Sort Key

To read that same item from the Pets table, DynamoDB calculates the hash value of Dog, yielding the partition in which these items are stored. DynamoDB then scans the sort key attribute values until it finds Fido.

To read all of the items with an AnimalType of Dog, you can issue a Query operation, without specifying a sort key condition. By default, the items are be returned in the order that they are stored—in ascending order, by sort key. (You can optionally request descending order, instead.)

To query only some of the Dog items, you can apply a condition to the sort key—for example, only the Dog items where Name is within the range A through K.

Note

In a DynamoDB table, there is no upper limit on the number of distinct sort key values per partition key value. If you needed to store many billions of Dog items in the Pets table, DynamoDB would automatically allocate enough storage to handle this requirement.