Amazon DynamoDB
Developer Guide (API Version 2012-08-10)

Amazon DynamoDB Storage Backend for Titan

The DynamoDB Storage Backend for Titan package is a storage backend for the Titan graph database implemented on top of Amazon DynamoDB. Titan is a scalable graph database optimized for storing and querying graphs. The DynamoDB Storage Backend for Titan package is available on GitHub. The Titan: Distributed Graph Database is also available on GitHub: Titan Version 0.4.4, Titan Version 0.5.4, and Titan Version 1.0.0.

The following sections describe graph databases, some use cases for graph databases, and how to use the Titan: Distributed Graph Database with the DynamoDB Storage Backend.

Working with Graph Databases

A graph database is a store of vertices and directed edges that connect those vertices. Both vertices and edges can have properties stored as key-value pairs.

A graph database uses adjacency lists or matrices for storing edges to allow simple traversal. A graph in a graph database can be traversed along specific edge types, or across the entire graph.

Graph databases can represent how entities relate by using actions, ownership, parentage, and so on. Whenever connections or relationships between entities are at the core of the data you are trying to model, a graph database is a natural choice. Therefore, graph databases are useful for modeling and querying social networks, business relationships, dependencies, shipping movements, and similar items.

You can use edges in a graph database to show typed relationships between entities (also called vertices or nodes). Edges can describe parent-child relationships, actions, product recommendations, purchases, and so on. A relationship, or edge, is a connection between two vertices that always has a start node, end node, type, and direction. An important rule of graph databases is that no broken links are allowed. Every link describes a relationship between two nodes. Deleting a node will delete all its incident relationships (that is, relationships that begin or end in the node that is being deleted).

The following is an example of a social network graph.

                        An example social network graph.

This example models a group of friends and their hobbies as a graph.


Each edge has a direction, indicated by the arrow, but the edge information is still stored in both the "in" and the "out" nodes.

A simple traversal of this graph can tell you what Justin's friends like.

Titan with the DynamoDB Storage Backend for Titan

Titan has a plugin architecture that allows it to use one of many storage backends for a graph. The DynamoDB Storage Backend for Titan is one of these plugins. The following table adds DynamoDB Storage Backend to the parity matrix available on the Titan Storage Backend Overview page.

Storage Backend Comparison

Name Storage Backend Configuration Value Consistency Availability Scalability Replication Persistence
DynamoDB eventually consistent, managed highly available, managed linear scalability, managed yes, managed SSD, managed
Cassandra cassandra eventually consistent highly available linear scalability yes disk
HBase hbase vertex consistent failover recovery linear scalability yes disk
BerkeleyDB berkeleyje ACID single point of failure single machine HA mode available disk
Persistit persistit ACID single point of failure single machine none disk
InMemory inmemory ACID single point of failure single machine none none

Using DynamoDB for graph storage gives you a highly scalable distributed graph database without the burden of managing database clusters. DynamoDB can scale to any size, provides fast, predictable performance, and is highly available with automatic data replication across three Availability Zones in an AWS region. It also provides AWS-managed authentication and multiple graphs in a single account and region by prepending prefixes to Titan graph tables.

Titan Features

The DynamoDB Storage Backend for Titan plugin supports Titan versions 0.4.4, 0.5.4, and 1.0.0.

We recommend you use the DynamoDB Storage Backend for Titan 1.0.0, but plugin versions for Titan 0.4.4 and 0.5.4 are still available.

Titan offers the following:

  • Fast traversals and arbitrary traversals along specified edge types

  • Directed, typed edges

  • Stored relationships

Titan 0.4.4 supports the TinkerPop 2.4 stack by implementing the Blueprints API. TinkerPop includes the following components:

  • The Rexster graph server

  • Furnace graph algorithms

  • The Frames object-graph mapper

  • The Gremlin traversal language

  • Pipes dataflows

  • The Blueprints generic graph API

For information on the TinkerPop stack, including Gremlin, Rexster, Furnace, Frames, and Blueprints, go to the TinkerPop home page.

Titan version 0.5.4 provides several important changes and additions:

  • Support for the TinkerPop 2.5 stack

  • Support for vertex partitioning

  • Support for vertex labels

  • User-defined transaction logs

  • Two new system transaction log tables: txlog and systemlog

  • edgeindex and vertexindex are merged into a single graphindex table

The Titan version 0.5.4 features are a superset of the Titan version 0.4.4 features. For more information about Titan changes, see the Titan 0.5.4 Release Notes.

Titan version 1.0.0 provides the following changes and additions:

  • Support for the TinkerPop 3.0 stack.

  • Titan-specific TraversalStrategies for TinkerPop.

  • Query execution engine optimizations.

The Titan version 1.0.0 features are a superset of the Titan version 0.5.4 features. For more information about Titan changes, see the Titan 1.0 Release Notes.

For more information on Titan features, go to the Titan documentation page.

As with other Titan storage backends, you can work with Titan graphs using the Gremlin shell and the Groovy language in addition to using the Java native API or Blueprints API.

The following tables compare the features available in Titan storage backends.

Feature name dynamodb cassandra berkeleyje hbase persistit (0.4.4 only) in memory (0.5.4 & 1.0.0)
batchMutation Yes Yes No Yes No No
distributed Yes Yes No Yes No No
keyConsistent Yes Yes No Depends on config No Depends on config
keyOrdered No Depends on partitioner Yes Yes Yes Yes
localKeyPartition No Depends on partitioner No No No No
locking Yes No Yes No Yes No
multiQuery Yes Yes, except cassandra-embedded No Yes No No
orderedScan No Depends on partitioner Yes Yes Yes Yes
transactional No No Yes No Yes No
unorderedScan Yes Depends on partitioner No Yes No Yes
optimisticLocking Yes Yes No Yes Yes
cellTTL (0.5.4 only) No Yes No No No
storeTTL (0.5.4 only) No No No Yes No
preferredTimestamps (0.5.4 only) MILLI
timestamps (0.5.4 only) No Yes No Yes No
visibility (0.5.4 only) No No No No No

Cassandra Partitioners

keyOrdered / localKeyPartition

No Yes
orderedScan No Yes
unorderedScan Yes No

Next Step