Amazon DynamoDB Storage Backend for Titan
The DynamoDB Storage Backend for Titan package is a storage backend for the Titan graph database implemented on top of Amazon DynamoDB. Titan is a scalable graph database optimized for storing and querying graphs. The DynamoDB Storage Backend for Titan package is available on GitHub. The Titan: Distributed Graph Database is also available on GitHub: Titan Version 0.4.4, Titan Version 0.5.4, and Titan Version 1.0.0.
The following sections describe graph databases, some use cases for graph databases, and how to use the Titan: Distributed Graph Database with the DynamoDB Storage Backend.
Working with Graph Databases
A graph database is a store of vertices and directed edges that connect those vertices. Both vertices and edges can have properties stored as key-value pairs.
A graph database uses adjacency lists or matrices for storing edges to allow simple traversal. A graph in a graph database can be traversed along specific edge types, or across the entire graph.
Graph databases can represent how entities relate by using actions, ownership, parentage, and so on. Whenever connections or relationships between entities are at the core of the data you are trying to model, a graph database is a natural choice. Therefore, graph databases are useful for modeling and querying social networks, business relationships, dependencies, shipping movements, and similar items.
You can use edges in a graph database to show typed relationships between entities (also called vertices or nodes). Edges can describe parent-child relationships, actions, product recommendations, purchases, and so on. A relationship, or edge, is a connection between two vertices that always has a start node, end node, type, and direction. An important rule of graph databases is that no broken links are allowed. Every link describes a relationship between two nodes. Deleting a node will delete all its incident relationships (that is, relationships that begin or end in the node that is being deleted).
The following is an example of a social network graph.
This example models a group of friends and their hobbies as a graph.
Each edge has a direction, indicated by the arrow, but the edge information is still stored in both the "in" and the "out" nodes.
A simple traversal of this graph can tell you what Justin's friends like.
Titan with the DynamoDB Storage Backend for Titan
Titan has a plugin architecture that allows it to use one of many storage backends for a graph. The DynamoDB Storage Backend for Titan is one of these plugins. The following table adds DynamoDB Storage Backend to the parity matrix available on the Titan Storage Backend Overview page.
Storage Backend Comparison
|Name||Storage Backend Configuration Value||Consistency||Availability||Scalability||Replication||Persistence|
|DynamoDB||com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager||eventually consistent, managed||highly available, managed||linear scalability, managed||yes, managed||SSD, managed|
|Cassandra||cassandra||eventually consistent||highly available||linear scalability||yes||disk|
|HBase||hbase||vertex consistent||failover recovery||linear scalability||yes||disk|
|BerkeleyDB||berkeleyje||ACID||single point of failure||single machine||HA mode available||disk|
|Persistit||persistit||ACID||single point of failure||single machine||none||disk|
|InMemory||inmemory||ACID||single point of failure||single machine||none||none|
Using DynamoDB for graph storage gives you a highly scalable distributed graph database without the burden of managing database clusters. DynamoDB can scale to any size, provides fast, predictable performance, and is highly available with automatic data replication across three Availability Zones in an AWS region. It also provides AWS-managed authentication and multiple graphs in a single account and region by prepending prefixes to Titan graph tables.
The DynamoDB Storage Backend for Titan plugin supports Titan versions 0.4.4, 0.5.4, and 1.0.0.
We recommend you use the DynamoDB Storage Backend for Titan 1.0.0, but plugin versions for Titan 0.4.4 and 0.5.4 are still available.
Titan offers the following:
Fast traversals and arbitrary traversals along specified edge types
Directed, typed edges
Titan 0.4.4 supports the TinkerPop 2.4 stack by implementing the Blueprints API. TinkerPop includes the following components:
The Rexster graph server
Furnace graph algorithms
The Frames object-graph mapper
The Gremlin traversal language
The Blueprints generic graph API
For information on the TinkerPop stack, including Gremlin, Rexster, Furnace, Frames, and Blueprints, go to the TinkerPop home page.
Titan version 0.5.4 provides several important changes and additions:
Support for the TinkerPop 2.5 stack
Support for vertex partitioning
Support for vertex labels
User-defined transaction logs
Two new system transaction log tables:
vertexindexare merged into a single
The Titan version 0.5.4 features are a superset of the Titan version 0.4.4 features. For more information about Titan changes, see the Titan 0.5.4 Release Notes.
Titan version 1.0.0 provides the following changes and additions:
Support for the TinkerPop 3.0 stack.
Titan-specific TraversalStrategies for TinkerPop.
Query execution engine optimizations.
The Titan version 1.0.0 features are a superset of the Titan version 0.5.4 features. For more information about Titan changes, see the Titan 1.0 Release Notes.
For more information on Titan features, go to the Titan documentation page.
As with other Titan storage backends, you can work with Titan graphs using the Gremlin shell and the Groovy language in addition to using the Java native API or Blueprints API.
The following tables compare the features available in Titan storage backends.
|Feature name||dynamodb||cassandra||berkeleyje||hbase||persistit (0.4.4 only)||in memory (0.5.4 & 1.0.0)|
|keyConsistent||Yes||Yes||No||Depends on config||No||Depends on config|
|keyOrdered||No||Depends on partitioner||Yes||Yes||Yes||Yes|
|localKeyPartition||No||Depends on partitioner||No||No||No||No|
|multiQuery||Yes||Yes, except cassandra-embedded||No||Yes||No||No|
|orderedScan||No||Depends on partitioner||Yes||Yes||Yes||Yes|
|unorderedScan||Yes||Depends on partitioner||No||Yes||No||Yes|
|cellTTL (0.5.4 only)||No||Yes||No||No||No|
|storeTTL (0.5.4 only)||No||No||No||Yes||No|
|preferredTimestamps (0.5.4 only)||MILLI|
|timestamps (0.5.4 only)||No||Yes||No||Yes||No|
|visibility (0.5.4 only)||No||No||No||No||No|
keyOrdered / localKeyPartition