Amazon Neptune
User Guide (API Version 2017-11-29)

Amazon Neptune Quick Start Using AWS CloudFormation

This section contains steps and other information to help you get started quickly with Amazon Neptune. For general information about Neptune, see What Is Amazon Neptune?.

These instructions use an AWS CloudFormation template to create the required resources for you. For instructions on creating these resources yourself, see Getting Started with Neptune.

Important

The AWS CloudFormation stack that is created by this template creates multiple resources, including resources in Neptune, Amazon Elastic Compute Cloud (Amazon EC2), Amazon Virtual Private Cloud (Amazon VPC), and AWS Identity and Access Management (IAM).

Some of these resources are not free-tier resources. For pricing information, see Amazon Neptune Pricing and Amazon EC2 Pricing. You can delete the stack when you are finished with it to stop any charges.

This AWS CloudFormation stack is intended as a basis for a tutorial for Amazon Neptune. We recommend you to use stricter IAM policies and security for your production environment if you use this template. For information on securing resources, see Amazon VPC Security and Amazon EC2 Network and Security.

Prerequisites

Before you create an Amazon Neptune cluster, you need to have the following:

  • The required IAM permissions.

  • A key pair

IAM Permissions

The following permissions allow you to create resources for the AWS CloudFormation stack:

AWS Managed Policies

  • AWSCloudFormationReadOnlyAccess

  • NeptuneFullAccess

Additional IAM Permissions

The following policy outlines the additional permissions that are required to create and delete this CloudFormation stack.

{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": [ "iam:GetSSHPublicKey", "iam:ListSSHPublicKeys", "iam:CreateRole", "iam:CreatePolicy", "iam:PutRolePolicy", "iam:CreateInstanceProfile", "iam:AddRoleToInstanceProfile", "iam:GetAccountSummary", "iam:ListAccountAliases", "iam:PassRole", "iam:GetRole", "cloudformation:*Stack", "ec2:DescribeKeyPairs", "ec2:*Vpc", "ec2:DescribeInternetGateways", "ec2:*InternetGateway", "ec2:createTags", "ec2:*VpcAttribute", "ec2:DescribeRouteTables", "ec2:*RouteTable", "ec2:*Subnet", "ec2:*SecurityGroup", "ec2:AuthorizeSecurityGroupIngress", "ec2:DescribeVpcEndpoints", "ec2:*VpcEndpoint", "ec2:*SubnetAttribute", "ec2:*Route", "ec2:*Instances", "iam:DeleteRole", "iam:RemoveRoleFromInstanceProfile", "iam:DeleteRolePolicy", "iam:DeleteInstanceProfile", "ec2:DeleteVpcEndpoints" ], "Resource": "*" } ] }

Note

The following permissions are only required to delete a stack: iam:DeleteRole, iam:RemoveRoleFromInstanceProfile, iam:DeleteRolePolicy, iam:DeleteInstanceProfile, and ec2:DeleteVpcEndpoints.

Also note that ec2:*Vpc grants ec2:DeleteVpc permissions.

EC2 Key Pair

You must have a key pair (and the PEM file) available in the Region that you create the AWS CloudFormation stack in. If you need to create a key pair, see Creating a Key Pair Using Amazon EC2 for instructions on creating the pair and downloading the PEM file.

Launch the Amazon Neptune CloudFormation Stack

  1. To launch the Neptune stack in the AWS CloudFormation console, choose one of the Launch Stack buttons in the following table.

    Region View View in Designer Launch
    US East (N. Virginia) View View in Designer
    US East (Ohio) View View in Designer
    US West (Oregon) View View in Designer
    EU (Ireland) View View in Designer
    EU (London) View Unavailable
  2. On the Select Template page, choose Next.

  3. On the Specify Details page, choose a key pair for the EC2SSHKeyPairName.

    This key pair is required to access the EC2 instance. Ensure that you have the PEM file for the key pair that you choose.

  4. Choose Next.

  5. On the Options page, choose Next.

  6. On the Review page, select the check box to acknowledge that AWS CloudFormation will create IAM resources. Then choose Create.

Accessing the Neptune Graph

Now that you have an instance, you can log into your EC2 instance using SSH and connect to the Neptune graph. For information about connecting to an EC2 instance using SSH, see Connect to Your Linux Instance in the Amazon EC2 User Guide for Linux Instances.

If you are using a Linux or macOS command line to connect to the EC2 instance, you can paste the SSH command from the SSHAccess item in the Outputs section of the AWS CloudFormation stack. This requires that you have the PEM file in the current directory and the PEM file permissions must be set to 400 (chmod 400 keypair.pem).

After you are connected, see the following sections, which contain information about using the Gremlin and SPARQL endpoints of Neptune.

Gremlin

The Gremlin Console allows you to experiment with TinkerPop graphs and queries in a REPL (read-eval-print loop) environment.

The following tutorial walks you through using Gremlin with Amazon Neptune, including how to add vertices, edges, properties, and more, while highlighting Neptune-specific Gremlin implementation differences.

To access Neptune using the Gremlin Console

  1. Change directories into the unzipped folder.

    cd apache-tinkerpop-gremlin-console-3.3.2
  2. Type the following command to run the Gremlin Console.

    bin/gremlin.sh

    You should see the following output:

    \,,,/ (o o) -----oOOo-(3)-oOOo----- plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities plugin activated: tinkerpop.tinkergraph gremlin>

    You are now at the gremlin> prompt. You will type the remaining steps at this prompt.

  3. At the gremlin> prompt, type the following to connect to the Neptune DB instance.

    :remote connect tinkerpop.server conf/neptune-remote.yaml
  4. At the gremlin> prompt, type the following to switch to remote mode. This sends all Gremlin queries to the remote connection.

    :remote console
  5. Add vertex with label and property

    g.addV('person').property('name', 'justin')

    The vertex is assigned a string ID containing a GUID. All vertex IDs are strings in Neptune.

  6. Add a vertex with custom id

    g.addV('person').property(id, '1').property('name', 'marko')

    The id property is not quoted. It is a keyword for the ID of the vertex. The vertex ID here is a string with the number 1 in it.

    Normal property names must be contained in quotation marks.

  7. Change property or add property if it doesn't exist

    g.V('1').property(single, 'name', 'marko')

    Here you are changing the name property for the vertex from the previous step. This removes all existing values from the name property.

    If you didn't specify single, it instead appends the value to the name property if it hasn't done so already.

  8. Add property, but append property if property already has a value.

    g.V('1').property('age', 29)

    Neptune uses set cardinality as the default action.

    This command adds the age property with the value 29, but it does not replace any existing values.

    If the age property already had a value, this command appends 29 to the property. For example, if the age property was 27, the new value would be [ 27, 29 ].

  9. Add multiple vertices:

    g.addV('person').property(id, '2').property('name', 'vadas').property('age', 27).next() g.addV('software').property(id, '3').property('name', 'lop').property('lang', 'java').next() g.addV('person').property(id, '4').property('name', 'josh').property('age', 32).next() g.addV('software').property(id, '5').property('name', 'ripple').property('ripple', 'java').next() g.addV('person').property(id, '6').property('name', 'peter').property('age', 35)

    You can send multiple statements at the same time to Neptune.

    Statements can be separated by newline ('\n'), spaces (' '), semicolon ('; '), or nothing (for example: g.addV(‘person’).next()g.V() is valid).

    Note

    The Gremlin Console sends a separate command at every newline ('\n'), so they will each be a separate transaction in that case. This example has all the commands on separate lines for readability. Remove the newline ('\n') characters to send it as a single command via the Gremlin Console.

    All statements other than the last statement must end in a terminating step, such as .next() or .iterate(), or they will not run. The Gremlin Console does not require these terminating steps.

    All statements that are sent together are included in a single transaction and succeed or fail together.

  10. Add edges

    g.V('1').addE('knows').to(g.V('2')).property('weight', 0.5).next() g.addE('knows').from(g.V('1')).to(g.V('4')).property('weight', 1.0)

    Here are two different ways to add an edge.

  11. Add the rest of the Modern graph

    g.V('1').addE('created').to(g.V('3')).property('weight', 0.4).next() g.V('4').addE('created').to(g.V('5')).property('weight', 1.0).next() g.V('4').addE('knows').to(g.V('3')).property('weight', 0.4).next() g.V('6').addE('created').to(g.V('3')).property('weight', 0.2)
  12. Delete a Vertex

    g.V().has('name', 'justin').drop()

    Removes the vertex with the name property equal to justin.

    Important

    Stop here, and you have the full Apache TinkerPop Modern graph. The examples in the Traversal section of the TinkerPop documentation use the Modern graph.

  13. Run a Traversal

    g.V().hasLabel('person')

    Returns all person vertices.

  14. Run a Traversal with values (valueMap())

    g.V().has('name', 'marko').out('knows').valueMap()

    Returns key, value pairs for all vertices that marko “knows.”

  15. Multiple labels

    g.addV("Label1::Label2::Label3")

    Neptune supports multiple labels for a vertex. When you create a label, you can specify multiple labels by separating them with ::

    This example adds a vertex with three different labels.

    The hasLabel step matches this vertex with any of those three labels: hasLabel("Label1"), hasLabel("Label2"), and hasLabel("Label3").

    The :: delimiter is reserved for this use only.

    You cannot specify multiple labels in the hasLabel step. For example, hasLabel("Label1::Label2") does not match anything.

  16. Time / date

    g.V().property(single, 'lastUpdate', datetime('2018-01-01T00:00:00'))

    Neptune does not support Java Date. Use the datetime() function instead. datetime() accepts an ISO8061-compliant datetime string.

    It supports the following formats: YYYY-MM-DD, YYYY-MM-DDTHH:mm, YYYY-MM-DDTHH:mm:SS, YYYY-MM-DDTHH:mm:SSZ

  17. Delete vertices, properties, or edges

    g.V().hasLabel('person').properties('age').drop().iterate() g.V('1').drop().iterate() g.V().outE().hasLabel('created').drop()

    Here are several drop examples.

    Note

    The .next() step does not work with .drop(). Use .iterate() instead.

  18. When you are finished, type the following to exit the Gremlin Console.

    :exit

Note

Use a semicolon (;) or a newline character (\n) to separate each statement.

Each traversal preceding the final traversal must end in next() to be executed. Only the data from the final traversal is returned.

For more information on the Neptune implementation of Gremlin, see Neptune Gremlin Implementation Differences.

RDF / SPARQL

SPARQL is a query language for the Resource Description Framework (RDF), which is a graph data format designed for the web. Amazon Neptune is compatible with SPARQL 1.1. This means that you can connect to a Neptune DB instance and query the graph using the query language described in the SPARQL 1.1 Query Language specification.

A query in SPARQL consists of a SELECT clause to specify the variables to return and a WHERE clause to specify which data to match in the graph. If you are unfamiliar with SPARQL queries, see Writing Simple Queries in the SPARQL 1.1 Query Language.

The HTTP endpoint for SPARQL queries to a Neptune DB instance is http://your-neptune-endpoint:8182/sparql.

To connect to SPARQL

  1. You can get the SPARQL endpoint for your Neptune cluster from the SparqlEndpoint item in the Outputs section of the AWS CloudFormation stack.

  2. Type the following to submit a SPARQL UPDATE using HTTP POST and the curl command.

    curl -X POST --data-binary 'update=INSERT DATA { <http://test.com/s> <http://test.com/p> <http://test.com/o> . }' http://your-neptune-endpoint:8182/sparql

    The preceding example inserts the following triple into the SPARQL default graph: <http://test.com/s> <http://test.com/p> <http://test.com/o>

  3. Type the following to submit a SPARQL QUERY using HTTP POST and the curl command.

    curl -X POST --data-binary 'query=select ?s ?p ?o where {?s ?p ?o} limit 10' http://your-neptune-endpoint:8182/sparql

    The preceding example returns up to 10 of the triples (subject-predicate-object) in the graph by using the ?s ?p ?o query with a limit of 10. To query for something else, replace it with another SPARQL query.

    Note

    The default MIME type of a response is application/sparql-results+json for SELECT and ASK queries.

    The default MIME type of a response is application/n-quads for CONSTRUCT and DESCRIBE queries.

    For a list of all available MIME types, see SPARQL HTTP API.

For more information about the Neptune SPARQL REST interface, see SPARQL HTTP API. For more information about Amazon Neptune, see Next Steps.