Set up the Gremlin console to connect to a Neptune DB instance - Amazon Neptune

Set up the Gremlin console to connect to a Neptune DB instance

The Gremlin Console allows you to experiment with TinkerPop graphs and queries in a REPL (read-eval-print loop) environment.

Installing the Gremlin console and connecting to it in the usual way

You can use the Gremlin Console to connect to a remote graph database. The following section walks you through installing and configuring the Gremlin Console to connect remotely to a Neptune DB instance. You must follow these instructions from an Amazon EC2 instance in the same virtual private cloud (VPC) as your Neptune DB instance.

Note

If you have IAM authentication enabled on your Neptune DB cluster, follow the instructions in Connecting to Neptune Using the Gremlin Console with Signature Version 4 Signing to install the Gremlin console rather than the instructions here.

To install the Gremlin Console and connect to Neptune

  1. The Gremlin Console binaries require Java 8. Enter the following to install Java 8 on your EC2 instance.

    sudo yum install java-1.8.0-devel
  2. Enter the following to set Java 8 as the default runtime on your EC2 instance.

    sudo /usr/sbin/alternatives --config java

    When prompted, enter the number for Java 8.

  3. Download the appropriate version of the Gremlin console from the Apache web site. You can check the engine release page for the Neptune engine version you are currently running to determine which Gremlin version it supports. For example, for version 3.5.2, you can download the Gremlin console from the Apache Tinkerpop3 website onto your EC2 instance like this:

    wget https://archive.apache.org/dist/tinkerpop/3.5.2/apache-tinkerpop-gremlin-console-3.5.2-bin.zip
  4. Unzip the Gremlin Console zip file.

    unzip apache-tinkerpop-gremlin-console-3.5.2-bin.zip
  5. Change directories into the unzipped directory.

    cd apache-tinkerpop-gremlin-console-3.5.2
  6. Install the CA certificate. Gremlin Console requires a certificate to verify the remote certificate.

    1. Download the certificate:

      wget https://www.amazontrust.com/repository/SFSRootCAG2.cer
    2. Create a directory for certificates:

      mkdir /tmp/certs/
    3. Copy Java certificates into the new directory:

      cp jre_path/lib/security/cacerts /tmp/certs/cacerts
    4. Add the Amazon certificate to the repository:

      sudo keytool -importcert \ -alias neptune-tests-ca \ -keystore /tmp/certs/cacerts \ -file /home/ec2-user/apache-tinkerpop-gremlin-console-3.5.2/SFSRootCAG2.cer \ -noprompt \ -storepass changeit
  7. In the conf subdirectory of the extracted directory, create a file named neptune-remote.yaml with the following text. Replace your-neptune-endpoint with the hostname or IP address of your Neptune DB instance. The square brackets ([ ]) are required.

    Note

    For information about finding the hostname of your Neptune DB instance, see the Connecting to Amazon Neptune Endpoints section.

    hosts: [your-neptune-endpoint] port: 8182 connectionPool: { enableSsl: true, trustStore: /tmp/certs/cacerts } serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }}
  8. In a terminal, navigate to the Gremlin Console directory (apache-tinkerpop-gremlin-console-3.5.2), and then enter the following command to run the Gremlin Console.

    bin/gremlin.sh

    You should see the following output:

    \,,,/ (o o) -----oOOo-(3)-oOOo----- plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities plugin activated: tinkerpop.tinkergraph gremlin>

    You are now at the gremlin> prompt. You will enter the remaining steps at this prompt.

  9. At the gremlin> prompt, enter the following to connect to the Neptune DB instance.

    :remote connect tinkerpop.server conf/neptune-remote.yaml
  10. At the gremlin> prompt, enter the following to switch to remote mode. This sends all Gremlin queries to the remote connection.

    :remote console
  11. Enter the following to send a query to the Gremlin Graph.

    g.V().limit(1)
  12. When you are finished, enter the following to exit the Gremlin Console.

    :exit
Note

Use a semicolon (;) or a newline character (\n) to separate each statement.

Each traversal preceding the final traversal must end in next() to be executed. Only the data from the final traversal is returned.

For more information on the Neptune implementation of Gremlin, see Gremlin standards compliance in Amazon Neptune.

An alternate way to connect to the Gremlin console

Drawbacks of the normal connection approach

The most common way to connect to the Gremlin console is the one explained above, using commands like this at the gremlin> prompt:

gremlin> :remote connect tinkerpop.server conf/(file name).yaml gremlin> :remote console

This works well, and lets you send queries to Neptune. However, it takes the Groovy script engine out of the loop, so Neptune treats all queries as pure Gremlin. This means that the following query forms fail:

gremlin> 1 + 1 gremlin> x = g.V().count()

The closest you can get to using a variable when connected this way is to use the result variable maintained by the console and send the query using :>, like this:

gremlin> :remote console ==>All scripts will now be evaluated locally - type ':remote console' to return to remote mode for Gremlin Server - [krl-1-cluster.cluster-ro-cm9t6tfwbtsr.us-east-1.neptune.amazonaws.com/172.31.19.217:8182] gremlin> :> g.V().count() ==>4249 gremlin> println(result) [result{object=4249 class=java.lang.Long}] gremlin> println(result['object']) [4249]

 

A different way to connect

You can also connect to the Gremlin console in a different way, which you may find nicer, like this:

gremlin> g = traversal().withRemote('conf/neptune.properties')

Here neptune.properties takes this form:

gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection gremlin.remote.driver.clusterFile=conf/my-cluster.yaml gremlin.remote.driver.sourceName=g

The my-cluster.yaml file should look like this:

hosts: [my-cluster-abcdefghijk.us-east-1.neptune.amazonaws.com] port: 8182 serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { serializeResultToString: false } } connectionPool: { enableSsl: true }

Configuring the Gremlin console connection like that lets you make the following kinds of queries successfully:

gremlin> 1+1 ==>2 gremlin> x=g.V().count().next() ==>4249 gremlin> println("The answer was ${x}") The answer was 4249

You can avoid displaying the result, like this:

gremlin> x=g.V().count().next();[] gremlin> println(x) 4249

All the usual ways of querying (without the terminal step) continue to work. For example:

gremlin> g.V().count() ==>4249

You can even use the g.io().read() step to load a file with this kind of connection.