The Amazon Neptune alternative query engine (DFE) - Amazon Neptune

The Amazon Neptune alternative query engine (DFE)

Amazon Neptune has a new, alternative query engine known as the DFE that uses DB instance resources such as CPU cores, memory, and I/O more efficiently than the current engine. It is currently available as a lab-mode feature, for development purposes only.

The new DFE runs both SPARQL and Gremlin queries, and supports a wide variety of plan types, including left-deep, bushy, and hybrid ones. Plan operators can invoke both compute operations, which run on a reserved set of compute cores, and I/O operations, each of which runs on its own thread in an I/O thread pool.

The DFE uses pre-generated statistics about your Neptune graph data to make informed decisions about how to structure queries. See DFE statistics for information about how these statistics are generated and how you can manage them.

The choice of plan type and the number of compute threads used is made automatically based on pre-generated statistics and on the resources that are available in the Neptune head node. The order of results is not predetermined for plans that have internal compute parallelism.

Enabling and disabling the Neptune DFE

You can enable or disable the DFE at any time in Neptune Lab Mode by setting DFEQueryEngine to enabled or disabled in the neptune_lab_mode parameter in the DB cluster parameter group. Setting DFEQueryEngine to enabled enables the new query engine, and setting it to disabled enables the older engine.

You can also disable DFE for a specific query with the useDFE query hint (see Gremlin useDFE query hint for Gremlin, and The useDFE SPARQL query hint for SPARQL). This query hint lets you prevent the DFE from executing that specific query.

Important

The DFE is currently experimental. It is intended for use in development, and is not recommended for production use.

You can confirm whether or not the DFE is enabled using an Instance Status call, like this:

curl -G https://your-neptune-endpoint:port/status

The status response then specifies whether the DFE is enabled or not:

{ "status": "healthy", "startTime": "Tue Nov 05 22:49:06 UTC 2019", "dbEngineVersion": "development", "role": "writer", "gremlin": {"version":"tinkerpop-3.4.1"}, "sparql": {"version":"sparql-1.1"}, "labMode": { "Streams": "disabled", "ReadWriteConflictDetection": "enabled", "DFEQueryEngine": "enabled" }, "rollingBackTrxCount": "5", "rollingBackTrxEarliestStartTime": "Fri Jan 10 01:26:21 UTC 2020" }

The Gremlin explain and profile results tell you whether a query is being executed by the DFE. See Information contained in a Gremlin explain report for explain and DFE profile reports for profile.

Similarly, SPARQL explain tells you whether a SPARQL query is being executed by the DFE. See Example of SPARQL explain output when the DFE is enabled and DFENode operator for more details.

Query constructs supported by the Neptune DFE

Currently, the Neptune DFE supports a subset of SPARQL and Gremlin query constructs.

For SPARQL, this is the subset of conjunctive basic graph patterns.

For Gremlin, it is the generally subset of queries that contain a chain of traversals which do not contain some of the more complex steps.

You can find out whether one of your queries is being executed in whole or in part by the DFE as follows: