The Amazon Neptune alternative query engine (DFE) - Amazon Neptune

The Amazon Neptune alternative query engine (DFE)

Amazon Neptune has an alternative query engine known as the DFE that uses DB instance resources such as CPU cores, memory, and I/O more efficiently than the original Neptune engine.

Note

Support for openCypher depends on the DFE query engine in Neptune.

The DFE engine was first available in lab mode in Neptune engine release 1.0.3.0, and starting in Neptune engine release 1.0.5.0, it became enabled by default, but only for use with query hints and for openCypher support.

Beginning with Neptune engine release 1.1.1.0 the DFE engine is no longer in lab mode, and is now controlled using the neptune_dfe_query_engine instance parameter in an instance's DB parameter group.

Note

With large data sets, the DFE engine may not run well on t3 instances.

The DFE engine runs SPARQL, Gremlin and openCypher queries, and supports a wide variety of plan types, including left-deep, bushy, and hybrid ones. Plan operators can invoke both compute operations, which run on a reserved set of compute cores, and I/O operations, each of which runs on its own thread in an I/O thread pool.

The DFE uses pre-generated statistics about your Neptune graph data to make informed decisions about how to structure queries. See DFE statistics for information about how these statistics are generated.

The choice of plan type and the number of compute threads used is made automatically based on pre-generated statistics and on the resources that are available in the Neptune head node. The order of results is not predetermined for plans that have internal compute parallelism.

Controlling where the Neptune DFE engine is used

By default, the neptune_dfe_query_engine instance parameter of an instance is set to viaQueryHint, which causes the DFE engine to be used only for openCypher queries and for Gremlin and SPARQL queries that explicitly include the useDFE query hint set to true.

You can fully enable the DFE engine so that it is used wherever possible by setting the neptune_dfe_query_engine instance parameter to enabled.

You can also disable the DFE by including the useDFE query hint for a particular Gremlin query or SPARQL query. This query hint lets you prevent the DFE from executing that particular query.

You can determine whether or not the DFE is enabled in an instance using an Instance Status call, like this:

curl -G https://your-neptune-endpoint:port/status

The status response then specifies whether the DFE is enabled or not:

{ "status":"healthy", "startTime":"Wed Dec 29 02:29:24 UTC 2021", "dbEngineVersion":"development", "role":"writer", "dfeQueryEngine":"viaQueryHint", "gremlin":{"version":"tinkerpop-3.5.2"}, "sparql":{"version":"sparql-1.1"}, "opencypher":{"version":"Neptune-9.0.20190305-1.0"}, "labMode":{ "ObjectIndex":"disabled", "ReadWriteConflictDetection":"enabled" }, "features":{ "ResultCache":{"status":"disabled"}, "IAMAuthentication":"disabled", "Streams":"disabled", "AuditLog":"disabled" }, "settings":{"clusterQueryTimeoutInMs":"120000"} }

The Gremlin explain and profile results tell you whether a query is being executed by the DFE. See Information contained in a Gremlin explain report for explain and DFE profile reports for profile.

Similarly, SPARQL explain tells you whether a SPARQL query is being executed by the DFE. See Example of SPARQL explain output when the DFE is enabled and DFENode operator for more details.

Query constructs supported by the Neptune DFE

Currently, the Neptune DFE supports a subset of SPARQL and Gremlin query constructs.

For SPARQL, this is the subset of conjunctive basic graph patterns.

For Gremlin, it is generally the subset of queries that contain a chain of traversals which do not contain some of the more complex steps.

You can find out whether one of your queries is being executed in whole or in part by the DFE as follows: