The Amazon Neptune alternative query engine (DFE) - Amazon Neptune

The Amazon Neptune alternative query engine (DFE)

Amazon Neptune has a new, alternative query engine known as the DFE that uses DB instance resources such as CPU cores, memory, and I/O more efficiently than the current engine. It is currently available as a lab-mode feature, for development purposes only.

Note

The DFE engine was first available in Neptune engine release 1.0.3.0, and starting in Neptune engine release 1.0.5.0, it became enabled by default, but only for use with query hints and for openCypher support, because DFEQueryEngine=viaQueryHint became the default lab-mode setting. Support for openCypher in Neptune depends on the DFE engine being enabled.

The new DFE runs both SPARQL and Gremlin queries, and supports a wide variety of plan types, including left-deep, bushy, and hybrid ones. Plan operators can invoke both compute operations, which run on a reserved set of compute cores, and I/O operations, each of which runs on its own thread in an I/O thread pool.

The DFE uses pre-generated statistics about your Neptune graph data to make informed decisions about how to structure queries. See DFE statistics for information about how these statistics are generated and how you can manage them.

The choice of plan type and the number of compute threads used is made automatically based on pre-generated statistics and on the resources that are available in the Neptune head node. The order of results is not predetermined for plans that have internal compute parallelism.

Important

The experimental DFE engine is currently not tuned for use in a t3.medium instance type. As long as the engine is still in lab mode, please only enable it in larger instance sizes.

Enabling and disabling the Neptune DFE

You can enable or disable the DFE at any time in Neptune Lab Mode by setting DFEQueryEngine to enabled, disabled, or viaQueryHint in the neptune_lab_mode parameter in the DB cluster parameter group.

  • Setting DFEQueryEngine to enabled enables the new query engine as well as the older engine, and causes the new engine to be used wherever possible, unless the useDFE query hint is present and set to false.

  • Setting DFEQueryEngine to disabled enables only the older engine.

  • Setting DFEQueryEngine to viaQueryHint (the default) enables both the new query engine and the older engine, but the newer engine is only used when the useDFE query hint is present and set to true.

You can also disable DFE for a specific query with the useDFE query hint (see Gremlin useDFE query hint for Gremlin, and The useDFE SPARQL query hint for SPARQL). This query hint lets you prevent the DFE from executing that specific query.

Important

The DFE is currently experimental. It is intended for use in development, and is not recommended for production use.

You can confirm whether or not the DFE is enabled using an Instance Status call, like this:

curl -G https://your-neptune-endpoint:port/status

The status response then specifies whether the DFE is enabled or not:

{ "status": "healthy", "startTime": "Tue Nov 05 22:49:06 UTC 2019", "dbEngineVersion": "development", "role": "writer", "gremlin": {"version":"tinkerpop-3.4.1"}, "sparql": {"version":"sparql-1.1"}, "labMode": { "Streams": "disabled", "ReadWriteConflictDetection": "enabled", "DFEQueryEngine": "enabled" }, "rollingBackTrxCount": "5", "rollingBackTrxEarliestStartTime": "Fri Jan 10 01:26:21 UTC 2020" }

The Gremlin explain and profile results tell you whether a query is being executed by the DFE. See Information contained in a Gremlin explain report for explain and DFE profile reports for profile.

Similarly, SPARQL explain tells you whether a SPARQL query is being executed by the DFE. See Example of SPARQL explain output when the DFE is enabled and DFENode operator for more details.

Query constructs supported by the Neptune DFE

Currently, the Neptune DFE supports a subset of SPARQL and Gremlin query constructs.

For SPARQL, this is the subset of conjunctive basic graph patterns.

For Gremlin, it is the generally subset of queries that contain a chain of traversals which do not contain some of the more complex steps.

You can find out whether one of your queries is being executed in whole or in part by the DFE as follows: