Gremlin profile API in Neptune - Amazon Neptune

Gremlin profile API in Neptune

The Neptune Gremlin profile API runs a specified Gremlin traversal, collects various metrics about the run, and produces a profile report as output.

It differs from the TinkerPop .profile() step so as to be able to report information specific to the Neptune engine.

The profile report includes the following information about the query plan:

  • The physical operator pipeline

  • The index operations for query execution and serialization

  • The size of the result

The profile API uses an extended version of the HTTP API syntax for query, with /gremlin/profile as the endpoint instead of /gremlin.

Parameters specific to Neptune Gremlin profile

  • profile.resultsboolean, allowed values: TRUE and FALSE, default value: TRUE.

    If true, the query results are gathered and displayed as part of the profile report. If false, only the result count is displayed.

  • profile.chopint, default value: 250.

    If non-zero, causes the results string to be truncated at that number of characters. This does not keep all results from being captured. It simply limits the size of the string in the profile report. If set to zero, the string contains all the results.

  • profile.serializerstring, default value: <null>.

    If non-null, the gathered results are returned in a serialized response message in the format specified by this parameter. The number of index operations necessary to produce that response message is reported along with the size in bytes to be sent to the client.

    Allowed values are <null> or any of the valid MIME type or TinkerPop driver "Serializers" enum values.

    "application/json" or "MIME_JSON" "application/vnd.gremlin-v1.0+json" or "GRAPHSON_V1D0" "application/vnd.gremlin-v2.0+json" or "GRAPHSON_V2D0" "application/vnd.gremlin-v3.0+json" or "GRAPHSON_V3D0" "application/vnd.gremlin-v1.0+gryo" or "GRYO_V1D0" "application/vnd.gremlin-v3.0+gryo" or "GRYO_V3D0" "application/vnd.gremlin-v1.0+gryo-lite" or "GRYO_LITE_V1D0" "application/vnd.graphbinary-v1.0" or "GRAPHBINARY_V1D0"
  • profile.indexOpsboolean, allowed values: TRUE and FALSE, default value: FALSE.

    If true, shows a detailed report of all index operations that took place during query execution and serialization. Warning: This report can be verbose.

Sample output of Neptune Gremlin profile

The following is a sample profile query.

curl -X POST https://your-neptune-endpoint:port/gremlin/profile \ -d '{"gremlin":"g.V().hasLabel(\"airport\") \ .has(\"code\", \"AUS\") \ .emit() \ .repeat(in().simplePath()) \ .times(2) \ .limit(100)", \ "profile.serializer":"application/vnd.gremlin-v3.0+gryo"}'

This query generates the following profile report when executed on the air-routes sample graph from the blog post, Let Me Graph That For You – Part 1 – Air Routes.

******************************************************* Neptune Gremlin Profile ******************************************************* Query String ================== g.V().hasLabel("airport").has("code", "AUS").emit().repeat(in().simplePath()).times(2).limit(100) Original Traversal ================== [GraphStep(vertex,[]), HasStep([~label.eq(airport), code.eq(AUS)]), RepeatStep(emit(true),[VertexStep(IN,vertex), PathFilterStep(simple), RepeatEndStep],until(loops(2))), RangeGlobalStep(0,100)] Optimized Traversal =================== Neptune steps: [ NeptuneGraphQueryStep(Vertex) { JoinGroupNode { PatternNode[(?1, <code>, "AUS", ?) . project ?1 .], {estimatedCardinality=1, indexTime=84, hashJoin=true, joinTime=3, actualTotalOutput=1} PatternNode[(?1, <~label>, ?2=<airport>, <~>) . project ask .], {estimatedCardinality=3374, indexTime=29, hashJoin=true, joinTime=0, actualTotalOutput=61} RepeatNode { Repeat { PatternNode[(?3, ?5, ?1, ?6) . project ?1,?3 . IsEdgeIdFilter(?6) . SimplePathFilter(?1, ?3)) .], {hashJoin=true, estimatedCardinality=50148, indexTime=0, joinTime=3} } Emit { Filter(true) } LoopsCondition { LoopsFilter([?1, ?3],eq(2)) } }, annotations={repeatMode=BFS, emitFirst=true, untilFirst=false, leftVar=?1, rightVar=?3} }, finishers=[limit(100)], annotations={path=[Vertex(?1):GraphStep, Repeat[Vertex(?3):VertexStep]], joinStats=true, optimizationTime=495, maxVarId=7, executionTime=323} }, NeptuneTraverserConverterStep ] Physical Pipeline ================= NeptuneGraphQueryStep |-- StartOp |-- JoinGroupOp |-- SpoolerOp(100) |-- DynamicJoinOp(PatternNode[(?1, <code>, "AUS", ?) . project ?1 .], {estimatedCardinality=1, indexTime=84, hashJoin=true}) |-- SpoolerOp(100) |-- DynamicJoinOp(PatternNode[(?1, <~label>, ?2=<airport>, <~>) . project ask .], {estimatedCardinality=3374, indexTime=29, hashJoin=true}) |-- RepeatOp |-- <upstream input> (Iteration 0) [visited=1, output=1 (until=0, emit=1), next=1] |-- BindingSetQueue (Iteration 1) [visited=61, output=61 (until=0, emit=61), next=61] |-- SpoolerOp(100) |-- DynamicJoinOp(PatternNode[(?3, ?5, ?1, ?6) . project ?1,?3 . IsEdgeIdFilter(?6) . SimplePathFilter(?1, ?3)) .], {hashJoin=true, estimatedCardinality=50148, indexTime=0}) |-- BindingSetQueue (Iteration 2) [visited=38, output=38 (until=38, emit=0), next=0] |-- SpoolerOp(100) |-- DynamicJoinOp(PatternNode[(?3, ?5, ?1, ?6) . project ?1,?3 . IsEdgeIdFilter(?6) . SimplePathFilter(?1, ?3)) .], {hashJoin=true, estimatedCardinality=50148, indexTime=0}) |-- LimitOp(100) Runtime (ms) ============ Query Execution: 392.686 Serialization: 2636.380 Traversal Metrics ================= Step Count Traversers Time (ms) % Dur ------------------------------------------------------------------------------------------------------------- NeptuneGraphQueryStep(Vertex) 100 100 314.162 82.78 NeptuneTraverserConverterStep 100 100 65.333 17.22 >TOTAL - - 379.495 - Repeat Metrics ============== Iteration Visited Output Until Emit Next ------------------------------------------------------ 0 1 1 0 1 1 1 61 61 0 61 61 2 38 38 38 0 0 ------------------------------------------------------ 100 100 38 62 62 Predicates ========== # of predicates: 16 WARNING: reverse traversal with no edge label(s) - .in() / .both() may impact query performance Results ======= Count: 100 Output: [v[3], v[3600], v[3614], v[4], v[5], v[6], v[7], v[8], v[9], v[10], v[11], v[12], v[47], v[49], v[136], v[13], v[15], v[16], v[17], v[18], v[389], v[20], v[21], v[22], v[23], v[24], v[25], v[26], v[27], v[28], v[416], v[29], v[30], v[430], v[31], v[9... Response serializer: GRYO_V3D0 Response size (bytes): 23566 Index Operations ================ Query execution: # of statement index ops: 3 # of unique statement index ops: 3 Duplication ratio: 1.0 # of terms materialized: 0 Serialization: # of statement index ops: 200 # of unique statement index ops: 140 Duplication ratio: 1.43 # of terms materialized: 393

In addition to the query plans returned by a call to Neptune explain, the profile results include runtime statistics around query execution. Each Join operation is tagged with the time it took to perform its join as well as the actual number of solutions that passed through it.

The profile output includes the time taken during the core query execution phase, as well as the serialization phase if the profile.serializer option was specified.

The breakdown of the index operations performed during each phase is also included at the bottom of the profile output.

Note that consecutive runs of the same query may show different results in terms of run-time and index operations because of caching.

For queries using the repeat() step, a breakdown of the frontier on each iteration is available if the repeat() step was pushed down as part of a NeptuneGraphQueryStep.

Differences in profile reports when DFE is enabled

When the Neptune DFE alternative query engine is enabled, profile output is somewhat different:

Optimized Traversal: This section is similar to the one in explain output, but contains additional information. This includes the type of DFE operators that were considered in planning, and the associated worst case and best case cost estimates.

Physical Pipeline: This section captures the operators that will be used to execute the query. DataflowOp abstracts the physical plan that is used by DFE to execute the portion of the plan it is responsible for. The DataflowOp is unfolded in the following section where DFE statistics are listed.

DFEQueryEngine Statistics: This section shows up only when at least part of the query is executed by DFE. It outlines various runtime statistics that are specific to DFE, and contains a detailed breakdown of the time spent in the various parts of the query execution.

DFE configuration: This sub-section of the DFEQueryEngine Statistics section lists the settings used to execute the DFE plan. Among other things, it contains the number of CPU cores used to execute the query.

Statistics & Operator histogram: This sub-section of the DFEQueryEngine Statistics section contains details about the execution time of the DataflowOp plan, and the breakdown of CPU time used by each operator. Below, this there is a table containing detailed runtime statistics of the plan executed by DFE.

The rest of the profile output remains unchanged.

Note

DFE is an experimental feature released under lab mode, so the exact format of the profile output is still subject to change.

Sample profile output when the Neptune Dataflow engine (DFE) is enabled

Given the following query:

g.V("v1").out("created").in("created").id()

The following profile output might be generated:

******************************************************* Neptune Gremlin Profile ******************************************************* Query String ================== g.V("v1").out("created").in("created").id() Original Traversal ================== [GraphStep(vertex,[v1]), VertexStep(OUT,[created],vertex), VertexStep(IN,[created],vertex), IdStep] Optimized Traversal =================== Neptune steps: [ NeptuneGraphQueryStep(VertexId) { JoinGroupNode { DFENode { DFEJoinGroupNode[ children={ DFEJoinGroupNode[ children={ DFEPatternNode[(?1, ?5, ?3, ?6) . project ALL[?1, ?3] subjectFilters=(in(&lt;v1&gt;) . ), predicateFilters=(in(&lt;created&gt;) . ), graphFilters=(!= &lt;~&gt; . ), {rangeCountEstimate=1}, OperatorInfoWithAlternative[ rec=OperatorInfo[ type=INCREMENTAL_HASH_JOIN, costEstimates=OperatorCostEstimates[ costEstimate=OperatorCostEstimate[in=1.0000,out=18.0000,io=0.0000,comp=0.0000,mem=44], worstCaseCostEstimate=OperatorCostEstimate[in=1.0000,out=42.0000,io=0.0000,comp=0.0000,mem=44]]], alt=OperatorInfo[ type=INCREMENTAL_PIPELINE_JOIN, costEstimates=OperatorCostEstimates[ costEstimate=OperatorCostEstimate[in=1.0000,out=18.0000,io=0.0001,comp=0.0000,mem=0], worstCaseCostEstimate=OperatorCostEstimate[in=1.0000,out=42.0000,io=0.0001,comp=0.0000,mem=0]]]]] }, , OperatorInfo[ type=NONE, costEstimates=OperatorCostEstimates[ costEstimate=OperatorCostEstimate[in=0.0000,out=0.0000,io=0.0000,comp=0.0000,mem=0], worstCaseCostEstimate=OperatorCostEstimate[in=0.0000,out=0.0000,io=0.0000,comp=0.0000,mem=0]]]], DFEJoinGroupNode[ children={ DFEPatternNode[(?7, ?9, ?3, ?10) . project ALL[?3, ?7] predicateFilters=(in(&lt;created&gt;) . ), graphFilters=(!= &lt;~&gt; . ), {rangeCountEstimate=3}, OperatorInfoWithAlternative[ rec=OperatorInfo[ type=INCREMENTAL_HASH_JOIN, costEstimates=OperatorCostEstimates[ costEstimate=OperatorCostEstimate[in=1.0000,out=18.0000,io=0.0000,comp=0.0000,mem=108], worstCaseCostEstimate=OperatorCostEstimate[in=1.0000,out=42.0000,io=0.0000,comp=0.0000,mem=108]]], alt=OperatorInfo[ type=INCREMENTAL_PIPELINE_JOIN, costEstimates=OperatorCostEstimates[ costEstimate=OperatorCostEstimate[in=1.0000,out=18.0000,io=0.0001,comp=0.0000,mem=0], worstCaseCostEstimate=OperatorCostEstimate[in=1.0000,out=42.0000,io=0.0001,comp=0.0000,mem=0]]]]] }, , OperatorInfo[ type=BUSHY_SYMMETRIC_HASH_JOIN, costEstimates=OperatorCostEstimates[ costEstimate=OperatorCostEstimate[in=36.0000,out=126.0000,io=0.0000,comp=0.0000,mem=1872], worstCaseCostEstimate=OperatorCostEstimate[in=84.0000,out=686.0000,io=0.0000,comp=0.0000,mem=4368]]]] }, ] } }, annotations={path=[Vertex(?1):GraphStep, Vertex(?3):VertexStep, Vertex(?7):VertexStep, VertexId(?7):IdStep], joinStats=true, optimizationTime=2, maxVarId=11, executionTime=13} }, NeptuneTraverserConverterStep ] Physical Pipeline ================= NeptuneGraphQueryStep |-- StartOp |-- JoinGroupOp |-- DataflowOp DFEQueryEngine Statistics ================= ====&gt; DFE execution time (measured by DFEQueryEngine) accepted [micros]=92 ready [micros]=1 running [micros]=3928 finished [micros]=0 ===&gt; DFE execution time (measured in DFGNode) -&gt; executionTime [ms]=10 -&gt; resultReadTime [ms]=0 ===&gt; DFE configuration: solutionChunkSize=5000 ouputQueueSize=20 numComputeCores=3 maxParallelIO=10 numInitialPermits=1 ====&gt; DFE configuration (reported back) numComputeCores=3 maxParallelIO=2 numInitialPermits=1 ===&gt; Statistics & operator histogram ==&gt; Statistics -&gt; 2727 / 2727 micros total elapsed (incl. wait / excl. wait) -&gt; 2727 / 2 millis total elapse (incl. wait / excl. wait) -&gt; 2727 / 0 secs total elapsed (incl. wait / excl. wait) ==&gt; Operator histogram -&gt; 46.57% of total time (excl. wait): pipelineScan (2 instances) -&gt; 12.69% of total time (excl. wait): merge (1 instances) -&gt; 40.56% of total time (excl. wait): symmetricHashJoin (1 instances) -&gt; 0.18% of total time (excl. wait): drain (1 instances) nodeId | out0 | out1 | opName | args | rowsIn | rowsOut | chunksIn | chunksOut | elapsed* | outWait | outBlocked | ratio | rate* [M/s] | rate [M/s] | % ------ | ------ | ---- | ----------------- | ----------------------------------------------------------------------------------------- | ------ | ------- | -------- | --------- | -------- | ------- | ---------- | -------- | ----------- | ---------- | ----- node_0 | node_2 | - | pipelineScan | (?1, ?5, ?3, ?6) ALL [?1, ?3] subjectFilters=true predicateFilters=true graphFilters=true | 0 | 1 | 0 | 1 | 714 | 0 | 0 | Infinity | 0.0014 | 0.0014 | 26.18 node_1 | node_2 | - | pipelineScan | (?7, ?9, ?3, ?10) ALL [?3, ?7] predicateFilters=true graphFilters=true | 0 | 3 | 0 | 1 | 556 | 0 | 0 | Infinity | 0.0054 | 0.0054 | 20.39 node_2 | node_4 | - | symmetricHashJoin | | 4 | 3 | 2 | 2 | 1106 | 0 | 0 | 0.75 | 0.0027 | 0.0027 | 40.56 node_3 | - | - | drain | | 3 | 0 | 1 | 0 | 5 | 0 | 0 | 0.00 | 0.0000 | 0.0000 | 0.18 node_4 | node_3 | - | merge | | 3 | 3 | 2 | 1 | 346 | 0 | 0 | 1.00 | 0.0087 | 0.0087 | 12.69 Runtime (ms) ============ Query Execution: 13.994 Traversal Metrics ================= Step Count Traversers Time (ms) % Dur ------------------------------------------------------------------------------------------------------------- NeptuneGraphQueryStep(VertexId) 3 3 12.928 94.47 NeptuneTraverserConverterStep 3 3 0.756 5.53 &gt;TOTAL - - 13.685 - Predicates ========== # of predicates: 6 Results ======= Count: 3 Output: [v4, v3, v1] Index Operations ================ Query execution: # of statement index ops: 0 # of terms materialized: 2