Neptune Analytics algorithms
Graph algorithms are powerful tools for gaining insights into data. Neptune Analytics provides a set of optimized indatabase implementations of common graph algorithms that are exposed as openCypher procedures. These algorithms analyze inherent aspects of the underlying graph structure, such as connectedness (path finding), relative importance (centrality), and community membership (community detection).
Neptune Analytics natively supports over 25 optimized graph algorithms and variants in the 5 most popular categories that help customers extract insights from their graphs, which are listed in the following table.
Category  Action  Algorithms  Common Uses 

Find the existence, quality, or availability of a path between nodes. 



Determines the absolute or relative importance of a node in the graph. 



Compare the similarities between different graph structures. 



Identify meaningful groups or clusters within graph structures. 



Identify approximate nearest neighbor (ANN) nodes by comparing vector embeddings using the Hierarchical Navigable Small World (HNSW) algorithm. 


Many of these algorithms require interacting with most or all the nodes and edges in a graph, often in an iterative fashion. As a result, they are too computationally expensive to process using normal analytic technologies. Neptune Analytics has built highly optimized implementations that allow them to run over graphs of any size.
Algorithms in Neptune Analytics are integrated naturally into openCypher through the CALL clause, as illustrated below. This lets you combine algorithms naturally with openCypher clauses, functions, and semantics to build very complex queries. For example, you could look for the top 10 most important airports in the USAK region like this:
MATCH (n:airport {region: 'USAK'}) CALL neptune.algo.pageRank(n, {edgeLabels: ['route'], numOfIterations: 10}) YIELD rank RETURN n.code, rank ORDER BY rank DESC LIMIT 10
You can run algorithms in the SDKs using the ExecuteOpenCypherQuery
operation or in boto3 and the AWS CLI using the executequery
command.
If you don't want to use the SDK or CLI, you can use you can use awscurl
to sign your Neptune Analytics requests using signed
using Signature Version 4 (Sig4). For example, you can run a simple breadthfirst search like this:
awscurl X POST H "ContentType: application/xwwwformurlencoded" \ https://
(graphIdentifier)
.(region)
.neptunegraph.amazonaws.com/opencypher \ service neptunegraph \ region(region)
\ d "query=CALL neptune.algo.bfs([\"101\", \"102\"], {edgeLabels: [\"route\"]})"
You could run the same query using the AWS CLI like this:
aws neptunegraph executequery \ graphidentifier ${graphIdentifier} \ querystring 'CALL neptune.algo.bfs(["101", "102"], {edgeLabels: ["route"]})' \ language open_cypher \ /tmp/out.txt
Algorithms having signatures with different kinds of input
in Neptune Analytics are exposed as separate algorithms. Unless otherwise
indicated, the examples here are using the Air Routes
dataset
Neptune Analytics currently supports five main categories of algorithm:

Path finding algorithms – These find the existence, quality, or availability of a path or paths between two or more nodes in the graph. A path in this sense is a set of nodes and connecting edges.
By efficiently determining the optimal route between two nodes, pathfinding algorithms enable you to model realworld systems like roads or social networks as interconnected nodes and edges. Finding the shortest paths between various points is crucial in applications like route planning for GPS systems, logistics optimization, and even in solving complex problems in fields like biology or engineering.

Centrality algorithms – These are used to determine the absolute or relative importance or influence of a node or nodes in the graph.
By identifying the most influential or important nodes within a network, centrality algorithms can provide insights about key players or critical points of interaction. This is valuable in social network analysis, where it helps pinpoint influential individuals, and in transportation networks, where it aids in identifying crucial hubs for efficient routing and resource allocation.

Similarity algorithms – Graph similarity algorithms allow you to compare and analyze the similarities and dissimilarities between different graph structures, which can provide insight into relationships, patterns, and commonalities across diverse datasets. This is invaluable in various fields, such as biology, for comparing molecular structures, such as social networks, for identifying similar communities, and such as recommendation systems, for suggesting similar items based on user preferences.

Clustering or communitydetection algorithms – Communitydetection algorithms can identify meaningful groups or clusters of nodes in a network, revealing hidden patterns and structures that can provide insights into the organization and dynamics of complex systems. This is valuable in social network analysis, and in biology, for identifying functional modules in proteinprotein interaction networks, and more generally for understanding information flow and influence propagation in many different domains.

Vector Similarity Search – Vector similarity algorithms work by using vector based representations of data, a.k.a. embeddings, to answer questions about the data's context and its similarity and connection to other data. This is valuable in applications such as Retrieval Augmented Generation (RAG) applications, knowledge graph backed chatbots, and recommendation engines.
Custom Algorithms
Property graph information
Property Graph Information (graph.pg_info) summarizes some of the basic metrics of the graph, such as the number of vertices, the number of edges, the number of edge properties, the number of vertex properties, the number of edge labels, and the number of vertex labels.
Inputs for graph.pg_info
There are no inputs for graph.pg_info.
Outputs for graph.pg_info
There are two columns in the output relation: the first column is the metric name and the second column is the count.
metric: the metrics that graph.pg_info will return, which include:
numVertices: the number of vertices in the graph.
numEdges: the number of edges in the graph.
numVertexProperties: the number of node properties in the graph.
numEdgeProperties: the number of edge properties in the graph.
numVertexLabels: the number of unique vertex labels in the graph.
numEdgeLabels: the number of unique edge labels in the graph.
count
count: the value of the above metrics.
graph.pg_info query example
## Syntax CALL neptune.graph.pg_info() YIELD metric, count RETURN metric, count
graph.pg_info query integration
# sample query integration CALL neptune.graph.pg_info() YIELD metric, count WHERE metric = 'numVertices' RETURN count
Sample graph.pg_info output
# sample output of graph.pg_info aws neptunegraph executequery \ graphidentifier ${graphIdentifier} \ querystring "CALL neptune.graph.pg_info() YIELD metric, count RETURN metric, count " \ language open_cypher \ /tmp/out.txt cat /tmp/out.txt { "results": [{ "metric": "numVertices", "count": 3748 }, { "metric": "numEdges", "count": 57538 }, { "metric": "numVertexProperties", "count": 42773 }, { "metric": "numEdgeProperties", "count": 50532 }, { "metric": "numVertexLabels", "count": 4 }, { "metric": "numEdgeLabels", "count": 2 }] }