Full text search in Amazon Neptune using Amazon OpenSearch Service - Amazon Neptune

Full text search in Amazon Neptune using Amazon OpenSearch Service

Neptune integrates with Amazon OpenSearch Service (OpenSearch Service) to support full-text search in both Gremlin and SPARQL queries. This feature is available starting in Neptune engine release 1.0.2.1, although we recommend using it with engine release 1.0.4.2 or higher to take advantage of the latest fixes.

Important

When integrating with Amazon OpenSearch Service, Neptune requires Elasticsearch version 7.1 or higher, or any version of OpenSearch prior to version 2.3. Neptune does not currently support integration with OpenSearch 2.3 or later versions.

You can use Neptune with an existing OpenSearch Service cluster that has been populated according to the Neptune data model for OpenSearch data. Or, you can create an OpenSearch Service domain linked with Neptune using an AWS CloudFormation stack.

Important

The Neptune to OpenSearch replication process described here does not replicate blank nodes. This is an important limitation to note.

Querying from an OpenSearch cluster with Fine-grained access control (FGAC) enabled

If you have enabled fine-grained access control on your OpenSearch cluster, you need to enable IAM authentication in your Neptune database as well.

The IAM entity (User or Role) used for connecting to the Neptune database should have permissions both for Neptune and the OpenSearch cluster. This means that your user or role must have an OpenSearch OpenSearch Service in place like this:

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::account-id:root" }, "Action": "es:*", "Resource": "arn:aws:es:region:account-id:es-resource-id/*" } ] }

See Custom IAM data-access policy statements for Amazon Neptune for more information.

Using Apache Lucene query syntax in Neptune full-text search queries

OpenSearch supports using Apache Lucene syntax for query_string queries. This is particularly useful for passing multiple filters in a query.

Neptune uses a nested structure for storing properties in an OpenSearch document (see Neptune Full-text search data model). When using Lucene syntax, you need to use full paths to the properties in this nexted model.

Here is a Gremlin example:

g.withSideEffect("Neptune#fts.endpoint", "es_endpoint") .withSideEffect("Neptune#fts.queryType", "query_string") .V() .has("*", "Neptune#fts predicates.name.value:\"Jane Austin\" AND entity_type:Book")

Here is a SPARQL example:

PREFIX neptune-fts: <http://aws.amazon.com/neptune/vocab/v01/services/fts#> SELECT * WHERE { SERVICE neptune-fts:search { neptune-fts:config neptune-fts:endpoint 'http://localhost:9200 (http://localhost:9200/)' . neptune-fts:config neptune-fts:queryType 'query_string' . neptune-fts:config neptune-fts:query "predicates.\\*foaf\\*name.value:Ronak AND predicates.\\*foaf\\*surname.value:Sh*" . neptune-fts:config neptune-fts:field '*' . neptune-fts:config neptune-fts:return ?res . }