Non-string OpenSearch indexing in Amazon Neptune
Non-string OpenSearch indexing in Amazon Neptune allows replicating non-string values for predicates to OpenSearch using the stream poller. All predicate values that can safely be converted to a corresponding OpenSearch mapping or datatype is then replicated to OpenSearch.
For non-string indexing to be enabled on a new stack, the Enable Non-String Indexing
flag in the AWS CloudFormation template must be set to true
. This is the default
setting. To update an existing stack to support non-string indexing, see Updating an existing stack below.
It is best not to enable non-string indexing on engine versions earlier than
1.0.4.2
.OpenSearch queries using regular expressions for field names that match multiple fields, some of which contain string values and others of which contain non-string values, fail with an error. The same thing happens if full-text search queries in Neptune are of that type.
When sorting by a non-string field, append ".value" to the field name to differentiate it from a string field.
Contents
- Updating an existing Neptune full-text search stack to support non-string indexing
- Filtering what fields are indexed in Neptune full-text search
- Mapping of SPARQL and Gremlin datatypes to OpenSearch
- Validation of data mappings
- Sample non-string OpenSearch queries in Neptune
- 1. Get all vertices with age greater than 30 and name starting with "Si"
- 2. Get all nodes with age between 10 and 50 and a name with a fuzzy match with "Ronka"
- 3. Get all nodes with a timestamp that falls within the last 25 days
- 4. Get all nodes with a timestamp that falls within a given year and month
Updating an existing Neptune full-text search stack to support non-string indexing
If you are already using Neptune full-text search, here are the steps you need to take to support non-string indexing:
-
Stop the stream poller Lambda function. This ensures that no new updates are copied during export. Do this by disabling the cloud event rule that invokes the Lambda function:
In the AWS Management Console, navigate to CloudWatch.
Select Rules.
Choose the rule with the Lambda stream poller name.
Select disable to temporarily disable the rule.
-
Delete the current Neptune index in OpenSearch. Use the following
curl
query to delete theamazon_neptune
index from your OpenSearch cluster:curl -X DELETE "
your OpenSearch endpoint
/amazon_neptune" -
Start a one-time export from Neptune to OpenSearch. It is best to set up a new OpenSearch stack at this point, so that new artifacts are picked up for the poller that performs the export.
Follow the steps listed here in GitHub
to start the one-time export of your Neptune data into OpenSearch. -
Update the Lambda artifacts for the existing stream poller. After the export of Neptune data to OpenSearch has completed successfully, take the following steps:
In the AWS Management Console, navigate to AWS CloudFormation.
Choose the main parent AWS CloudFormation stack.
Select the Update option for that stack.
Select Replace current template from options.
For the template source, select Amazon S3 URL.
-
For the Amazon S3 URL, enter:
https://aws-neptune-customer-samples.s3.amazonaws.com/neptune-stream/neptune_to_elastic_search.json
Choose Next without changing any of the AWS CloudFormation parameters.
Select Update stack. AWS CloudFormation will replace the Lambda code artifacts for the stream poller with the latest artifacts.
-
Start the stream poller again. Do this by enabling the appropriate CloudWatch rule:
In the AWS Management Console, navigate to CloudWatch.
Select Rules.
Choose the rule with the Lambda stream poller name.
Select enable.