Amazon Neptune
User Guide (API Version 2017-11-29)

Loader Command

Loads data from an Amazon S3 bucket into a Neptune DB instance.

To load data, you must send an HTTP POST request to the http://your-neptune-endpoint:8182/loader endpoint. The parameters for the loader request can be sent in the POST body or as URL-encoded parameters.

Important

The MIME type must be application/json.

The S3 bucket must be in the same AWS Region as the cluster.

Request Syntax

{ "source" : "string", "format" : "string", "iamRoleArn" : "string", "mode": "NEW|RESUME|AUTO" "region" : "us-east-1", "failOnError" : "string", "parserConfiguration" : { "baseUri" : "http://base-uri-string", "namedGraphUri" : "http://named-graph-string" } }

Request Parameters

source

An Amazon S3 URI.

The SOURCE parameter accepts an Amazon S3 URI that points to either a single file or a folder. If you specify a folder, Neptune loads every data file in the folder.

The folder may contain multiple vertex files and multiple edge files.

The URI can be in any of the following formats.

  • s3://bucket_name/object-key-name

  • https://s3.amazonaws.com/bucket_name/object-key-name

  • https://s3-us-east-1.amazonaws.com/bucket_name/object-key-name

format

The format of the data. For more information about data formats for the Neptune Loader command, see Loading Data into Neptune.

Allowed values: csv (Gremlin). ntriples, nquads, rdfxml, turtle (RDF)

iamRoleArn

The Amazon Resource Name (ARN) for an IAM role to be assumed by the Neptune DB instance for access to the S3 bucket. For information about creating a role that has access to Amazon S3 and then associating it with a Neptune cluster, see Prerequisites: IAM Role and Amazon S3 Access.

region

The region parameter must match the region of the cluster and the S3 bucket.

Amazon Neptune is available in the following regions:

  • us-east-1 - US East (N. Virginia)

  • us-east-2 - US East (Ohio)

  • us-west-2 - US West (Oregon)

  • eu-west-1 - EU (Ireland)

  • eu-west-2 - EU (London)

  • eu-central-1 - EU (Frankfurt)

  • ap-southeast-1 - Asia Pacific (Singapore)

mode

Load job mode.

RESUME mode determines whether there is a previous load for the source and resumes the load if one is found. If a previous load is not found, the load is aborted. The loader avoids reloading the files that successfully completed in previous load and only attempts to process the failed files, in case you dropped the previously loaded data from your Neptune cluster, they will not be reloaded in this mode. In the special case where the previous loads from the same source completed successfully, the new load is completed with nothing reloaded.

NEW mode creates a new load request regardless of any previous loads. This mode may be used to reload all the data from a source after dropping the previously loaded data from your Neptune cluster or to load new data available at the same source.

AUTO mode determines whether there is a previous load with the same source, and resumes the load if one is found just like RESUME mode. If a previous load is not found, then loads data from the source just like NEW mode.

Default: AUTO

Allowed values: NEW, RESUME, AUTO.

failOnError

Flag to toggle a complete stop on an error.

When set to FALSE, the bulk loader attempts to load all the data in the location specified and skips any entries with errors.

When set to TRUE, the bulk loader aborts if it encounters an error, data loaded up to that point persists.

Default: TRUE

Allowed values: TRUE, FALSE

parserConfiguration

An optional object with additional parser configuration values. Each child parameter is also optional.

Name Example Value Description
namedGraphUri http://aws.amazon.com/neptune/vocab/v01/DefaultNamedGraph The default graph for all RDF formats when no graph is specified (for non-quads formats and NQUAD entries with no graph). Default is http://aws.amazon.com/neptune/vocab/v01/DefaultNamedGraph
baseUri http://aws.amazon.com/neptune/default The base uri for RDF/XML and Turtle formats. Default is http://aws.amazon.com/neptune/default

For more information, see SPARQL Default Graph and Named Graphs.

[deprecated] accessKey

The iamRoleArn parameter is recommended instead. For information about creating a role that has access to Amazon S3 and then associating it with a Neptune cluster, see Prerequisites: IAM Role and Amazon S3 Access.

An access key ID of an IAM role with access to the S3 bucket and data files.

For more information, see Access keys (access key ID and secret access key).

[deprecated] secretKey

The iamRoleArn parameter is recommended instead. For information about creating a role that has access to Amazon S3 and then associating it with a Neptune cluster, see Prerequisites: IAM Role and Amazon S3 Access.

For more information, see Access keys (access key ID and secret access key).

Response Syntax

{ "status" : "200 OK", "payload" : { "loadId" : "guid_as_string" } }

200 OK

Successfully started load job returns a 200 code.

Errors

When an error occurs, a JSON object is returned in the BODY of the response. The message object contains a description of the error.

Error 400

Syntax errors return a 400 bad request error. The message describes the error.

Error 500

A valid request that cannot be processed returns a 500 internal server error. The message describes the error.

Loader Error Messages

The following are possible error messages from the loader with a description of the error.

Max concurrent load limit breached (HTTP 400)

You can only have 1 load job at a time.

Couldn't find the AWS credential for iam_role_arn (HTTP 400)

The credentials were not found. Verify the supplied credentials against the IAM console or AWS CLI output.

S3 bucket not found for source (HTTP 400)

The S3 bucket does not exist. Check the name of the bucket.

The source source-uri does not exist/not reachable (HTTP 400)

No matching files were found in the S3 bucket.

Unable to connect to S3 endpoint. Provided source = source-uri and region = aws-region (HTTP 400)

Unable to connect to Amazon S3. Region must match the cluster region. Ensure that you have a VPC endpoint. For information about creating a VPC endpoint, see Amazon S3 VPC Endpoint.

Bucket is not in provided region (aws-region) (HTTP 400)

The bucket must be in the same AWS Region as your Neptune DB instance.

Unable to perform S3 list operation (HTTP 400)

The IAM user or role provided does not have List permissions on the bucket or the folder. Check the policy and/or the access control list (ACL) on the bucket.

Start new load operation not permitted on a read-replica instance (HTTP 405)

Loading is a write operation. Retry load on the read/write cluster endpoint.

Failed to start load because of unknown error from S3 (HTTP 500)

Amazon S3 returned an unknown error. Contact AWS Support.

Invalid S3 access key (HTTP 400)

Access key is invalid. Check the provided credentials.

Invalid S3 secret key (HTTP 400)

Secret key is invalid. Check the provided credentials.

Examples

Example Request

The following is a request sent via HTTP POST using the curl command. It loads a file in the Neptune CSV format. For more information, see Gremlin Load Data Format.

curl -X POST \ -H 'Content-Type: application/json' \ http://your-neptune-endpoint:8182/loader -d ' { "source" : "s3://bucket-name/object-key-name", "format" : "csv", "accessKey" : "access-key-id", "secretKey" : "secret-key", "region" : "region", "failOnError" : "FALSE" }'

Example Response

{ "status" : "200 OK", "payload" : { "loadId" : "ef478d76-d9da-4d94-8ff1-08d9d4863aa5" } }