Load Data Formats - Amazon Neptune

Load Data Formats

The Amazon Neptune Load API supports loading data in a variety of formats.

Property-graph load formats

Data loaded in one of the following property-graph formats can then be queried using both Gremlin and openCypher:

RDF load formats

To load Resource Description Framework (RDF) data that you query using SPARQL, you can use one of the following standard formats as specified by the World Wide Web Consortium (W3C):

Load data must use UTF-8 encoding

Important

All load-data files must be encoded in UTF-8 form. If a file is not UTF-8 encoded, Neptune tries to load it as UTF-8 anyway.

For N-Quads and N-triples data that includes Unicode characters, \uxxxxx escape sequences are supported. However, Neptune does not support normalization. If a value is present that requires normalization, it will not match byte-to-byte during querying. For more information about normalization, see the Normalization page on Unicode.org.

If your data is not in a supported format, you must convert it before you load it.

A tool for converting GraphML to the Neptune CSV format is available in the GraphML2CSV project on GitHub.

Compression support for load-data files

Neptune supports compression of individual files in gzip or bzip2 format.

The compressed file must have a .gz or .bz2 extension, and must be a single text file encoded in UTF-8 format. You can load multiple files, but each one must be a separate .gz, .bz2, or uncompressed text file. Archive files with extensions such as .tar, .tar.gz, and .tgz are not supported.

The following sections describe the formats in more detail.