Using RDF data
Neptune Analytics supports importing RDF data using the n-triples format. With this support it is possible to load CSV and n-triples data files into the same graph. The handling of RDF values is described below, including how RDF data is interpreted as LPG concepts and can be queried using openCypher.
Handling of RDF values
The handling of RDF specific values, that don‘t have a direct equivalent in LPG, is described here.
IRIs
Values of type IRI, like <http://example.com/Alice>
, are stored as such. IRIs and Strings
are distinct data types.
Calling openCypher function TOSTRING()
on an IRI returns a string containing the IRI wrapped inside
<>
. For example, if x
is the IRI <http://example.com/Alice>
,
then TOSTRING(x)
returns "<http://example.com/Alice>"
. When serializing openCypher
query results in json format, IRI values are included as strings in this same format.
Language-tagged literals
Values like "Hallo"@de
are treated as follows:
-
When used as input for openCypher string functions, like
trim()
, a language-tagged string is treated as a simple string; sotrim("Hallo"@de)
is equivalent totrim("Hallo")
. -
When used in comparison operations, like
x = y
orx <> y
orx < y
orORDER BY
, a language-tagged literal is “greater than” (and thus “not equal to”) the corresponding simple string:"Hallo" < "Hallo"@de
.
Calling a function, such as TOSTRING()
on a language-tagged literal, returns that literal as a
string without language tag. For example, if x
is the value "Hallo"@de
, then
TOSTRING(x)
returns "Hallo"
. When serializing openCypher query results in JSON format,
language-tagged literals are also serialized as strings without an associated language tag.
Blank nodes
Blank nodes in n-triples data files are replaced with globally unique IRIs at import time.
Loading RDF datasets that contains blank nodes is supported; but those blank nodes are represented as
IRIs in the graph. When loading ntriples files the parameter blankNodeHandling
needs to be
specified, with the value convertToIri
.
The generated IRI for a blank node has the format:
<http://aws.amazon.com/neptune/vocab/v01/BNode/scope#id>
In these IRIs, scope
is a unique identifier for the blank node scope, and id
is
the blank node identifier in the file. For example for a blank node _:b123
the generated IRI
could be <http://aws.amazon.com/neptune/vocab/v01/BNode/737c0b5386448f78#b123>
.
The blank node scope (e.g. 737c0b5386448f78) is generated by Neptune Analytics and
designates one file within one load operation. This means that when two different ntriples files reference
the same blank node identifier, like _:b123
, there will be two IRIs generated, namely one for
each file. All references to _:b123
within the first file will end up as references to the
first IRI, like <http://aws.amazon.com/neptune/vocab/v01/BNode/1001#b123>
, and all
references within the second file will end up referring to another IRI, like
<http://aws.amazon.com/neptune/vocab/v01/BNode/1002#b123>
.