Using the additionalParams object to tune the Neptune ML export of model-training data - Amazon Neptune

Using the additionalParams object to tune the Neptune ML export of model-training data

Using the Neptune-Export service or neptune-export command line utility, you can pass parameters in the additionalParams field to guide the creation of a training data configuration file.

The export process cannot automatically infer which node and edge properties should be the machine learning class labels to serve as examples for training purposes. It also cannot automatically infer the best feature encoding for numeric, categorical and text properties, so you need to supply hints using parameters in the additionalParams object to specify these things, or to override the default encoding.

The general structure of the additionalParams object looks either like this:

"additionalParams": { "neptune_ml": { "version": "v2.0", "targets": [ (an array of node and edge class label targets) ], "features": [ (an array of node feature hints) ] } }

Or, it can look like this, containing multiple different export configurations:

"additionalParams" : { "neptune_ml" : { "version": "v2.0", "jobs": [ { "name" : "(training data configuration name)", "targets": [ (an array of node and edge class label targets) ], "features": [ (an array of node feature hints) ] }, { "name" : "(another training data configuration name)", "targets": [ (an array of node and edge class label targets) ], "features": [ (an array of node feature hints) ] } ] } }

Top-level elements in the neptune_ml field in additionalParams

The version element in neptune_ml

Specifies the version of training data configuration to generate.

(Optional), Type: string, Default: "v2.0".

If you do include version, set it to v2.0.

The jobs field in neptune_ml

Contains an array of training-data configuration objects, each of which defines a data processing job, and contains:

  • name   –   The name of the training data configuration to be created.

    For example, a training data configuration with the name "job-number-1" results in a training data configuration file named job-number-1.json.

  • targets   –   A JSON array of node and edge class label targets that represent the machine-learning class labels for training purposes. See The targets field in neptune_ml.

  • features   –   A JSON array of node property features. See The features field in neptune_ml.