Creating a dataset from one .csv file - Amazon Lookout for Equipment

Creating a dataset from one .csv file

If you've uploaded one .csv file containing all of the sensor data for the asset, you would use the following schema to create a dataset from that file.

{ "Components": [ { "ComponentName": "AssetName", "Columns": [ { "Name": "Timestamp", "Type": "DATETIME" }, { "Name": "Sensor1", "Type": "DOUBLE" }, { "Name": "Sensor2", "Type": "DOUBLE" }, { "Name": "Sensor3", "Type": "DOUBLE" }, { "Name": "Sensor4", "Type": "DOUBLE" }, ] } ] }

The "ComponentName" is the portion of the prefix of the Amazon S3 object key that identifies the .csv file containing the sensor data for your asset. When you specify the value of "ComponentName" as "AssetName", you access s3://DOC-EXAMPLE-BUCKET/FacilityName/AssetName/AssetName.csv. You enter the columns of your dataset in the Columns object. The name of each column in your .csv file must match the Name in the schema. For the column containing the time stamp data, you must specify the value of "Type" as "DATETIME" in the schema. For the columns containing data from sensors, you must specify the value of "Type" as "DOUBLE".

You can use a schema to create a dataset for your .csv files in the Amazon Lookout for Equipment console, but we recommend using the API. You can use the following example code using the AWS SDK for Python (Boto3) to create a dataset.

import boto3 import json import pprint from botocore.config import Config ​ ​ config = Config( region_name = 'Region' # Choose a valid AWS Region. ) ​ lookoutequipment = boto3.client(service_name="lookoutequipment", config=config) ​ dataset_schema = { "Components": [ { "ComponentName": "AssetName", "Columns": [ { "Name": "Timestamp", "Type": "DATETIME" }, { "Name": "Sensor1", "Type": "DOUBLE" }, { "Name": "Sensor2", "Type": "DOUBLE" }, { "Name": "Sensor3", "Type": "DOUBLE" }, { "Name": "Sensor4", "Type": "DOUBLE" }, ] } ] } ​ dataset_name = "dataset-name" data_schema = { 'InlineDataSchema': json.dumps(dataset_schema), } ​ create_dataset_response = lookoutequipment.create_dataset(DatasetName=dataset_name, DatasetSchema=data_schema) ​ pp = pprint.PrettyPrinter(depth=4) pp.pprint(create_dataset_response)

Next step

Ingesting a dataset