Evaluating sensor grades - Amazon Lookout for Equipment

Evaluating sensor grades

This is where you can dive deep and troubleshoot exactly why you’re getting the error codes, and make some decisions about whether you want to remove some sensors from your dataset.

Even if your ingestion job succeeds as a whole, and all your individual files also ingest successfully, you may decide not to use all the data from your sensors.

For each sensor, Lookout for Equipment tallies up the number of issues that arise. Based on how many issues occur for each sensor, Lookout for Equipment issues that sensor a grade.

Important

This page is about evaluating the quality of the data coming from specific sensors. You can also read about why the ingestion of an entire job can fail, and about why the ingestion of a particular file can fail.

Sensor grades
  • High

    No validation errors were detected in the data during ingestion. Data from sensors in this category is considered the most reliable for model training and evaluation.

  • Medium

    One or more potentially harmful validation errors were detected in the data during ingestion. Data from sensors in this category is considered less reliable for model training and evaluation.

  • Low

    One or more significant validation errors were detected in the data during ingestion. There's a high probability that training a model on data from sensors in this category will result in poor model performance.

Individual sensor errors
Error Explanation Data quality Action taken by Lookout for Equipment Action recommended for customer
No data found No data is present for this sensor. Low Cannot use data from this sensor Do not use this sensor.
Insufficient data Less than 14 days of data provided. Low Lookout for Equipment cannot use data from this sensor. This sensor cannot be used.
Monotonic values detected Data only goes up, only goes down, or remains virtually static. Low Lookout for equipment can use this sensor but there is a risk of high number of false positive alerts. Review this sensor and update sensor if necessary. We recommend that you do not use monotonic sensors.
Large data gaps detected Data has at least one gap longer than 30 days. Medium Lookout for Equipment will forward fill all the missing values. Review missing values and update sensor if necessary. The data gaps may cause false alerts.
Multiple operating modes detected Data shifts between ranges. Medium Lookout for Equipment can use this sensor but there is a risk of high number of false positive alerts. Multiple operating modes add variability. Ensure all normal modes of operation are present in both the training dataset and the evaluation dataset.
Missing values detected Total number of missing values is above 10% Medium If used, the missing values will be forward filled. Review the missing values and update the sensor if necessary.
Categorical values detected This sensor has N=<10 distinct values. Medium Lookout for Equipment can use this sensor but there is a risk of high number of false positive alerts. Review categorical values and update sensor if necessary. Categorical values may lead to a higher number of false positive alerts.
Constant values detected The value does not change over time. Medium This sensor can be used, but it is not likely to add value.
Non-numberical values detected Non-numerical data is present in this sensor. Medium The unsupported data will be removed and treated as missing values, then forward filled. Review the non numerical data and update sensor if necessary
Duplicate timestamps detected There are two or more rows that have the exact same timestamp. Medium The last encountered data point will be ingested, and the remaining duplicates will be omitted. Review the duplicate timestamps and update the sensor if necessary.

Choosing the best sensors for your project

Use this information to decide which sensors are right for your project.

A high-grade sensor, from the point of view of Lookout for Equipment, is a sensor that did not trigger any errors in the table above. However, just because it's eligible to contribute doesn't mean it should. For example, suppose that the sensor is not actually attached to the asset that you're trying to monitor.

Suppose that the sensor is attached, instead, to the leg of the table that the asset sits on. The sensor might collect data related to vibration or heat, and the data it collects may not trigger any of the errors in the table above. But that doesn't mean that the data is actually useful. The data the sensor is collecting may not be relevant to the operation of your asset. Even if the data is revelant, another sensor, nearby but better positioned, may already be collecting the most useful data for that part of the asset. Just because the data from a particular sensor doesn't trigger any of the errors above, doesn't mean that it ought to be selected for your model.

A medium-grade sensor collects data that triggers at least one error from the table above. But that doesn't necessarily mean that you shouldn't use that sensor in your model. For example, your sensor may have been labeled as medium-grade because it duplicated a timestamp once over the course of 14 days.

Based on your knowledge of the asset and how the data was collected, you may decide that Lookout for Equipment's method of remediation (deleting all but the first record collected for duplicate timestamps) is appropriate and productive. On the other hand, after receiving the alert, you may review the data, find many duplicate timestamps, and decide that the duplications indicate a problem with how the data was collected. You may then decide not to use data from that sensor in your model.

Data from a low-grade sensor contains a problem that may interfere with the accuracy of your model. We recommend that you do not include sensors with low-grade data when building your model. However, you may still choose to do so.

Next Steps: