Databases - IoT Lens


You will have multiple databases in your IoT application, each selected for attributes such as the write frequency of data to the database, the read frequency of data from the database, and how the data is structured and queried. There are other criteria to consider when selecting a database offering:

  • Volume of data and retention period.

  • Intrinsic data organization and structure.

  • Users and applications consuming the data (either raw or processed) and their geographical location/dispersion.

  • Advanced analytics needs, such as machine learning or real-time visualizations.

  • Data synchronization across other teams, organizations, and business units.

  • Security of the data at the row, table, and database levels.

  • Interactions with other related data-driven events such as enterprise applications, drill-through dashboards, or systems of interaction.

AWS has several database offerings that support IoT solutions. For structured data, you should use Amazon Aurora, a highly scalable relational interface to organizational data. For semi structured data that requires low latency for queries and will be used by multiple consumers, use DynamoDB, a fully managed, multi-region, multi-master database that provides consistent single-digit millisecond latency, and offers built-in security, backup and restore, and in-memory caching.

For storing raw, unformatted event data, use AWS IoT Analytics. AWS IoT Analytics filters, transforms, and enriches IoT data before storing it in a time series data store for analysis. Use Amazon SageMaker to build, train, and deploy machine learning models, based off of your IoT data, in the cloud and on the edge using AWS IoT Services such as Greengrass Machine Learning Inference. Consider storing your raw formatted time series data in a data warehouse solution such as Amazon Redshift. Unformatted data can be imported to Amazon Redshift via Amazon S3 and Amazon Kinesis Data Firehose. By archiving unformatted data in a scalable, managed data storage solution, you can begin to gain business insights, explore your data, and identify trends and patterns over time.

In addition to storing and leveraging the historical trends of your IoT data, you must have a system that stores the current state of the device and provides the ability to query against the current state of all of your devices. This supports internal analytics and customer facing views into your IoT data.

The AWS IoT Shadow service is an effective mechanism to store a virtual representation of your device in the cloud. AWS IoT device shadow is best suited for managing the current state of each device. In addition, for internal teams that need to query against the shadow for operational needs, leverage the managed capabilities of Fleet Indexing, which provides a searchable index incorporating your IoT registry and shadow metadata. If there is a need to provide index based searching or filtering capability to a large number of external users, such as for a consumer application, dynamically archive the shadow state using a combination of the IoT rules engine, Kinesis Data Firehose, and Amazon ElasticSearch Service to store your data in a format that allows fine grained query access for external users.

IOTPERF 4. How do you select the database for your IoT device state?