Cost-effective resources - IoT Lens

Cost-effective resources

Given the scale of devices and data that can be generated by an IoT application, using the appropriate AWS services for your system is key to cost savings. In addition to the overall cost for your IoT solution, IoT architects often look at connectivity through the lens of bill of materials (BOM) costs. For BOM calculations, you must predict and monitor what the long-term costs will be for managing the connectivity to your IoT application throughout the lifetime of that device. Leveraging AWS services will help you calculate initial BOM costs, make use of cost-effective services that are event driven, and update your architecture to continue to lower your overall lifetime cost for connectivity.

The most straightforward way to increase the cost-effectiveness of your resources is to group IoT events into batches and process data collectively. By processing events in groups, you are able to lower the overall compute time for each individual message. Aggregation can help you save on compute resources and enable solutions when data is compressed and archived before being persisted. This strategy decreases the overall storage footprint without losing data or compromising the query ability of the data.

IOTCOST 01. How do you choose cost-efficient tools for data aggregation of your IoT workloads?

AWS IoT is best suited for streaming data for either immediate consumption or historical analyses. There are several ways to batch data from AWS IoT Core to other AWS services and the differentiating factor is driven by batching raw data (as is) or enriching the data and then batching it. Enriching, transforming, and filtering IoT telemetry data during (or immediately after) ingestion is best performed by creating an AWS IoT rule that sends the data to Kinesis Data Streams, Firehose, AWS IoT Analytics, or Amazon Simple Queue Service (Amazon SQS). These services allow you to process multiple data events at once.

When dealing with raw device data from this batch pipeline, you can use AWS IoT Analytics and Amazon Data Firehose to transfer data to S3 buckets and Amazon Redshift. To lower storage costs in Amazon S3, an application can use lifecycle policies that archive data to lower cost storage, such as Amazon S3 Glacier.

Raw data from devices can also be processed at the edge using AWS IoT Greengrass thus eliminating the need to send all the data to the cloud for storage and processing. This can result in lower network cost and lower cost in cloud services. You can dynamically change or update that logic, as well as frequency of transmission using AWS IoT Greengrass since it's not hardcoded and can be adjusted as needed by the use case. This gives you added flexibility for cost optimization.

In addition, observe the following general practice recommendations:

Methods and tools for how data is acquired, validated, categorized, and stored impacts the overall cost of your application. Focusing on tools that can automatically vary in scale and cost with demand and support your data with a minimum of administrative overhead can help you achieve lowest cost for your application. By considering the data pipeline from origination to archival, you can make informed decisions and examine tradeoffs among technical and business requirements to identify the most effective solution.

Best practice IOTCOST_1.1Use a data lake for raw telemetry data

A data lake brings different data sources together and provides a common management framework for browsing, viewing, and extracting the sources. An effective data lake enables IoT cost management by storing data in the right format for the right use case. With a data lake, storage and interaction characteristics can be aligned to a specific dataset format and required interfaces.

Recommendation IOTCOST_1.1.1Categorize telemetry types and map to storage capabilities

Best practice IOTCOST_1.2 – Provide a self-service interface for end users to search, extract, manage, and update IoT data

With inexpensive cloud computing resources, pay-as-you-go pricing, and strong identity and encryption controls, your organization should allow groups to define and share data models in the format that makes the most sense for them. Self-service interfaces will encourage experimentation and speed up change by removing barriers for teams to gain access to the data they need to make decisions.

Recommendation IOTCOST_1.2.1Use an architecture that allows various end users to easily find, obtain, enhance, and share data

Recommendation IOTCOST_1.2.2Use a subscriber model, which allows teams to subscribe to data feeds and receive notification of updates, to reduce the need for active polling and re-synching with data sources

For more information:

Best practice IOTCOST_1.3 – Track and manage the utilization of data sources

Applications and users evolve over time, and IoT solutions can generate large volumes of data quickly. As your application matures, it’s important for cost management of your IoT workload to track that data collected is still being used. Consistent tracking and review of data utilization provides an objective basis for cost optimization decisions.

Recommendation IOTCOST_1.3.1Track and manage the utilization of data sources to identify hot and cold spots to assess value of data

Best practice IOTCOST_1.4Aggregate data at the edge where possible

Data aggregation is an architectural decision that can have impacts on data fidelity. Aggregations should be thoroughly reviewed with engineering and architectural teams before implementation. If the device can aggregate data before sending for processing using methods such as combining messages or removing duplicate or repeating values, that can reduce the amount of processing, the number of associated resources, and associated expense.

Recommendation IOTCOST_1.4.1Examine device telemetry for opportunities to batch and aggregate data

  • A common mechanism includes combining multiple status updates to a final status, or combining a series of measurements generated by the device into a single message.

  • For example, 10 KB of device telemetry data might be packaged as one 10-KB message, two 5-KB messages, or 10 1-KB messages. Each packaging format has implications outside of cost such as network traffic (10 1-KB messages will each add their own header messaging as opposed to a single 10-KB message with one header) and the impact of a lost or delayed message. Optimizing message size should consider how a lost message impacts the functional or non-functional characteristics of the system.

Recommendation IOTCOST_1.4.2Use cost calculators to model different approaches for message size and count

IOTCOST 02. How do you optimize the cost of raw telemetry data?

Raw telemetry is an original source for analytics but can also be a major component of cost. Analyze the flow and usage of your telemetry to identify the right service and handling process required. Select storage and processing mechanisms that match the needs of your specific telemetry case.

Best practice IOTCOST_2.1 – Use lifecycle policies to archive your data

When selecting an automated lifecycle policy for data, there are tradeoffs to consider. For example, do you want to optimize for speed to market or cost? In some cases, it's best to optimize for speed—going to market quickly, shipping new features, or meeting a deadline—rather than investing in upfront cost optimization. Use your organization’s data classification strategies to define a lifecycle policy to take raw telemetry measurements through various services. Setting milestones by time sets expectations and encourages aggregation and production of data over mere collection.

Recommendation IOTCOST_2.1.1Evaluate your organization’s data retention and handling requirements and configure your AWS services to support them

  • Check your organization’s data management policy for requirements on retention, deletion, and encryption and align your retention policies and tools with those guidelines.

  • S3 Lifecycle policies or S3 Intelligent-Tiering can move the data to the most cost-effective Amazon S3 storage class or Amazon S3 Glacier for long-term storage.

Best practice IOTCOST_2.2 – Evaluate storage characteristics for your use case and align with the right services

Not all data needs to be stored in the same way, and data storage needs change through a data item’s lifecycle. A growing fleet of devices can exponentially scale its messaging rate and device operation traffic. This scaling of message volumes can also mean an increase in storage costs.

Recommendation IOTCOST_2.2.1Evaluate velocity and the volume of data coming from IoT devices when selecting storage services

  • For data at high scale of devices, time, or other characteristics—Consider a data warehouse such as Amazon Redshift or Amazon S3 with Amazon Athena. The data partitioning and tiering features of AWS storage services can help reduce storage costs.

  • For data at lower scale of time, devices, or other characteristics—Consider Amazon DynamoDB, Amazon OpenSearch Service (OpenSearch Service), or Amazon Aurora for short-term historical data. Use your data lifecycle policies to optimize what is kept in the short-term storage.

Best practice IOTCOST_2.3 – Store raw archival data on cost effective services

Using the right storage solution for a specific data type will align costs with usage.

Recommendation IOTCOST_2.3.1Use an object store for archival storage

  • Use an object store, such as Amazon S3, for raw archival storage. Object stores are immutable and often more efficient and cost-effective than block storage, especially for data which doesn’t require editing.

  • Avoid costs by using a schema-on-read service, such as Amazon Athena, to query the data in its native form. Using Athena can help reduce the need for large-scale storage arrays or always-on databases to read raw archival data.

IOTCOST 03. How do you optimize the cost of interactions between devices and your IoT cloud solution?

Interactions to and from devices can be a significant driver of your workload’s overall cost. Understanding and optimizing interactions between devices and cloud solution can be a significant factor of cost management.

Best practice IOTCOST_3.1 – Select services to optimize cost

Understand how services use and charge for messaging, as well as operating modes that offer cost benefits. Understanding service billing characteristics can help you identify ways to optimize messaging, which could result in considerable cost savings at scale.

Recommendation IOTCOST_3.1.1Select services to optimize cost

  • Review your IoT architecture to find communication patterns and scenarios that could map to service discount features.

  • With AWS IoT Core Rules Engine Basic Ingest, you can publish directly to a rule without messaging charges.

  • Use your registry of things only for primarily immutable data, such as serialNumber.

  • For your device’s shadow, minimize the frequency of reads and writes to reduce the total metered operation and your operating costs.

  • For more information:

Best practice IOTCOST_3.2 – Implement and configure telemetry to reduce data transfer costs

Matching the precision of telemetry data, such as the number of decimal places, to the precision of the required calculation can help address both the overall message size and the precision of calculations.

Recommendation IOTCOST_3.2.1Reduce string lengths and decimal precision where feasible

  • For example, strings sent regularly such as "POWER" or "CHARGE" could be reduced to the strings "P" or "C". Similarly, decimal values such as “21.25” or “71.86” could be reduced to “21” or “72” if the additional precision is not required for the application. This is common in room temperature readings where precision beyond a whole number is rarely required. Across many millions of messages, the savings from removing a few characters can make a significant difference in message size and cost.

Best practice IOTCOST_3.3 – Use shadow only for slow changing data

Shadow is used in IoT applications as a persistence mechanism of device state. The shadow maintains data that remains consistent across multiple points in time. Device shadow operations can be billed and metered differently than publish/subscribe messages. Reducing the shadow update frequency from the device can reduce the number of billed operations while maintaining an acceptable level of data freshness.

Recommendation IOTCOST_3.3.1Avoid using shadows as a guaranteed-delivery mechanism or for continuously fluctuating data

  • As a workload scales up, the cost of frequent shadow updates could exceed the value of the data.

  • Consider MQTT Last Will and Testament (LWT) as a mitigation for the risk of loss of device communication instead of using shadow.

  • Use the AWS Pricing Calculator to compare device shadow interactions versus telemetry messages understand cost implications.

  • For more information:

Best practice IOTCOST_3.4 – Group and tag IoT devices and messages for cost allocation

An IoT billing group enables you to tag devices by categories related to your IoT application. You should create tags that represent business categories, such as cost centers. Visibility into devices and messages by category makes cost dimensions easier to understand and visualize.

Recommendation IOTCOST_3.4.1Use AWS IoT Core Billing Groups to tag IoT devices for cost allocation

  • Add tracking elements to messages and devices to help trace source such as product model and serial number.

  • Ensure that your system can group and organize data by both technical and business entity.

  • For more information:

Best practice IOTCOST_3.5 – Implement and configure device messaging to reduce data transfer costs

Charges for different cloud and data transporter providers can vary based on specifics of message size and frequency. IoT workloads can cross multiple communication, such as cell networks, and each layer can have its own metering and pricing standards.

Recommendation IOTCOST_3.5.1Evaluate tradeoffs between message size and number of messages

  • Frequency optimization is performed with payload optimization to both accurately assess the network load and identify adequate trade-offs between frequency and payload size.

  • For example, your devices might send one message per second. If you could aggregate those messages on the device and send five observations in a single message every five seconds, that could drastically reduce your message count and cost.

Recommendation IOTCOST_3.5.2Evaluate cost of streaming services versus IoT messaging services

  • Use the AWS Cost Calculator to compare the cost of using messaging services like Kinesis and API Gateway to offload components of your IoT workload.