Write data (inserts and upserts) - Amazon Timestream

Write data (inserts and upserts)

Writing batches of records

You can use the following code snippets to write data into an Amazon Timestream table. Writing data in batches helps to optimize the cost of writes. See Calculating the number of writes for more information.

def write_records(self): print("Writing records") current_time = self._current_milli_time() dimensions = [ {'Name': 'region', 'Value': 'us-east-1'}, {'Name': 'az', 'Value': 'az1'}, {'Name': 'hostname', 'Value': 'host1'} ] cpu_utilization = { 'Dimensions': dimensions, 'MeasureName': 'cpu_utilization', 'MeasureValue': '13.5', 'MeasureValueType': 'DOUBLE', 'Time': current_time } memory_utilization = { 'Dimensions': dimensions, 'MeasureName': 'memory_utilization', 'MeasureValue': '40', 'MeasureValueType': 'DOUBLE', 'Time': current_time } records = [cpu_utilization, memory_utilization] try: result = self.client.write_records(DatabaseName=Constant.DATABASE_NAME, TableName=Constant.TABLE_NAME, Records=records, CommonAttributes={}) print("WriteRecords Status: [%s]" % result['ResponseMetadata']['HTTPStatusCode']) except self.client.exceptions.RejectedRecordsException as err: print("RejectedRecords: ", err) for rr in err.response["RejectedRecords"]: print("Rejected Index " + str(rr["RecordIndex"]) + ": " + rr["Reason"]) print("Other records were written successfully. ") except Exception as err: print("Error:", err)

Writing batches of records with common attributes

If your time series data has measures and/or dimensions that are common across many data points, you can also use the following optimized version of the writeRecords API to insert data into Timestream. Using common attributes with batching can further optimize the cost of writes as described in Calculating the number of writes.

def write_records_with_common_attributes(self): print("Writing records extracting common attributes") current_time = self._current_milli_time() dimensions = [ {'Name': 'region', 'Value': 'us-east-1'}, {'Name': 'az', 'Value': 'az1'}, {'Name': 'hostname', 'Value': 'host1'} ] common_attributes = { 'Dimensions': dimensions, 'MeasureValueType': 'DOUBLE', 'Time': current_time } cpu_utilization = { 'MeasureName': 'cpu_utilization', 'MeasureValue': '13.5' } memory_utilization = { 'MeasureName': 'memory_utilization', 'MeasureValue': '40' } records = [cpu_utilization, memory_utilization] try: result = self.client.write_records(DatabaseName=Constant.DATABASE_NAME, TableName=Constant.TABLE_NAME, Records=records, CommonAttributes=common_attributes) print("WriteRecords Status: [%s]" % result['ResponseMetadata']['HTTPStatusCode']) except self.client.exceptions.RejectedRecordsException as err: print("RejectedRecords: ", err) for rr in err.response["RejectedRecords"]: print("Rejected Index " + str(rr["RecordIndex"]) + ": " + rr["Reason"]) print("Other records were written successfully. ") except Exception as err: print("Error:", err)

Upserting records

While the default writes in Amazon Timestream follow the first writer wins semantics, where data is stored as append only and duplicate records are rejected, there are applications that require the ability to write data into Amazon Timestream using the last writer wins semantics, where the record with the highest version is stored in the system. There are also applications that require the ability to update existing records. To address these scenarios, Amazon Timestream provides the ability to upsert data. Upsert is an operation that inserts a record in to the system when the record does not exist or updates the record, when one exists.

You can upsert records by including the Version in record definition while sending a WriteRecords request. Amazon Timestream will store the record with the record with highest Version. The code sample below shows how you can upsert data:

def write_records_with_upsert(self): print("Writing records with upsert") current_time = self._current_milli_time() # To achieve upsert (last writer wins) semantic, one example is to use current time as the version if you are writing directly from the data source version = int(self._current_milli_time()) dimensions = [ {'Name': 'region', 'Value': 'us-east-1'}, {'Name': 'az', 'Value': 'az1'}, {'Name': 'hostname', 'Value': 'host1'} ] common_attributes = { 'Dimensions': dimensions, 'MeasureValueType': 'DOUBLE', 'Time': current_time, 'Version': version } cpu_utilization = { 'MeasureName': 'cpu_utilization', 'MeasureValue': '13.5' } memory_utilization = { 'MeasureName': 'memory_utilization', 'MeasureValue': '40' } records = [cpu_utilization, memory_utilization] # write records for first time try: result = self.client.write_records(DatabaseName=Constant.DATABASE_NAME, TableName=Constant.TABLE_NAME, Records=records, CommonAttributes=common_attributes) print("WriteRecords Status for first time: [%s]" % result['ResponseMetadata']['HTTPStatusCode']) except self.client.exceptions.RejectedRecordsException as err: self._print_rejected_records_exceptions(err) except Exception as err: print("Error:", err) # Successfully retry same writeRecordsRequest with same records and versions, because writeRecords API is idempotent. try: result = self.client.write_records(DatabaseName=Constant.DATABASE_NAME, TableName=Constant.TABLE_NAME, Records=records, CommonAttributes=common_attributes) print("WriteRecords Status for retry: [%s]" % result['ResponseMetadata']['HTTPStatusCode']) except self.client.exceptions.RejectedRecordsException as err: self._print_rejected_records_exceptions(err) except Exception as err: print("Error:", err) # upsert with lower version, this would fail because a higher version is required to update the measure value. version -= 1 common_attributes["Version"] = version cpu_utilization["MeasureValue"] = '14.5' memory_utilization["MeasureValue"] = '50' upsertedRecords = [cpu_utilization, memory_utilization] try: upsertedResult = self.client.write_records(DatabaseName=Constant.DATABASE_NAME, TableName=Constant.TABLE_NAME, Records=upsertedRecords, CommonAttributes=common_attributes) print("WriteRecords Status for upsert with lower version: [%s]" % upsertedResult['ResponseMetadata']['HTTPStatusCode']) except self.client.exceptions.RejectedRecordsException as err: self._print_rejected_records_exceptions(err) except Exception as err: print("Error:", err) # upsert with higher version as new data in generated version = int(self._current_milli_time()) common_attributes["Version"] = version try: upsertedResult = self.client.write_records(DatabaseName=Constant.DATABASE_NAME, TableName=Constant.TABLE_NAME, Records=upsertedRecords, CommonAttributes=common_attributes) print("WriteRecords Upsert Status: [%s]" % upsertedResult['ResponseMetadata']['HTTPStatusCode']) except self.client.exceptions.RejectedRecordsException as err: self._print_rejected_records_exceptions(err) except Exception as err: print("Error:", err)

Handling write failures

If your application receives a RejectedRecordsException when attempting to write records to Timestream, you can parse the rejected records to learn more about the write failures as shown below.

try: result = self.client.write_records(DatabaseName=Constant.DATABASE_NAME, TableName=Constant.TABLE_NAME, Records=records, CommonAttributes=common_attributes) print("WriteRecords Status: [%s]" % result['ResponseMetadata']['HTTPStatusCode']) except self.client.exceptions.RejectedRecordsException as err: print("RejectedRecords: ", err) for rr in err.response["RejectedRecords"]: print("Rejected Index " + str(rr["RecordIndex"]) + ": " + rr["Reason"]) print("Other records were written successfully. ") except Exception as err: print("Error:", err)