Testing the solution - Disaster Recovery for AWS IoT

Testing the solution

Use the following steps to test the solution after the CloudFormation stacks have been launched.

Device replication

  • Create a device in the primary Region.

  • Verify that the device has been created with certificate and policy attached in the secondary Region. You can use the AWS CLI, AWS management console or the tool.

  • Send and receive messages in both Regions to verify that the device works correctly in either Region. You can use a publish/subscribe of your choice by using the iot-dr-pubsub.py tool.

Device shadow replication

Test device shadow replication with the iot-dr-shadow-cmp.py tool provided or use your own method.

Sample walkthrough

This sample walkthrough used the tools provided by the solution. Multiple variables are used from the file toolsrc in the tools folder.

Copy the tools from your S3 bucket in the primary Region to your environment. You can find the S3 URL for tools in the Outputs section of the main stack under ToolsS3Url.

# Get the tools mkdir tools cd tools aws s3 sync ToolsS3Url # make scripts executable chmod +x *.sh chmod +x *.py # get the Amazon Root CA curl https://www.amazontrust.com/repository/AmazonRootCA1.pem -o root.ca.pem # Source the environment variables . toolsrc # assign the thing name to be used to a shell variable THING_NAME=dr-walkthrough # create the device in the primary region AWS_DEFAULT_REGION=$PRIMARY_REGION ./create-device.sh $THING_NAME # list the device in primary and secondary region AWS_DEFAULT_REGION=$PRIMARY_REGION ./list-thing.py $THING_NAME AWS_DEFAULT_REGION=$SECONDARY_REGION ./list-thing.py $THING_NAME # pub/sub in primary region ./iot-dr-pubsub.py --endpoint $IOT_ENDPOINT_PRIMARY --cert $THING_NAME.certificate.pem --key $THING_NAME.private.key --root-ca root.ca.pem --client-id $THING_NAME --topic dr/$THING_NAME --count 2 --interval 1 # pub/sub in secondary region ./iot-dr-pubsub.py --endpoint $IOT_ENDPOINT_SECONDARY --cert $THING_NAME.certificate.pem --key $THING_NAME.private.key --root-ca root.ca.pem --client-id $THING_NAME --topic dr/$THING_NAME --count 2 --interval 1

You can also use the MQTT test client in the AWS IoT Core console in the primary and secondary Regions to verify message arrival. Subscribe to dr/$THING_NAME in each Region. Replace $THING_NAME with the name of your device.

Shadow replication tool

You can use the iot-dr-shadow-com.py tool to test if device shadows are replicated from the primary to the secondary Region. Replicating a shadow from one Region to the other can take up to approximate 10 seconds. The tool tries to look up the shadow in the secondary Region immediately after it has been created in the primary Region. This attempt fails if the shadow has not been replicated yet. The tool has retry logic implemented and will try to get the shadow again between two to 10 seconds depending on the amount of retry attempts. You might see messages that include retry information, such as the following example:

compare_shadow: n: 2 thing_name: 69a14446-dc16-4f29-9db7-1e46875b5c02: no shadow payload, retrying in 4 secs.

Upon successful comparison of the shadows in both Regions, you will find the following example messages:

compare_shadow: i: 4 thing_name: 524a0e2d-7cdb-4def-9341-a5e22553df07 shadows match: temperature: 36 temperature_secondary: 36

To create five shadows and compare them, run the following command:

./iot-dr-shadow-cmp.py --primary-region $PRIMARY_REGION --secondary-region $SECONDARY_REGION --num-tests 5

Failover testing

To test failover with Amazon Route 53 , you can use the tool iot-dr-pubsub.py. You must create a CNAME with a traffic policy before you start using the pub/sub sample subscriber with your endpoint and in dr-mode.

./iot-dr-pubsub.py --endpoint iot-dr-us.example.com --cert $THING_NAME.certificate.pem --key $THING_NAME.private.key --root-ca root.ca.pem --client-id $THING_NAME --topic dr/$THING_NAME --count 0 --interval 5 --use-cname --dr-mode

You can invalidate the Route 53 health check in the primary Region by modifying the query string for the health check so that the health check fails. Use the following steps to modify the query string:

  1. Sign in to the Amazon Route 53 management console.

  2. Select Health checks.

  3. Check the health check for your primary Region.

  4. Edit the health check.

  5. Delete the last character from the value for the Path field. Record this character so that you can revert the path to the original.

  6. Choose Save.


          Disaster Recovery for AWS IoT Solution Make a health
          check fail

Figure 5: Disaster Recovery for AWS IoT Solution Make a health check fail

After a few minutes, the health check changes the state to Unhealthy.

When the health check in the primary Region changes to Unhealthy, Amazon Route 53 resolves your CNAME automatically to the secondary Region. This causes the script iot-dr-pubsub.py to detect a Region switch and reconnect to the secondary Region. You will then see messages similar to the following:

[INFO]: Thread-4-iot-dr-pubsub.py:143-dr_endpoint_verifier: REGION SWITCH detected: ENDPOINT_NAME-ats.iot.us-east-1.amazonaws.com -> ENDPOINT_NAME-ats.iot.us-west-2.amazonaws.com [INFO]: Thread-4-iot-dr-pubsub.py:145-dr_endpoint_verifier: teminating current MQTT_CONNECTION [INFO]: Thread-4-iot-dr-pubsub.py:147-dr_endpoint_verifier: disconnect_future result: <Future at 0x7f7db1bd9240 state=pending> [INFO]: Thread-4-iot-dr-pubsub.py:150-dr_endpoint_verifier: initiating new MQTT_CONNECTION to iot_endpoint: ENDPOINT_NAME-ats.iot.us-west-2.amazonaws.com [INFO]: Thread-4-iot-dr-pubsub.py:205-connection_start: Connecting to ENDPOINT_NAME-ats.iot.us-west-2.amazonaws.com with client ID 'dr-walkthrough'...

You can use the MQTT test client in the AWS IoT management console to verify to which Region messages are being published.

If you make the health check status in the primary Region Healthy by adding the character that you deleted, use the script to switch the health check status back again to the primary Region.

Disaster Recovery for AWS IoT Route 53 health checks

Figure 6: Disaster Recovery for AWS IoT Route 53 health checks

Automated tests

The python script `test-dr-deployment.py` in the `source/tools` folder provides an end-to-end test capability to check if the deployed solution is working. The script tests the following:

  • Device replication from primary to secondary Region

  • Pub/Sub to newly created things in primary and secondary Region

  • Shadow replication

  • Delete thing replication