Use Systems Manager SSM documents with AWS FIS - AWS Fault Injection Simulator

Use Systems Manager SSM documents with AWS FIS

AWS FIS supports custom fault types through the AWS Systems Manager SSM Agent and the AWS FIS action aws:ssm:send-command. Pre-configured Systems Manager SSM documents (SSM documents) that can be used to create common fault injection actions are available as public AWS documents that begin with the AWSFIS- prefix.

SSM Agent is Amazon software that can be installed and configured on Amazon EC2 instances, on-premises servers, or virtual machines (VMs). This makes it possible for Systems Manager to manage these resources. The agent processes requests from Systems Manager, and then runs them as specified in the request. You can include your own SSM document to inject custom faults, or reference one of the public Amazon-owned documents.

Requirements

For actions that require SSM Agent to run the action on the target, you must ensure the following:

Use the aws:ssm:send-command action

An SSM document defines the actions that Systems Manager performs on your managed instances. Systems Manager includes a number of pre-configured documents, or you can create your own. For more information about creating your own SSM document, see Creating Systems Manager documents in the AWS Systems Manager User Guide. For more information about SSM documents in general, see AWS Systems Manager documents in the AWS Systems Manager User Guide.

AWS FIS provides pre-configured SSM documents. You can view the pre-configured SSM documents under Documents in the AWS Systems Manager console: https://console.aws.amazon.com/systems-manager/documents. You can also choose from a selection of pre-configured documents in the AWS FIS console. For more information, see Pre-configured AWS FIS SSM documents.

To use an SSM document in your AWS FIS experiments, you can use the aws:ssm:send-command action. This action fetches and runs the specified SSM document on your target instances.

When you use the aws:ssm:send-command action in your experiment template, you must specify additional parameters for the action, including the following:

  • documentArn – Required. The Amazon Resource Name (ARN) of the SSM document.

  • documentParameters – Conditional. The required and optional parameters that the SSM document accepts. The format is a JSON object with keys that are strings and values that are either strings or arrays of strings.

  • documentVersion – Optional. The version of the SSM document to run.

You can view the information for an SSM document (including the parameters for the document) by using the Systems Manager console or the command line.

To view information about an SSM document using the console

  1. Open the AWS Systems Manager console at https://console.aws.amazon.com/systems-manager/.

  2. In the navigation pane, choose Documents.

  3. Select the document, and choose the Details tab.

To view information about an SSM document using the command line

Use the SSM describe-document command.

Pre-configured AWS FIS SSM documents

You can use pre-configured AWS FIS SSM documents with the aws:ssm:send-command action in your experiment templates.

Requirements

  • The pre-configured SSM documents provided by AWS FIS are supported only on Amazon Linux and Ubuntu. On other Linux systems and Windows, you can use the aws:ssm:send-command action to run your own SSM document.

  • The pre-configured SSM documents provided by AWS FIS are supported only on EC2 instances. They are not supported on other types of managed nodes, such as on-premises servers.

AWSFIS-Run-CPU-Stress

Runs CPU stress on an instance using the stress-ng tool. Uses the AWSFIS-Run-CPU-Stress SSM document and the following document parameters:

  • DurationSeconds – Required. The duration of the CPU stress test, in seconds.

  • CPU – Optional. The number of CPU stressors to use. The default is 0, which uses all CPU stressors.

  • InstallDependencies – Optional. If the value is True, Systems Manager installs the required dependencies on the target instances if they are not already installed. The default is True. The dependency is stress-ng.

The following is an example.

{"DurationSeconds":"60", "InstallDependencies":"True"}
AWSFIS-Run-IO-Stress

Runs IO stress on an instance using the stress-ng tool. Uses the AWSFIS-Run-IO-Stress SSM document and the following document parameters:

  • DurationSeconds – Required. The duration of the IO stress test, in seconds.

  • Workers – Optional. The number of workers that perform a mix of sequential, random, and memory-mapped read/write operations, forced synchronizing, and cache dropping. Multiple child processes perform different I/O operations on the same file. The default is 1.

  • Percent – Optional. The percentage of free space on the file system to use during the IO stress test. The default is 80%.

  • InstallDependencies – Optional. If the value is True, Systems Manager installs the required dependencies on the target instances if they are not already installed. The default is True. The dependency is stress-ng.

The following is an example.

{"Workers":"1", "Percent":"80", "DurationSeconds":"60", "InstallDependencies":"True"}
AWSFIS-Run-Kill-Process

Stops the specified process in the instance, using the killall command. Uses the AWSFIS-Run-Kill-Process SSM document with the following document parameters:

  • ProcessName – Required. The name of the process to stop.

  • Signal – Optional. The signal to send along with the command. The possible values are SIGTERM (which the receiver can choose to ignore) and SIGKILL (which cannot be ignored). The default is SIGTERM.

The following is an example.

{"ProcessName":"myapplication", "Signal":"SIGTERM"}
AWSFIS-Run-Memory-Stress

Runs memory stress on an instance using the stress-ng tool. Uses the AWSFIS-Run-Memory-Stress SSM document with the following document parameters:

  • DurationSeconds – Required. The duration of the memory stress test, in seconds.

  • Workers – Optional. The number of virtual memory stressors. The default is 1.

  • Percent – Required. The percentage of virtual memory to use during the memory stress test.

  • InstallDependencies – Optional. If the value is True, Systems Manager installs the required dependencies on the target instances if they are not already installed. The default is True. The dependency is stress-ng.

The following is an example.

{"Percent":"80", "DurationSeconds":"60", "InstallDependencies":"True"}
AWSFIS-Run-Network-Blackhole-Port

Drops inbound or outbound traffic for the protocol and port using the iptables tool. Uses the AWSFIS-Run-Network-Blackhole-Port SSM document with the following document parameters:

  • Protocol – Required. The protocol. The possible values are tcp and udp.

  • Port – Required. The port number.

  • TrafficType – Optional. The type of traffic. The possible values are ingress and egress. The default is ingress.

  • DurationSeconds – Required. The duration of the network blackhole test, in seconds.

  • InstallDependencies – Optional. If the value is True, Systems Manager installs the required dependencies on the target instances if they are not already installed. The default is True. The dependencies are iptables and atd.

The following is an example.

{"Protocol":"tcp", "Port":"8080", "TrafficType":"egress", "DurationSeconds":"60", "InstallDependencies":"True"}
AWSFIS-Run-Network-Latency

Adds latency to the network interface using the tc tool. Uses the AWSFIS-Run-Network-Latency SSM document with the following document parameters:

  • Interface – Optional. The network interface. The default is eth0.

  • DelayMilliseconds – Optional. The delay, in milliseconds. The default is 200.

  • DurationSeconds – Required. The duration of the network latency test, in seconds.

  • InstallDependencies – Optional. If the value is True, Systems Manager installs the required dependencies on the target instances if they are not already installed. The default is True. The dependencies are tc and atd.

The following is an example.

{"DelayMilliseconds":"200", "Interface":"eth0", "DurationSeconds":"60", "InstallDependencies":"True"}
AWSFIS-Run-Network-Latency-Sources

Adds latency and jitter to the network interface using the tc tool for traffic to or from specific sources. Uses the AWSFIS-Run-Network-Latency-Sources SSM document with the following document parameters:

  • Interface – Optional. The network interface. The default is eth0.

  • DelayMilliseconds – Optional. The delay, in milliseconds. The default is 200.

  • JitterMilliseconds – Optional. The jitter, in milliseconds. The default is 10.

  • Sources – Required. The sources, separated by commas. The possible values are: an IPv4 address, an IPv4 CIDR, a domain name, DYNAMODB, and S3.

  • TrafficType – Optional. The type of traffic. The possible values are ingress and egress. The default is ingress.

  • DurationSeconds – Required. The duration of the network latency test, in seconds.

  • InstallDependencies – Optional. If the value is True, Systems Manager installs the required dependencies on the target instances if they are not already installed. The default is True. The dependencies are tc, atd, and jq.

The following is an example.

{"DelayMilliseconds":"200", "JitterMilliseconds":"15", "Sources":"S3,www.example.com,72.21.198.67", "Interface":"eth0", "TrafficType":"egress", "DurationSeconds":"60", "InstallDependencies":"True"}
AWSFIS-Run-Network-Packet-Loss

Adds package loss to the network interface using the tc tool. Uses the AWSFIS-Run-Network-Packet-Loss SSM document with the following document parameters:

  • Interface – Optional. The network interface. The default is eth0.

  • LossPercent – Optional. The percentage of packet loss. The default is 7%.

  • DurationSeconds – Required. The duration of the network packet loss test, in seconds.

  • InstallDependencies – Optional. If the value is True, Systems Manager installs the required dependencies on the target instances. The default is True. The dependencies are tc and atd.

The following is an example.

{"LossPercent":"15", "Interface":"eth0", "DurationSeconds":"60", "InstallDependencies":"True"}
AWSFIS-Run-Network-Packet-Loss-Sources

Adds package loss to the network interface using the tc tool for traffic to or from specific sources. Uses the AWSFIS-Run-Network-Packet-Loss-Sources SSM document with the following document parameters:

  • Interface – Optional. The network interface. The default is eth0.

  • LossPercent – Optional. The percentage of packet loss. The default is 7%.

  • Sources – Required. The sources, separated by commas. The possible values are: an IPv4 address, an IPv4 CIDR, a domain name, DYNAMODB, and S3.

  • TrafficType – Optional. The type of traffic. The possible values are ingress and egress. The default is ingress.

  • DurationSeconds – Required. The duration of the network packet loss test, in seconds.

  • InstallDependencies – Optional. If the value is True, Systems Manager installs the required dependencies on the target instances. The default is True. The dependencies are tc, atd, and jq.

The following is an example.

{"LossPercent":"15", "Sources":"S3,www.example.com,72.21.198.67", "Interface":"eth0", "TrafficType":"egress", "DurationSeconds":"60", "InstallDependencies":"True"}