Monitoring AWS DataSync activity with Amazon CloudWatch
You can monitor AWS DataSync using Amazon CloudWatch, which collects and processes raw data from DataSync into readable, near real-time metrics. These statistics are retained for a period of 15 months.
By default, DataSync metrics data is automatically sent to CloudWatch in 5-minute intervals. For more information, see What is Amazon CloudWatch? in the Amazon CloudWatch User Guide.
Amazon CloudWatch metrics for DataSync
Amazon CloudWatch provides metrics that you can use to get information about DataSync performance and troubleshoot issues. You can see CloudWatch metrics for DataSync by use the following tools:
-
CloudWatch console
-
CloudWatch CLI
-
CloudWatch API
-
DataSync console (task execution page)
For information, see Using Amazon CloudWatch metrics in the Amazon CloudWatch User Guide.
DataSync metrics use the AWS/DataSync
namespace and provide metrics for the
following dimensions:
-
AgentId – The unique ID of the agent.
-
TaskId – The unique ID of the task. It takes the form of
task-01234567890abcdef
.
The AWS/DataSync
namespace includes the following metrics.
Metric | Description |
---|---|
|
The physical number of bytes transferred over the network after
compression was applied. In most cases, this number is less than
Unit: Bytes |
|
The total number of bytes of data that are prepared at the destination location. Unit: Bytes |
|
The total number of bytes of data that are prepared at the source location. Unit: Bytes |
|
The total number of bytes that are involved in the transfer. For the number of bytes sent over
the network, see Unit: Bytes |
|
The total number of bytes of data that are verified at the destination location. Unit: Bytes |
|
The total number of bytes of data that are verified at the source location. Units: Bytes |
|
The total logical size of all files that have been transferred to the destination location. Unit: Bytes |
|
The total number of files that are prepared at the destination location. Unit: Count |
|
The total number of files that are prepared at the source location. Unit: Count |
|
The actual number of files or metadata that were transferred over the network. This value is
calculated and updated on an ongoing basis during the
If failures occur during a transfer, this value can be less than
Unit: Count |
|
The total number of files that are verified at the destination location. Unit: Count |
|
The total number of files that are verified at the source location. Unit: Count |
Amazon EventBridge events for DataSync
Amazon EventBridge events describe changes in DataSync resources. You can set up rules to match these events and route them to one or more target functions or streams. Events are emitted on a best-effort basis.
DataSync transfer events
The following EventBridge events are available for DataSync transfers.
Agent state changes | |
---|---|
Event | Description |
Online | The agent is configured properly and is available to use. This status is the normal running status for an agent. |
Offline | The agent's VM is turned off or the agent is in an unhealthy state and has been out of contact with the service for 5 minutes or longer. When the issue that caused the unhealthy state is resolved, the agent returns to ONLINE status. |
Location state changes | |
Event | Description |
Adding | DataSync is adding a location. |
Available | The location is created and is available to use. |
Task state changes | |
Event | Description |
Available | The task was created and is ready to start. |
Running | The task is in progress and functioning properly. |
Unavailable | The task isn't configured properly and can't be used. You may see this when an agent associated with the task goes offline. |
Queued | Another task is running and using the same agent. DataSync runs tasks in series (first in, first out). |
Task execution state changes | |
Event | Description |
Queueing | DataSync is waiting for another task that's using the same agent to finish. |
Launching | DataSync is initializing the task execution. |
Preparing | DataSync is determining which files need to be transferred. |
Transferring | DataSync is performing the actual transfer of your data. |
Verifying | DataSync performs a full data and metadata integrity verification to ensure that the data in your destination is an exact copy of your source. |
Success | The transfer is successful. |
Error | The transfer failed. |
DataSync Discovery events
The following EventBridge events are available for DataSync Discovery.
Storage system state changes | |
---|---|
Event | Description |
Storage System Connectivity Status Change | The connection between your DataSync agent and on-premises storage system changed. For details, see your CloudWatch logs. |
Discovery job state changes | |
Event | Description |
Discovery Job State Change | The status of your discovery job changed. For more information, see discovery job statuses. |
Discovery Job Expiration Soon | Your discovery job expires soon. This includes any information the discovery job collected about your on-premises storage system. Before the job expires, you can export collected data by using the DescribeStorageSystemResources and DescribeStorageSystemResourceMetrics operations. |
Allowing DataSync to upload logs to CloudWatch log groups
DataSync requires sufficient permissions to send logs to your CloudWatch log group. When you create a task using the console, DataSync can automatically create an IAM resource policy with the correct permissions for you.
The following example is a resource policy that grants these permissions.
{ "Statement": [ { "Sid": "DataSyncLogsToCloudWatchLogs", "Effect": "Allow", "Action": [ "logs:PutLogEvents", "logs:CreateLogStream" ], "Principal": { "Service": "datasync.amazonaws.com" }, "Condition": { "ArnLike": { "aws:SourceArn": [ "arn:aws:datasync:
region
:account-id
:task/*" ] }, "StringEquals": { "aws:SourceAccount": "account-id
" } }, "Resource": "arn:aws:logs:region
:account-id
:log-group:*:*" } ], "Version": "2012-10-17" }
The policy uses condition statements to ensure that only DataSync tasks from the
specified account have access to the specified CloudWatch log group. We recommend using the
aws:SourceArn
and aws:SourceAccount
global condition context keys in these
condition statements to protect against the confused deputy problem. For more
information, see Cross-service confused deputy
prevention.
To specify the DataSync task or tasks, replace
with the Region code for the
AWS Region where the tasks are located and replace
region
with the AWS account ID of
the account that contains the tasks. To specify the CloudWatch log group, replace the same
values. You can also modify the account-id
Resource
statement to target specific log
groups. For more information about using SourceArn
and
SourceAccount
, see Global condition keys in the IAM User Guide.
To apply the policy, save this policy statement to a file on your local computer. Then run the following AWS CLI command to apply the resource policy:
aws logs put-resource-policy --policy-name trustDataSync --policy-document file://
full-path-to-policy-file
Note
Run this command using the same AWS account and AWS Region were you activated your DataSync agent.
For information, see Working with log groups and log streams in the Amazon CloudWatch Logs User Guide.
Monitoring your DataSync task from the command line
You can track your DataSync tasks with the AWS Command Line Interface or the standard Unix watch
utility.
Monitoring your task by using the AWS CLI
To monitor the status of your DataSync task with the CLI, use the
describe-task-execution
command.
aws datasync describe-task-execution \ --task-execution-arn 'arn:aws:datasync:
region
:account-id
:task/task-id
/execution/task-execution-id
'
This command returns information about a task execution similar to that shown following.
{ "BytesCompressed": 0, "BytesTransferred": 0, "BytesWritten": 0, "EstimatedFilesToTransfer": 0, "EstimatedBytesToTransfer": 0, "FilesTransferred": 0, "Options": { "VerifyMode": "POINT_IN_TIME_CONSISTENT", "Atime": "BEST_EFFORT", "Mtime": "PRESERVE", "Uid": "INT_VALUE", "Gid": "INT_VALUE", "PreserveDevices": "NONE", "PosixPermissions": "PRESERVE", "PreserveDeletedFiles": "PRESERVE", "OverwriteMode": "NEVER", "TaskQueueing": "ENABLED" }, "Result": { "PrepareDuration": 4355, "PrepareStatus": "Ok", "TransferDuration": 5889, "TransferStatus": "Ok", "VerifyDuration": 4538, "VerifyStatus": "Pending" }, "StartTime": 1532658526.949, "Status": "VERIFYING", "TaskExecutionArn": "arn:aws:datasync:us-east-1:112233445566:task/task-08de6e6697796f026/execution/exec-04ce9d516d69bd52f" }
If the task execution succeeds, the value of Status changes to
SUCCESS. If the describe-task-execution
command
fails, the result sends error codes that can help you troubleshoot issues. For
information about the error codes, see TaskExecutionResultDetail
in the DataSync API Reference.
Monitoring your task by using the watch
utility
To monitor the progress of your task in real time from the command line, you can
use the standard Unix watch
utility. Task execution duration values are
measured in milliseconds.
The watch
utility doesn't recognize the DataSync alias. The following
example shows how to invoke the CLI directly.
# pass '-n 1' to update every second and '-d' to highlight differences $ watch -n 1 -d \ "aws datasync describe-task-execution --task-execution-arn 'arn:aws:datasync:
region
:account-id
:task/task-id
/execution/taskexecution-id
'"