Monitoring AWS DataSync activity with Amazon CloudWatch - AWS DataSync

Monitoring AWS DataSync activity with Amazon CloudWatch

You can monitor AWS DataSync by using Amazon CloudWatch, which collects and processes raw data from DataSync into readable, near real-time metrics. These statistics are retained for a period of 15 months.

By default, DataSync metrics data is automatically sent to CloudWatch in 5-minute intervals. For more information, see What is Amazon CloudWatch? in the Amazon CloudWatch User Guide.

CloudWatch metrics for DataSync

Amazon CloudWatch provides metrics that you can use to get information about DataSync performance and to troubleshoot issues. To see CloudWatch metrics for DataSync, you can use the following tools:

  • The CloudWatch console

  • The CloudWatch CLI

  • The CloudWatch API

  • The DataSync console (on the task execution's details page)

For more information, see Using Amazon CloudWatch metrics in the Amazon CloudWatch User Guide.

DataSync metrics use the aws/datasync namespace and provide metrics for the following dimensions:

  • AgentId – The unique ID of the agent.

  • TaskId – The unique ID of the task. It takes the form of task-01234567890abcdef.

The aws/datasync namespace includes the following metrics.

Metric Description

BytesCompressed

The physical number of bytes transferred over the network after compression was applied. In most cases, this number is less than BytesTransferred unless the data isn't compressible.

Unit: Bytes

BytesPreparedDestination

The total number of bytes of data that are prepared at the destination location.

Unit: Bytes

BytesPreparedSource

The total number of bytes of data that are prepared at the source location.

Unit: Bytes

BytesTransferred

The total number of bytes that are involved in the transfer. For the number of bytes sent over the network, see BytesCompressed.

Unit: Bytes

BytesVerifiedDestination

The total number of bytes of data that are verified at the destination location.

Unit: Bytes

BytesVerifiedSource

The total number of bytes of data that are verified at the source location.

Units: Bytes

BytesWritten

The total logical size of all files, objects. and directories that transferred to the destination location.

Unit: Bytes

FilesPreparedDestination

The total number of files, objects, and directories that are prepared at the destination location.

Unit: Count

FilesPreparedSource

The total number of files, objects, and directories that are prepared at the source location.

Unit: Count

FilesTransferred

The actual number of files, objects, directories, and metadata that transferred over the network. This value is calculated and updated on an ongoing basis during the transferring phase of your task execution. It's updated periodically when each piece of data is read from the source location and sent over the network.

If failures occur during a transfer, this value can be less than EstimatedFilesToTransfer. This value can also be greater than EstimatedFilesTransferred in some cases. This element is implementation-specific for some location types, so don't use it as an indicator for a correct transfer total or to monitor your task execution.

Unit: Count

FilesVerifiedDestination

The total number of files, objects, and directories that are verified at the destination location.

Unit: Count

FilesVerifiedSource

The total number of files, objects, and directories that are verified at the source location.

Unit: Count

Allowing DataSync to upload logs to CloudWatch log groups

DataSync requires sufficient permissions to send logs to a CloudWatch log group. When you create a task by using the console, DataSync can many times create an AWS Identity and Access Management (IAM) resource policy with the correct permissions for you.

If you want to use an existing CloudWatch log group or if you want to create your tasks programmatically, you must create this IAM resource policy yourself.

The following example is a resource policy that grants these permissions.

{ "Version": "2012-10-17", "Statement": [ { "Sid": "DataSyncLogsToCloudWatchLogs", "Effect": "Allow", "Action": [ "logs:PutLogEvents", "logs:CreateLogStream" ], "Principal": { "Service": "datasync.amazonaws.com" }, "Condition": { "ArnLike": { "aws:SourceArn": [ "arn:aws:datasync:region:account-id:task/*" ] }, "StringEquals": { "aws:SourceAccount": "account-id" } }, "Resource": "arn:aws:logs:region:account-id:log-group:*:*" } ] }

The policy uses Condition statements to help ensure that only DataSync tasks from the specified account have access to the specified CloudWatch log group. We recommend using the aws:SourceArn and aws:SourceAccount global condition context keys in these Condition statements to protect against the confused deputy problem. For more information, see Cross-service confused deputy prevention.

To specify the DataSync task or tasks, replace region with the Region code for the AWS Region where the tasks are located (for example, us-west-2), and replace account-id with the AWS account ID of the account that contains the tasks. To specify the CloudWatch log group, replace the same values. You can also modify the Resource statement to target specific log groups. For more information about using SourceArn and SourceAccount, see Global condition keys in the IAM User Guide.

To apply the policy, save this policy statement to a file on your local computer. Then run the following AWS CLI command to apply the resource policy. To use this example command, replace full-path-to-policy-file with the path to the file that contains your policy statement.

aws logs put-resource-policy --policy-name trust-datasync --policy-document file://full-path-to-policy-file
Note

Run this command by using the same AWS account and AWS Region where you activated your DataSync agent.

For more information, see Working with log groups and log streams in the Amazon CloudWatch Logs User Guide.

Configuring logging for your DataSync transfer task

You can publish details about your DataSync transfer task to a CloudWatch log group.

Before you begin

DataSync needs permission to upload logs to a CloudWatch log group. You can set up this permission through an IAM resource policy in a couple different ways:

  • When you create your task by using the console, DataSync can create a log group and the associated resource policy for you. DataSync can also apply this resource policy for you.

  • If you want to use an existing log group, see an example of how to create a resource policy yourself.

The following instructions describe how to configure CloudWatch logging when creating a task. You also can configure CloudWatch logging when editing a task.

  1. Open the AWS DataSync console at https://console.aws.amazon.com/datasync/.

  2. In the left navigation pane, expand Data transfer, then choose Tasks, and then choose Create task.

  3. Configure your task's source and destination locations.

    For more information, see Where can I transfer my data with AWS DataSync?

  4. On the Configure settings page, give your task a name, configure your task execution, configure your data transfer, set a schedule, and optionally add tags, and configure a task report.

  5. Scroll down to the Logging section. For Log level, choose one of the following options:

    • Log basic information such as transfer errors – Publish logs with only basic information (such as transfer errors).

    • Log all transferred objects and files – Publish logs for all files or objects that your DataSync task transfers and performs data-integrity checks on.

    • Do not send logs to CloudWatch

  6. For CloudWatch log group, specify a log group that DataSync has permission to upload logs to by doing one of the following:

    • Choose Autogenerate to automatically create a log group that allows DataSync to upload logs to it.

    • Choose an existing log group in your current AWS Region.

      If you choose an existing log group, make sure that you have a resource policy that allows DataSync to upload logs to the log group.

You can configure CloudWatch logging for your task by using the CloudWatchLogGroupArn parameter with any of the following operations: