AWS Data Pipeline is no longer available to new customers. Existing customers of AWS Data Pipeline can continue to use the service as normal. Learn more
Viewing Pipeline Logs
Pipeline-level logging is supported at pipeline creation by specifying an Amazon S3
location in either the console or with a pipelineLogUri
in the
default object in SDK/CLI. The directory structure for each pipeline within that URI
is like the following:
pipelineId
-componentName
-instanceId
-attemptId
For pipeline, df-00123456ABC7DEF8HIJK
, the directory structure looks
like:
df-00123456ABC7DEF8HIJK -ActivityId_fXNzc -@ActivityId_fXNzc_2014-05-01T00:00:00 -@ActivityId_fXNzc_2014-05-01T00:00:00_Attempt=1
For ShellCommandActivity
, logs for stderr
and stdout
associated with these activities are stored in the
directory for each attempt.
For resources like, EmrCluster
, where an emrLogUri
is
set, that value takes precedence. Otherwise, resources (including TaskRunner logs
for those resources) follow the above pipeline logging structure.
To view logs for a given pipeline run:
Retrieve the
ObjectId
by callingquery-objects
to get the exact object ID. For example:aws datapipeline query-objects --pipeline-id <pipeline-id> --sphere ATTEMPT --region ap-northeast-1
query-objects
is a paginated CLI and may return a pagination token if there are more executions for the givenpipeline-id
. You can use the token to go through all the attempts until you find the expected object. For example, a returned ObjectId would look like:@TableBackupActivity_2023-05-020T18:05:18_Attempt=1
.Using the ObjectId, retrieve the log location using:
aws datapipeline describe-objects —pipeline-id <pipeline-id> --object-ids <object-id> --query "pipelineObjects[].fields[?key=='@logLocation'].stringValue"
Error message of a failed activity
To get the error message, first get the ObjectId using query-objects
.
After retrieving the failed ObjectId, use the describe-objects
CLI to get the actual error message.
aws datapipeline describe-objects --region ap-northeast-1 --pipeline-id <pipeline-id> --object-ids <object-id> --query "pipelineObjects[].fields[?key=='errorMessage'].stringValue"
Cancel or rerun or mark as finished an object
Use the set-status
CLI to cancel a running object, or re-run a failed object or mark a running object as Finished.
First, get the object ID using the query-objects
CLI. For example:
aws datapipeline query-objects --pipeline-id <pipeline-id> --sphere INSTANCE --region ap-northeast-1
Use the set-status
CLI to change the status of the desired object. For example:
aws datapipeline set-status —pipeline-id <pipeline-id> --region ap-northeast-1 --status TRY_CANCEL --object-ids <object-id>