Managing workflows for post-upload processing - AWS Transfer Family

Managing workflows for post-upload processing

AWS Transfer Family supports managed workflows for file processing, enabling you to create, automate, and monitor file transfers over SFTP, FTPS, and FTP. Using this feature, you can securely and cost effectively meet your compliance requirements for Business to Business (B2B) file exchanges by coordinating all the necessary steps required for file processing, while benefiting from end-to-end auditing and visibility.

Managing your post-processing workflows using Transfer Family:

  • is easy to set up and customize

  • is auditable

  • supports a library of commonly used operations that you can use individually or chain together

  • supports searching and viewing the real-time status of in-progress and completed workflows

First, set up your workflow by defining a sequence of action steps such as copying, tagging, or your own custom step based on your requirements. Next, map the workflow to one of your AWS Transfer Family's managed file servers, so upon file arrival, actions specified in this workflow are evaluated and triggered in real-time.

Using the AWS Transfer Family console, you can search for and view the real-time status of in-progress workflows. You can view completed workflows in CloudWatch logs.

Topics

Create a workflow

When you're creating workflows, keep in mind that the maximum number of workflows per account is 10, and the maximum number of steps per workflow is 8.

Create a workflow

  1. Open the AWS Transfer Family console at https://console.aws.amazon.com/transfer/.

  2. In the left navigation pane, choose Workflows.

  3. On the Workflows page, choose Create workflow.

  4. On the Provide workflow description page, enter a description. This description appears on the Workflows page.

  5. On the Configure nominal steps page, choose Add step. Add one or more steps.

    1. Choose a step type. Choose from the available options.

    2. Choose Next, then configure parameters for the step.

    3. Choose Next, then review the details for the step.

    4. Choose OK to add the step and continue.

    5. Continue adding steps as needed. The maximum number of steps in a workflow is 8.

    6. After you have added all the necessary steps, choose Next to continue to Configure Exception Handlers.

      Note

      We recommend that you set up Exception Handlers and steps to execute when your workflow fails, so you are informed of failures in real-time

  6. You configure exception handlers by adding steps, in the same manner as described previously. If a file causes any step to throw an exception, your exception handlers are invoked one by one.

  7. Review the configuration, and choose Create workflow.

Configure and execute a workflow

Configure Transfer Family to run a workflow on uploaded files

  1. Open the AWS Transfer Family console at https://console.aws.amazon.com/transfer/.

  2. In the left navigation pane, choose Servers. Choose the server that you want to use for your workflow.

  3. On the details page for the server, scroll down to the Additional details section, and then choose Edit.

    Note

    By default, servers do not have any associated workflows. You use the Additional details section to associate a workflow with the selected server.

  4. On the Edit additional details page, in the Post-upload file-processing section, select a workflow to be run on all uploads.

    Note

    If you do not already have a workflow, use the Create a Workflow link to create one.

    1. Select the workflow ID to use, and then choose Save.

    2. Choose an execution role. This is the role that Managed File Transfer Workflows assumes when executing the workflow's steps. For details, see Construct an execution role for workflows.

Execute a workflow

To execute a workflow, you upload a file to a Transfer Family server that you configured with an associated workflow.

# Execute a workflow > sftp bob@s-1234567890abcdef0.server.transfer.us-east-1.amazonaws.com Connected to s-1234567890abcdef0.server.transfer.us-east-1.amazonaws.com. sftp> put doc1.pdf Uploading doc1.pdf to /my-cool-bucket/home/users/bob/doc1.pdf doc1.pdf 100% 5013KB 601.0KB/s 00:08 sftp> exit >

After your file has been uploaded, the action defined is performed on your file. For example, if your workflow contains a copy step, the file is copied to the location that you defined in that step. You can use Amazon CloudWatch Logs to track steps that executed and their status after your execution completes.

View workflow details

View workflow details

You can view details pertaining to previously created workflows or to workflow executions. To view these details, you can use the console or the AWS Command Line Interface (AWS CLI).

Console

View workflow details

  1. Open the AWS Transfer Family console at https://console.aws.amazon.com/transfer/.

  2. In the left navigation pane, choose Workflows.

  3. On the Workflows page, choose a workflow.

    The workflow details page opens.

CLI

To view the workflow details, use the describe-workflow CLI command, as shown in the following example.

# View Workflow details > aws transfer describe-workflow --workflow-id w-1234567890abcdef0 { "Workflow": { "Arn": "arn:aws:transfer:us-east-1:111122223333:workflow/w-1234567890abcdef0", "WorkflowId": "w-1234567890abcdef0", "Name": "Copy file to shared_files", "Steps": [ { "Type": "COPY", "CopyStepDetails": { "Name": "Copy to shared", "FileLocation": { "S3FileLocation": { "Bucket": "my-cool-bucket", "Key": "home/shared_files/" } } } } ], "OnException": {} } }

Create a custom workflow step

With a custom workflow step, you must configure the Lambda function to call the SendWorkflowStepState API in order to notify the execution that the step completed with either a success or a failure. The response returned from the Lambda function cannot update the state of the step. If the Lambda function fails or times out, the step fails, and you see StepErrored in your CloudWatch logs. If the Lambda function is part of the nominal step and you send a failure on the API, or if it fails or times out, the flow continues with the Exception handler steps: the workflow does not continue to execute the remaining (if any) nominal steps.

When you call the SendWorkflowStepState API you must send the following parameters:

{ "ExecutionId": "string", "Status": "string", "Token": "string", "WorkflowId": "string" }

You can extract the ExecutionId, Token, and WorkflowId from the input event that is passed when the Lambda function executes (examples are shown below). The status can be either SUCCESS or FAILURE.

To be able to call the SendWorkflowStepState from your Lambda function, you need to use the latest version of the AWS SDK. To use the latest version of the AWS SDK, you can deploy a Lambda function with the dependencies by using a ZIP archive (see custom package in the AWS Lambda Developer Guide) or you can create a Lambda layer to integrate the latest version of the AWS SDK to the Lambda function.

Furthermore, the Lambda execution role must have permissions to call the SendWorkflowStepState API, otherwise it fails. You can add the following policy to your Lambda execution role:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "transfer:SendWorkflowStepState", "Resource": "*" } ]}

Example Lambda function for a custom workflow step

The following Lambda function extracts the information regarding the execution and calls the SendWorkflowStepState API to return the status to the workflow for the step—SUCCESS or FAILURE. Before your function calls the SendWorkflowStepState API, you can configure Lambda to take an action based on your workflow logic.

import json import boto3 transfer = boto3.client('transfer') def lambda_handler(event, context): print(json.dumps(event)) # call the SendWorkflowStepState API to notify worfklows for an update on the step with a SUCCESS or a FAILURE response = transfer.send_workflow_step_state( WorkflowId=event['serviceMetadata']['executionDetails']['workflowId'], ExecutionId=event['serviceMetadata']['executionDetails']['executionId'], Token=event['token'], Status='SUCCESS|FAILURE' ) print(json.dumps(response)) return { 'statusCode': 200, 'body': json.dumps(response) }

Example custom step

The following examples show the code for a custom step. One example uses a Transfer Family server where the domain is configured with Amazon S3 and one where the domain uses Amazon EFS.

Custom step Amazon S3 domain
{ "token": "MzI0Nzc4ZDktMGRmMi00MjFhLTgxMjUtYWZmZmRmODNkYjc0", "serviceMetadata": { "executionDetails": { "workflowId": "w-1234567890example", "executionId": "abcd1234-aa11-bb22-cc33-abcdef123456" }, "transferDetails": { "sessionId": "36688ff5d2deda8c", "userName": "myuser", "serverId": "s-example1234567890" } }, "fileLocation": { "domain": "S3", "bucket": "mybucket", "key": "path/to/mykey", "eTag": "d8e8fca2dc0f896fd7cb4cb0031ba249", "versionId": null } }
Custom step Amazon EFS domain
{ "token": "MTg0N2Y3N2UtNWI5Ny00ZmZlLTk5YTgtZTU3YzViYjllNmZm", "serviceMetadata": { "executionDetails": { "workflowId": "w-1234567890example", "executionId": "abcd1234-aa11-bb22-cc33-abcdef123456" }, "transferDetails": { "sessionId": "36688ff5d2deda8c", "userName": "myuser", "serverId": "s-example1234567890" } }, "fileLocation": { "domain": "EFS", "fileSystemId": "fs-1234567", "path": "/path/to/myfile" } }

Create workflow procedure

To create a custom workflow step, you specify the Amazon Resource Name (ARN) for the Lambda function. With a custom workflow step, you can write your own code in AWS Lambda to process your incoming files. After incoming files are uploaded to Amazon S3 or Amazon EFS, AWS Transfer Family invokes a Lambda function that you provide using a custom step.

Create a custom workflow step

  1. Open the AWS Transfer Family console at https://console.aws.amazon.com/transfer/.

  2. In the left navigation pane, choose Workflows.

  3. On the Workflows page, choose Create workflow.

  4. Enter a description for the workflow, and then choose Add step in the Nominal steps section.

  5. On the Choose step type page, choose Custom Lambda, and then choose Next.

  6. On the Configure parameters page, enter a name for the step, the Amazon Resource Name (ARN) for the Lambda function to execute, and a timeout value, in minutes. The maximum value for the timeout is 30.

    Note

    In order to complete this step, you must have a functional Lambda code ready.

  7. Choose Next, and then choose Create step.

The following sample AWS Lambda code takes the workflow event data and sends it to Amazon EventBridge, so that you can integrate the workflow event with your other applications.

import json, boto3def lambda_handler(event, context): # TODO implement print (event) client = boto3.client('events') event_serviceMetadata_transferDetails_serverid = event['serviceMetadata']['transferDetails']['serverId'] response = client.put_events( Entries=[ { 'Source': 'dev.transfer', 'Resources': [ event_serviceMetadata_transferDetails_serverid, ], 'DetailType': 'Event message relayed from an AWS Transfer Workflow customstep', 'Detail': json.dumps(event) }, ] ) return { 'statusCode': 200, 'body': json.dumps(response) }

Exception handling for a workflow

You specify the error-handling steps for a workflow in the same manner as you specify the workflow steps themselves. If any errors occur during the workflow execution, the error-handling steps that you specified are executed. For information about troubleshooting workflows, see Troubleshoot workflow-related errors using CloudWatch.

Choose pre-defined steps

Copy, tag, and delete steps are pre-defined actions for post-upload file-processing. For example, you can create a workflow step to copy your uploaded file to another location. You can create this step for servers that use Amazon S3.

Note

Currently, copying and tagging are supported only on Amazon S3.

If your server uses Amazon S3, you must provide the bucket name and a key for the destination of the copy. The key can be either a path name or a file name. Whether the key is treated as a path name or a file name is determined by whether you end the key with the forward slash (/) character.

If the final character is /, your file is copied to the folder, and its name does not change. If the final character is alphanumeric, your uploaded file is renamed to the key value. In this case, if a file with that name already exists, the existing file is overwritten with the new one. For example, if your key value is shared-files/, your uploaded files are copied to the shared-files folder. If your key value is shared-files/today, every file you upload is copied to a file named today in the shared-files folder, and each succeeding file overwrites the previous one.

Example copy step

For example, consider the following workflow steps file:

[ { "Type": "COPY", "CopyStepDetails": { "Name": "copyToShared", "DestinationFileLocation": { "S3FileLocation": { "Bucket": "bob-bucket", "Key": "shared_files/" } } } } ]

After you create a workflow from this file, and then attach it to a Transfer Family server, all uploaded files are copied to the /bob-bucket/shared-files folder.

Parametrize the destination folder

You can also parametrize the destination by username. To do this, you set the Key to ${transfer:UserName}. In this case, the DestinationFileLocation code looks similar to this:

"DestinationFileLocation": { "S3FileLocation": { "Bucket": "main-bucket", "Key": "${transfer:UserName}/processed/" }

In this example, the destination folder for each authorized user is main-bucket/user-id/processed/. So, for a user whose ID is bob-usa, their files get copied into main-bucket/bob-usa/processed/.

CloudWatch logging for a workflow

Amazon CloudWatch monitors your AWS resources and the applications that you run in the AWS Cloud in real time. You can use CloudWatch to collect and track metrics, which are variables that you can measure for your workflows. CloudWatch provides consolidated auditing and logging for workflow progress and results.

View Amazon CloudWatch logs for workflows

  1. Open the Amazon CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

  2. In the left navigation pane, choose Logs then choose Log groups.

  3. On the Log groups page, on the navigation bar, choose the correct region for your Transfer Family server.

  4. Choose the log group that corresponds to your server.

    For example, if your server ID is s-1234567890abcdef0, your log group is /aws/transfer/s-1234567890abcdef0.

  5. On the log page for your server, the most recent log streams are displayed. There are log streams for the user that you are exploring: one for each sftp session, as well as a log stream for the workflow that is being executed for your server. The format for the log stream for the workflow is username.workflowID.uniqueStreamSuffix.

    For example, if your user is bob-usa, you have the following log streams:

    bob-usa-east.1234567890abcdef0 bob.w-abcdef01234567890.021345abcdef6789
    Note

    The 16-digit alphanumeric identifiers listed in this example are fictitious : the values you see in Amazon CloudWatch are different.

The Log events page for bob-usa-east.1234567890abcdef0 displays the details for user session, and the bob.w-abcdef01234567890.021345abcdef6789 log stream contains the details for the workflow.

The following is a sample log stream for bob.w-abcdef01234567890.021345abcdef6789, based on a workflow (w-abcdef01234567890) that contains a copy step.

{"type":"ExecutionStarted","details": {"input": {"initialFileLocation":{"bucket":"my-bucket","key":"bob/workflowSteps2.json","versionId":"version-id","etag":"etag-id"} } }, "workflowId":"w-abcdef01234567890","executionId":"execution-id", "transferDetails":{"serverId":"s-server-id","username":"bob","sessionId":"session-id"} } {"type":"StepStarted","details": {"input": {"fileLocation": {"backingStore":"S3","bucket":"my-bucket","key":"bob/workflowSteps2.json","versionId":"version-id","etag":"etag-id"} }, "stepType":"COPY","stepName":"copyToShared"}, "workflowId":"w-abcdef01234567890","executionId":"execution-id", "transferDetails":{"serverId":"s-server-id","username":"bob","sessionId":"session-id"} } {"type":"StepCompleted","details": {"output":{},"stepType":"COPY","stepName":"copyToShared"}, "workflowId":"w-abcdef01234567890","executionId":"execution-id", "transferDetails":{"serverId":"s-server-id","username":"bob","sessionId":"session-id"} } {"type":"ExecutionCompleted","details": {}, "workflowId":"w-abcdef01234567890","executionId":"execution-id", "transferDetails":{"serverId":"s-server-id","username":"bob","sessionId":"session-id} }

Construct an execution role for workflows

When you add a workflow to a server, you need to select an execution role. The server uses this role when it executes the workflow. If the role does not have the proper permissions, Transfer Family cannot run the workflow. This section describes one possible set of permissions that can be used to execute a workflow.

  • Create a new role, and add the AWS managed policy, AWSTransferFullAccess, to the role.

  • Create another policy with the following permissions, and attach it to your role.

    { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObjectVersion", "s3:ListBucket", "s3:DeleteObject", "s3:GetBucketLocation", "s3:GetObjectVersion", "s3:PutObjectTagging" ], "Resource": "*" } ] }

Save this role and specify it as the execution role when you add a workflow to a server.

Managed workflows restrictions and limitations

Restrictions

The following restrictions currently apply to post-upload processing workflows for AWS Transfer Family.

  • Cross-account and cross-Region Lambda functions are not supported.

  • For a copy step, the source and destination Amazon S3 buckets must be in the same Region.

  • Only asynchronous custom steps are supported.

  • Custom step timeouts are approximate. That is, it may take slightly longer to time out than specified. Additionally, the workflow is dependent upon the Lambda function. Therefore, if the function is delayed during execution, the workflow is not aware of the delay.

  • The only supported outputs from a custom step are Success or Failure.

  • Copying across storage services and Regions is not supported. You can, however, copy across accounts, provided that your AWS Identity and Access Management (IAM) policies are correctly configured.

  • If you exceed your throttling limit, Transfer Family doesn't add workflow operations to the queue.

  • Workflows are triggered upon successful file upload—end of STOR data stream for FTP, or CLOSE after write for SFTP—and not on partial upload (for example an abrupt disconnect without closing).

  • Workflows are not triggered for files that have a size of 0. Files with a size greater than 0 do trigger the associated workflow.

Limitations

Additionally, the following functional limits apply to workflows for Transfer Family:

  • The number of workflows per account is limited to 10.

  • The maximum timeout for custom steps is 30 minutes.

  • The maximum number of steps in a workflow is 8.

  • The maximum number of tags per workflow is 50.