Amazon S3 - Amazon AppFlow

Amazon S3

The following are the requirements and connection instructions for using Amazon Simple Storage Service (Amazon S3) with Amazon AppFlow.

Note

You can use Amazon S3 as a source or a destination.

Requirements

  • Your S3 buckets must be in the same AWS Region as your console and flow.

  • If you use Amazon S3 as a source, all source files in the chosen S3 bucket must be in CSV format, with a header row that includes the field names in each file. Before you set up the flow, ensure that the source location has at least one file in CSV format, with a list of field names separated by commas in the first line. You must place the CSV file inside a folder in the S3 bucket.

  • Each source file should not exceed 125 MB in size. However, you can upload multiple CSV files in the source location, and Amazon AppFlow will read from all of them to transfer data over a single flow run. You can check for any applicable destination data transfer limits in Quotas for Amazon AppFlow.

  • Amazon AppFlow does not support cross-account access to S3 buckets in order to prevent unauthorized access and potential security concerns.

Connection instructions

To use Amazon S3 as a source or destination while creating a flow

  1. Open the Amazon AppFlow console at https://console.aws.amazon.com/appflow/.

  2. Choose Create flow.

  3. For Flow details, enter a name and description for the flow.

  4. (Optional) To use a customer managed CMK instead of the default AWS managed CMK, choose Data encryption, Customize encryption settings and then choose an existing CMK or create a new one.

  5. (Optional) To add a tag, choose Tags, Add tag and then enter the key name and value.

  6. Choose Next.

  7. Choose Amazon S3 from the Source name or Destination name dropdown list.

  8. Under Bucket details, select the S3 bucket that you're retrieving from or adding to. You can specify a prefix, which is equivalent to specifying a folder within the S3 bucket where your source files are located or records are to be written to the destination.

Now that you are connected to your S3 bucket, you can continue with the flow creation steps as described in Getting started with Amazon AppFlow.

Tip

If you aren’t connected successfully, ensure that you have followed the instructions in the Requirements section above.

Notes

  • When you use Amazon S3 as a source, you can run schedule-triggered flows at a maximum frequency of one flow run per minute.

  • When you use Amazon S3 as a destination, the following additional settings are available.

Setting name Description

Data format preference

  • You can specify your preferred file format for the transferred records.

  • The following options are currently available: JSON (default), CSV, or Parquet.

Note

If you choose Parquet as the format for your destination file in Amazon S3, the option to aggregate all records into one file per flow run will not be available. When choosing Parquet, Amazon AppFlow will write the output as string, and not declare the data types as defined by the source.

Data transfer preference

  • You can choose between aggregation, and no aggregation of records.

  • By default, Amazon AppFlow transfers data into multiple files per flow run.

  • Alternatively, you can choose to aggregate all transferred data into one file per run flow.

Filename preference

  • You can choose to add a timestamp to the filename.

  • Your filename will end with the file creation timestamp in YYYY-MM-DDThh:mm:sss format.

  • The creation date is in UTC time.

Folder structure preference

  • You can choose to place the file in a timestamped folder.

  • You can choose your preferred level of granularity (year, month, week, day, or minute).

  • The granularity that you choose determines the naming format of the folder.

  • The timestamp is in UTC time.

Related resources