Raw layer S3 bucket - AWS Prescriptive Guidance

Raw layer S3 bucket

The raw data layer contains ingested data that has not been transformed and is in its original file format (for example, JSON or CSV). This data is typically organized by data source and the date that it was ingested into the raw data layer's Amazon Simple Storage Service (Amazon S3) bucket.

The following table provides the naming structure, a description of the naming structure, and a name example for the S3 bucket in your raw data layer.

Naming format Example
s3://companyname-raw-awsregion-awsaccount|uniqid-env/source/source_region/table/year=yyyy/month=mm/day=dd/table_<yearmonthday>.avro|csv

  • companyname – The organization’s name (optional).

  • awsregion – The AWS Region (for example, us-east-1, or sa-east-1).

  • awsaccount|uniqid – The unique identifier or AWS account ID.

  • env – The deployment environment (for example, dev, test, or prod).

  • source – The source or content (for example, MySQL database, ecommerce, or SAP).

  • source_region – For example, us or asia.

  • tabletb_customer, tb_transactions, or tb_products.

s3://anycompany-raw-useast1-12345-dev/socialmedia/us/tb_products/year=2021/month=03/day=01/products_20210301.csv