Menu
Amazon Simple Storage Service
Developer Guide (API Version 2006-03-01)

Amazon S3 Storage Inventory

Amazon S3 inventory is one of the tools Amazon S3 provides to help manage your storage. You can simplify and speed up business workflows and big data jobs using the Amazon S3 inventory, which provides a scheduled alternative to the Amazon S3 synchronous List API operation. Amazon S3 inventory provides a comma-separated values (CSV) flat-file output of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or a shared prefix (that is, objects that have names that begin with a common string).

You can configure what object metadata to include in the inventory, whether to list all object versions or only current versions, where to store the inventory list flat-file output, and whether to generate the inventory on a daily or weekly basis. You can have multiple inventory lists configured for a bucket. For information about pricing, see Amazon S3 Pricing.

How Do I Set Up Amazon S3 Inventory?

This section describes how to set up Amazon S3 inventory by describing the inventory source and destination buckets, and then by providing the steps for setting up an inventory.

Amazon S3 Inventory Source and Destination Buckets

The bucket that the inventory lists the objects for is called the source bucket. The bucket where the inventory list flat file is stored is called the destination bucket.

The Source Bucket

The Amazon S3 inventory lists the objects that are stored in the source bucket. You can get inventory lists for an entire bucket or filtered by (object key name) prefix.

The source bucket:

  • Contains the objects that are listed in the inventory.

  • Contains the configuration for the inventory.

The Destination Bucket

Amazon S3 inventory list flat files are written to the destination bucket. You can specify a destination (object key name) prefix in the inventory configuration to group all the inventory list files in a common location within the destination bucket.

The destination bucket:

  • Contains the inventory flat file lists.

  • Contains the manifest.json file that lists all the flat file inventory lists that are stored in the destination bucket. For more information, see What is an Inventory Manifest?.

  • Must have a bucket policy to give Amazon S3 permission to verify ownership of the bucket and permission to write files to the bucket.

  • Must be in the same region as the source bucket.

  • Can be the same as the source bucket.

  • Can be owned by a different AWS account than the account that owns the source bucket.

Setting Up Amazon S3 Inventory

Amazon S3 inventory helps you manage your storage by creating lists of the objects in an S3 bucket on a defined schedule. The inventory lists are published to flat files in a destination bucket.

To set up Amazon S3 inventory for an S3 bucket:

  1. Add a bucket policy for the destination bucket.

    You must create a bucket policy on the destination bucket to grant permissions to Amazon S3 to write objects to the bucket in the defined location. For an example policy, see Granting Permissions for Amazon S3 Inventory and Amazon S3 Analytics.

  2. Configure an inventory to list the objects in a source bucket and publish the list to a destination bucket.

    When you configure an inventory list for a source bucket, you specify the destination bucket where you want the list to be stored, and whether to generate the list daily or weekly. You can also configure what object metadata to include and whether to list all object versions or only current versions. You can configure multiple inventory lists for a bucket.

What's Included in an Amazon S3 Inventory?

An inventory list flat file contains a list of the objects in the source bucket and metadata for each object. The inventory lists are stored in the destination bucket as a comma-separated values (CSV) file compressed with GZIP.

The inventory list contains a list of the objects in an S3 bucket and the following metadata for each listed object:

  • Bucket name – The name of the bucket that the inventory is for.

  • Key name – Object key name (or key) that uniquely identifies the object in the bucket.

  • Version ID – Object version ID. When you enable versioning on a bucket, Amazon S3 assigns a version number to objects added to the bucket. For more information, see Object Versioning. (This field is not included if the list is only for the current version of objects.)

  • IsLatest – Set to True if the object is the current version of the object. (This field is not included if the list is only for the current version of objects.)

  • Size – Object size in bytes.

  • Last modified date – Object creation date or the last modified date, whichever is the latest.

  • ETag – The entity tag is a hash of the object. The ETag reflects changes only to the contents of an object, not its metadata. The ETag may or may not be an MD5 digest of the object data. Whether or not it is depends on how the object was created and how it is encrypted.

  • Storage class – Storage class used for storing the object. For more information, see Storage Classes.

  • Multipart upload flag – Set to True if the object was uploaded as a multipart upload. For more information, see Multipart Upload Overview.

  • Delete marker – Set to True, if the object is a delete marker. For more information, see Object Versioning. (This field is not included if the list is only for the current version of objects.)

  • Replication status – Set to PENDING, COMPLETED, FAILED, or REPLICA. For more information, see How to Find Replication Status of an Object.

The following is an example inventory list opened in a spreadsheet application. The heading row is only to help clarify the example, it is not included in the actual list.

We recommend that you create a lifecycle policy that deletes old inventory lists. For more information, see Object Lifecycle Management.

Inventory Consistency

All of your objects might not appear in each inventory list. The inventory list provides eventual consistency for PUTs of both new objects and overwrites, and DELETEs. Inventory lists are a rolling snapshot of bucket items, which are eventually consistent (that is, the list might not include recently added or deleted objects).

To validate the state of the object before taking action on the object we recommend that you perform a HEAD Object REST API request to retrieve metadata for the object, or check the object's properties in the Amazon S3 console. You can also check object metadata with the AWS CLI, or the AWS SDKS. For more information, see HEAD Object in the Amazon Simple Storage Service API Reference.

Where are Inventory Lists Located?

The manifest files are published to the following location in the destination bucket when an inventory list is published.

Copy
destination-prefix/source-bucket/config-ID/YYYY-MM-DDTHH-MMZ/manifest.json destination-prefix/source-bucket/config-ID/YYYY-MM-DDTHH-MMZ/manifest.checksum
  • destination-prefix is the (object key name) prefix set in the inventory configuration, which can be used to group all the inventory list files in a common location within the destination bucket.

  • source-bucket is the source bucket that the inventory list is for. It is added to prevent collisions when multiple inventory reports from different source buckets are sent to the same destination bucket.

  • config-ID is added to prevent collisions with multiple inventory reports from the same source bucket that are sent to the same destination bucket.

  • YYYY-MMTHH-MMZ is the date when the inventory list is generated. For example, 2016-11-06T21-32Z.

  • manifest.json is the manifest file.

  • manifest.checksum is the MD5 of the content of manifest.json file.

The inventory lists are published to the following location in the destination bucket on a daily or weekly basis.

Copy
destination-prefix/source-bucket/data/example-file-name.csv.gz ... destination-prefix/source-bucket/data/example-file-name-1.csv.gz
  • destination-prefix is the (object key name) prefix set in the inventory configuration, which can be used to group all the inventory list files in a common location within the destination bucket.

  • source-bucket is the source bucket that the inventory list is for. It is added to prevent collisions when multiple inventory reports from different source buckets are sent to the same destination bucket.

  • example-file-name.csv.gz is one of the inventory list files.

What is an Inventory Manifest?

The manifest provides metadata and other basic information about an inventory list including source bucket name, destination bucket name, version of the inventory list, format and schema of the inventory list flat files, and the actual list of the inventory list files.

There is one manifest for each inventory, which is contained in a manifest.json file that lists the inventory lists that are in the destination bucket. Whenever a manifest.json file is written it is accompanied by a manifest.checksum file that is the MD5 of the content of manifest.json file. The manifest is updated whenever a new list is written to the destination bucket.

The following is an example of a manifest.json file.

Copy
{ "sourceBucket": "example-source-bucket", "destinationBucket": "example-inventory-destination-bucket", "version": "2016-11-30", "fileFormat": "CSV", "fileSchema": "Bucket, Key, VersionId, IsLatest, IsDeleteMarker, Size, LastModifiedDate, ETag, StorageClass, MultipartUploaded, ReplicationStatus", "files": [ { "key": "Inventory/example-source-bucket/2016-11-06T21-32Z/files/939c6d46-85a9-4ba8-87bd-9db705a579ce.csv.gz", "size": 2147483647, "MD5checksum": "f11166069f1990abeb9c97ace9cdfabc", "inventoriedRecord": 58050695 } ] }

How Do I know When an Inventory is Complete?

You can set up an Amazon S3 event notification to receive notice when the manifest checksum file is created, which indicates that an inventory list has been added to the destination bucket. The manifest is an up-to-date list of all the inventory lists at the destination location.

Amazon S3 can publish events to an Amazon Simple Notification Service (Amazon SNS) topic, an Amazon Simple Queue Service (Amazon SQS) queue, or an AWS Lambda function. For more information, see Configuring Amazon S3 Event Notifications.

The following notification configuration defines that all manifest.checksum files newly added to the destination bucket are processed by the AWS Lambda cloud-function-list-write.

Copy
<NotificationConfiguration> <QueueConfiguration> <Id>1</Id> <Filter> <S3Key> <FilterRule> <Name>prefix</Name> <Value>destination-prefix/source-bucket</Value> </FilterRule> <FilterRule> <Name>suffix</Name> <Value>checksum</Value> </FilterRule> </S3Key> </Filter> <CloudFunction>arn:aws:lambda:us-west-2:222233334444:cloud-function-list-write</CloudFunction> <Event>s3:ObjectCreated:*</Event> </QueueConfiguration> </NotificationConfiguration>

For more information, see Using AWS Lambda with Amazon S3 in the AWS Lambda Developer Guide.

The following are the REST operations used for storage inventory.