Managing collection-level document compression - Amazon DocumentDB

Managing collection-level document compression

Amazon DocumentDB collection-level document compression allows you to lower storage and IO costs by compressing the documents in your collections. You can enable document compression at a collection level and view compression metrics as needed by measuring the storage gains through compression metrics such as storage size of compressed documents and compression status. Amazon DocumentDB uses the LZ4 compression algorithm to compress documents.

Guidelines

The following guidelines apply to collection-level document compression:

  • Document compression is disabled by default

  • Document compression cannot be applied to existing collections.

  • Document compression is only supported on Amazon DocumentDB version 5.0 and higher.

  • Amazon DocumentDB only compresses documents with a size of 2KB and larger.

Enabling document compression

Enable document compression while creating a collection on Amazon DocumentDB by using db.createCollection() method:

db.createCollection( sample_collection,{ storageEngine : { documentDB: { compression:{ enable: <true | false> } } } })

Monitoring document compression

You can check if a collection is compressed and calculate it's compression ratio as follows.

View compression statistics by running the db.printCollectionStats() or db.collection.stats() command from the mongo shell. The output shows you the original size and compressed size that you can compare to analyze the storage gains from document compression. In this example, statistics for a collection named “sample_collection” are shown:

db.sample_collection.stats(1024*1024) { "ns" : "test.sample_collection", "count" : 1000000, "size" : 3906.3, "avgObjSize" : 4096, "storageSize" : 1953.1, compression:{ "enabled" : true, "threshold" : 2032 } ... }
  • size - The original size of the document collection.

  • avgObjSize - The average document size before compression rounded off to first decimal. The unit of measure is bytes.

  • storageSize - The storage size of the collection after compression. The unit of measure is bytes.

  • enabled - Indicates if compression is enabled or disabled.

To calculate the actual compression ratio, divide the collection size by the storage size (size/storageSize). For the example above, the calculation is 3906.3/1953.1 which translates to a 2:1 compression ratio.

Managing existing collections

While you cannot compress an existing collection, you can convert uncompressed or compressed documents. To store existing uncompressed documents in compressed format, copy the document to a compression-enabled collection. To convert compressed documents to uncompressed format, copy the documents to a compression-disabled collection.