计算校验和 - Amazon S3 Glacier

如果您不熟悉 Amazon Simple Storage Service (Amazon S3) 中的归档存储功能,建议您先详细了解 Amazon S3 中的 S3 Glacier 存储类、S3 Glacier 即时检索S3 Glacier 灵活检索S3 Glacier 深度归档。有关更多信息,请参阅 Amazon S3 用户指南中的 S3 Glacier 存储类和用于存档对象的存储类。

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

计算校验和

上传档案时,您必须包括 x-amz-sha256-tree-hashx-amz-content-sha256 标头。x-amz-sha256-tree-hash 标头是您的请求正文中有效负载的校验和。此主题描述了如何计算 x-amz-sha256-tree-hash 标头。x-amz-content-sha256 标头是整个有效负载的哈希,并且是授权所必需的项目。有关更多信息,请参阅 流式处理 API 的示例签名计算

您的请求的有效负载可以是:

  • 整个档案 — 在单一请求中使用上传档案 API 上传档案时,您可以在请求正文中发送整个档案。在这种情况下,您必须包括整个档案的校验和。

  • 档案段 — 使用分段上传 API 分段上传档案时,您可以在请求正文中只发送档案的一段。在这种情况下,您可以包括档案段的校验和。上传所有段后,您可以发送完成分段上传请求,该请求必须包括整个档案的校验和。

有效负载的校验和为 SHA256 树形哈希。它被称为树形哈希是因为,在计算校验和的过程中,您会计算 SHA256 哈希值树。根部的哈希值为整个档案的校验和。

注意

此部分描述了一种计算 SHA256 树形哈希的方法。但是,只要能得出相同的结果,您可以使用任何方法。

您可以按以下方法计算 SHA256 树形哈希:

  1. 针对有效负载数据的每个 1MB 区块,计算 SHA256 哈希。数据的最后一个区块可以小于 1MB。例如,如果您要上传一个 3.2MB 的档案,则您需要计算数据的前三个 1MB 区块中每一个区块的 SHA256 哈希值,然后计算剩余 0.2MB 数据的 SHA256 哈希。这些哈希值构成了树的叶节点。

  2. 构建树的下一层。

    1. 连接两个连续子节点的哈希值,然后计算连接的哈希值的 SHA256 哈希。此连接和 SHA256 哈希的生成会产生这两个子节点的父节点。

    2. 如果只剩下一个子节点,您可以将该哈希值提升到树的下一层。

  3. 重复步骤 2,直到结果树具有根为止。树根提供了整个档案的哈希,相应的子树根提供了分段上传中的段的哈希。

树形哈希示例 1:在单一请求中上传档案

在单一请求中使用上传档案 API(请参阅“上传档案(发布档案)”)上传档案时,请求的有效负载会包括整个档案。因此,您必须在 x-amz-sha256-tree-hash 请求标头中包括整个档案的树形哈希。假设您要上传一个 6.5MB 的档案。下图说明了创建档案的 SHA256 哈希的流程。您读取档案并为每个 1MB 区块计算 SHA256 哈希。此外,您还要为剩余的 0.5MB 数据计算哈希,然后按前面的步骤所述构建树。

该图显示了在单个请求中上传档案的树形哈希示例。

树形哈希示例 2:使用分段上传来上传档案

在使用分段上传来上传档案时计算树形哈希的流程与在单一请求中上传档案时相同。唯一的区别是,在分段上传中,您在每个请求中只(使用 上传段(设置上传 ID) API)上传档案的一段,因此,您在 x-amz-sha256-tree-hash 请求标头中只提供该段的校验和。但是,上传所有段后,您必须发送完成分段上传(请参阅“完成分段上传(发布上传 ID)”)请求,并在 x-amz-sha256-tree-hash 请求标头中包含整个档案的树形哈希。

该图显示了使用分段上传档案的树形哈希示例。

计算文件的树形哈希

此处显示的算法是出于演示目的而选择的。您可以根据实施情况的需要而优化代码。如果您使用 Amazon SDK 对 Amazon S3 Glacier(S3 Glacier)进行编程,则系统会为您完成树形哈希计算,您只需提供文件引用。

例 1:Java 示例

以下示例显示了如何使用 Java 计算文件的 SHA256 树形哈希。您可以通过提供文件位置作为参数来运行此示例,也可以直接从您的代码使用 TreeHashExample.computeSHA256TreeHash 方法。

import java.io.File; import java.io.FileInputStream; import java.io.IOException; import java.security.MessageDigest; import java.security.NoSuchAlgorithmException; public class TreeHashExample { static final int ONE_MB = 1024 * 1024; /** * Compute the Hex representation of the SHA-256 tree hash for the specified * File * * @param args * args[0]: a file to compute a SHA-256 tree hash for */ public static void main(String[] args) { if (args.length < 1) { System.err.println("Missing required filename argument"); System.exit(-1); } File inputFile = new File(args[0]); try { byte[] treeHash = computeSHA256TreeHash(inputFile); System.out.printf("SHA-256 Tree Hash = %s\n", toHex(treeHash)); } catch (IOException ioe) { System.err.format("Exception when reading from file %s: %s", inputFile, ioe.getMessage()); System.exit(-1); } catch (NoSuchAlgorithmException nsae) { System.err.format("Cannot locate MessageDigest algorithm for SHA-256: %s", nsae.getMessage()); System.exit(-1); } } /** * Computes the SHA-256 tree hash for the given file * * @param inputFile * a File to compute the SHA-256 tree hash for * @return a byte[] containing the SHA-256 tree hash * @throws IOException * Thrown if there's an issue reading the input file * @throws NoSuchAlgorithmException */ public static byte[] computeSHA256TreeHash(File inputFile) throws IOException, NoSuchAlgorithmException { byte[][] chunkSHA256Hashes = getChunkSHA256Hashes(inputFile); return computeSHA256TreeHash(chunkSHA256Hashes); } /** * Computes a SHA256 checksum for each 1 MB chunk of the input file. This * includes the checksum for the last chunk even if it is smaller than 1 MB. * * @param file * A file to compute checksums on * @return a byte[][] containing the checksums of each 1 MB chunk * @throws IOException * Thrown if there's an IOException when reading the file * @throws NoSuchAlgorithmException * Thrown if SHA-256 MessageDigest can't be found */ public static byte[][] getChunkSHA256Hashes(File file) throws IOException, NoSuchAlgorithmException { MessageDigest md = MessageDigest.getInstance("SHA-256"); long numChunks = file.length() / ONE_MB; if (file.length() % ONE_MB > 0) { numChunks++; } if (numChunks == 0) { return new byte[][] { md.digest() }; } byte[][] chunkSHA256Hashes = new byte[(int) numChunks][]; FileInputStream fileStream = null; try { fileStream = new FileInputStream(file); byte[] buff = new byte[ONE_MB]; int bytesRead; int idx = 0; int offset = 0; while ((bytesRead = fileStream.read(buff, offset, ONE_MB)) > 0) { md.reset(); md.update(buff, 0, bytesRead); chunkSHA256Hashes[idx++] = md.digest(); offset += bytesRead; } return chunkSHA256Hashes; } finally { if (fileStream != null) { try { fileStream.close(); } catch (IOException ioe) { System.err.printf("Exception while closing %s.\n %s", file.getName(), ioe.getMessage()); } } } } /** * Computes the SHA-256 tree hash for the passed array of 1 MB chunk * checksums. * * This method uses a pair of arrays to iteratively compute the tree hash * level by level. Each iteration takes two adjacent elements from the * previous level source array, computes the SHA-256 hash on their * concatenated value and places the result in the next level's destination * array. At the end of an iteration, the destination array becomes the * source array for the next level. * * @param chunkSHA256Hashes * An array of SHA-256 checksums * @return A byte[] containing the SHA-256 tree hash for the input chunks * @throws NoSuchAlgorithmException * Thrown if SHA-256 MessageDigest can't be found */ public static byte[] computeSHA256TreeHash(byte[][] chunkSHA256Hashes) throws NoSuchAlgorithmException { MessageDigest md = MessageDigest.getInstance("SHA-256"); byte[][] prevLvlHashes = chunkSHA256Hashes; while (prevLvlHashes.length > 1) { int len = prevLvlHashes.length / 2; if (prevLvlHashes.length % 2 != 0) { len++; } byte[][] currLvlHashes = new byte[len][]; int j = 0; for (int i = 0; i < prevLvlHashes.length; i = i + 2, j++) { // If there are at least two elements remaining if (prevLvlHashes.length - i > 1) { // Calculate a digest of the concatenated nodes md.reset(); md.update(prevLvlHashes[i]); md.update(prevLvlHashes[i + 1]); currLvlHashes[j] = md.digest(); } else { // Take care of remaining odd chunk currLvlHashes[j] = prevLvlHashes[i]; } } prevLvlHashes = currLvlHashes; } return prevLvlHashes[0]; } /** * Returns the hexadecimal representation of the input byte array * * @param data * a byte[] to convert to Hex characters * @return A String containing Hex characters */ public static String toHex(byte[] data) { StringBuilder sb = new StringBuilder(data.length * 2); for (int i = 0; i < data.length; i++) { String hex = Integer.toHexString(data[i] & 0xFF); if (hex.length() == 1) { // Append leading zero. sb.append("0"); } sb.append(hex); } return sb.toString().toLowerCase(); } }
例 2:C# .NET 示例

以下示例显示了如何计算文件的 SHA256 树形哈希。您可以通过提供文件位置作为参数来运行此示例。

using System; using System.IO; using System.Security.Cryptography; namespace ExampleTreeHash { class Program { static int ONE_MB = 1024 * 1024; /** * Compute the Hex representation of the SHA-256 tree hash for the * specified file * * @param args * args[0]: a file to compute a SHA-256 tree hash for */ public static void Main(string[] args) { if (args.Length < 1) { Console.WriteLine("Missing required filename argument"); Environment.Exit(-1); } FileStream inputFile = File.Open(args[0], FileMode.Open, FileAccess.Read); try { byte[] treeHash = ComputeSHA256TreeHash(inputFile); Console.WriteLine("SHA-256 Tree Hash = {0}", BitConverter.ToString(treeHash).Replace("-", "").ToLower()); Console.ReadLine(); Environment.Exit(-1); } catch (IOException ioe) { Console.WriteLine("Exception when reading from file {0}: {1}", inputFile, ioe.Message); Console.ReadLine(); Environment.Exit(-1); } catch (Exception e) { Console.WriteLine("Cannot locate MessageDigest algorithm for SHA-256: {0}", e.Message); Console.WriteLine(e.GetType()); Console.ReadLine(); Environment.Exit(-1); } Console.ReadLine(); } /** * Computes the SHA-256 tree hash for the given file * * @param inputFile * A file to compute the SHA-256 tree hash for * @return a byte[] containing the SHA-256 tree hash */ public static byte[] ComputeSHA256TreeHash(FileStream inputFile) { byte[][] chunkSHA256Hashes = GetChunkSHA256Hashes(inputFile); return ComputeSHA256TreeHash(chunkSHA256Hashes); } /** * Computes a SHA256 checksum for each 1 MB chunk of the input file. This * includes the checksum for the last chunk even if it is smaller than 1 MB. * * @param file * A file to compute checksums on * @return a byte[][] containing the checksums of each 1MB chunk */ public static byte[][] GetChunkSHA256Hashes(FileStream file) { long numChunks = file.Length / ONE_MB; if (file.Length % ONE_MB > 0) { numChunks++; } if (numChunks == 0) { return new byte[][] { CalculateSHA256Hash(null, 0) }; } byte[][] chunkSHA256Hashes = new byte[(int)numChunks][]; try { byte[] buff = new byte[ONE_MB]; int bytesRead; int idx = 0; while ((bytesRead = file.Read(buff, 0, ONE_MB)) > 0) { chunkSHA256Hashes[idx++] = CalculateSHA256Hash(buff, bytesRead); } return chunkSHA256Hashes; } finally { if (file != null) { try { file.Close(); } catch (IOException ioe) { throw ioe; } } } } /** * Computes the SHA-256 tree hash for the passed array of 1MB chunk * checksums. * * This method uses a pair of arrays to iteratively compute the tree hash * level by level. Each iteration takes two adjacent elements from the * previous level source array, computes the SHA-256 hash on their * concatenated value and places the result in the next level's destination * array. At the end of an iteration, the destination array becomes the * source array for the next level. * * @param chunkSHA256Hashes * An array of SHA-256 checksums * @return A byte[] containing the SHA-256 tree hash for the input chunks */ public static byte[] ComputeSHA256TreeHash(byte[][] chunkSHA256Hashes) { byte[][] prevLvlHashes = chunkSHA256Hashes; while (prevLvlHashes.GetLength(0) > 1) { int len = prevLvlHashes.GetLength(0) / 2; if (prevLvlHashes.GetLength(0) % 2 != 0) { len++; } byte[][] currLvlHashes = new byte[len][]; int j = 0; for (int i = 0; i < prevLvlHashes.GetLength(0); i = i + 2, j++) { // If there are at least two elements remaining if (prevLvlHashes.GetLength(0) - i > 1) { // Calculate a digest of the concatenated nodes byte[] firstPart = prevLvlHashes[i]; byte[] secondPart = prevLvlHashes[i + 1]; byte[] concatenation = new byte[firstPart.Length + secondPart.Length]; System.Buffer.BlockCopy(firstPart, 0, concatenation, 0, firstPart.Length); System.Buffer.BlockCopy(secondPart, 0, concatenation, firstPart.Length, secondPart.Length); currLvlHashes[j] = CalculateSHA256Hash(concatenation, concatenation.Length); } else { // Take care of remaining odd chunk currLvlHashes[j] = prevLvlHashes[i]; } } prevLvlHashes = currLvlHashes; } return prevLvlHashes[0]; } public static byte[] CalculateSHA256Hash(byte[] inputBytes, int count) { SHA256 sha256 = System.Security.Cryptography.SHA256.Create(); byte[] hash = sha256.ComputeHash(inputBytes, 0, count); return hash; } } }