使用分段上传上传对象 - Amazon Simple Storage Service

使用分段上传上传对象

您可以使用分段上传以编程方式将单个对象上传到 Amazon S3。

有关更多信息,请参阅以下部分。

AWS 开发工具包公开了一个名为 TransferManager 的高级别 API,用于简化分段上传。有关更多信息,请参阅 使用分段上传来上传和复制对象

您可以从文件或流上传数据。您还可以设置高级选项,例如,您想要用于分段上传的分段大小或在上传分段时要使用的并发线程数。您也可以设置可选的对象属性、存储类或访问控制列表 (ACL)。您可以使用 PutObjectRequestTransferManagerConfiguration 类来设置这些高级选项。

在可能的情况下,TransferManager 会尝试使用多个线程来一次性上传单个上传的多个分段。当处理大型内容大小和高带宽时,此操作可以大幅增加吞吐量。

除了文件上传功能,TransferManager 类能让您停止正在进行的分段上传。启动上传后,上传将被视为正在进行,直到您完成或停止该操作。TransferManager 将停止在指定的日期和时间之前启动的指定存储桶上的所有正在进行的分段上传。

如果需要暂停并恢复分段上传、在上传期间更改分段大小,或者事先不知道数据大小,请使用低级别 PHP API。有关分段上传的更多信息 (包括低级别 API 方法提供的额外功能),请参阅使用 AWS 开发工具包(低级别 API)

Java

下面的示例使用高级别分段上传 Java API(TransferManager 类)上传对象。有关创建和测试有效示例的说明,请参阅测试 Amazon S3 Java 代码示例

import com.amazonaws.AmazonServiceException; import com.amazonaws.SdkClientException; import com.amazonaws.auth.profile.ProfileCredentialsProvider; import com.amazonaws.regions.Regions; import com.amazonaws.services.s3.AmazonS3; import com.amazonaws.services.s3.AmazonS3ClientBuilder; import com.amazonaws.services.s3.transfer.TransferManager; import com.amazonaws.services.s3.transfer.TransferManagerBuilder; import com.amazonaws.services.s3.transfer.Upload; import java.io.File; public class HighLevelMultipartUpload { public static void main(String[] args) throws Exception { Regions clientRegion = Regions.DEFAULT_REGION; String bucketName = "*** Bucket name ***"; String keyName = "*** Object key ***"; String filePath = "*** Path for file to upload ***"; try { AmazonS3 s3Client = AmazonS3ClientBuilder.standard() .withRegion(clientRegion) .withCredentials(new ProfileCredentialsProvider()) .build(); TransferManager tm = TransferManagerBuilder.standard() .withS3Client(s3Client) .build(); // TransferManager processes all transfers asynchronously, // so this call returns immediately. Upload upload = tm.upload(bucketName, keyName, new File(filePath)); System.out.println("Object upload started"); // Optionally, wait for the upload to finish before continuing. upload.waitForCompletion(); System.out.println("Object upload complete"); } catch (AmazonServiceException e) { // The call was transmitted successfully, but Amazon S3 couldn't process // it, so it returned an error response. e.printStackTrace(); } catch (SdkClientException e) { // Amazon S3 couldn't be contacted for a response, or the client // couldn't parse the response from Amazon S3. e.printStackTrace(); } } }
.NET

要将文件上传到 S3 存储桶,请使用 TransferUtility 类。在从文件上传数据时,您必须提供对象的键名。如果未提供,该 API 将使用文件名作为键名。在从流上传数据时,您必须提供对象的键名。

要设置高级上传选项(如段大小、并发上传段时的线程数、元数据、存储类或 ACL),请使用 TransferUtilityUploadRequest 类。

以下 C# 示例将文件分段上传到 Amazon S3 存储桶。它说明如何使用各种 TransferUtility.Upload 重载来上传文件。每个对上传的后续调用都将替换先前的上传。有关示例与特定版本的AWS SDK for .NET的兼容性的信息以及有关创建和测试有效示例的说明,请参阅运行 Amazon S3 .NET 代码示例

using Amazon; using Amazon.S3; using Amazon.S3.Transfer; using System; using System.IO; using System.Threading.Tasks; namespace Amazon.DocSamples.S3 { class UploadFileMPUHighLevelAPITest { private const string bucketName = "*** provide bucket name ***"; private const string keyName = "*** provide a name for the uploaded object ***"; private const string filePath = "*** provide the full path name of the file to upload ***"; // Specify your bucket region (an example region is shown). private static readonly RegionEndpoint bucketRegion = RegionEndpoint.USWest2; private static IAmazonS3 s3Client; public static void Main() { s3Client = new AmazonS3Client(bucketRegion); UploadFileAsync().Wait(); } private static async Task UploadFileAsync() { try { var fileTransferUtility = new TransferUtility(s3Client); // Option 1. Upload a file. The file name is used as the object key name. await fileTransferUtility.UploadAsync(filePath, bucketName); Console.WriteLine("Upload 1 completed"); // Option 2. Specify object key name explicitly. await fileTransferUtility.UploadAsync(filePath, bucketName, keyName); Console.WriteLine("Upload 2 completed"); // Option 3. Upload data from a type of System.IO.Stream. using (var fileToUpload = new FileStream(filePath, FileMode.Open, FileAccess.Read)) { await fileTransferUtility.UploadAsync(fileToUpload, bucketName, keyName); } Console.WriteLine("Upload 3 completed"); // Option 4. Specify advanced settings. var fileTransferUtilityRequest = new TransferUtilityUploadRequest { BucketName = bucketName, FilePath = filePath, StorageClass = S3StorageClass.StandardInfrequentAccess, PartSize = 6291456, // 6 MB. Key = keyName, CannedACL = S3CannedACL.PublicRead }; fileTransferUtilityRequest.Metadata.Add("param1", "Value1"); fileTransferUtilityRequest.Metadata.Add("param2", "Value2"); await fileTransferUtility.UploadAsync(fileTransferUtilityRequest); Console.WriteLine("Upload 4 completed"); } catch (AmazonS3Exception e) { Console.WriteLine("Error encountered on server. Message:'{0}' when writing an object", e.Message); } catch (Exception e) { Console.WriteLine("Unknown encountered on server. Message:'{0}' when writing an object", e.Message); } } } }
PHP

本主题介绍如何使用Aws\S3\Model\MultipartUpload\UploadBuilder中的高级别 AWS SDK for PHP 类执行文件分段上传。本主题假定您已按照使用AWS SDK for PHP和运行 PHP 示例的说明执行操作,并正确安装了AWS SDK for PHP。

以下 PHP 代码示例将文件上传到 Amazon S3 存储桶。该示例演示如何设置 MultipartUploader 对象的参数。

有关运行本指南中的 PHP 示例的信息,请参阅运行 PHP 示例

require 'vendor/autoload.php'; use Aws\Common\Exception\MultipartUploadException; use Aws\S3\MultipartUploader; use Aws\S3\S3Client; $bucket = '*** Your Bucket Name ***'; $keyname = '*** Your Object Key ***'; $s3 = new S3Client([ 'version' => 'latest', 'region' => 'us-east-1' ]); // Prepare the upload parameters. $uploader = new MultipartUploader($s3, '/path/to/large/file.zip', [ 'bucket' => $bucket, 'key' => $keyname ]); // Perform the upload. try { $result = $uploader->upload(); echo "Upload complete: {$result['ObjectURL']}" . PHP_EOL; } catch (MultipartUploadException $e) { echo $e->getMessage() . PHP_EOL; }
Python

以下示例使用高级别分段上传 Python API(TransferManager 类)上传对象。

""" Use Boto 3 managed file transfers to manage multipart uploads to and downloads from an Amazon S3 bucket. When the file to transfer is larger than the specified threshold, the transfer manager automatically uses multipart uploads or downloads. This demonstration shows how to use several of the available transfer manager settings and reports thread usage and time to transfer. """ import sys import threading import boto3 from boto3.s3.transfer import TransferConfig MB = 1024 * 1024 s3 = boto3.resource('s3') class TransferCallback: """ Handle callbacks from the transfer manager. The transfer manager periodically calls the __call__ method throughout the upload and download process so that it can take action, such as displaying progress to the user and collecting data about the transfer. """ def __init__(self, target_size): self._target_size = target_size self._total_transferred = 0 self._lock = threading.Lock() self.thread_info = {} def __call__(self, bytes_transferred): """ The callback method that is called by the transfer manager. Display progress during file transfer and collect per-thread transfer data. This method can be called by multiple threads, so shared instance data is protected by a thread lock. """ thread = threading.current_thread() with self._lock: self._total_transferred += bytes_transferred if thread.ident not in self.thread_info.keys(): self.thread_info[thread.ident] = bytes_transferred else: self.thread_info[thread.ident] += bytes_transferred target = self._target_size * MB sys.stdout.write( f"\r{self._total_transferred} of {target} transferred " f"({(self._total_transferred / target) * 100:.2f}%).") sys.stdout.flush() def upload_with_default_configuration(local_file_path, bucket_name, object_key, file_size_mb): """ Upload a file from a local folder to an Amazon S3 bucket, using the default configuration. """ transfer_callback = TransferCallback(file_size_mb) s3.Bucket(bucket_name).upload_file( local_file_path, object_key, Callback=transfer_callback) return transfer_callback.thread_info def upload_with_chunksize_and_meta(local_file_path, bucket_name, object_key, file_size_mb, metadata=None): """ Upload a file from a local folder to an Amazon S3 bucket, setting a multipart chunk size and adding metadata to the Amazon S3 object. The multipart chunk size controls the size of the chunks of data that are sent in the request. A smaller chunk size typically results in the transfer manager using more threads for the upload. The metadata is a set of key-value pairs that are stored with the object in Amazon S3. """ transfer_callback = TransferCallback(file_size_mb) config = TransferConfig(multipart_chunksize=1 * MB) extra_args = {'Metadata': metadata} if metadata else None s3.Bucket(bucket_name).upload_file( local_file_path, object_key, Config=config, ExtraArgs=extra_args, Callback=transfer_callback) return transfer_callback.thread_info def upload_with_high_threshold(local_file_path, bucket_name, object_key, file_size_mb): """ Upload a file from a local folder to an Amazon S3 bucket, setting a multipart threshold larger than the size of the file. Setting a multipart threshold larger than the size of the file results in the transfer manager sending the file as a standard upload instead of a multipart upload. """ transfer_callback = TransferCallback(file_size_mb) config = TransferConfig(multipart_threshold=file_size_mb * 2 * MB) s3.Bucket(bucket_name).upload_file( local_file_path, object_key, Config=config, Callback=transfer_callback) return transfer_callback.thread_info def upload_with_sse(local_file_path, bucket_name, object_key, file_size_mb, sse_key=None): """ Upload a file from a local folder to an Amazon S3 bucket, adding server-side encryption with customer-provided encryption keys to the object. When this kind of encryption is specified, Amazon S3 encrypts the object at rest and allows downloads only when the expected encryption key is provided in the download request. """ transfer_callback = TransferCallback(file_size_mb) if sse_key: extra_args = { 'SSECustomerAlgorithm': 'AES256', 'SSECustomerKey': sse_key} else: extra_args = None s3.Bucket(bucket_name).upload_file( local_file_path, object_key, ExtraArgs=extra_args, Callback=transfer_callback) return transfer_callback.thread_info def download_with_default_configuration(bucket_name, object_key, download_file_path, file_size_mb): """ Download a file from an Amazon S3 bucket to a local folder, using the default configuration. """ transfer_callback = TransferCallback(file_size_mb) s3.Bucket(bucket_name).Object(object_key).download_file( download_file_path, Callback=transfer_callback) return transfer_callback.thread_info def download_with_single_thread(bucket_name, object_key, download_file_path, file_size_mb): """ Download a file from an Amazon S3 bucket to a local folder, using a single thread. """ transfer_callback = TransferCallback(file_size_mb) config = TransferConfig(use_threads=False) s3.Bucket(bucket_name).Object(object_key).download_file( download_file_path, Config=config, Callback=transfer_callback) return transfer_callback.thread_info def download_with_high_threshold(bucket_name, object_key, download_file_path, file_size_mb): """ Download a file from an Amazon S3 bucket to a local folder, setting a multipart threshold larger than the size of the file. Setting a multipart threshold larger than the size of the file results in the transfer manager sending the file as a standard download instead of a multipart download. """ transfer_callback = TransferCallback(file_size_mb) config = TransferConfig(multipart_threshold=file_size_mb * 2 * MB) s3.Bucket(bucket_name).Object(object_key).download_file( download_file_path, Config=config, Callback=transfer_callback) return transfer_callback.thread_info def download_with_sse(bucket_name, object_key, download_file_path, file_size_mb, sse_key): """ Download a file from an Amazon S3 bucket to a local folder, adding a customer-provided encryption key to the request. When this kind of encryption is specified, Amazon S3 encrypts the object at rest and allows downloads only when the expected encryption key is provided in the download request. """ transfer_callback = TransferCallback(file_size_mb) if sse_key: extra_args = { 'SSECustomerAlgorithm': 'AES256', 'SSECustomerKey': sse_key} else: extra_args = None s3.Bucket(bucket_name).Object(object_key).download_file( download_file_path, ExtraArgs=extra_args, Callback=transfer_callback) return transfer_callback.thread_info

AWS 开发工具包公开了一个与 Amazon S3 REST API 非常相似的用于分段上传的低级别 API(请参阅 使用分段上传来上传和复制对象)。当您需要暂停和恢复分段上传、在上传过程中更改部分大小或事先不知道上传数据的大小时,请使用低级别 API。当您没有这些需求时,请使用高级别 API(请参阅 使用 AWS 开发工具包(高级别 API))。

Java

以下示例演示如何使用低级别 Java 类上传文件。它将执行以下步骤:

  • 使用 AmazonS3Client.initiateMultipartUpload() 方法初始化分段上传,并传入 InitiateMultipartUploadRequest 对象。

  • 保存 AmazonS3Client.initiateMultipartUpload() 方法返回的上传 ID。您为随后的每个分段上传操作提供此上传 ID。

  • 上传对象的分段。对于每个分段,您将调用 AmazonS3Client.uploadPart() 方法。您使用 UploadPartRequest 对象提供分段上传信息。

  • 对于每个分段,在列表中保存来自 AmazonS3Client.uploadPart() 方法的响应的 ETag。您使用 ETag 值完成分段上传。

  • 调用 AmazonS3Client.completeMultipartUpload() 方法来完成分段上传。

有关创建和测试有效示例的说明,请参阅测试 Amazon S3 Java 代码示例

import com.amazonaws.AmazonServiceException; import com.amazonaws.SdkClientException; import com.amazonaws.auth.profile.ProfileCredentialsProvider; import com.amazonaws.regions.Regions; import com.amazonaws.services.s3.AmazonS3; import com.amazonaws.services.s3.AmazonS3ClientBuilder; import com.amazonaws.services.s3.model.*; import java.io.File; import java.io.IOException; import java.util.ArrayList; import java.util.List; public class LowLevelMultipartUpload { public static void main(String[] args) throws IOException { Regions clientRegion = Regions.DEFAULT_REGION; String bucketName = "*** Bucket name ***"; String keyName = "*** Key name ***"; String filePath = "*** Path to file to upload ***"; File file = new File(filePath); long contentLength = file.length(); long partSize = 5 * 1024 * 1024; // Set part size to 5 MB. try { AmazonS3 s3Client = AmazonS3ClientBuilder.standard() .withRegion(clientRegion) .withCredentials(new ProfileCredentialsProvider()) .build(); // Create a list of ETag objects. You retrieve ETags for each object part uploaded, // then, after each individual part has been uploaded, pass the list of ETags to // the request to complete the upload. List<PartETag> partETags = new ArrayList<PartETag>(); // Initiate the multipart upload. InitiateMultipartUploadRequest initRequest = new InitiateMultipartUploadRequest(bucketName, keyName); InitiateMultipartUploadResult initResponse = s3Client.initiateMultipartUpload(initRequest); // Upload the file parts. long filePosition = 0; for (int i = 1; filePosition < contentLength; i++) { // Because the last part could be less than 5 MB, adjust the part size as needed. partSize = Math.min(partSize, (contentLength - filePosition)); // Create the request to upload a part. UploadPartRequest uploadRequest = new UploadPartRequest() .withBucketName(bucketName) .withKey(keyName) .withUploadId(initResponse.getUploadId()) .withPartNumber(i) .withFileOffset(filePosition) .withFile(file) .withPartSize(partSize); // Upload the part and add the response's ETag to our list. UploadPartResult uploadResult = s3Client.uploadPart(uploadRequest); partETags.add(uploadResult.getPartETag()); filePosition += partSize; } // Complete the multipart upload. CompleteMultipartUploadRequest compRequest = new CompleteMultipartUploadRequest(bucketName, keyName, initResponse.getUploadId(), partETags); s3Client.completeMultipartUpload(compRequest); } catch (AmazonServiceException e) { // The call was transmitted successfully, but Amazon S3 couldn't process // it, so it returned an error response. e.printStackTrace(); } catch (SdkClientException e) { // Amazon S3 couldn't be contacted for a response, or the client // couldn't parse the response from Amazon S3. e.printStackTrace(); } } }
.NET

以下 C# 示例演示如何使用低级别AWS SDK for .NET分段上传 API 将文件上传到 S3 存储桶。有关 Amazon S3 分段上传的信息,请参阅使用分段上传来上传和复制对象

注意

如果使用AWS SDK for .NET API 上传大型对象,数据写入到请求流时可能出现超时。您可以使用 UploadPartRequest 设置显式超时。

以下 C# 示例使用低级别分段上传 API 将文件上传到 S3 存储桶。有关示例与特定版本的AWS SDK for .NET的兼容性的信息以及有关创建和测试有效示例的说明,请参阅运行 Amazon S3 .NET 代码示例

using Amazon; using Amazon.Runtime; using Amazon.S3; using Amazon.S3.Model; using System; using System.Collections.Generic; using System.IO; using System.Threading.Tasks; namespace Amazon.DocSamples.S3 { class UploadFileMPULowLevelAPITest { private const string bucketName = "*** provide bucket name ***"; private const string keyName = "*** provide a name for the uploaded object ***"; private const string filePath = "*** provide the full path name of the file to upload ***"; // Specify your bucket region (an example region is shown). private static readonly RegionEndpoint bucketRegion = RegionEndpoint.USWest2; private static IAmazonS3 s3Client; public static void Main() { s3Client = new AmazonS3Client(bucketRegion); Console.WriteLine("Uploading an object"); UploadObjectAsync().Wait(); } private static async Task UploadObjectAsync() { // Create list to store upload part responses. List<UploadPartResponse> uploadResponses = new List<UploadPartResponse>(); // Setup information required to initiate the multipart upload. InitiateMultipartUploadRequest initiateRequest = new InitiateMultipartUploadRequest { BucketName = bucketName, Key = keyName }; // Initiate the upload. InitiateMultipartUploadResponse initResponse = await s3Client.InitiateMultipartUploadAsync(initiateRequest); // Upload parts. long contentLength = new FileInfo(filePath).Length; long partSize = 5 * (long)Math.Pow(2, 20); // 5 MB try { Console.WriteLine("Uploading parts"); long filePosition = 0; for (int i = 1; filePosition < contentLength; i++) { UploadPartRequest uploadRequest = new UploadPartRequest { BucketName = bucketName, Key = keyName, UploadId = initResponse.UploadId, PartNumber = i, PartSize = partSize, FilePosition = filePosition, FilePath = filePath }; // Track upload progress. uploadRequest.StreamTransferProgress += new EventHandler<StreamTransferProgressArgs>(UploadPartProgressEventCallback); // Upload a part and add the response to our list. uploadResponses.Add(await s3Client.UploadPartAsync(uploadRequest)); filePosition += partSize; } // Setup to complete the upload. CompleteMultipartUploadRequest completeRequest = new CompleteMultipartUploadRequest { BucketName = bucketName, Key = keyName, UploadId = initResponse.UploadId }; completeRequest.AddPartETags(uploadResponses); // Complete the upload. CompleteMultipartUploadResponse completeUploadResponse = await s3Client.CompleteMultipartUploadAsync(completeRequest); } catch (Exception exception) { Console.WriteLine("An AmazonS3Exception was thrown: { 0}", exception.Message); // Abort the upload. AbortMultipartUploadRequest abortMPURequest = new AbortMultipartUploadRequest { BucketName = bucketName, Key = keyName, UploadId = initResponse.UploadId }; await s3Client.AbortMultipartUploadAsync(abortMPURequest); } } public static void UploadPartProgressEventCallback(object sender, StreamTransferProgressArgs e) { // Process event. Console.WriteLine("{0}/{1}", e.TransferredBytes, e.TotalBytes); } } }
PHP

本主题说明如何使用 AWS SDK for PHP 版本 3 中的低级别 uploadPart 方法来以分段形式上传文件。本主题假定您已按照使用AWS SDK for PHP和运行 PHP 示例的说明执行操作,并正确安装了AWS SDK for PHP。

以下 PHP 示例使用低级别 PHP API 分段上传将文件上传到 Amazon S3 存储桶。有关运行本指南中的 PHP 示例的信息,请参阅运行 PHP 示例

require 'vendor/autoload.php'; use Aws\S3\S3Client; $bucket = '*** Your Bucket Name ***'; $keyname = '*** Your Object Key ***'; $filename = '*** Path to and Name of the File to Upload ***'; $s3 = new S3Client([ 'version' => 'latest', 'region' => 'us-east-1' ]); $result = $s3->createMultipartUpload([ 'Bucket' => $bucket, 'Key' => $keyname, 'StorageClass' => 'REDUCED_REDUNDANCY', 'Metadata' => [ 'param1' => 'value 1', 'param2' => 'value 2', 'param3' => 'value 3' ] ]); $uploadId = $result['UploadId']; // Upload the file in parts. try { $file = fopen($filename, 'r'); $partNumber = 1; while (!feof($file)) { $result = $s3->uploadPart([ 'Bucket' => $bucket, 'Key' => $keyname, 'UploadId' => $uploadId, 'PartNumber' => $partNumber, 'Body' => fread($file, 5 * 1024 * 1024), ]); $parts['Parts'][$partNumber] = [ 'PartNumber' => $partNumber, 'ETag' => $result['ETag'], ]; $partNumber++; echo "Uploading part {$partNumber} of {$filename}." . PHP_EOL; } fclose($file); } catch (S3Exception $e) { $result = $s3->abortMultipartUpload([ 'Bucket' => $bucket, 'Key' => $keyname, 'UploadId' => $uploadId ]); echo "Upload of {$filename} failed." . PHP_EOL; } // Complete the multipart upload. $result = $s3->completeMultipartUpload([ 'Bucket' => $bucket, 'Key' => $keyname, 'UploadId' => $uploadId, 'MultipartUpload' => $parts, ]); $url = $result['Location']; echo "Uploaded {$filename} to {$url}." . PHP_EOL;

AWS SDK for Ruby 版本 3 支持两种 Amazon S3 分段上传方式。对于第一个选项,您可以使用托管文件上传。有关更多信息,请参阅 AWS 开发人员博客中的将文件上传至 Amazon S3。托管文件上传是建议用于将文件上传到存储桶的方法。它们提供以下优势:

  • 管理大于 15MB 的对象的分段上传。

  • 正确打开二进制模式的文件,规避编码问题。

  • 在并行上传较大对象的多个部分时,使用多个线程。

此外,您还可以直接使用以下分段上传客户端操作:

有关更多信息,请参阅 使用 AWS SDK for Ruby - 版本 3

Amazon Simple Storage Service API 参考》的下面几节描述了适用于分段上传的 REST API。

AWS Command Line Interface (AWS CLI) 中的以下各部分介绍了适用于分段上传的操作。

您也可以使用 REST API 创建您自己的 REST 请求,或者也可以使用其中一个 AWS 开发工具包。有关 REST API 的更多信息,请参阅使用 REST API。有关开发工具包的更多信息,请参阅 使用分段上传上传对象