S3 multipart upload isn’t just about splitting files; it’s a way to make uploads resilient to network interruptions and unlock parallel transfer speeds you’d never get otherwise.

Let’s see it in action. Imagine uploading a massive 10GB video file. Without multipart, one dropped packet halfway through means starting over. With multipart, we break it into 100MB chunks. If one chunk fails, only that chunk needs retransmission, not the whole 10GB.

Here’s a aws s3api example to get us started. We’ll initiate a multipart upload for a file named large_video.mp4 to a bucket called my-video-bucket.

aws s3api create-multipart-upload \
    --bucket my-video-bucket \
    --key large_video.mp4 \
    --content-type "video/mp4"

This command returns a UploadId. This ID is the key to tracking all the individual parts of our upload. Let’s say the UploadId is VzV1XzYyZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6.

Now, we can upload parts. We’ll upload the first 100MB chunk.

aws s3api upload-part \
    --bucket my-video-bucket \
    --key large_video.mp4 \
    --upload-id VzV1XzYyZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6 \
    --part-number 1 \
    --body file-part-1.mp4 \
    --content-length 104857600

Notice --part-number 1 and --body file-part-1.mp4. Each part gets a sequential number. The --content-length is critical; it tells S3 the exact size of the part being uploaded. This command returns an ETag, which is a hash of the uploaded part. We need to collect these ETags along with their part numbers.

After uploading all parts (say, up to part number 100), we need to tell S3 to assemble them.

aws s3api complete-multipart-upload \
    --bucket my-video-bucket \
    --key large_video.mp4 \
    --upload-id VzV1XzYyZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6ZlV6 \
    --multipart-upload '{"Parts": [{"PartNumber": 1, "ETag": "a1b2c3d4e5f67890abcdef"}, {"PartNumber": 2, "ETag": "b2c3d4e5f67890abcdefa1"}, ...]}'

The --multipart-upload argument takes a JSON structure containing an array of all the parts, each with its PartNumber and the ETag we received when uploading that specific part. This is how S3 verifies that all parts arrived correctly and in the right order.

The fundamental problem multipart upload solves is the inherent unreliability of wide-area networks for large data transfers. A single, monolithic upload is fragile. By breaking it into smaller, independently verifiable pieces, we gain several advantages:

  • Resilience: Network hiccups only affect individual parts, which can be retried without re-uploading the entire file.
  • Parallelism: You can upload multiple parts concurrently, dramatically speeding up transfers. This is especially powerful when you have multiple network paths or high bandwidth.
  • Performance for Large Files: S3 has an internal limit for single PUT operations (5GB). Multipart upload bypasses this by allowing uploads up to 5TB.

The ETag is not just a checksum; it’s a strong guarantee of data integrity for each part. When you complete-multipart-upload, S3 uses these ETags to reconstruct the file and ensures that the final object matches the sum of its parts.

Most people think about multipart upload in terms of breaking up a single file. What they often miss is that the parallelization isn’t just about your network connection. S3 itself can distribute these parts across different internal resources, further optimizing the upload process. You can configure your AWS SDK or CLI to use a specific number of threads for uploading parts, but S3’s internal architecture is also designed to handle these parallel streams efficiently.

The next thing you’ll likely encounter is managing aborted multipart uploads, which can incur storage costs if not cleaned up.

Want structured learning?

Take the full S3 course →