Who this is for
Backend engineers who need to accept files from clients (images, PDFs, data exports) and serve large downloads efficiently (reports, media, backups) without timeouts or memory blowups.
Prerequisites
- Comfort with HTTP basics (methods, headers, status codes)
- Familiarity with your server framework’s routing and middleware
- Basic understanding of filesystems or object storage (local disk, S3-like)
Why this matters
- Real tasks: Profile photo uploads, CSV imports, log archives, media streaming, report downloads.
- Bad handling causes outages: memory spikes, slow requests, timeouts, corrupted files, and security risks.
- Great handling improves UX: resumable uploads, predictable limits, fast streaming downloads.
Concept explained simply
Uploads are data flowing from client to server; streaming is how you move data without loading it all into memory.
Mental model
- Think “pipes and buckets.” The client pours data into a pipe; your server connects that pipe directly to storage. Avoid filling the bucket (RAM) when the pipe can drain to disk/storage.
- Always control the valve: enforce size limits, allowed types, and timeouts.
Core concepts and safe defaults
1) Choosing an upload method
- multipart/form-data: Best for browser forms. Send fields + files together.
- Binary body (PUT/POST): Simple when uploading a single file; metadata via headers or query.
- Signed/direct upload: Client uploads to object storage using a pre-signed URL; your API issues the URL and validates after.
2) Memory vs disk
- Do not buffer whole files in RAM. Stream directly to disk or object storage.
- Set strict limits: max file size, request timeout, and concurrency.
3) Validation and security
- Allowlist MIME types and verify by magic bytes when possible.
- Normalize file names; do not trust client-sent paths.
- Scan files if your org requires it. Quarantine until scan passes.
- Strip metadata you don’t need; never execute uploaded files.
4) Streaming downloads
- Use Content-Type and Content-Disposition for correct handling.
- Stream file contents; do not read entire file into memory.
- Support Range requests for large media and resumable downloads.
5) Error handling
- Return clear 4xx for validation failures (size/type).
- On partial writes, clean up temp files/chunks.
- Use idempotent identifiers to retry safely (upload IDs, chunk numbers).
Worked examples
Example 1: Multipart single-file upload with limits
Goal: Accept an image under 10 MB, store it, and return JSON metadata.
# Client cURL
curl -F "avatar=@me.jpg" \
-F "user_id=123" \
http://localhost:3000/upload/avatar
# Server behavior
- Enforce: max size 10 MB, types: image/jpeg, image/png
- Stream to temp file, then move to permanent storage
- Respond: { id, originalName, size, mime, url }
What to check
- Reject invalid types with 415.
- Reject oversize with 413.
- Ensure file name is sanitized (e.g., random ID + extension).
Example 2: Streaming download with Content-Disposition
Goal: Serve a large report without loading it into RAM.
# Client cURL
curl -OJ http://localhost:3000/reports/2024-annual.pdf
# Server behavior
- Set Content-Type: application/pdf
- Set Content-Disposition: attachment; filename="2024-annual.pdf"
- Stream file in chunks using OS file handle
Performance tips
- Use OS-level streaming APIs.
- Disable response compression for already-compressed files (pdf/zip).
- Close stream on client disconnect; log partial transfers.
Example 3: Basic resumable chunked upload
Goal: Upload large files in chunks with retry on failures.
# 1) Initialize
POST /uploads/init
{ "filename": "video.mp4", "size": 4000000000, "mime": "video/mp4" }
-> { "upload_id": "abc123", "chunk_size": 5242880 }
# 2) Send chunks
PUT /uploads/abc123/chunk/1 (body: bytes 0..chunk_size-1)
PUT /uploads/abc123/chunk/2 (body: next bytes)
...
# 3) Complete
POST /uploads/abc123/complete
-> server verifies all chunks and assembles -> returns final file id/url
Integrity and safety
- Store per-chunk checksums; verify on receipt.
- Track received chunk indices to allow retries.
- Finalize atomically and clean up orphaned chunks via background job.
Implementation steps
- Define limits: max file size, allowed types, request timeout, and per-user rate limits.
- Choose storage: local disk (dev) or object storage for production. Generate safe server-side file names.
- Wire streaming: connect request stream to a write stream; handle backpressure and errors.
- Validate: check Content-Type, verify magic bytes if critical, and scan if required.
- Respond with metadata: id, size, mime, hash, and a retrieval URL.
- For downloads: set headers and stream; optionally support Range requests.
Common mistakes and self-check
- Buffering entire file in memory. Self-check: Does memory grow with file size? If yes, switch to streaming.
- Trusting client file names. Self-check: Ensure server generates names; no path segments in stored names.
- No size/type limits. Self-check: Try a 2 GB file or an executable; request should be rejected.
- Forgetting to clean up partial files. Self-check: Simulate client cancel; temp files should be removed.
- Missing Range support for large media. Self-check: Send Range header; server should return 206 with correct Content-Range.
Practical projects
- Profile Photo Service: Single-file image upload with 5 MB limit and automatic resizing.
- CSV Importer: Upload CSV, store, queue a background parser, expose status and result download.
- Media Downloader: Range-enabled streaming endpoint serving large video files.
- Resumable Uploader: Init/chunk/complete flow with chunk SHA-256 verification.
Exercises
Everyone can take exercises and the quick test for free. Logged-in users have their progress saved automatically.
Exercise 1 — Safe single-file upload
Implement an endpoint that accepts one image via multipart/form-data with:
- Max size: 10 MB
- Allowed types: image/jpeg, image/png
- Stream to disk/storage; sanitize file name; return JSON metadata
Acceptance checklist
- Oversize files return 413
- Disallowed types return 415
- Response includes id, size, mime
- No memory spikes during upload
Exercise 2 — Streaming download with Range
Implement GET /files/:id to stream a stored file with:
- Content-Type from stored metadata
- Content-Disposition: attachment; filename="original.ext"
- Support Range requests, responding with 206 when present
Acceptance checklist
- Large files do not increase RAM usage
- Range 0-999 returns 1000 bytes and correct headers
- Invalid ranges return 416
Mini tasks
- Add a SHA-256 hash to your upload response; verify it on download.
- Implement automatic cleanup of temp files older than 24 hours.
- Add per-user daily upload quota with clear 429 error on exceed.
Learning path
- Start: Single-file multipart uploads with strict limits.
- Next: Streaming downloads with proper headers and Range.
- Then: Resumable chunked uploads with integrity checks.
- Advanced: Direct-to-storage signed uploads and virus scanning queues.
Next steps
- Harden your validations: MIME allowlist + magic byte check.
- Add observability: log size, duration, and error categories.
- Benchmark under load; tune timeouts and concurrency.