Deep Dive - How Chunked Transfer Encoding Works
Chunked transfer encoding is a key HTTP/1.1 feature that allows servers to stream data incrementally without knowing the total size of the response upfront. It’s particularly useful in streaming APIs, live updates, and large or dynamically-generated responses.
In this post, we’ll practically explore how chunked transfer encoding works using the backend we developed in my previous blog post on Streaming APIs with FastAPI and Next.js — Part 2.
🤔 What is Chunked Transfer Encoding?
Chunked transfer encoding modifies HTTP responses into a series of chunks, each prefixed with its size in bytes. It allows servers to start sending response data immediately, without having to calculate the full content length beforehand.
When Transfer-Encoding: chunked
is present, the client receives data incrementally and knows the response has ended when a zero-length chunk appears.
💻 Hands-on Example
Let’s use the FastAPI backend we built.
make start-backend
Let’s hit the /stream
endpoint with curl
to see how chunked transfer encoding works in practice.
curl -i --raw http://localhost:8000/stream
-i
: Include headers in output.--raw
: Disable curl’s automatic decoding, revealing raw chunked encoding.
Expected output:
curl -i --raw localhost:8000/stream
HTTP/1.1 200 OK
date: Mon, 31 Mar 2025 09:51:47 GMT
server: uvicorn
content-type: text/plain; charset=utf-8
Transfer-Encoding: chunked
1f
Waiting for new log entries...
1f
Waiting for new log entries...
1f
Waiting for new log entries...
1f
Waiting for new log entries...
1f
Waiting for new log entries...
30
Simulated log entry at Mon Mar 31 20:21:53 2025
0
Here’s a diagram to help visualize the chunked transfer encoding:
Here’s what’s happening:
- Each chunk starts with its length in hexadecimal (
1f
= 31 bytes). - The data follows the length, and the next chunk starts after a newline.
- The chunk with 30 represents a simulated log entry (
30
= 48 bytes). - The response ends with a zero-length chunk (
0
).
Note: This aligns with the techniques demonstrated in my previous blog series Streaming APIs with FastAPI and Next.js (Part 1) and Part 2.
🛠️ Step-by-Step Breakdown of Chunking
We’ll be using the index.py. Here’s exactly what’s happening under the hood:
1. Generator (yield
):
- Every time
yield
is executed, the Starlette framework (used internally by FastAPI) receives a new piece of data to stream to the client. - Each yielded data segment corresponds directly to one HTTP chunk.
For example, the yielded line:
yield "Waiting for new log entries...\n"
is packaged into one HTTP chunk.
2. Starlette’s StreamingResponse Handling:
- The
StreamingResponse
from Starlette wraps the async generator. - Starlette doesn’t wait until the generator finishes (which might be infinite). Instead, it immediately pushes each yielded chunk to the underlying ASGI server, typically Uvicorn.
3. Uvicorn’s Chunk Formatting:
Uvicorn (the ASGI server you’re using) receives the yielded chunk from Starlette and formats it according to the HTTP/1.1 chunked transfer encoding specification:
Each chunk is transmitted as follows:
<chunk-size in hexadecimal>\r\n
<chunk-data>\r\n
Here’s how one of your actual data chunks might look:
1f\r\n
Waiting for new log entries...\n\r\n
1f
= 31 bytes, the exact length of"Waiting for new log entries...\n"
4. Continuous Chunk Transmission:
- Uvicorn immediately sends each formatted chunk down the TCP connection.
- Your client (like
curl
) receives each chunk as soon as it’s sent, which allows incremental processing.
5. Ending the Stream:
- If your generator ever completes (or if the server shuts down the connection), Uvicorn sends a special zero-length chunk (
0\r\n\r\n
) to indicate that transmission has ended.
Example final chunk:
0\r\n
\r\n
🙋 What about HTTP/2 and HTTP/3?
The short answer: HTTP/2+ does not use chunked encoding at all. In fact, the HTTP/2 specification explicitly forbids the use of the Transfer-Encoding: chunked
header; if a client incorrectly tries to send it, it’s considered a protocol error.
Instead, HTTP/2 uses a more efficient binary framing layer that allows multiplexing multiple streams over a single connection. This means that chunked transfer encoding is not necessary in HTTP/2 and HTTP/3, as the protocol itself handles streaming more efficiently.
Key Takeaways
Through these practical examples, you’ve seen firsthand how chunked transfer encoding enables incremental streaming of data:
- Responses are sent as a series of chunks, each with a defined size.
- The end of data transmission is indicated by a zero-length chunk.
- Tools like
curl
, Python frameworks like FastAPI, and browser developer tools help visualize and debug chunked encoding.
Understanding this helps you build better streaming APIs and debug complex HTTP interactions effectively.
Happy Streaming! 🚀