May 24, 2025
You might know that WebSocket has message ordering and message delivery guarantees. But why is that the case? And under what conditions do those guarantees hold? I was recently discussing WebSocket behavior with another engineer and realized I didn’t have a solid foundation here, so I did some digging to understand it better…
Yes, WebSocket is an application level protocol (layer 7 on the OSI model, a la HTTP) using TCP on the transport level (layer 4). So you get all the guarantees of TCP, namely: reliable delivery and reliable ordering of messages.
Each TCP segment has a sequence number header, ordering each segment monotonically. Clients ACK
segment offsets back to hosts. If the client doesn’t ACK
in time, the host will automatically resend unacknowledged segments - or the client can request the additional segments.
In the case of corrupted segments, there is a checksum header that lets a client know to re-request a segment. Otherwise, clients send ACK
s to confirm receipt of segments, allowing the host to track successful delivery.
We know WebSocket message order is guaranteed through TCP, but things can get out of order quite easily. Let’s go through a case.
Primeagen is a popular programming creator (I follow him!)
Here’s the code that Primeagen is referring to (simplified for clarity):
ws.onmessage = async (message) => {
const blob = message.data
const arrBuff = await blob.arrayBuffer()
console.log(arrBuff) // messages are logged out of order
}
Primagen knows WebSocket is on TCP and has ordering guarantees. He noticed that the messages were logging out of order and initially suspected a JavaScript bug — perhaps something wrong with how the engine handled binary WebSocket frames.
So what’s actually going wrong here?
ws.onmessage = async (message) => {
const blob = message.data
const arrBuff = await blob.arrayBuffer() // we're async now folks
console.log(arrBuff)
}
The await
pauses execution of the message handler, returning to the next task in the call stack. .arrayBuffer()
will take variable time depending on the size of the message blob. So this isn’t a JavaScript engine bug - it’s an application level issue that reorders each WebSocket message based on how long it takes to resolve .arrayBuffer()
.
I wrote up two options. In my implementation, my WebSocket server will send clients 10 messages with a randomized processing time of 1-5 seconds. I have a simple timeout promise on the client side to approximate processing time.
async function resolveIn(ms: number): Promise<number> {
return new Promise<number>(resolve => {
setTimeout(() => {
resolve(ms)
}, ms * 1000)
})
}
This design has a message queue with a simple processing flag (don’t need a proper mutex for single threaded JS).
const messageQueue: Message[] = []
let processingQueue = false
async function processQueue() {
if (processingQueue) {
setTimeout(() => processQueue(), 0)
return
}
processingQueue = true
const toProcess = messageQueue.splice(0, 5)
const messages = await Promise.all(toProcess.map(async (message) => {
return {
id: message.id,
number: await resolveIn(message.number)
}
}))
for (let message of messages) {
console.log(`Handled message ${message.id} with ${message.number}s client processing`)
}
processingQueue = false
}
socket.onmessage = (message) => {
const data = JSON.parse(message.data)
messageQueue.push(data)
processQueue()
}
This implementation puts messages on an queue. Then messages are batch processed 5 at a time, using Promise.all
to preserve result ordering. This design allows for parallel processing, but ensures the final results are handled in the correct order.
const generator = messageGenerator()
await generator.next()
async function* messageGenerator(): AsyncGenerator<void, void, Message> {
while (true) {
const message = yield
if (message) {
const number = await resolveIn(message.number)
console.log(`Handled message ${message.id} with ${number}s client processing`)
}
}
}
socket.onmessage = async (message) => {
const data = JSON.parse(message.data)
await generator.next(data)
}
We can also use a generator to process messages one at a time. This is slower than batching since we are not doing anything in parallel, but this might be required in some cases where processing order is important.
Debugging complex systems is hard, especially when you’re dealing with layered abstractions like WebSocket over TCP. It’s easy to blame the protocol, the network, or the language compiler, when the real issue is how your application code handles concurrency. Always go one level deeper. In this case, even with TCP’s ordering guarantees, it’s still possible to break that order yourself.
In an upcoming post, I’ll talk about how to handle unreliable connections and state resync with WebSockets — stay tuned!