Actual-time communication is in every single place — stay chatbots, information streams, or prompt messaging. WebSockets are a strong enabler of this, however when do you have to use them? How do they work, and the way do they differ from conventional HTTP requests?
This text was impressed by a current system design interview — “design an actual time messaging app” — the place I stumbled by way of some ideas. Now that I’ve dug deeper, I’d prefer to share what I’ve realized so you possibly can keep away from the identical errors.
On this article, we’ll discover how WebSockets match into the larger image of consumer‑server communication. We’ll talk about what they do properly, the place they fall brief, and — sure — how one can design an actual‑time messaging app.
At its core, client-server communication is the change of information between two entities: a consumer and a server.
The consumer requests for information, and the server processes these requests and returns a response. These roles aren’t unique — providers can act as each a consumer and a server concurrently, relying on the context.
Earlier than diving into the main points of WebSockets, let’s take a step again and discover the larger image of client-server communication strategies.
1. Quick polling
Quick polling is the only, most acquainted method.
The consumer repeatedly sends HTTP requests to the server at common intervals (e.g., each few seconds) to verify for brand new information. Every request is impartial and one-directional (consumer → server).
This methodology is simple to arrange however can waste sources if the server not often has recent information. Use it for much less time‑delicate functions the place occasional polling is adequate.
2. Lengthy polling
Lengthy polling is an enchancment over brief polling, designed to scale back the variety of pointless requests. As an alternative of the server instantly responding to a consumer request, the server retains the connection open till new information is offered. As soon as the server has information, it sends the response, and the consumer instantly establishes a brand new connection.
Lengthy polling can be stateless and one-directional (consumer → server).
A typical instance is a trip‑hailing app, the place the consumer waits for a match or reserving replace.
3. Webhooks
Webhooks flip the script by making the server the initiator. The server sends HTTP POST requests to a client-defined endpoint each time particular occasions happen.
Every request is impartial and doesn’t depend on a persistent connection. Webhooks are additionally one-directional (server to consumer).
Webhooks are broadly used for asynchronous notifications, particularly when integrating with third-party providers. For instance, cost programs use webhooks to inform purchasers when the standing of a transaction adjustments.
4. Server-Despatched Occasions (SSE)
SSEs are a native HTTP-based occasion streaming protocol that enables servers to push real-time updates to purchasers over a single, persistent connection.
SSE works utilizing the EventSource
API, making it easy to implement in fashionable internet functions. It’s one-directional (server to consumer) and preferrred for conditions the place the consumer solely must obtain updates.
SSE is well-suited for functions like buying and selling platforms or stay sports activities updates, the place the server pushes information like inventory costs or scores in actual time. The consumer doesn’t must ship information again to the server in these eventualities.
However what about two-way communication?
All of the strategies above concentrate on one‑directional movement. For true two‑manner, actual‑time exchanges, we’d like a distinct method. That’s the place WebSockets shine.
Let’s dive in.
WebSockets allow real-time, bidirectional communication, making them excellent for functions like chat apps, stay notifications, and on-line gaming. In contrast to the standard HTTP request-response mannequin, WebSockets create a persistent connection, the place each consumer and server can ship messages independently with out ready for a request.
The connection begins as a daily HTTP request and is upgraded to a WebSocket connection by way of a handshake.
As soon as established, it makes use of a single TCP connection, working on the identical ports as HTTP (80 and 443). Messages despatched over WebSockets are small and light-weight, making them environment friendly for low-latency, high-interactivity use circumstances.
WebSocket connections observe a selected URI format: ws://
for normal connections and wss://
for safe, encrypted connections.
What’s a handshake?
A handshake is the method of initialising a connection between two programs. For WebSockets, it begins with an HTTP GET request from the consumer, asking for a protocol improve. This ensures compatibility with HTTP infrastructure earlier than transitioning to a persistent WebSocket connection.
- Consumer sends a request, with headers that appear to be:
GET /chat HTTP/1.1
Host: server.instance.com
Improve: websocket
Connection: Improve
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://instance.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Model: 13
Improve
— indicators the request to modify the protocolSec-WebSocket-Key
— Randomly generated, base64 encoded string used for handshake verificationSec-WebSocket-Protocol
(non-compulsory) — Lists subprotocols the consumer helps, permitting the server to select one.
2. Server responds to resquest
If the server helps WebSockets and agrees to the improve, it responds with a 101 Switching Protocols standing. Instance headers:
HTTP/1.1 101 Switching Protocols
Improve: websocket
Connection: Improve
Sec-WebSocket-Settle for: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat
Sec-WebSocket-Settle for
— Base64 encoded hash of the consumer’sSec-WebSocket-Key
and a GUID. This ensures the handshake is safe and legitimate.
3. Handshake validation
With the 101 Switching Protocols
response, the WebSocket connection is efficiently established and each consumer and server can begin exchanging messages in actual time.
This connection will stay open until it’s explicitly closed by both get together.
If any code aside from 101
is returned, the consumer has to finish the connection and the WebSocket handshake will fail.
Right here’s a abstract.
We’ve talked about how WebSockets allow real-time, bidirectional communication, however that’s nonetheless fairly summary time period. Let’s nail down some actual examples.
WebSockets are broadly utilized in real-time collaboration instruments and chat functions, akin to Excalidraw, Telegram, WhatsApp, Google Docs, Google Maps and the stay chat part throughout a YouTube or TikTok stay stream.
1. Having a fallback technique if connections are terminated
WebSockets don’t robotically get well if the connection is terminated as a consequence of community points, server crashes, or different failures. The consumer should explicitly detect the disconnection and try to re-establish the connection.
Lengthy polling is commonly used as a backup whereas a WebSocket connection tries to get reestablished.
2. Not optimised for streaming audio and video information
WebSocket messages are designed for sending small, structured messages. To stream giant media information, a expertise like WebRTC is healthier fitted to these eventualities.
3. WebSockets are stateful, therefore horizontally scaling shouldn’t be trivial
WebSockets are stateful, that means the server should preserve an energetic connection for each consumer. This makes horizontal scaling extra complicated in comparison with stateless HTTP, the place any server can deal with a consumer request with out sustaining persistent state.
You’ll want an extra layer of pub/sub mechanisms to do that.
Now let’s see how that is utilized in system design. I’ve coated each the straightforward (unscalable) answer and a horizontally scaled one.