📡 You're offline — showing cached content
New version available!
Quick Access
PHP Advanced Featured

System Design Interview: Design a Chat Application Like WhatsApp

Design a real-time chat system for 500 million users — WebSockets vs polling, Kafka message queue, Cassandra storage, presence service and push notifications.

EzyCoders Admin January 11, 2026 16 min read 24 views
System Design Chat Application WhatsApp
Share: Twitter LinkedIn WhatsApp

Design a Chat Application

Designing a real-time chat system like WhatsApp is a common senior-level system design question. It tests your knowledge of WebSockets, message queues, distributed storage, and scalability.

Requirements Clarification

Functional: 1-to-1 messaging, group chats (up to 256 members), message delivery status (sent/delivered/read), media sharing, online/offline status

Non-Functional: 500 million users, 100 billion messages/day, <100ms message delivery latency, messages stored for 5 years, end-to-end encryption

Back-of-Envelope

Messages per second = 100B / (24 * 3600) = ~1.16M msg/s peak
Storage per message = 100 bytes avg
Daily storage       = 100B × 100B = 10 TB/day
5-year storage      = 18 PB (need distributed storage)

Connections: 500M users, 20% active = 100M concurrent WebSocket connections
Servers needed: 1 server handles ~65K connections
WebSocket servers = 100M / 65K = ~1,540 servers

High-Level Architecture

Client (Mobile/Web)
    |
    | WebSocket (persistent connection)
    |
Load Balancer (L7 — sticky sessions by user ID)
    |
Chat Servers (stateful — maintain WebSocket connections)
    |
    ├── Message Queue (Apache Kafka)
    |       |
    |       └── Message Processor Service
    |               ├── Store to Cassandra (messages)
    |               ├── Push Notification Service (FCM/APNs)
    |               └── Update delivery status
    |
    ├── Presence Service (Redis pub/sub — online/offline)
    |
    └── Media Service
            ├── S3 / CDN (store images, videos)
            └── Return pre-signed URLs to clients

Message Delivery Flow

# 1. Sender sends message via WebSocket
class ChatServer:
    async def handle_message(self, ws, message: dict):
        # Validate and enrich message
        msg = {
            'id':          generate_uuid(),
            'sender_id':   message['sender_id'],
            'receiver_id': message['receiver_id'],
            'content':     message['content'],
            'timestamp':   time.time(),
            'status':      'sent',
        }

        # 2. Publish to Kafka immediately (fast, non-blocking)
        await kafka.produce('messages', key=msg['receiver_id'], value=msg)

        # 3. ACK back to sender — message accepted
        await ws.send(json.dumps({'type':'ack','msg_id':msg['id']}))

# 4. Message Processor (Kafka consumer)
async def process_message(msg):
    # Store in Cassandra
    await cassandra.execute(
        'INSERT INTO messages (chat_id, id, sender, content, ts) VALUES (?,?,?,?,?)',
        [msg['chat_id'], msg['id'], msg['sender_id'], msg['content'], msg['timestamp']]
    )

    # 5. Deliver to receiver if online
    receiver_server = await presence.get_server(msg['receiver_id'])
    if receiver_server:
        # Forward to the server holding receiver's WebSocket
        await internal_rpc.deliver(receiver_server, msg)
    else:
        # Receiver offline — send push notification
        await push.notify(msg['receiver_id'], msg)

Database Choice

-- Cassandra: perfect for chat messages
-- Partition by chat_id, cluster by timestamp (reverse — latest first)
CREATE TABLE messages (
    chat_id    UUID,
    message_id TIMEUUID,  -- time-based UUID ensures ordering
    sender_id  UUID,
    content    TEXT,
    media_url  TEXT,
    status     TEXT,      -- sent/delivered/read
    PRIMARY KEY (chat_id, message_id)
) WITH CLUSTERING ORDER BY (message_id DESC)
  AND compaction = {'class': 'TimeWindowCompactionStrategy'};
-- Time-window compaction: efficient for time-series data

Q: Why WebSockets instead of HTTP polling?

HTTP polling sends a request every N seconds even when there are no messages — wasteful. Long polling holds the connection open until data arrives — better but still HTTP overhead per message. WebSockets maintain a persistent bidirectional connection with minimal overhead (~2 bytes per frame vs hundreds of bytes for HTTP headers).


Q: Why Cassandra for messages instead of MySQL?

Cassandra is optimized for append-heavy, time-series write workloads — exactly what chat is. It scales horizontally across datacenters with no single point of failure. MySQL struggles past a few TB on a single node and requires complex sharding. Cassandra's partition key (chat_id) ensures all messages for a conversation are co-located on the same nodes.

EzyCoders Admin
Written by
EzyCoders Admin

Team Lead and Full-Stack Developer with experience in PHP, JavaScript, SQL, DSA, and System Design. Passionate about software engineering, scalable web technologies, and helping developers prepare for coding interviews and tech careers through practical tutorials and professional guidance.

Comments (0)

No comments yet. Be the first!

Leave a Comment