Skip to content

Instantly share code, notes, and snippets.

@chgeuer
Last active February 23, 2026 18:29
Show Gist options
  • Select an option

  • Save chgeuer/d3912d742eb50209cc0bb4b1fb2cc0e8 to your computer and use it in GitHub Desktop.

Select an option

Save chgeuer/d3912d742eb50209cc0bb4b1fb2cc0e8 to your computer and use it in GitHub Desktop.

WhatsApp Gateway — Architecture Design Specification

AMQP 1.0 as Local IPC Protocol

Version: 0.1.0-draft Status: Draft Author: Christian Date: February 2026


1. Introduction

1.1 Purpose

This document describes the architecture of the WhatsApp Gateway, an Elixir application that maintains a single authenticated WhatsApp Web session and exposes WhatsApp functionality to multiple local consumer processes over AMQP 1.0.

1.2 Problem Statement

WhatsApp limits each account to four concurrent linked device sessions. Multiple local applications — bots, automation scripts, monitoring tools — need independent access to a single WhatsApp account. Each application registering as its own device would quickly exhaust the session limit and create conflicting state.

1.3 Solution Overview

The WhatsApp Gateway acts as a single linked device, holding one WhatsApp Web session. It re-exposes the full WhatsApp experience to local consumers over AMQP 1.0, acting as a peer-to-peer AMQP server on localhost. Consumer applications connect as AMQP clients, subscribe to conversations via links, send messages, and query metadata — all multiplexed over a single TCP connection per consumer.

1.4 Why AMQP 1.0

AMQP 1.0 was chosen over raw WebSockets, gRPC, or a custom protocol for the following reasons:

  • Session multiplexing with independent flow control. Each WhatsApp conversation maps to an AMQP session. A noisy group chat cannot starve a critical 1:1 conversation because each session manages its own transfer window.
  • Credit-based flow control per link. Each consumer explicitly grants credit to the gateway per link, preventing unbounded memory growth from unconsumed messages.
  • Well-specified subscribe/unsubscribe semantics. Link attach and detach map directly to “I want to see this chat” and “I’m done with this chat” — no custom subscription protocol required.
  • Request/reply built into the protocol. Querying message history, group membership, or contact metadata uses standard AMQP request/reply over dedicated links, eliminating the need for a separate RPC mechanism.
  • Dynamic link creation by the server. The gateway can push new sessions and links to the client when relevant activity occurs, such as a tracked contact posting in a new group.
  • Existing implementation. A production-grade AMQP 1.0 client and server library already exists in Elixir, eliminating the need to build transport, framing, flow control, or session management from scratch.

2. System Context

┌──────────────────────────────────────────────────────────┐
│                      Local Machine                        │
│                                                           │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐    │
│  │  Bot A   │ │  Bot B   │ │  Bot C   │ │ WA Web UI│    │
│  │ (AMQP    │ │ (AMQP    │ │ (AMQP    │ │ (LiveView│    │
│  │  Client) │ │  Client) │ │  Client) │ │  + AMQP) │    │
│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘    │
│       │ AMQP 1.0   │ AMQP 1.0  │ AMQP 1.0   │          │
│       └────────────┬┴───────────┴─────────────┘          │
│                    │                                      │
│           ┌────────┴────────┐                             │
│           │  WhatsApp       │                             │
│           │  Gateway        │                             │
│           │                 │                             │
│           │  ┌───────────┐  │                             │
│           │  │ AMQP 1.0  │  │                             │
│           │  │ Server    │  │                             │
│           │  └─────┬─────┘  │                             │
│           │        │        │                             │
│           │  ┌─────┴─────┐  │  ┌─────────────────────┐   │
│           │  │ WA Proto  │  │  │  SQLite + Media      │   │
│           │  │ (Noise +  │──┼──│  Store               │   │
│           │  │ WebSocket)│  │  │  /var/lib/wa-gateway/ │   │
│           │  └─────┬─────┘  │  └─────────────────────┘   │
│           └────────┼────────┘                             │
│                    │                                      │
└────────────────────┼──────────────────────────────────────┘
                     │ TLS/WebSocket
                     ▼
             WhatsApp Servers

2.1 Components

Component Role
WhatsApp Gateway Maintains the WhatsApp Web session. Runs an AMQP 1.0 server on localhost. Persists all messages and media to a local SQLite database and content-addressed media store. Translates between WhatsApp’s internal protocol and AMQP.
Consumer (Bot) Any local process that connects to the gateway via AMQP 1.0. Subscribes to conversations, sends messages, queries metadata.
WA Web UI Optional Phoenix LiveView application that acts as an AMQP consumer, providing a WhatsApp Web-like interface in the browser. Has no special privileges — uses the same AMQP interface as any other consumer.
SQLite + Media Store Local persistence layer. SQLite stores all message records, contacts, groups, and chat metadata. Media files are stored content-addressed by SHA-256 hash on the local filesystem. Together, these form the canonical archive for the account.
WhatsApp Servers Meta’s backend infrastructure. The gateway connects as a linked device using the Noise protocol over WebSocket.

3. AMQP 1.0 Mapping

This section defines how WhatsApp concepts map to AMQP 1.0 primitives.

3.1 Connection

Each consumer application establishes exactly one AMQP connection to the gateway. The connection carries authentication (e.g., a shared secret or local token) and capability negotiation via AMQP connection properties.

Consumer A ──── 1 AMQP Connection ────► Gateway
Consumer B ──── 1 AMQP Connection ────► Gateway

Connection properties advertised by the gateway:

Property Description
wa:account-jid The JID of the authenticated WhatsApp account
wa:device-id The linked device identifier
wa:server-version Gateway software version
wa:capabilities Bitmask of supported features

3.2 Sessions — Conversation Contexts

Each AMQP session represents an independent conversation context. Sessions provide multiplexed, flow-controlled channels within a single TCP connection.

WhatsApp Concept AMQP Session
1:1 chat with a contact Session identified by contact JID
Group chat Session identified by group JID
Broadcast list Session identified by broadcast JID
Gateway control plane Dedicated $gateway session
Presence/status Dedicated $presence session

Each session maintains its own transfer window, so a high-traffic group cannot interfere with delivery of messages from a priority 1:1 chat.

3.2.1 Session Lifecycle

Client-initiated: The consumer attaches a session for a specific conversation it wants to interact with.

Consumer                          Gateway
   │                                 │
   │─── BEGIN (jid="wife@s.whatsapp.net") ──►│
   │                                 │
   │◄── BEGIN ───────────────────────│
   │    (session established)        │

Server-initiated: The gateway may open a session toward the consumer when relevant activity is detected. For example, if the consumer has a session for a specific contact and that contact posts in a group, the gateway can push a new session for that group to the consumer.

Consumer                          Gateway
   │                                 │
   │  (has session for wife@...)     │
   │                                 │
   │     wife posts in group chat    │
   │                                 │
   │◄── BEGIN (jid="family-group@g.us") ──│
   │    (gateway pushes new session) │
   │                                 │
   │─── BEGIN ───────────────────────►│
   │    (consumer accepts or not)    │

The consumer can choose to accept the session (attach links, start receiving) or immediately end it if uninterested.

3.3 Links — Directional Data Streams

Within each session, AMQP links provide directional message flow. Links are the subscribe/unsubscribe primitive.

3.3.1 Standard Links per Chat Session

Each chat session exposes a standard set of links:

Link Name Direction Purpose
messages Gateway → Consumer Incoming messages in this chat
send Consumer → Gateway Outgoing messages to this chat
receipts Gateway → Consumer Delivery receipts, read receipts, played receipts
typing Gateway → Consumer Typing indicators, recording indicators
history Bidirectional (request/reply) Query past messages
meta Bidirectional (request/reply) Chat metadata (group info, contact info, settings)
Chat Session: wife@s.whatsapp.net
┌─────────────────────────────────────────────┐
│                                             │
│  ◄── messages (incoming texts, media, etc.) │
│  ──► send (outgoing texts, media, etc.)     │
│  ◄── receipts (read, delivered, played)     │
│  ◄── typing (composing, recording)          │
│  ◄─► history (request/reply)                │
│  ◄─► meta (request/reply)                   │
│                                             │
└─────────────────────────────────────────────┘

3.3.2 Link Attach as Subscription

Attaching a link is the subscription mechanism. A consumer does not need to attach all links — it can attach only messages if it doesn’t care about typing indicators, or only meta if it just wants to query group membership.

Consumer                          Gateway
   │                                 │
   │─── ATTACH (name="messages",    │
   │     role=receiver)  ───────────►│
   │                                 │
   │◄── ATTACH ─────────────────────│
   │    (link established)           │
   │                                 │
   │─── FLOW (credit=100) ─────────►│
   │    (consumer grants credit)     │
   │                                 │
   │◄── TRANSFER (message) ─────────│
   │◄── TRANSFER (message) ─────────│
   │    ...                          │

3.3.3 Link Detach as Unsubscription

Detaching a link cleanly unsubscribes from that data stream without tearing down the session. The consumer can re-attach later.

3.4 Transfers — Message Encoding

WhatsApp messages are carried as AMQP transfer frames. The AMQP message structure maps as follows:

3.4.1 Message Properties (AMQP Header)

AMQP Property WhatsApp Mapping
message-id WhatsApp message ID
correlation-id Quoted/replied-to message ID (if reply)
to Recipient JID
reply-to Sender JID
content-type MIME type of the message body
creation-time WhatsApp server timestamp
group-id Conversation JID (for routing)

3.4.2 Application Properties

Key Description
wa:message-type text, image, video, audio, document, sticker, location, contact, reaction, poll, edit, revoke
wa:participant Sender JID within a group
wa:push-name Sender’s push name
wa:quoted-id ID of the quoted message
wa:quoted-participant Sender of the quoted message
wa:is-forwarded Boolean
wa:forward-score Forwarding count
wa:ephemeral-duration Disappearing message TTL in seconds
wa:broadcast Whether sent to a broadcast list
wa:edit-version Edit revision number
wa:media-url Original CDN URL (for reference; may be expired)
wa:media-key Decryption key for media
wa:media-sha256 SHA-256 hash of decrypted media (content address in local store)
wa:media-status available, downloading, expired-remote
wa:thumbnail Base64-encoded thumbnail

3.4.3 Message Body

The AMQP message body contains the primary content:

  • Text messages: UTF-8 string in an AMQP data section, content-type: text/plain
  • Media messages: Binary payload in an AMQP data section with the appropriate MIME type, served from the gateway’s local media store. The gateway always serves decrypted media directly — consumers never need to interact with WhatsApp’s CDN or handle media decryption. For large media, the body may instead contain a local reference (content-addressed SHA-256 path) that the consumer can fetch out-of-band, controlled by a per-consumer media_mode configuration.
  • Complex types (location, contact, poll): Structured data encoded as CBOR or JSON in an AMQP data section, content-type: application/cbor or application/json

3.5 Flow Control

AMQP 1.0’s credit-based flow control is applied at two levels:

3.5.1 Session-Level Flow Control

Each session has an independent transfer window. The consumer can grant a large window to priority conversations and a small window to noisy groups.

3.5.2 Link-Level Flow Control

Each link uses credit-based flow. The consumer explicitly grants credit (number of transfers it’s willing to accept) on each link. This prevents a consumer that only processes text messages from being overwhelmed by a stream of typing indicators it hasn’t consumed.

Consumer                          Gateway
   │                                 │
   │─── FLOW (credit=10) ──────────►│  Consumer can handle 10 messages
   │                                 │
   │◄── TRANSFER ───────────────────│  credit: 9 remaining
   │◄── TRANSFER ───────────────────│  credit: 8 remaining
   │    ...                          │
   │◄── TRANSFER ───────────────────│  credit: 1 remaining
   │                                 │
   │─── FLOW (credit=10) ──────────►│  Consumer replenishes

3.6 Delivery Semantics

The gateway operates as a persistent proxy — it persists all messages and media locally while providing real-time fan-out to connected consumers.

Aspect Behavior
Durability All messages are persisted to SQLite on arrival. All media is downloaded and stored locally on arrival. Consumers that reconnect after being offline can catch up via the history link.
Settlement Pre-settled (fire-and-forget) for most real-time streams. Unsettled with acknowledgment for outbound sends (so the consumer knows when WhatsApp accepted the message).
Replay Available via the history request/reply link, which queries the local SQLite database. Consumers track their own high-water mark (last seen message timestamp or ID) and request replay from that point on reconnection.
Ordering Messages are delivered in the order received from WhatsApp, per link. Cross-session ordering is not guaranteed.
Media Always available from local storage. The gateway downloads and decrypts media eagerly on arrival, eliminating dependency on WhatsApp’s CDN after initial receipt. See Section 8: Persistence & Media Architecture.

4. Gateway Control Plane

4.1 The $gateway Session

Every consumer connection has access to a special $gateway session that provides control-plane functionality independent of any specific conversation.

4.1.1 Links

Link Name Direction Purpose
status Gateway → Consumer Connection state changes (connected, reconnecting, disconnected, logged-out)
events Gateway → Consumer Account-level events (new chat created, contact joined WhatsApp, chat archived/unarchived)
command Bidirectional (request/reply) Administrative commands
query Bidirectional (request/reply) Data queries

4.1.2 Queries via the query Link

Queries use AMQP request/reply. The consumer sends a transfer with reply-to set to its receiving link address, and the gateway responds on that link.

Query Description
list-chats Returns all chats with last message timestamp, unread count
list-contacts Returns all contacts with JID, push name, profile picture URL
list-groups Returns all groups with JID, subject, participants
get-profile Returns profile info for a given JID
get-profile-picture Returns profile picture for a given JID
search-messages Full-text search across conversations (backed by SQLite FTS5)
get-status Returns the current WhatsApp connection state
get-media Retrieve a media file by SHA-256 hash from the local store
get-media-status Returns availability status of a media file (available, expired-remote)
get-storage-stats Returns media store size, message count, oldest/newest message timestamps

4.1.3 Commands via the command Link

Command Description
create-group Create a new group with specified participants
update-group-subject Change group name
update-group-description Change group description
add-participants Add members to a group
remove-participants Remove members from a group
leave-group Leave a group
archive-chat Archive a chat
mute-chat Mute notifications for a chat
set-presence Set own presence (available/unavailable)
update-profile-name Update own profile display name

4.2 The $presence Session

A dedicated session for presence and status updates across all contacts.

Link Name Direction Purpose
updates Gateway → Consumer Real-time presence changes (online, offline, last seen)
subscribe Consumer → Gateway Request presence tracking for specific JIDs

5. Server-Initiated Session Promotion

One of the more powerful aspects of this architecture is the gateway’s ability to dynamically promote sessions to the consumer based on activity patterns.

5.1 Tracked Contact Activity

When a consumer establishes a session for a specific contact (e.g., wife@s.whatsapp.net), the gateway can optionally track that contact’s activity across all conversations. If the contact posts in a group that the consumer hasn’t subscribed to, the gateway initiates a new session for that group.

Consumer                                    Gateway
   │                                           │
   │  Session: wife@s.whatsapp.net             │
   │  (messages link attached)                 │
   │                                           │
   │         wife sends message in             │
   │         family-group@g.us                 │
   │                                           │
   │◄── BEGIN (properties: {                   │
   │       "jid": "family-group@g.us",         │
   │       "wa:promotion-reason": "tracked-contact", │
   │       "wa:triggered-by": "wife@s.whatsapp.net"  │
   │     })                                    │
   │                                           │
   │─── BEGIN (accept) ───────────────────────►│
   │                                           │
   │─── ATTACH (name="messages") ─────────────►│
   │                                           │

5.2 Consumer Configuration

Consumers control promotion behavior through connection properties or $gateway commands:

Setting Description
wa:auto-promote none, tracked-contacts, all
wa:promote-filter JID pattern filter for promotions
wa:max-sessions Maximum number of sessions the gateway should maintain for this consumer

6. Message Types Reference

6.1 Outbound Message Types (Consumer → Gateway via send link)

wa:message-type Body Content Notes
text UTF-8 string May include mentions via application properties
image Binary image data or URL reference Requires content-type
video Binary video data or URL reference Requires content-type
audio Binary audio data or URL reference Set wa:ptt: true for voice note
document Binary document data or URL reference Set wa:filename
sticker Binary WebP data Must be 512x512
location JSON { lat, lng, name?, address? }
contact vCard string content-type: text/vcard
reaction UTF-8 emoji Set wa:reaction-target to target message ID
reply Any of the above Set correlation-id to quoted message ID
edit UTF-8 replacement text Set wa:edit-target to original message ID
revoke Empty Set wa:revoke-target to message ID

6.2 Receipt Types (Gateway → Consumer via receipts link)

wa:receipt-type Description
delivery Message delivered to recipient’s device
read Message read by recipient
played Voice note or video played by recipient
read-self You read a message (from another device)

7. Security

7.1 Transport

The gateway listens on localhost only. AMQP connections are not exposed to the network. TLS is optional for localhost but may be enabled for defense-in-depth.

7.2 Authentication

Consumers authenticate using AMQP SASL. Supported mechanisms:

Mechanism Use Case
PLAIN Simple shared secret per consumer (sufficient for localhost)
EXTERNAL Unix socket peer credentials or mTLS client certificates

7.3 Authorization

Each consumer is assigned a permission set that controls:

  • Which conversations it may open sessions for (JID allowlist/denylist)
  • Whether it may send messages or only receive
  • Whether it receives media payloads or only metadata
  • Whether it may issue administrative commands (group management, profile changes)

8. Gateway Internal Architecture

8.1 Process Topology (OTP)

                    ┌──────────────────────┐
                    │   Application Sup     │
                    └──────────┬───────────┘
          ┌────────────┬───────┼───────┬────────────┐
          ▼            ▼       ▼       ▼            ▼
┌──────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ WA.Connection│ │ AMQP     │ │ Chat     │ │ Store    │ │ Media    │
│ Supervisor   │ │ Listener │ │ Registry │ │ (Ecto/  │ │ Ingester │
│              │ │ Sup      │ │          │ │ SQLite) │ │          │
└──────┬───────┘ └────┬─────┘ └──────────┘ └──────────┘ └──────────┘
       │              │
┌──────┴──────┐ ┌─────┴───────┐
│ WA.Socket   │ │ Per-Consumer│
│ (GenServer) │ │ Connection  │
│             │ │ (GenServer) │
│ WA.Proto    │ └─────────────┘
│ (Noise/WS)  │
└─────────────┘
Process Responsibility
WA.Socket Maintains the WebSocket connection to WhatsApp. Handles Noise protocol handshake, encryption, keepalives, reconnection.
WA.Proto Encodes/decodes WhatsApp’s binary protocol. Translates between WhatsApp protobuf messages and internal Elixir structs.
Chat.Registry Maps JIDs to chat processes. Uses Elixir’s Registry for efficient lookup.
Chat (per-JID) Manages state for a single conversation. Tracks which consumers are subscribed, fans out incoming messages.
Store Ecto repository backed by SQLite (via ecto_sqlite3). Persists messages, contacts, groups, chat metadata. Provides full-text search via FTS5. Single-writer serialized through the Ecto repo process.
Media.Ingester Downloads, decrypts, and stores media files on arrival. Writes content-addressed files to the local media store. Runs as a pool of workers to handle concurrent downloads without blocking message processing.
AMQP.Listener Accepts incoming AMQP connections on localhost. Spawns per-consumer connection processes.
Consumer Connection Manages one AMQP connection. Handles session/link lifecycle. Routes transfers between AMQP links and chat processes.

8.2 Message Flow: Incoming

WhatsApp Servers
       │
       ▼
  WA.Socket (receives encrypted frame)
       │
       ▼
  WA.Proto (decodes protobuf → Elixir struct)
       │
       ├──► Store (persist message record to SQLite)
       │
       ├──► Media.Ingester (if media: download, decrypt, store by SHA-256)
       │
       ▼
  Chat.Registry (lookup by JID)
       │
       ▼
  Chat process (fans out to subscribed consumers)
       │
       ├──► Consumer A connection → AMQP TRANSFER on messages link
       ├──► Consumer B connection → AMQP TRANSFER on messages link
       └──► Consumer C connection → (no messages link attached, skipped)

Message persistence and media ingestion happen before fan-out to consumers. This ensures that even if no consumers are connected, no data is lost. Media ingestion is asynchronous — the message record is persisted immediately with media_status: :downloading, and the media binary is downloaded in the background. If a consumer requests the media before the download completes, the gateway can either block briefly or return the thumbnail with a downloading status.

8.3 Message Flow: Outgoing

Consumer A
       │
       ▼
  AMQP TRANSFER on send link
       │
       ▼
  Consumer Connection (validates, maps to internal struct)
       │
       ▼
  Chat process (optional: dedup, rate limit)
       │
       ▼
  WA.Proto (encodes to protobuf)
       │
       ▼
  WA.Socket (encrypts, sends)
       │
       ▼
  WhatsApp Servers
       │
       ▼
  (receipt arrives back via incoming flow)
       │
       ▼
  Consumer Connection → AMQP DISPOSITION (settled)

9. Persistence & Media Architecture

9.1 Design Rationale

The gateway maintains a local SQLite database and content-addressed media store, making it the canonical archive for the WhatsApp account. This design is driven by a key constraint of WhatsApp’s infrastructure: media files stored on WhatsApp’s CDN expire after approximately 30 days. After expiry, the CDN returns a 404, and the only recovery path is a re-upload request to another device that still has the file locally — a mechanism that is fragile and depends on the phone not having cleared its storage.

By downloading and storing all media eagerly on arrival, the gateway eliminates this expiry problem entirely. Once the gateway is running, it becomes the most complete media archive of the account — more complete than the phone (which auto-purges to free storage) or any other linked device (which only downloads media on demand when the user views it).

9.2 SQLite Database

The gateway uses SQLite (via Ecto with ecto_sqlite3) with WAL mode enabled for concurrent read access. The single-writer constraint is naturally satisfied by the gateway’s architecture: messages arrive sequentially from the WhatsApp WebSocket connection.

9.2.1 Schema Overview

Table Purpose Key Columns
messages All messages across all conversations id (WA message ID), chat_jid, sender_jid, timestamp, type, body_text, media_sha256, media_status, protobuf_raw
contacts Contact directory jid, push_name, notify_name, profile_picture_url, is_business
groups Group metadata jid, subject, description, created_at, creator_jid
group_participants Group membership group_jid, participant_jid, role (admin, superadmin, member)
chats Per-conversation state jid, last_message_timestamp, unread_count, is_archived, is_muted, mute_until
media_files Media file registry sha256 (primary key), mime_type, file_size, local_path, original_cdn_url, media_key, downloaded_at
messages_fts Full-text search (FTS5 virtual table) body_text, linked to messages.rowid

9.2.2 Key Design Decisions

  • protobuf_raw column: Each message stores the original WhatsApp protobuf binary. This ensures no information is lost during the translation to the relational schema, and allows re-parsing if the schema evolves.
  • FTS5 for search: The search-messages query on the $gateway session is backed by SQLite’s FTS5 full-text search engine, providing fast search across all conversations.
  • WAL mode: Enables concurrent reads from multiple consumer connection processes while the single writer (message ingestion) is active.
  • No ORM overhead for writes: Hot-path message inserts use raw Exqlite for minimal latency. Ecto is used for complex reads and queries.

9.3 Content-Addressed Media Store

Media files are stored on the local filesystem, addressed by the SHA-256 hash of the decrypted content.

/var/lib/wa-gateway/media/
├── ab/
│   └── ab3f7c8e9d...  (first 2 chars as directory prefix)
├── cd/
│   └── cd91a2b4e7...
└── ...

9.3.1 Why Content-Addressed

  • Natural deduplication. The same image forwarded across 10 group chats is stored exactly once. The SHA-256 hash is already present in WhatsApp’s message protobuf (fileHash / fileSha256), so the gateway does not need to compute it.
  • Immutable. Once written, a media file is never modified. This simplifies caching, backup, and integrity verification.
  • Simple lookup. Given a message’s media_sha256, the file path is deterministic. No database join is required to locate the file.

9.3.2 Media Ingestion Pipeline

When the gateway receives a media message from WhatsApp:

  1. Persist message record to SQLite with media_status: :downloading and the CDN reference (directPath, mediaKey, fileSha256).
  2. Check local store — if a file with this SHA-256 already exists (dedup hit), set media_status: :available and skip download.
  3. Download the encrypted .enc file from WhatsApp’s CDN (mmg.whatsapp.net + directPath).
  4. Decrypt using AES-256-CBC with the mediaKey (key derivation via HKDF-SHA256, per WhatsApp’s encryption spec).
  5. Verify the SHA-256 of the decrypted content matches fileSha256.
  6. Write to the content-addressed store at /var/lib/wa-gateway/media/{sha256[0:2]}/{sha256}.
  7. Update the message record to media_status: :available and populate local_path in the media_files table.

If the download or decryption fails (CDN error, network issue), the message is marked media_status: :failed with a retry timestamp. A background process periodically retries failed downloads.

9.3.3 WhatsApp CDN URL Structure

WhatsApp media CDN URLs come in two formats:

Legacy clientUrl:

https://mmg-fna.whatsapp.net/d/f/<opaque-base64-token>.enc

Current directPath:

/v/t62.7118-24/<numeric-id>_<numeric-id>_<numeric-id>_n.enc?ccb=...&oh=<auth-hash>&oe=<expiry-hex>&_nc_hot=<timestamp>

The oe parameter is a hex-encoded expiration timestamp. The oh parameter is an authentication hash. Neither format contains any device-identifying information — URLs are opaque content-addressed blob references. This is a consequence of WhatsApp’s end-to-end encryption design: the CDN serves encrypted blobs that WhatsApp’s servers cannot decrypt, so there is no reason to associate device identity with the URL.

CDN expiry: Media files are retained on WhatsApp’s CDN for approximately 30 days. After this period, the URL returns a 404. This is the primary motivation for the gateway’s eager download-on-arrival strategy.

9.3.4 Media Delivery to Consumers

When a consumer receives a message via the messages link or requests media via the history link:

media_mode Behavior
:inline The full decrypted binary is included in the AMQP message body. Simple for consumers, but high bandwidth for large files.
:reference The message body contains only text/metadata. Media is referenced by wa:media-sha256 in application properties. The consumer retrieves the binary separately via the get-media query on the $gateway session.
:none Only the thumbnail (from the protobuf) is included. The consumer receives metadata but no media payload.

The media_mode is configurable per consumer via connection properties.

9.4 Gateway as Canonical Media Node

Over time, the gateway becomes the most reliable media archive in the WhatsApp device mesh:

Device Media Completeness Reason
Phone Partial Auto-purges old media to free storage. User may manually delete files.
WhatsApp Web Ephemeral Downloads on demand for viewing. Browser cache is cleared regularly.
Other linked devices Partial Only download media the user actively views.
Gateway Complete (from first boot) Downloads every media file eagerly on arrival. Never purges unless explicitly configured to.

The gateway can also participate in WhatsApp’s re-upload protocol. When another device (including the phone) requests a re-upload of expired media, the gateway — if it has the file locally — can re-encrypt the media with a fresh mediaKey, upload it to WhatsApp’s CDN, and return the new URL and key. This effectively makes the gateway a media server for the user’s other devices. Implementation of the re-upload responder is optional but valuable for account resilience.

9.5 First Boot & Historical Backfill

When the gateway connects as a linked device for the first time, WhatsApp syncs message history — the protobuf message records including metadata, text, thumbnails, and media references (CDN URLs, media keys, file hashes). However, the actual media binaries are not pushed during history sync.

9.5.1 Backfill Strategy

  1. Persist all synced message records to SQLite immediately, including media references.
  2. Attempt eager download of all referenced media during the initial sync window. Media younger than ~30 days will likely still be available on the CDN.
  3. Mark expired media as media_status: :expired_remote for any CDN URL that returns a 404. The thumbnail (which is inline in the protobuf) is still available.
  4. On-demand re-upload (optional): If a consumer later requests expired media via the history link, the gateway can attempt a re-upload request to the phone. This is fragile — it requires the phone to still have the file locally — but provides a last-resort recovery path.

After the initial backfill, the expiry problem is permanently solved. Every subsequent message is downloaded within seconds of arrival, well before the ~30-day CDN window.

9.5.2 Backfill Status Tracking

The gateway exposes backfill progress on the $gateway/status link:

Event Description
backfill:started History sync has begun. Includes estimated message count.
backfill:progress Periodic progress update. Includes messages processed, media downloaded, media expired.
backfill:complete History sync finished. Summary of total messages, media available, media expired.

9.6 Storage Management

9.6.1 Backup

SQLite database backup is a single file copy (or VACUUM INTO for a consistent snapshot). The content-addressed media store can be backed up with any file-level tool (rsync, restic, etc.). Together, these form a complete, portable archive of the WhatsApp account.

9.6.2 Retention Policy

By default, the gateway retains everything indefinitely. Optional retention policies can be configured:

Policy Description
media_max_age Delete local media files older than N days. Message records and thumbnails are preserved.
media_max_size Cap total media store size. Oldest files are evicted first (LRU).
message_max_age Delete message records older than N days.

Retention policies are applied by a background process and never affect real-time ingestion.


10. Configuration

10.1 Gateway Configuration

config :wa_gateway,
  # AMQP server
  amqp_host: "127.0.0.1",
  amqp_port: 5672,
  amqp_tls: false,

  # WhatsApp connection
  wa_auth_state_path: "/var/lib/wa-gateway/auth",
  wa_reconnect_interval: 5_000,
  wa_keepalive_interval: 30_000,

  # Persistence
  database_path: "/var/lib/wa-gateway/gateway.db",
  database_wal_mode: true,

  # Media store
  media_store_path: "/var/lib/wa-gateway/media",
  media_download_on_arrival: true,
  media_download_pool_size: 4,
  media_retry_interval: 60_000,
  media_retry_max_attempts: 5,

  # Retention (nil = keep forever)
  media_max_age_days: nil,
  media_max_size_bytes: nil,
  message_max_age_days: nil,

  # First boot backfill
  backfill_media_download: true,
  backfill_media_concurrency: 8,

  # Re-upload responder (serve media to other devices)
  reupload_responder_enabled: false,

  # Consumer defaults
  default_media_mode: :reference,  # :inline | :reference | :none
  default_max_sessions: 50,
  default_auto_promote: :none

11. Error Handling

11.1 WhatsApp Connection Errors

Scenario Gateway Behavior Consumer Impact
Temporary disconnect Automatic reconnection with exponential backoff. Messages arriving during reconnection are not lost — they will be delivered by WhatsApp on reconnection and persisted normally. $gateway/status link sends reconnecting. Real-time fan-out pauses; consumers can query history for any messages they missed.
Session expired Re-authentication required (QR code scan). $gateway/status link sends logged-out. All sessions remain open but message delivery pauses. Existing persisted data remains available for queries.
Rate limited Gateway backs off per WhatsApp’s signals. Outbound send links receive AMQP rejected disposition with wa:rate-limited error.

11.2 AMQP Errors

Scenario Behavior
Consumer disconnects All sessions and links torn down. Chat processes remove consumer from fan-out.
Consumer exceeds credit Not possible — credit-based flow prevents this by design.
Invalid JID on session open Session BEGIN rejected with amqp:not-found error.
Unauthorized conversation Session BEGIN rejected with amqp:unauthorized-access error.
Malformed message on send Transfer rejected with amqp:decode-error.

12. Future Considerations

  • End-to-end encryption relay. Currently, the gateway decrypts WhatsApp messages before re-transmitting over AMQP. A future mode could relay encrypted payloads with key material, allowing consumers to decrypt independently.
  • Multi-account support. Running multiple WhatsApp sessions on a single gateway, each exposed as a separate AMQP virtual host or connection property namespace. Each account would have its own SQLite database and media store.
  • WebSocket bridge. A thin WebSocket-to-AMQP bridge for consumers that cannot run an AMQP client (browser-based dashboards, etc.). The Phoenix LiveView web client could use this bridge, or connect via a native Elixir AMQP client within the same BEAM node.
  • Metrics and observability. Expose Prometheus metrics for connection health, message throughput per chat, consumer lag, flow control backpressure events, media store size, and SQLite performance.
  • Encrypted-at-rest media store. Encrypt the local media store with a user-provided key, so that the on-disk files are not readable without the gateway running. The SQLite database could similarly be encrypted using SQLCipher.
  • Re-upload responder. Full implementation of WhatsApp’s re-upload protocol, allowing the gateway to serve expired media back to the phone and other linked devices on demand. This makes the gateway a resilience node for the entire device mesh.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment