How it works

From payload to delivery report in under 500ms

A deep, honest look at every layer of the platform. No "magic" — just solid engineering, production-proven components and design decisions we can explain.

Architecture

End-to-end non-blocking pipeline

Every layer was designed to sustain billions of messages/month — with delivery guarantee, exponential retries in DLQ, Redis-backed distributed rate-limit and full per-message observability.

Your Backend

App / CRM / ERP

API Gateway

REST · Webhooks · SDKs

RabbitMQ

Prioritized queues

Parallel Workers

Non-blocking pipeline

Meta Cloud API

WhatsApp Business

Webhooks

Callbacks · Status

Throughput

Fully non-blocking pipeline. Queues prioritized by tenant and message type.

Resilience

Exponential retry, DLQ, periodic reconciliation with the Meta Cloud API.

Observability

Every message traceable end-to-end — logs, metrics, tracing and alerts.

Message flow

8 steps from backend to WhatsApp

Step 01

Ingestion

POST /v1/messages hits the edge

The gateway validates schema, token signature, dedupes by idempotency_key and routes to the right queue. Fully non-blocking — we return 202 in under 40ms with a persisted message_id.

API GatewayJWT + scopesRedis (idempotency)OpenAPI 3.1 validation

Step 02

Queueing

RabbitMQ prioritized per tenant + type

Messages go into separate queues by priority (critical, high, default, low) and tenant. A heavy marketing batch never delays an OTP. Distributed rate-limit per template and per number.

RabbitMQ with lazy queuesPrioritized quorum queuesRedis-backed rate-limit

Step 03

Processing

Parallel workers consuming in a loop

Each worker grabs a batch, hydrates template variables, applies segmentation rules and dispatches to the Meta Cloud API. Auto-scaling pool with backpressure — we don't choke the downstream API even at peak.

Node.js workersAdaptive backpressureMeta connection pool

Step 04

Meta Cloud API

Delivery to WhatsApp Business

Meta returns an id and then signals status (queued → sent → delivered → read). On errors we classify as retryable / terminal and apply the right policy.

Meta Graph API v19+Aggressive timeout (2s)Per-number circuit breaker

Step 05

Retry + DLQ

Failure isn't the end — it starts reconciliation

Retryable errors go to a dead-letter queue with exponential backoff (2s → 4s → 8s → 16s → 32s → 1min → 5min → 30min). After 7 attempts, the message enters continuous reconciliation with Meta every 6h for 7 days.

Dead-letter queueExponential backoffReconciliation cron

Step 06

Persistence

MongoDB as the source of truth

Every state transition is persisted with timestamp, Meta metadata, billable cost and tracing id. Compound indexes on tenant + status + created_at enable millisecond queries across billions of records.

Sharded MongoDBTTL for logsCompound indexes

Step 07

Outbound webhooks

Your backend receives every event

Each transition triggers an HMAC SHA-256 signed webhook to your endpoint. Automatic redelivery with backoff up to 24h, ordering by message_id, and manual replay from the dashboard for the last 30 days.

HMAC SHA-256Redelivery workerDashboard replay

Step 08

Observability

Every message traceable end-to-end

The entire pipeline produces structured logs, metrics (Prometheus) and tracing spans (OpenTelemetry). A single trace_id takes you from ingestion to outbound webhook. Public Grafana dashboards per tenant.

OpenTelemetryPrometheus + GrafanaStructured JSON logs
Pillars

Three architectural choices that hold everything up

Native multi-tenant

Each workspace has its own queues, rate-limits and MongoDB indexes. Real isolation — not logical-only via WHERE tenant_id.

Security in layers

Scoped tokens, zero-downtime rotation, PII encrypted at rest (AES-256) and in transit (TLS 1.3), immutable audit log.

Total observability

One trace_id takes you from the client API to the outbound webhook. Public dashboards per tenant. No secrets about what happens inside the platform.

Operational guarantees

What CCX puts in writing

Beyond the SLA, a list of commitments written into Enterprise contracts — and honored by default on every other plan.

  • Guaranteed 7-day idempotency on every endpoint
  • Zero messages sent twice due to an internal failure
  • DLQ retries up to 7 attempts + 7 days of reconciliation
  • Webhooks with automatic 24h redelivery and manual replay
  • Public SLA with financial credits when we fall short
  • Real-time status page and public postmortems within 48h
  • LGPD, GDPR, SOC 2 Type II audit-ready with a pre-approved DPA
  • Full data export on demand (CSV, JSONL, Parquet)

Read this far? You're ready to test.

Spin up a sandbox in 2 minutes and send your first approved template. No card, no lock-in.

How It Works — CCX Message WhatsApp Architecture — Reviewo