chaitanya.dev / Patterns / Outbox

Outbox

Insert the message into an "outbox" table in the same local transaction as the business write, then a separate relay ships it to the broker. Trades a little latency for a real exactly-business-event-once guarantee.

When to reach for it

  • You write to a database AND publish an event in the same operation, and you've already been bitten by one happening without the other
  • Your broker doesn't support transactions with your database (Kafka, SQS, RabbitMQ to Postgres — i.e. always)
  • You need an audit trail of what was published, when, and whether downstream accepted it
  • You're building event-driven integration and need a defensible "we did emit that event" when a downstream team swears they never got it
  • You're already on Postgres or MySQL — adding an outbox table costs nothing

What it actually costs

You're now running a relay — poller, CDC tap (Debezium), or worker — that has to be deployed, monitored and not be a single point of failure. The relay can re-publish after a crash so consumers must be idempotent anyway. The outbox table grows; you need a retention policy and an index strategy or your INSERTs slow down silently. New failure surface: when the relay stops, your business writes still succeed but no events flow, and you find out from a stale dashboard.

The failure mode nobody mentions

Silent backlog. A single poison-pill row (oversized payload, schema-incompatible event) jams the relay's batch and it retries forever. The outbox grows by 10k rows an hour, replication lag climbs, and nobody notices until a customer-facing query gets slow. Always run a dead-letter sink for the relay itself, and alert on outbox table size — not just relay liveness.

When not to use it

A fire-and-forget signal where loss is acceptable (a metric, a non-critical analytics event) — publish directly and skip the moving parts.