Skip to main content
Everything that happens without a caller happens here. The billing-worker is a single Node process hosting a set of BullMQ workers and repeatable jobs, all over the shared Redis.

The queue topology

One pattern, everywhere

Every async concern in Duro follows the same scan → enqueue → process shape. Learn it once and you understand the billing engine, the recovery engine, and webhook delivery: Why this shape, every time:
  • The scan is cheap and short. It selects ids and enqueues. It never holds a transaction open while charging a gateway.
  • The work is isolated and retryable. Each item is its own job — it can fail, back off, and retry without affecting its neighbours. BullMQ’s attempts + exponential backoff handle transient failures for free.
  • jobId deduplicates. If a scan fires before the previous item finished, the duplicate collapses. Idempotency at the queue layer.
  • Both modes, always. A scanner has no “current mode,” so it sweeps test and live. This is the only place that loops both — request-scoped code is always single-mode.

The scanners

ScannerIntervalFindsEnqueues
billing60sactive/trialing subs with currentPeriodEnd ≤ nowrenew
reminder60mactive subs renewing within 60 daysremind
dunning60sscheduled schedules with nextAttemptAt ≤ nowprocess
webhook15sundispatched events + due deliveriesdeliver
The reminder scan runs hourly rather than every minute — renewal reminders aren’t time-critical to the second, and a 60-minute cadence keeps the re-evaluation churn low. The webhook scan runs fastest (15s) because delivery latency is the thing merchants feel.

Email as a queue, not an await

Sending email is IO that can be slow and flaky, so it’s never awaited in a request. Producers enqueue an EmailJobData and return; the dedicated email worker drains the queue and talks to SMTP. The welcome/KYC emails (core-api), receipts and dunning reminders (worker), and recovery notices all flow through this one queue. See the email system →

Graceful shutdown

On SIGTERM/SIGINT the worker closes the HTTP health server, then awaits every BullMQ worker and queue close, disconnects Redis and Postgres, and exits — with a 25-second hard backstop so a hung connection can’t wedge a deploy. Next: the email & template system.