Error Handling and Retries
This page describes what happens when webhook delivery fails — how the queue behaves, what states it can enter, and how to recover.
Queue Architecture
Each integration or OAuth client has a dedicated per-integration queue. When an event fires, a message is appended to that queue and delivered in strict FIFO order. This means:
- Messages are always delivered in the order they were received.
- If one delivery fails, all subsequent messages for that integration are held until the queue is unblocked.
Queue States
A queue has two independent properties: status and state.
status — delivery health
| Value | Description |
|---|---|
HEALTHY | The last delivery attempt succeeded. Messages flow normally. |
BLOCKED | The last delivery attempt failed. Delivery is paused for this queue. |
state — processing lock
| Value | Description |
|---|---|
READY | The queue is idle. The next message can be picked up. |
LOCKED | A message is currently being delivered. The queue will not pick up another message until the in-flight attempt resolves. |
The queue is in LOCKED state while a message is in flight (i.e., the HTTP POST has been sent but no result has been recorded yet). Once the result is received — success or error — the state returns to READY.
What Triggers a Failure
A delivery attempt is considered failed if any of the following occur:
- The HTTP response status code is not in the 2xx range (i.e., outside 200–299).
- The HTTP request throws an exception (connection refused, timeout, DNS failure, TLS error, etc.).
On failure, QueueProcessor.error() is called. This:
- Records
lastTriedAtandlastErrorMessageon the failed message. - Sets the queue
statustoBLOCKEDandstatetoREADY. - Emits an internal Kafka event to
applications.message.failed.
No automatic retry is scheduled. The queue stays BLOCKED until an operator or support team manually resolves the situation (see Recovery Procedures below).
What Happens When Your Endpoint Is Down
- Message enqueued — the domain event fires and the message is persisted in
gb_integration_queue_message. New events continue to be accepted. - Delivery attempted —
QueueProcessor.send()tries to POST to yourendPoint. - Failure recorded — the message is updated with the error details and the queue is marked
BLOCKED. - Delivery paused — no further messages are delivered to that endpoint. Subsequent events pile up in the queue.
- Your webhook's
healthfield — visible via the GraphQL API — reflectsFAILING.
New events are still accepted and stored in the queue while it is blocked. No events are lost; they are simply waiting for the queue to be unblocked.
How Blocked Status Works
When status = BLOCKED, QueueProcessor.next() refuses to pick up the next message unless forced. The pseudologic is:
if queue.status == BLOCKED or queue.state == LOCKED:
return # do nothingOnly a force = true call (issued internally after a successful delivery) can advance past a blocked queue. This ensures that a known-bad endpoint does not receive a flood of retries.
Recovery Procedures
Because retries are not automatic, recovery requires a manual action:
- Fix your endpoint — ensure the URL is reachable, returns 2xx responses, and can handle the full payload within the timeout window.
- Contact support — ask the HappyColis support team to unblock the queue. They will reset
statustoHEALTHYand triggerQueueProcessor.next()to resume delivery from the oldest pending message. - After unblocking — the queue replays messages in FIFO order, starting from the message that originally failed (it is still in the queue with its error metadata).
Delivery Confirmation
A delivery is confirmed successful when:
QueueProcessor.send()receives an HTTP response with a 2xx status code.QueueProcessor.success()deletes the message fromgb_integration_queue_message.- The queue
statusis reset toHEALTHYandstatetoREADY. - An internal Kafka event is emitted to
applications.message.processed.
Message Retention
Messages remain in gb_integration_queue_message until they are successfully delivered. There is no automatic expiry — if a queue stays blocked indefinitely, its messages are retained indefinitely. This is by design to prevent data loss.
Payload Signing and Delivery
Every HTTP POST includes an HMAC-SHA256 signature computed over the full JSON body using the application's clientSecret. If your endpoint rejects the request because of a signature mismatch, the queue will be marked BLOCKED just like any other failure.
See Signature Verification for details on how to verify the signature on your side.
Summary
| Situation | Effect |
|---|---|
| 2xx response received | Message deleted, queue stays HEALTHY, next message dispatched |
| Non-2xx response received | Queue set to BLOCKED, error recorded, no further delivery |
| Network/connection exception | Queue set to BLOCKED, error recorded, no further delivery |
Queue is BLOCKED | New events accepted and queued, but not delivered |
Queue is LOCKED | No new dispatch until in-flight attempt resolves |
See Also
- Overview — system architecture
- Signature Verification — HMAC signature verification
- Subscription Management — webhook health states