Skip to content

Error Handling and Retries

This page describes what happens when webhook delivery fails — how the queue behaves, what states it can enter, and how to recover.

Queue Architecture

Each integration or OAuth client has a dedicated per-integration queue. When an event fires, a message is appended to that queue and delivered in strict FIFO order. This means:

  • Messages are always delivered in the order they were received.
  • If one delivery fails, all subsequent messages for that integration are held until the queue is unblocked.

Queue States

A queue has two independent properties: status and state.

status — delivery health

ValueDescription
HEALTHYThe last delivery attempt succeeded. Messages flow normally.
BLOCKEDThe last delivery attempt failed. Delivery is paused for this queue.

state — processing lock

ValueDescription
READYThe queue is idle. The next message can be picked up.
LOCKEDA message is currently being delivered. The queue will not pick up another message until the in-flight attempt resolves.

The queue is in LOCKED state while a message is in flight (i.e., the HTTP POST has been sent but no result has been recorded yet). Once the result is received — success or error — the state returns to READY.

What Triggers a Failure

A delivery attempt is considered failed if any of the following occur:

  • The HTTP response status code is not in the 2xx range (i.e., outside 200–299).
  • The HTTP request throws an exception (connection refused, timeout, DNS failure, TLS error, etc.).

On failure, QueueProcessor.error() is called. This:

  1. Records lastTriedAt and lastErrorMessage on the failed message.
  2. Sets the queue status to BLOCKED and state to READY.
  3. Emits an internal Kafka event to applications.message.failed.

No automatic retry is scheduled. The queue stays BLOCKED until an operator or support team manually resolves the situation (see Recovery Procedures below).

What Happens When Your Endpoint Is Down

  1. Message enqueued — the domain event fires and the message is persisted in gb_integration_queue_message. New events continue to be accepted.
  2. Delivery attemptedQueueProcessor.send() tries to POST to your endPoint.
  3. Failure recorded — the message is updated with the error details and the queue is marked BLOCKED.
  4. Delivery paused — no further messages are delivered to that endpoint. Subsequent events pile up in the queue.
  5. Your webhook's health field — visible via the GraphQL API — reflects FAILING.

New events are still accepted and stored in the queue while it is blocked. No events are lost; they are simply waiting for the queue to be unblocked.

How Blocked Status Works

When status = BLOCKED, QueueProcessor.next() refuses to pick up the next message unless forced. The pseudologic is:

if queue.status == BLOCKED or queue.state == LOCKED:
    return  # do nothing

Only a force = true call (issued internally after a successful delivery) can advance past a blocked queue. This ensures that a known-bad endpoint does not receive a flood of retries.

Recovery Procedures

Because retries are not automatic, recovery requires a manual action:

  1. Fix your endpoint — ensure the URL is reachable, returns 2xx responses, and can handle the full payload within the timeout window.
  2. Contact support — ask the HappyColis support team to unblock the queue. They will reset status to HEALTHY and trigger QueueProcessor.next() to resume delivery from the oldest pending message.
  3. After unblocking — the queue replays messages in FIFO order, starting from the message that originally failed (it is still in the queue with its error metadata).

Delivery Confirmation

A delivery is confirmed successful when:

  • QueueProcessor.send() receives an HTTP response with a 2xx status code.
  • QueueProcessor.success() deletes the message from gb_integration_queue_message.
  • The queue status is reset to HEALTHY and state to READY.
  • An internal Kafka event is emitted to applications.message.processed.

Message Retention

Messages remain in gb_integration_queue_message until they are successfully delivered. There is no automatic expiry — if a queue stays blocked indefinitely, its messages are retained indefinitely. This is by design to prevent data loss.

Payload Signing and Delivery

Every HTTP POST includes an HMAC-SHA256 signature computed over the full JSON body using the application's clientSecret. If your endpoint rejects the request because of a signature mismatch, the queue will be marked BLOCKED just like any other failure.

See Signature Verification for details on how to verify the signature on your side.

Summary

SituationEffect
2xx response receivedMessage deleted, queue stays HEALTHY, next message dispatched
Non-2xx response receivedQueue set to BLOCKED, error recorded, no further delivery
Network/connection exceptionQueue set to BLOCKED, error recorded, no further delivery
Queue is BLOCKEDNew events accepted and queued, but not delivered
Queue is LOCKEDNo new dispatch until in-flight attempt resolves

See Also

HappyColis API Documentation