Rate Limiting

The HappyColis API enforces rate limits to ensure fair usage and platform stability. This page describes the rate limiting policy, the response headers to monitor, and how to implement resilient retry logic in your integration.

Rate Limit Headers

Every API response includes the following headers so you can track your current usage:

Header	Description
`X-RateLimit-Limit`	Maximum number of requests allowed in the current window
`X-RateLimit-Remaining`	Number of requests remaining in the current window
`X-RateLimit-Reset`	Unix timestamp (seconds) when the current window resets
`Retry-After`	Seconds to wait before retrying (only present on `429` responses)

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 842
X-RateLimit-Reset: 1705321200

Default Limits

Tier	Requests per window	Window duration
Standard	1,000	1 minute
Burst	100	1 second

Rate limits are applied per access token (i.e. per organization + client combination). Heavy batch operations count as multiple requests based on query complexity.

Contact HappyColis support if your integration requires higher limits.

429 Too Many Requests

When you exceed the rate limit, the API returns HTTP 429:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 30
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1705321200

json

{
  "statusCode": 429,
  "message": "Too many requests. Please retry after 30 seconds."
}

Handling Rate Limits

Exponential Backoff

When you receive a 429 response, wait for the duration specified in Retry-After before retrying. If Retry-After is absent, use exponential backoff with jitter.

Node.jsPython

async function gqlRequestWithRetry(accessToken, query, variables = {}, maxRetries = 5) {
  let attempt = 0;

  while (attempt <= maxRetries) {
    const response = await fetch('https://api-v3.happycolis.com/graphql', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${accessToken}`,
      },
      body: JSON.stringify({ query, variables }),
    });

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60', 10);
      if (attempt === maxRetries) {
        throw new Error(`Rate limit exceeded after ${maxRetries} retries`);
      }
      // Add jitter: wait retryAfter seconds ± 10%
      const jitter = retryAfter * 0.1 * (Math.random() * 2 - 1);
      const waitMs = (retryAfter + jitter) * 1000;
      console.warn(`Rate limited. Retrying in ${Math.round(waitMs / 1000)}s (attempt ${attempt + 1}/${maxRetries})`);
      await new Promise(resolve => setTimeout(resolve, waitMs));
      attempt++;
      continue;
    }

    const result = await response.json();
    if (result.errors?.length) throw new Error(result.errors[0].message);
    return result.data;
  }
}

python

import time
import random
import requests

def gql_request_with_retry(access_token, query, variables=None, max_retries=5):
    attempt = 0
    while attempt <= max_retries:
        response = requests.post(
            'https://api-v3.happycolis.com/graphql',
            headers={
                'Authorization': f'Bearer {access_token}',
                'Content-Type': 'application/json',
            },
            json={'query': query, 'variables': variables or {}},
        )

        if response.status_code == 429:
            if attempt == max_retries:
                raise Exception(f'Rate limit exceeded after {max_retries} retries')

            retry_after = int(response.headers.get('Retry-After', 60))
            # Add ±10% jitter
            jitter = retry_after * 0.1 * (random.random() * 2 - 1)
            wait = retry_after + jitter
            print(f'Rate limited. Retrying in {wait:.1f}s (attempt {attempt + 1}/{max_retries})')
            time.sleep(wait)
            attempt += 1
            continue

        result = response.json()
        if 'errors' in result and result['errors']:
            raise Exception(result['errors'][0]['message'])
        return result['data']

Best Practices

Monitor Headers Proactively

Check X-RateLimit-Remaining on every response. When remaining falls below a threshold (e.g. 10% of limit), slow down your request rate before hitting the limit.

function checkRateLimitHeaders(response) {
  const remaining = parseInt(response.headers.get('X-RateLimit-Remaining') || '0', 10);
  const limit = parseInt(response.headers.get('X-RateLimit-Limit') || '1000', 10);
  const resetAt = parseInt(response.headers.get('X-RateLimit-Reset') || '0', 10);

  if (remaining < limit * 0.1) {
    const waitMs = Math.max(0, resetAt * 1000 - Date.now());
    console.warn(`Approaching rate limit (${remaining}/${limit} remaining). Reset in ${Math.round(waitMs / 1000)}s`);
  }

  return { remaining, limit, resetAt };
}

Batch Requests

Use GraphQL's ability to request multiple fields in a single query to minimize the number of API calls:

graphql

# Fetch multiple resources in one request
query GetOrderAndShipment($orderId: String!, $shipmentId: String!) {
  order(id: $orderId) {
    id
    state
    lines { sku quantity }
  }
  shipment(shipmentId: $shipmentId) {
    id
    state
    lastEvent
  }
}

Distribute Load Over Time

For bulk operations (e.g. syncing a catalogue), spread requests across time rather than sending them all at once. Use a queue with a controlled concurrency limit:

async function batchProcess(items, accessToken, processFn, ratePerSecond = 10) {
  const delayMs = 1000 / ratePerSecond;
  const results = [];

  for (const item of items) {
    results.push(await processFn(accessToken, item));
    await new Promise(resolve => setTimeout(resolve, delayMs));
  }

  return results;
}

Cache Read Results

Cache frequently-accessed data like product catalogues, location lists, and stock references with a reasonable TTL. Use webhook subscriptions to invalidate the cache when the underlying data changes, rather than polling.

Rate Limiting ​

Rate Limit Headers ​

Default Limits ​

429 Too Many Requests ​

Handling Rate Limits ​

Exponential Backoff ​

Best Practices ​

Monitor Headers Proactively ​

Batch Requests ​

Distribute Load Over Time ​

Cache Read Results ​