Summary

  • Rate limits are per user and apply across all of your concurrent WebSocket connections.
  • We use an Exponential Moving Average (EMA) model to smooth bursts rather than a simple per-second counter.
  • There are two independent buckets per user:
    • General bucket for orders, queries, subscriptions, etc.
    • Cancel bucket for cancel-related messages.
  • Each message type has an internal weight. Heavier actions (e.g., placing orders) consume more budget than light ones (e.g., subscribing).
Limits and weights may change over time to protect system stability. Always handle RateLimited responses gracefully rather than assuming fixed throughput.

Message weights

Current values:
MessageWeight
add_order2.0
cancel_order2.0
modify_order2.0
get_order0.5
get_user_orders0.5
cancel_all_orders2.0
get_user_trades0.5
subscribe0.1
unsubscribe0.1
get_user_leverage0.1
get_available_leverage_levels0.1
set_user_leverage0.1
cancel_stop_order0.1
modify_stop_order0.1
transfer_balance0.1
cancel_on_disconnect0.1
Example: If max_load = 5.0, an add_order (2.0) uses ~40% of your budget. You can sustainably place ~2–3 orders/sec.

What’s limited

These limits apply to WebSocket inbound messages after successful authentication. HTTP endpoints and the initial WebSocket handshake may be subject to separate controls.

How it works

We maintain an EMA of your weighted message rate. Each incoming message adds its weight; between messages the EMA decays automatically.
  • If your EMA rises above the allowed threshold for a bucket, the next message for that bucket is rejected with a RateLimited error.
  • The error includes a retry-after hint so your client knows when it is safe to try again.
Because cancels use a separate bucket, you can often cancel even when your general bucket is momentarily saturated (subject to the cancel bucket’s own limits).

Per-user, cross-connection

If you open multiple WebSocket connections, their traffic aggregates into the same per-user budgets (one general, one cancel). Opening more sockets does not increase your effective allowance.

Error you may see

When a message is throttled, you’ll receive a unicast error on your WebSocket:
{
  "type": "Err",
  "error_code": "RateLimited",
  "message": "Rate limit exceeded, retry after N seconds",
  "incoming_message": {
    /* your original request */
  }
}
Use the retry after value as guidance for when to resend.

Client best practices

  1. Backoff with jitter
    On RateLimited, wait the suggested number of seconds plus a small random jitter (e.g., 50–200 ms) before retrying. This avoids thundering herds.
  2. Coalesce & batch
    Prefer fewer, purposeful messages over many tiny ones (especially modifies).
  3. Respect cancel bucket
    Cancels have their own allowance to help you unwind risk quickly. It’s separate, not unlimited—avoid bursty cancel storms.
  4. Stagger across connections
    If you run multiple connections, stagger bursts. They share the same per-user budgets.
  5. Idempotency
    Use client order IDs where supported so a retry doesn’t create duplicates if the first attempt actually succeeded server-side.
  6. Subscription hygiene
    Subscriptions are light but not free. Avoid repeatedly subscribing/unsubscribing in short intervals.

FAQs

Is the limit per IP or per API key?
Per user. Multiple API keys tied to the same user share the same budgets.
Do reads count against the limit?
Yes, but they are lighter than writes.
Can I burst?
Short bursts are tolerated due to EMA smoothing. Sustained rates above the threshold will be throttled.
Can I request higher limits?
If you have a production use case that requires more throughput, contact us to discuss dedicated quotas.

Change policy

We may adjust rate-limit thresholds and message weights from time to time to maintain platform reliability. Such changes do not require client updates, provided your integration handles RateLimited responses and performs backoff as described above.