> ## Documentation Index
> Fetch the complete documentation index at: https://docs.qfex.com/llms.txt
> Use this file to discover all available pages before exploring further.

# API Rate Limits

> This page explains how request throttling works on our **WebSocket trading API**. If you’re writing a client, read this to avoid `RateLimited` errors.

## Summary

* Rate limits are **per user** and apply across **all** of your concurrent WebSocket connections.
* The limit is **12,000 weight units per 60 seconds**.
* We use an **Exponential Moving Average (EMA)** model to smooth bursts rather than a simple per-second counter.
* There are **two independent buckets** per user:
  * **General bucket** for orders, queries, subscriptions, etc.
  * **Cancel bucket** for cancel-related messages.
* Each message type has an internal **weight**. Heavier actions (e.g., placing orders) consume more budget than light ones (e.g., subscribing).

> Limits and weights may change over time to protect system stability. Always handle `RateLimited` responses gracefully rather than assuming fixed throughput.

***

## Message weights

Current values:

| Message                          | Weight |
| -------------------------------- | :----: |
| add\_order                       |   1.0  |
| cancel\_order                    |   1.0  |
| modify\_order                    |   1.0  |
| get\_order                       |   2.0  |
| get\_user\_orders                |   5.0  |
| cancel\_all\_orders              |   2.0  |
| get\_user\_trades                |   0.5  |
| subscribe                        |   0.1  |
| unsubscribe                      |   0.1  |
| get\_user\_leverage              |   0.1  |
| get\_available\_leverage\_levels |   0.1  |
| set\_user\_leverage              |   0.1  |
| cancel\_stop\_order              |   1.0  |
| modify\_stop\_order              |   1.0  |
| cancel\_on\_disconnect           |   0.1  |

The current limit is **12,000 units per 60 seconds** (200 units/sec). An `add_order` (1.0) uses 1 unit, so you can sustainably place \~200 orders/sec.

***

## What’s limited

These limits apply to **WebSocket inbound messages** after successful authentication. HTTP endpoints and the initial WebSocket handshake may be subject to separate controls.

***

## How it works

We maintain an **EMA of your weighted message rate**. Each incoming message adds its weight; between messages the EMA decays automatically.

* If your EMA rises above the allowed threshold for a bucket, the next message for that bucket is rejected with a `RateLimited` error.
* The error includes a **retry-after** hint so your client knows when it is safe to try again.

Because cancels use a **separate bucket**, you can often cancel even when your general bucket is momentarily saturated (subject to the cancel bucket’s own limits).

***

## Per-user, cross-connection

If you open multiple WebSocket connections, their traffic **aggregates** into the same per-user budgets (one general, one cancel). Opening more sockets does **not** increase your effective allowance.

***

## Error you may see

When a message is throttled, you’ll receive a unicast error on your WebSocket:

```json theme={null}
{
  "type": "Err",
  "error_code": "RateLimited",
  "message": "Rate limit exceeded, retry after N seconds",
  "incoming_message": {
    /* your original request */
  }
}
```

Use the `retry after` value as guidance for when to resend.

***

## Client best practices

1. **Backoff with jitter**\
   On `RateLimited`, wait the suggested number of seconds **plus a small random jitter** (e.g., 50–200 ms) before retrying. This avoids thundering herds.

2. **Coalesce & batch**\
   Prefer fewer, purposeful messages over many tiny ones (especially modifies).

3. **Respect cancel bucket**\
   Cancels have their own allowance to help you unwind risk quickly. It’s separate, not unlimited—avoid bursty cancel storms.

4. **Stagger across connections**\
   If you run multiple connections, **stagger** bursts. They share the same per-user budgets.

5. **Idempotency**\
   Use client order IDs where supported so a retry doesn’t create duplicates if the first attempt actually succeeded server-side.

6. **Subscription hygiene**\
   Subscriptions are light but not free. Avoid repeatedly subscribing/unsubscribing in short intervals.

***

## FAQs

**Is the limit per IP or per API key?**\
Per **user**. Multiple API keys tied to the same user share the same budgets.

**Do reads count against the limit?**\
Yes, but they are lighter than writes.

**Can I burst?**\
Short bursts are tolerated due to EMA smoothing. Sustained rates above the threshold will be throttled.

**Can I request higher limits?**\
If you have a production use case that requires more throughput, contact us to discuss dedicated quotas.

***

## Change policy

We may adjust rate-limit thresholds and message weights from time to time to maintain platform reliability. Such changes do not require client updates, provided your integration handles `RateLimited` responses and performs backoff as described above.
