View Your Limits
Interactive playground
Rate Limit Logs
See which requests hit limits
Default Limits
Text Models
Text models are grouped into tiers based on size. Each model card on the Models page displays its tier badge.| Tier | Requests/min | Tokens/min |
|---|---|---|
| XS | 500 | 1,000,000 |
| S | 75 | 750,000 |
| M | 50 | 750,000 |
| L | 20 | 500,000 |
Which models are in each tier?
Which models are in each tier?
XS
qwen3-4b llama-3.2-3bS mistral-31-24b venice-uncensoredM llama-3.3-70b qwen3-next-80b google-gemma-3-27b-itL qwen3-235b-a22b-instruct-2507 qwen3-235b-a22b-thinking-2507 deepseek-ai-DeepSeek-R1 grok-41-fast kimi-k2-thinking gemini-3-pro-preview hermes-3-llama-3.1-405b qwen3-coder-480b-a35b-instruct zai-org-glm-4.7 openai-gpt-oss-120bOther Models
| Type | Requests/min |
|---|---|
| Image | 20 |
| Audio | 60 |
| Embedding | 500 |
| Video (queue) | 40 |
| Video (retrieve) | 120 |
Handling Errors
Failed requests (500, 503, 429) should be retried with exponential backoff. For 429 errors specifically, check thex-ratelimit-reset-requests header for the exact Unix timestamp when you can retry. Most HTTP libraries have built-in retry mechanisms that handle this automatically.
Abuse Protection
If you generate more than 20 failed requests in 30 seconds, the API will block further requests for 30 seconds:Response Headers
Every response includes these headers:| Header | Description |
|---|---|
x-ratelimit-limit-requests | Max requests allowed in current window |
x-ratelimit-remaining-requests | Requests remaining in current window |
x-ratelimit-reset-requests | Unix timestamp when window resets |
x-ratelimit-limit-tokens | Max tokens allowed per minute |
x-ratelimit-remaining-tokens | Tokens remaining in current minute |
x-ratelimit-reset-tokens | Seconds until token limit resets |
Partner Tier
Partners get significantly higher rate limits:| Tier | Requests/min | Tokens/min |
|---|---|---|
| XS | 500 | 2,000,000 |
| S | 150 | 1,500,000 |
| M | 100 | 1,500,000 |
| L | 60 | 1,000,000 |
| Type | Requests/min |
|---|---|
| Image | 60 |
| Audio | 120 |
| Embedding | 500 |