This page describes the request and token rate limits for the Venice API.
Rate limits apply to users who have purchased API credits or staked VVV to gain Diem.
Helpful links:
Model | Model ID | Req / Min | Req / Day | Tokens / Min |
---|---|---|---|---|
Llama 3.2 3B | llama-3.2-3b | 500 | 288,000 | 1,000,000 |
Qwen 3 4B | qwen3-4b | 500 | 288,000 | 1,000,000 |
Deepseek Coder V2 | deepseek-coder-v2-lite | 75 | 54,000 | 750,000 |
Qwen 2.5 Coder 32B | qwen-2.5-coder-32b | 75 | 54,000 | 750,000 |
Qwen 2.5 QWQ 32B | qwen-2.5-qwq-32b | 75 | 54,000 | 750,000 |
Dolphin 72B | dolphin-2.9.2-qwen2-72b | 50 | 36,000 | 750,000 |
Llama 3.3 70B | llama-3.3-70b | 50 | 36,000 | 750,000 |
Mistral Small 3.1 24B | mistral-31-24b | 50 | 36,000 | 750,000 |
Qwen 2.5 VL 72B | qwen-2.5-vl | 50 | 36,000 | 750,000 |
Qwen 3 235B | qwen3-235b | 50 | 36,000 | 750,000 |
Llama 3.1 405B | llama-3.1-405b | 20 | 15,000 | 750,000 |
Deepseek R1 671B | deepseek-r1-671b | 15 | 10,000 | 200,000 |
Model | Model ID | Req / Min | Req / Day |
---|---|---|---|
Flux | flux-dev / flux-dev-uncensored | 20 | 14,400 |
All others | All | 20 | 28,800 |
Model | Model ID | Req / Min | Req / Day |
---|---|---|---|
All Audio Models | All | 60 | 86,400 |
You can monitor your API utilization and remaining requests by evaluating the following headers:
Header | Description |
---|---|
x-ratelimit-limit-requests | The number of requests you’ve made in the current evaluation period. |
x-ratelimit-remaining-requests | The remaining requests you can make in the current evaluation period. |
x-ratelimit-reset-requests | The unix time stamp when the rate limit will reset. |
x-ratelimit-limit-tokens | The number of total (prompt + completion) tokens used within a 1 minute sliding window. |
x-ratelimit-remaining-tokens | The remaining number of total tokens that can be used during the evaluation period. |
x-ratelimit-reset-tokens | The duration of time in seconds until the token rate limit resets. |
x-venice-balance-diem | The user’s Diem balance before the request has been processed. |
x-venice-balance-usd | The user’s USD balance before the request has been processed. |
This page describes the request and token rate limits for the Venice API.
Rate limits apply to users who have purchased API credits or staked VVV to gain Diem.
Helpful links:
Model | Model ID | Req / Min | Req / Day | Tokens / Min |
---|---|---|---|---|
Llama 3.2 3B | llama-3.2-3b | 500 | 288,000 | 1,000,000 |
Qwen 3 4B | qwen3-4b | 500 | 288,000 | 1,000,000 |
Deepseek Coder V2 | deepseek-coder-v2-lite | 75 | 54,000 | 750,000 |
Qwen 2.5 Coder 32B | qwen-2.5-coder-32b | 75 | 54,000 | 750,000 |
Qwen 2.5 QWQ 32B | qwen-2.5-qwq-32b | 75 | 54,000 | 750,000 |
Dolphin 72B | dolphin-2.9.2-qwen2-72b | 50 | 36,000 | 750,000 |
Llama 3.3 70B | llama-3.3-70b | 50 | 36,000 | 750,000 |
Mistral Small 3.1 24B | mistral-31-24b | 50 | 36,000 | 750,000 |
Qwen 2.5 VL 72B | qwen-2.5-vl | 50 | 36,000 | 750,000 |
Qwen 3 235B | qwen3-235b | 50 | 36,000 | 750,000 |
Llama 3.1 405B | llama-3.1-405b | 20 | 15,000 | 750,000 |
Deepseek R1 671B | deepseek-r1-671b | 15 | 10,000 | 200,000 |
Model | Model ID | Req / Min | Req / Day |
---|---|---|---|
Flux | flux-dev / flux-dev-uncensored | 20 | 14,400 |
All others | All | 20 | 28,800 |
Model | Model ID | Req / Min | Req / Day |
---|---|---|---|
All Audio Models | All | 60 | 86,400 |
You can monitor your API utilization and remaining requests by evaluating the following headers:
Header | Description |
---|---|
x-ratelimit-limit-requests | The number of requests you’ve made in the current evaluation period. |
x-ratelimit-remaining-requests | The remaining requests you can make in the current evaluation period. |
x-ratelimit-reset-requests | The unix time stamp when the rate limit will reset. |
x-ratelimit-limit-tokens | The number of total (prompt + completion) tokens used within a 1 minute sliding window. |
x-ratelimit-remaining-tokens | The remaining number of total tokens that can be used during the evaluation period. |
x-ratelimit-reset-tokens | The duration of time in seconds until the token rate limit resets. |
x-venice-balance-diem | The user’s Diem balance before the request has been processed. |
x-venice-balance-usd | The user’s USD balance before the request has been processed. |