Skip to main content
Rate limits vary by model and tier. You can check your exact limits anytime:
curl https://api.venice.ai/api/v1/api_keys/rate_limits \
  -H "Authorization: Bearer $VENICE_API_KEY"

Default Limits

Text Models

Text models are grouped into tiers based on size. Each model card on the Models page displays its tier badge.
TierRequests/minTokens/min
XS5001,000,000
S75750,000
M50750,000
L20500,000
XS qwen3-4b llama-3.2-3bS mistral-31-24b venice-uncensoredM llama-3.3-70b qwen3-next-80b google-gemma-3-27b-itL qwen3-235b-a22b-instruct-2507 qwen3-235b-a22b-thinking-2507 deepseek-ai-DeepSeek-R1 grok-41-fast kimi-k2-thinking gemini-3-pro-preview hermes-3-llama-3.1-405b qwen3-coder-480b-a35b-instruct zai-org-glm-4.6 openai-gpt-oss-120b

Other Models

TypeRequests/min
Image20
Audio60
Embedding500
Video (queue)20
Video (retrieve)120

Handling Errors

Failed requests (500, 503, 429) should be retried with exponential backoff. For 429 errors specifically, check the x-ratelimit-reset-requests header for the exact Unix timestamp when you can retry. Most HTTP libraries have built-in retry mechanisms that handle this automatically.

Abuse Protection

If you generate more than 20 failed requests in 30 seconds, the API will block further requests for 30 seconds:
Too many failed attempts (> 20) resulting in a non-success status code. Please wait 30s and try again.

Response Headers

Every response includes these headers:
HeaderDescription
x-ratelimit-limit-requestsMax requests allowed in current window
x-ratelimit-remaining-requestsRequests remaining in current window
x-ratelimit-reset-requestsUnix timestamp when window resets
x-ratelimit-limit-tokensMax tokens allowed per minute
x-ratelimit-remaining-tokensTokens remaining in current minute
x-ratelimit-reset-tokensSeconds until token limit resets

Partner Tier

Partners get significantly higher rate limits:
TierRequests/minTokens/min
XS5002,000,000
S1501,500,000
M1001,500,000
L601,000,000
TypeRequests/min
Image60
Audio120
Embedding500
If you’re consistently hitting your rate limits and your usage patterns show sustained demand over time, reach out to discuss partner access: [email protected]. Partner tier limits can be adjusted based on your specific needs.