Venice AI is a drop-in replacement for OpenAI. Same SDK, same code — just change two lines. Get privacy-first inference, uncensored models, and competitive pricing.
The 2-Line Migration
Python
# Before (OpenAI)
from openai import OpenAI
client = OpenAI()
# After (Venice) — change api_key and base_url
from openai import OpenAI
client = OpenAI(
api_key = "your-venice-api-key" , # ← Change 1
base_url = "https://api.venice.ai/api/v1" # ← Change 2
)
Node.js
// Before (OpenAI)
import OpenAI from 'openai' ;
const client = new OpenAI ();
// After (Venice)
import OpenAI from 'openai' ;
const client = new OpenAI ({
apiKey: 'your-venice-api-key' ,
baseURL: 'https://api.venice.ai/api/v1' ,
});
cURL
# Before
curl https://api.openai.com/v1/chat/completions ...
# After — just change the URL and key
curl https://api.venice.ai/api/v1/chat/completions ...
Environment Variables
# Before
OPENAI_API_KEY = sk-...
OPENAI_BASE_URL = https://api.openai.com/v1
# After
OPENAI_API_KEY = your-venice-api-key
OPENAI_BASE_URL = https://api.venice.ai/api/v1
Many libraries and tools read OPENAI_API_KEY and OPENAI_BASE_URL automatically. Just updating these env vars may be all you need.
Model Mapping
OpenAI Model Venice Equivalent Type Pricing (Input/Output per 1M) gpt-4o zai-org-glm-4.7 (Private)Text 0.55 / 0.55 / 0.55/ 2.65gpt-4o openai-gpt-52 (Anonymized)Text 2.19 / 2.19 / 2.19/ 17.50gpt-4o-mini qwen3-4bText 0.05 / 0.05 / 0.05/ 0.15gpt-4-turbo mistral-31-24bText 0.50 / 0.50 / 0.50/ 2.00o1 / o3 qwen3-235b-a22b-thinking-2507 (Private)Reasoning 0.45 / 0.45 / 0.45/ 3.50o1 / o3 grok-41-fast (Anonymized)Reasoning 0.50 / 0.50 / 0.50/ 1.25gpt-4-vision mistral-31-24b or qwen3-vl-235b-a22bVision 0.50 / 0.50 / 0.50/ 2.00text-embedding-3-small text-embedding-bge-m3Embeddings 0.15 / 0.15 / 0.15/ 0.60dall-e-3 qwen-image (Private, $0.01) or flux-2-proImage From $0.01 whisper nvidia/parakeet-tdt-0.6b-v3STT $0.0001/sec tts-1 tts-kokoroTTS $3.50/1M chars
Feature Compatibility
Feature OpenAI Venice Notes Chat Completions ✅ ✅ Fully compatible Streaming ✅ ✅ SSE format identical Function Calling ✅ ✅ Same tools parameter Structured Output ✅ ✅ Same response_format Vision ✅ ✅ Same content array format Embeddings ✅ ✅ Same API Image Generation ✅ ✅ OpenAI-compatible via /images/generations* TTS ✅ ✅ Compatible STT ✅ ✅ Compatible Assistants API ✅ ❌ Use Characters or Minds instead Batch API ✅ ❌ Not yet available Fine-tuning ✅ ❌ Not available
*Venice also provides an OpenAI-compatible endpoint at POST /images/generations for easier migration from DALL-E. For Venice’s native image API with additional options, see Image Generate .
Venice-Only Features
Venice offers capabilities OpenAI doesn’t:
1. Built-in Web Search
response = client.chat.completions.create(
model = "venice-uncensored" ,
messages = [{ "role" : "user" , "content" : "Latest AI news today" }],
extra_body = {
"venice_parameters" : {
"enable_web_search" : "auto"
}
}
)
2. Web Scraping
response = client.chat.completions.create(
model = "venice-uncensored" ,
messages = [{ "role" : "user" , "content" : "Summarize https://example.com/article" }],
extra_body = {
"venice_parameters" : {
"enable_web_scraping" : True
}
}
)
3. Characters (AI Personas)
response = client.chat.completions.create(
model = "venice-uncensored" ,
messages = [{ "role" : "user" , "content" : "Tell me about yourself" }],
extra_body = {
"venice_parameters" : {
"character_slug" : "venice-ai"
}
}
)
4. Uncensored Models
Venice’s private models have no content filtering, making them suitable for:
Creative writing without guardrails
Security research and red teaming
Honest analysis without refusal patterns
Medical/legal information without disclaimers
5. Video Generation
# Queue a video generation job
import requests
response = requests.post(
"https://api.venice.ai/api/v1/video/queue" ,
headers = { "Authorization" : f "Bearer { api_key } " , "Content-Type" : "application/json" },
json = {
"model" : "wan-2.6-text-to-video" ,
"prompt" : "A serene lake at sunset with gentle waves" ,
"resolution" : "720p" ,
"duration" : 5 ,
}
)
job_id = response.json()[ "id" ]
Why Migrate?
Privacy
Zero data retention on private models — your prompts are never stored
No training on your data — ever
OpenAI retains data for 30 days and may use it for safety research
Cost
Private models are often cheaper than OpenAI equivalents
qwen3-4b at $0.05/1M input is 10x cheaper than gpt-4o-mini
venice-uncensored at 0.20 / 1 M i n p u t v s g p t − 4 o a t 0.20/1M input vs gpt-4o at 0.20/1 M in p u t v s g pt − 4 o a t 2.50/1M
Freedom
No content filtering on uncensored models
No account suspensions for controversial use cases
Web3-native with crypto payment options
DIEM staking for daily credits
Model Diversity
Access to models from multiple providers (Qwen, Llama, Mistral, Gemma, Claude, GPT, Grok, etc.)
Switch between private and anonymized models per request
New models added regularly
Framework Migration
Most AI frameworks work with Venice by changing the base URL:
Framework Change Required LangChain base_url in ChatOpenAIVercel AI SDK baseURL in createOpenAICrewAI OPENAI_API_BASE env varLlamaIndex api_base in OpenAIAutoGen base_url in configHaystack api_base_url in OpenAIGeneratorClaude Code --api-base flag or env varCursor Custom API endpoint in settings Continue.dev apiBase in config.json
Get Your API Key Generate a Venice API key and start migrating in minutes