Skip to main content
Get up and running with the Venice API in minutes. Generate an API key, make your first request, and start building.

Quickstart

1

Get your API key

Head to your Venice API Settings and generate a new API key.For a detailed walkthrough with screenshots, check out the API Key guide.
2

Set up your API key

Add your API key to your environment. You can export it in your shell:
export VENICE_API_KEY='your-api-key-here'
Or add it to a .env file in your project:
VENICE_API_KEY=your-api-key-here
3

Install the SDK

Venice is OpenAI-compatible, so you can use the OpenAI SDK. If you prefer to use cURL or raw HTTP requests, you can skip this step.
pip install openai
4

Send your first request

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("VENICE_API_KEY"),
    base_url="https://api.venice.ai/api/v1"
)

completion = client.chat.completions.create(
    model="venice-uncensored",
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant"},
        {"role": "user", "content": "Why is privacy important?"}
    ]
)

print(completion.choices[0].message.content)
Message roles:
  • system - Instructions for how the model should behave
  • user - Your prompts or questions
  • assistant - Previous model responses (for multi-turn conversations)
  • tool - Function calling results (when using tools)
5

Choose your model (optional)

Venice has multiple models for different use cases. Popular choices:
  • llama-3.3-70b - Balanced performance, great for most use cases
  • zai-org-glm-4.7 - Flagship model for complex tasks and deep reasoning
  • qwen3-vl-235b-a22b - Vision support
  • venice-uncensored - No content filtering

View All Models

Browse the complete list of models with pricing, capabilities, and context limits
6

Use Venice Parameters

You can choose to enable Venice-specific features like web search using venice_parameters:
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("VENICE_API_KEY"),
    base_url="https://api.venice.ai/api/v1"
)

completion = client.chat.completions.create(
    model="venice-uncensored",
    messages=[
        {"role": "user", "content": "What are the latest developments in AI?"}
    ],
    extra_body={
        "venice_parameters": {
            "enable_web_search": "auto",
            "include_venice_system_prompt": True
        }
    }
)

print(completion.choices[0].message.content)
See all available parameters.
7

Enable streaming (optional)

Stream responses in real-time using stream=True:
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("VENICE_API_KEY"),
    base_url="https://api.venice.ai/api/v1"
)

stream = client.chat.completions.create(
    model="venice-uncensored",
    messages=[{"role": "user", "content": "Write a short story about AI"}],
    stream=True
)

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")
8

Customize response behavior (optional)

Control how the model responds with parameters like temperature, max tokens, and more:
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("VENICE_API_KEY"),
    base_url="https://api.venice.ai/api/v1"
)

completion = client.chat.completions.create(
    model="venice-uncensored",
    messages=[
        {"role": "system", "content": "You are a creative storyteller"},
        {"role": "user", "content": "Tell me a creative story"}
    ],
    temperature=0.8,
    max_tokens=500,
    top_p=0.9,
    frequency_penalty=0.5,
    presence_penalty=0.5,
    extra_body={
        "venice_parameters": {
            "include_venice_system_prompt": False
        }
    }
)

print(completion.choices[0].message.content)
Check out the Chat Completions docs for more information on all supported parameters.

More Capabilities

Image Generation

Create images from text prompts using diffusion models:
import os
import requests

url = "https://api.venice.ai/api/v1/image/generate"

payload = {
    "model": "venice-sd35",
    "prompt": "A cyberpunk city with neon lights and rain",
    "width": 1024,
    "height": 1024,
    "format": "webp"
}

headers = {
    "Authorization": f"Bearer {os.getenv('VENICE_API_KEY')}",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())
Note: The response returns base64-encoded images in the images array. Decode the base64 string to save or display the image. Popular Image Models:
  • qwen-image - Highest quality image generation
  • venice-sd35 - Default choice, works with all features
  • hidream - Fast generation for production use

View All Image Models

See all available image models with pricing and capabilities
For more advanced parameter options like cfg_scale, negative_prompt, style_preset, seed, variants, and more, check out the Images API Reference.

Image Editing

Modify existing images with AI-powered inpainting using the Qwen-Image model:
import os
import requests
import base64

url = "https://api.venice.ai/api/v1/image/edit"

with open("image.jpg", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode('utf-8')

payload = {
    "prompt": "Colorize",
    "image": image_base64
}

headers = {
    "Authorization": f"Bearer {os.getenv('VENICE_API_KEY')}",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

with open("edited_image.png", "wb") as f:
    f.write(response.content)
Note: The image editor uses the Qwen-Image model and is an experimental endpoint. Send the input image as a base64-encoded string, and the API returns the edited image as binary data. See the Image Edit API for all parameters.

Image Upscaling

Enhance and upscale images to higher resolutions:
import os
import requests
import base64

url = "https://api.venice.ai/api/v1/image/upscale"

with open("image.jpg", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode('utf-8')

payload = {
    "image": image_base64,
    "scale": 2
}

headers = {
    "Authorization": f"Bearer {os.getenv('VENICE_API_KEY')}",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

with open("upscaled_image.png", "wb") as f:
    f.write(response.content)
Note: Send the input image as a base64-encoded string, and the API returns the upscaled image as binary data. See the Image Upscale API for all parameters.

Text-to-Speech

Convert text to audio with 50+ multilingual voices:
import os
import requests

response = requests.post(
    "https://api.venice.ai/api/v1/audio/speech",
    headers={
        "Authorization": f"Bearer {os.getenv('VENICE_API_KEY')}",
        "Content-Type": "application/json"
    },
    json={
        "input": "Hello, welcome to Venice Voice.",
        "model": "tts-kokoro",
        "voice": "af_sky"
    }
)

with open("speech.mp3", "wb") as f:
    f.write(response.content)
The tts-kokoro model supports 50+ multilingual voices including af_sky, af_nova, am_liam, bf_emma, zf_xiaobei, and jm_kumo. See the TTS API for all voice options.

Speech-to-Text

Transcribe audio files to text:
import os
import requests

url = "https://api.venice.ai/api/v1/audio/transcriptions"

with open("audio.mp3", "rb") as f:
    response = requests.post(
        url,
        headers={"Authorization": f"Bearer {os.getenv('VENICE_API_KEY')}"},
        files={"file": f},
        data={
            "model": "nvidia/parakeet-tdt-0.6b-v3",
            "response_format": "json"
        }
    )

print(response.json())
Supported formats: WAV, FLAC, MP3, M4A, AAC, MP4. Enable timestamps=true to get word-level timing data. See the Transcriptions API for all options.

Embeddings

Generate vector embeddings for semantic search, RAG, and recommendations:
import os
import requests

url = "https://api.venice.ai/api/v1/embeddings"

payload = {
    "model": "text-embedding-bge-m3",
    "input": "Privacy-first AI infrastructure for semantic search",
    "encoding_format": "float"
}

headers = {
    "Authorization": f"Bearer {os.getenv('VENICE_API_KEY')}",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())
See the Embeddings API for batch processing and advanced options.

Vision (Multimodal)

Analyze images alongside text using vision-capable models like qwen3-vl-235b-a22b:
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("VENICE_API_KEY"),
    base_url="https://api.venice.ai/api/v1"
)

response = client.chat.completions.create(
    model="qwen3-vl-235b-a22b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://www.gstatic.com/webp/gallery/1.jpg"}
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

Function Calling

Define functions that models can call to interact with external tools and APIs:
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("VENICE_API_KEY"),
    base_url="https://api.venice.ai/api/v1"
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="zai-org-glm-4.7",
    messages=[{"role": "user", "content": "What's the weather in San Francisco?"}],
    tools=tools
)

print(response.choices[0].message)

Next Steps

Now that you’ve made your first requests, explore more of what Venice API has to offer:

Browse Models

Compare all available models with their capabilities, pricing, and context limits

API Reference

Explore detailed API documentation with all endpoints and parameters

Structured Responses

Learn how to get JSON responses with guaranteed schemas

AI Agents Guide

Build autonomous AI agents with Venice API and frameworks like Eliza

Additional Resources

Rate Limiting

Understand rate limits and best practices for production usage

Error Codes

Reference for handling API errors and troubleshooting issues

Postman Collection

Import our complete Postman collection for easy testing

Privacy & Security

Learn about Venice’s privacy-first architecture and data handling

Need Help?