# API Spec
Source: https://docs.venice.ai/api-reference/api-spec


## Swagger Configuration

You can find the complete swagger definition for the Venice API here:
[https://api.venice.ai/doc/api/swagger.yaml](https://api.venice.ai/doc/api/swagger.yaml)

***

## OpenAI Compatibility

Venice's API implements the OpenAI API specification, ensuring compatibility with existing OpenAI clients and tools. This document outlines how to integrate with Venice using this familiar interface. The image API supports Open AI's format, but for a full set of options, we also offer a custom Venice API you can utilize.

### Base Configuration

#### Required Base URL

All API requests must use Venice's base URL:

```javascript
const BASE_URL = "https://api.venice.ai/api/v1"
```

### Client Setup

Configure your OpenAI client with Venice's base URL:

```javascript
import OpenAI from "openai";

new OpenAI({
  apiKey: "--Your API Key--",
  baseURL: "https://api.venice.ai/api/v1",
});
```

## Available Endpoints

### Models

* **Endpoint**: `/api/v1/models`
* **Documentation**: [Models API Reference](/api-reference/endpoint/models/list)
* **Purpose**: Retrieve available models and their capabilities

### Chat Completions

* **Endpoint**: `/api/v1/chat/completions`
* **Documentation**: [Chat Completions API Reference](/api-reference/endpoint/chat/completions)
* **Purpose**: Generate text responses in a chat-like format

### Image generations

* \*\* Endpoint\*\*: `/api/v1/image/generations`
* **Documentation**: [Image Generations API Reference](/api-reference/endpoint/image/generations)
* **Purpose**: Generate images based on text prompts

## System Prompts

Venice provides default system prompts designed to ensure uncensored and natural model responses. You have two options for handling system prompts:

1. **Default Behavior**: Your system prompts are appended to Venice's defaults

2. **Custom Behavior**: Disable Venice's system prompts entirely

### Disabling Venice System Prompts

Use the `venice_parameters` option to remove Venice's default system prompts:

```javascript
const completionStream = await openAI.chat.completions.create({
  model: "default",
  messages: [
    {
      role: "system",
      content: "Your system prompt",
    },
    {
      role: "user",
      content: "Why is the sky blue?",
    },
  ],
  // @ts-expect-error Venice.ai paramters are unique to Venice.
  venice_parameters: {
    include_venice_system_prompt: false,
  },
});
```

## Best Practices

1. **Error Handling**: Implement robust error handling for API responses

2. **Rate Limiting**: Be mindful of rate limits during the beta period

3. **System Prompts**: Test both with and without Venice's system prompts to determine the best fit for your use case

4. **API Keys**: Keep your API keys secure and rotate them regularly

## Differences from OpenAI's API

While Venice maintains high compatibility with the OpenAI API specification, there are some Venice-specific features and parameters:

1. **venice\_parameters**: Venice offers additional configurations not available via OpenAI

2. **System Prompts**: Different default behavior for system prompt handling

3. **Model Names**: Venice provides transformation for some common OpenAI model selection to comparable Venice support models, although it is recommended to review the models available on Venice directly ([https://docs.venice.ai/api-reference/endpoint/models/list](https://docs.venice.ai/api-reference/endpoint/models/list))


# Create API Key
Source: https://docs.venice.ai/api-reference/endpoint/api_keys/create

POST /api_keys
Create a new API key.


# Delete API Key
Source: https://docs.venice.ai/api-reference/endpoint/api_keys/delete

DELETE /api_keys
Delete an API key.


# Generate API Key with Web3 Wallet
Source: https://docs.venice.ai/api-reference/endpoint/api_keys/generate_web3_key/get

GET /api_keys/generate_web3_key
Returns the token required to generate an API key via a wallet.

## Autonomous Agent API Key Creation

Please see [this guide](/overview/guides/generating-api-key-agent) on how to use this endpoint.

***


# Generate API Key with Web3 Wallet
Source: https://docs.venice.ai/api-reference/endpoint/api_keys/generate_web3_key/post

POST /api_keys/generate_web3_key
Authenticates a wallet holding sVVV and creates an API key.

## Autonomous Agent API Key Creation

Please see [this guide](/overview/guides/generating-api-key-agent) on how to use this endpoint.

***


# Get API Key Details
Source: https://docs.venice.ai/api-reference/endpoint/api_keys/get

GET /api_keys/{id}
Return details about a specific API key, including rate limits and balance data.


# List API Keys
Source: https://docs.venice.ai/api-reference/endpoint/api_keys/list

GET /api_keys
Return a list of API keys.


# Rate Limit Logs
Source: https://docs.venice.ai/api-reference/endpoint/api_keys/rate_limit_logs

GET /api_keys/rate_limits/log
Returns the last 50 rate limits that the account exceeded.

## Experimental Endpoint

<Warning>
  This is an experimental endpoint and may be subject to change.
</Warning>

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-b1bd9f3e-507b-46c5-ad35-be7419ea5ad3?action=share\&creator=38652128\&ctx=documentation\&active-environment=38652128-ef110f4e-d3e1-43b5-8029-4d6877e62041).


# Rate Limits and Balances
Source: https://docs.venice.ai/api-reference/endpoint/api_keys/rate_limits

GET /api_keys/rate_limits
Return details about user balances and rate limits.


# Speech API (Beta)
Source: https://docs.venice.ai/api-reference/endpoint/audio/speech

POST /audio/speech
Converts text to speech using various voice models and formats.


# Billing Usage API (Beta)
Source: https://docs.venice.ai/api-reference/endpoint/billing/usage

GET /billing/usage
Get paginated billing usage data for the authenticated user. NOTE: This is a beta endpoint and may be subject to change.

Exports usage data for a user. Descriptions of response fields can be found below:

* **timestamp**: The timestamp the billing usage entry was created
* **sku**: The product associated with the billing usage entry
* **pricePerUnitUsd**: The price per unit in USD
* **unit**: The number of units consumed
* **amount**: The total amount charged for the billing usage entry
* **currency**: The currency charged for the billing usage entry
* **notes**: Notes about the billing usage entry
* **inferenceDetails.requestId**: The request ID associated with the inference
* **inferenceDetails.inferenceExecutionTime**: Time taken for inference execution in milliseconds
* **inferenceDetails.promptTokens**: Number of tokens requested in the prompt. Only present for LLM usage.
* **inferenceDetails.completionTokens**: Number of tokens used in the completion. Only present for LLM usage.


# List Characters
Source: https://docs.venice.ai/api-reference/endpoint/characters/list

GET /characters
This is a preview API and may change. Returns a list of characters supported in the API.

## Experimental Endpoint

<Warning>
  This is an experimental endpoint and may be subject to change.
</Warning>

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-b1bd9f3e-507b-46c5-ad35-be7419ea5ad3?action=share\&creator=38652128\&ctx=documentation\&active-environment=38652128-ef110f4e-d3e1-43b5-8029-4d6877e62041).


# Chat Completions
Source: https://docs.venice.ai/api-reference/endpoint/chat/completions

POST /chat/completions
Run text inference based on the supplied parameters. Long running requests should use the streaming API by setting stream=true in your request.

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-5a71391b-5dd8-4fe8-80be-197a958907fe?action=share\&creator=38652128\&ctx=documentation\&active-environment=38652128-ef110f4e-d3e1-43b5-8029-4d6877e62041).

***


# Model Feature Suffix
Source: https://docs.venice.ai/api-reference/endpoint/chat/model_feature_suffix


Venice supports additional capabilities within it's models that can be powered by the `venice_parameters` input on the chat completions endpoint.

In certain circumstances, you may be using a client that does not let you modify the request body. For those platforms, you can utilize Venice's Model Feature Suffix offering to pass flags in via the model ID.

## Instructions

You can append any valid `venice_parameter` value to the end of the model ID as follows. These feature suffix should follow the model name with a `:` and you can chain multiple features together:

### To Set Web Search to Auto

```
default:enable_web_search=auto
```

### To Enable Web Search and Disable System Prompt

```
default:enable_web_search=on&include_venice_system_prompt=false
```

### To Enable Web Search and Add Citations to the Response

```
default:enable_web_search=on&enable_web_citations=true
```

### To Use a Character

```
default:character_slug=alan-watts
```

### To Hide Thinking Blocks on a Reasoning Model Response

```
qwen3-4b:strip_thinking_response=true
```

### To Disable Thinking on Supported Reasoning Models

Certain reasoning models (like Qwen 3) support disabling the thinking process. You can activate using the suffix below:

```
qwen3-4b:disable_thinking=true
```

### To Add Web Search Results to a Streaming Response

This will enable web search, add citations to the response body and include the search results in the stream as the final response message.

You can see an example of this in our [Postman Collection here](https://www.postman.com/veniceai/workspace/venice-ai-workspace/request/38652128-ceef3395-451c-4391-bc7e-a40377e0357b?action=share\&source=copy-link\&creator=38652128\&active-environment=ef110f4e-d3e1-43b5-8029-4d6877e62041).

```
qwen3-4b:enable_web_search=on&enable_web_citations=true&include_search_results_in_stream=true
```

### To Add Web Search Results to a Non-Streaming Response

## Postman Example

You can view an example of this feature in our [Postman Collection here](https://www.postman.com/veniceai/workspace/venice-ai-workspace/request/38652128-857f29ff-ee70-4c7c-beba-ef884bdc93be?action=share\&creator=38652128\&ctx=documentation\&active-environment=38652128-ef110f4e-d3e1-43b5-8029-4d6877e62041).


# Generate Embeddings
Source: https://docs.venice.ai/api-reference/endpoint/embeddings/generate

POST /embeddings
Create embeddings for the supplied input.


# Edit (aka Inpaint)
Source: https://docs.venice.ai/api-reference/endpoint/image/edit

POST /image/edit
Edit or modify an image based on the supplied prompt. The image can be provided either as a multipart form-data file upload or as a base64-encoded string in a JSON request.

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-2d156cd6-a9bc-4586-8a8b-98e4b5c4435d?action=share\&source=copy-link\&creator=38652128\&ctx=documentation).

***

<Warning>
  Venice’s image editor runs on the Flux Kontext Dev model, which blocks any request that tries to generate or add explicit sexual imagery, sexualise minors or make adults look child-like, or depict real-world violence or gore.
</Warning>


# Generate Images
Source: https://docs.venice.ai/api-reference/endpoint/image/generate

POST /image/generate
Generate an image based on input parameters

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-0adc004d-2edf-4b88-a3bb-0f868c791c9c?action=share\&source=copy-link\&creator=38652128\&ctx=documentation).

***


# Generate Images (OpenAI Compatible API)
Source: https://docs.venice.ai/api-reference/endpoint/image/generations

POST /images/generations
Generate an image based on input parameters using an OpenAI compatible endpoint. This endpoint does not support the full feature set of the Venice Image Generation endpoint, but is compatible with the existing OpenAI endpoint.


# Image Styles
Source: https://docs.venice.ai/api-reference/endpoint/image/styles

GET /image/styles
List available image styles that can be used with the generate API.

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-04b32328-197f-4548-b15e-79d4ab0728b1?action=share\&source=copy-link\&creator=38652128\&ctx=documentation).

***


# Upscale and Enhance
Source: https://docs.venice.ai/api-reference/endpoint/image/upscale

POST /image/upscale
Upscale or enhance an image based on the supplied parameters. Using a scale of 1 with enhance enabled will only run the enhancer. The image can be provided either as a multipart form-data file upload or as a base64-encoded string in a JSON request.

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-8c268e3a-614f-4e49-9816-e4b8d1597818?action=share\&source=copy-link\&creator=38652128\&ctx=documentation).

***


# Compatibility Mapping
Source: https://docs.venice.ai/api-reference/endpoint/models/compatibility_mapping

GET /models/compatibility_mapping
Returns a list of model compatibility mappings and the associated model.

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-59dfa959-7038-4cd8-b8ba-80cf09f2f026?action=share\&source=copy-link\&creator=38652128\&ctx=documentation).

***


# List Models
Source: https://docs.venice.ai/api-reference/endpoint/models/list

GET /models
Returns a list of available models supported by the Venice.ai API for both text and image inference.

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-59dfa959-7038-4cd8-b8ba-80cf09f2f026?action=share\&source=copy-link\&creator=38652128\&ctx=documentation).

***


# Traits
Source: https://docs.venice.ai/api-reference/endpoint/models/traits

GET /models/traits
Returns a list of model traits and the associated model.

## Postman Collection

For additional examples, please see this [Postman Collection](https://www.postman.com/veniceai/workspace/venice-ai-workspace/folder/38652128-59dfa959-7038-4cd8-b8ba-80cf09f2f026?action=share\&source=copy-link\&creator=38652128\&ctx=documentation).

***


# Error Codes
Source: https://docs.venice.ai/api-reference/error-codes

Predictable error codes for the Venice API

When an error occurs in the API, we return a consistent error response format that includes an error code, HTTP status code, and a descriptive message. This reference lists all possible error codes that you might encounter while using our API, along with their corresponding HTTP status codes and messages.

| Error Code                           | HTTP Status | Message                                                                                                           | Log Level |
| ------------------------------------ | ----------- | ----------------------------------------------------------------------------------------------------------------- | --------- |
| `AUTHENTICATION_FAILED`              | 401         | Authentication failed                                                                                             | -         |
| `AUTHENTICATION_FAILED_INACTIVE_KEY` | 401         | Authentication failed - Pro subscription is inactive. Please upgrade your subscription to continue using the API. | -         |
| `INVALID_API_KEY`                    | 401         | Invalid API key provided                                                                                          | -         |
| `UNAUTHORIZED`                       | 403         | Unauthorized access                                                                                               | -         |
| `INVALID_REQUEST`                    | 400         | Invalid request parameters                                                                                        | -         |
| `INVALID_MODEL`                      | 400         | Invalid model specified                                                                                           | -         |
| `CHARACTER_NOT_FOUND`                | 404         | No character could be found from the provided character\_slug                                                     | -         |
| `INVALID_CONTENT_TYPE`               | 415         | Invalid content type                                                                                              | -         |
| `INVALID_FILE_SIZE`                  | 413         | File size exceeds maximum limit                                                                                   | -         |
| `INVALID_IMAGE_FORMAT`               | 400         | Invalid image format                                                                                              | -         |
| `CORRUPTED_IMAGE`                    | 400         | The image file is corrupted or unreadable                                                                         | -         |
| `RATE_LIMIT_EXCEEDED`                | 429         | Rate limit exceeded                                                                                               | -         |
| `MODEL_NOT_FOUND`                    | 404         | Specified model not found                                                                                         | -         |
| `INFERENCE_FAILED`                   | 500         | Inference processing failed                                                                                       | error     |
| `UPSCALE_FAILED`                     | 500         | Image upscaling failed                                                                                            | error     |
| `UNKNOWN_ERROR`                      | 500         | An unknown error occurred                                                                                         | error     |


# Rate Limits
Source: https://docs.venice.ai/api-reference/rate-limiting

This page describes the request and token rate limits for the Venice API.

## Failed Request Rate Limits

Failed requests including 500 errors, 503 capacity errors, 429 rate limit errors are should be retried with exponential back off.

For 429 rate limit errors, please use `x-ratelimit-reset-requests` and `x-ratelimit-remaining-requests` to determine when to next retry.

To protect our infrastructure from abuse, if an user generates more than 20 failed requests in a 30 second window, the API will return a 429 error indicating the error rate limit has been reached:

```
Too many failed attempts (> 20) resulting in a non-success status code. Please wait 30s and try again. See https://docs.venice.ai/api-reference/rate-limiting for more information.
```

## Paid Tier Rate Limits

Rate limits apply to users who have purchased API credits or staked VVV to gain Diem.

Helpful links:

* [Real time rate limits](https://docs.venice.ai/api-reference/endpoint/api_keys/rate_limits?playground=open)
* [Rate limit logs](https://docs.venice.ai/api-reference/endpoint/api_keys/rate_limit_logs?playground=open) - View requests that have hit the rate limiter

<Note>We will continue to monitor usage. As we add compute capacity to the network, we will review these limits. If you are consistently hitting rate limits, please contact [**support@venice.ai**](mailto:support@venice.ai) or post in the #API channel in Discord for assistance and we can work with you to raise your limits.</Note>

### Paid Tier - LLMs

***

| Model                 | Model ID                | Req / Min | Req / Day | Tokens / Min |
| --------------------- | ----------------------- | :-------: | :-------- | :----------: |
| Llama 3.2 3B          | llama-3.2-3b            |    500    | 288,000   |   1,000,000  |
| Qwen 3 4B             | qwen3-4b                |    500    | 288,000   |   1,000,000  |
| Deepseek Coder V2     | deepseek-coder-v2-lite  |     75    | 54,000    |    750,000   |
| Qwen 2.5 Coder 32B    | qwen-2.5-coder-32b      |     75    | 54,000    |    750,000   |
| Qwen 2.5 QWQ 32B      | qwen-2.5-qwq-32b        |     75    | 54,000    |    750,000   |
| Dolphin 72B           | dolphin-2.9.2-qwen2-72b |     50    | 36,000    |    750,000   |
| Llama 3.3 70B         | llama-3.3-70b           |     50    | 36,000    |    750,000   |
| Mistral Small 3.1 24B | mistral-31-24b          |     50    | 36,000    |    750,000   |
| Qwen 2.5 VL 72B       | qwen-2.5-vl             |     50    | 36,000    |    750,000   |
| Qwen 3 235B           | qwen3-235b              |     50    | 36,000    |    750,000   |
| Llama 3.1 405B        | llama-3.1-405b          |     20    | 15,000    |    750,000   |
| Deepseek R1 671B      | deepseek-r1-671b        |     15    | 10,000    |    200,000   |

### Paid Tier - Image Models

***

| Model      | Model ID                       | Req / Min | Req / Day |
| ---------- | ------------------------------ | --------- | :-------- |
| Flux       | flux-dev / flux-dev-uncensored | 20        | 14,400    |
| All others | All                            | 20        | 28,800    |

### Paid Tier - Audio Models

***

| Model            | Model ID | Req / Min | Req / Day |
| ---------------- | -------- | :-------: | :-------: |
| All Audio Models | All      |     60    |   86,400  |

## Rate Limit and Consumption Headers

You can monitor your API utilization and remaining requests by evaluating the following headers:

<div style={{ overflowX: 'auto' }}>
  | Header                                                                       | Description                                                                             |
  | ---------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
  | <div style={{whiteSpace: 'nowrap'}}>**x-ratelimit-limit-requests**</div>     | The number of requests you've made in the current evaluation period.                    |
  | <div style={{whiteSpace: 'nowrap'}}>**x-ratelimit-remaining-requests**</div> | The remaining requests you can make in the current evaluation period.                   |
  | <div style={{whiteSpace: 'nowrap'}}>**x-ratelimit-reset-requests**</div>     | The unix time stamp when the rate limit will reset.                                     |
  | <div style={{whiteSpace: 'nowrap'}}>**x-ratelimit-limit-tokens**</div>       | The number of total (prompt + completion) tokens used within a 1 minute sliding window. |
  | <div style={{whiteSpace: 'nowrap'}}>**x-ratelimit-remaining-tokens**</div>   | The remaining number of total tokens that can be used during the evaluation period.     |
  | <div style={{whiteSpace: 'nowrap'}}>**x-ratelimit-reset-tokens**</div>       | The duration of time in seconds until the token rate limit resets.                      |
  | <div style={{whiteSpace: 'nowrap'}}>**x-venice-balance-diem**</div>          | The user's Diem balance before the request has been processed.                          |
  | <div style={{whiteSpace: 'nowrap'}}>**x-venice-balance-usd**</div>           | The user's USD balance before the request has been processed.                           |
</div>


# About Venice
Source: https://docs.venice.ai/overview/about-venice


Welcome to Venice.ai's API documentation! Our API enables you to harness the power of advanced AI models for text and image generation while maintaining the highest standards of privacy and performance.

<Warning>
  Venice's API is rapidly evolving. Please help us improve our offering by providing feedback. Join our [Discord](https://discord.gg/askvenice) to interact with our community or request new featues.

  * Features and endpoints may evolve

  * Model availability may change

  * Your feedback shapes our development. We take your feedback seriously and work quickly to ensure we are providing you with the best possible product.
</Warning>

## Venice's Values

* **Privacy-First Architecture**: Built from the ground up with user privacy as a core principle. Venice does not utilize or store user data for any purposes whatsoever.

* **Open-Source**: Venice only utilizes open-source models to ensure users have full transparency into the models they are interacting with.

* **OpenAI API Compatible**: Seamless integration with existing OpenAI clients using the Venice API base URL.

## What Can I do with Venice API?

* **Chat**: Prompt any of the supported models directly for simple chat applications with custom parameters and configurations. Use default settings, or customize as deeply as you prefer.

* **Generate Images**: Use the image models to generate new images from a simple prompt, or modify images with "inpainting".

* **Assist with Coding**: Prompt models for coding related outputs or integrate the Venice API into your preferred IDE or Visual Studio Code plugin.

* **Transcribe Audio (BETA)**: Use Venice's new Voice models to transcribe text into Voice using your preferred "speaker".

* **Analyze Documents**: Send images or PDF documents for interpretation, analysis or summarization.

* **Interact with Characters**: Chat with your favorite Venice characters through the API.

* **Anything you can imagine**: The Venice API has no bounds. Tie the API into your preferred integration using the API base URL and build anything you can imagine and code.

## Accessing the API

Venice users can access the API in 3 ways:

1. **Pro Account:** Users with a PRO account are issued a one time \$10 in API credit to experiment with the Venice API.

2. **Diem:** With Venice’s [launch of the VVV token](https://venice.ai/blog/introducing-the-venice-token-vvv), users who stake tokens within the Venice protocol gain access to a daily AI inference allocation (as well as ongoing staking yield). When staking, users receive Diem, which represent a portion of the overall Venice compute capacity. You can stake VVV tokens and [see your Diem allocation here](https://venice.ai/token).

3. **USD:** Users can also opt to deposit USD into their account to pay for API inference the same way that they would on other platforms, like OpenAI or Anthropic. Users with positive USD balance are entitled to “Paid Tier” rate limits.

## API settings

Venice recognizes that users may be integrating with various applications and require API key separation and usage limitation. Venice now offers the following settings for API Keys:

1. **Administrator Settings**: Users can create new API keys directly through the API, reducing the need for UI interactions.

2. **Expiration Time**: Users can set a date for API keys expiration.

3. **Usage Limits**: Users can set daily Diem or USD limits per API key.

## Resources

<CardGroup cols={2}>
  <Card title="Our Privacy Commitment" href="/overview/privacy" icon="shield-halved">
    Learn more about how our API handles your data and privacy.
  </Card>

  <Card title="Pricing" href="/overview/pricing" icon="dollar-sign">
    Learn more about our pricing.
  </Card>

  <Card title="Rate Limits & Usage" href="/api-reference/rate-limiting" icon="stopwatch">
    Learn more about how our API handles rate limits and usage.
  </Card>

  <Card title="API Reference" href="/api-reference" icon="rectangle-code">
    Explore our API reference.
  </Card>
</CardGroup>

## Start Building

<Card horizontal title="Getting Started" href="/overview/getting-started" icon="rocket">
  Ready to begin? Head to our Getting Started Guide for a step-by-step walk-through of making your first API call.
</Card>

These docs are open source and can be contributed to on [Github](https://github.com/veniceai/api-docs) by submitting a pull request. Here is a simple reference guide for ["How to use Venice API"](https://venice.ai/blog/how-to-use-venice-api)


# Deprecations
Source: https://docs.venice.ai/overview/deprecations

Model inclusion and lifecycle policy and deprecations for the Venice API

## Model inclusion and lifecycle policy for the Venice API

The Venice API exists to give developers unrestricted private access to production-grade models free from hidden filters or black-box decisions.

As models improve, we occasionally retire older ones in favor of smarter, faster, or more capable alternatives. We design these transitions to be predictable and low‑friction.

## Model Deprecations

We know deprecations can be disruptive. That’s why we aim to deprecate only when necessary, and we design features like traits and Venice-branded models to minimize disruption.

We may deprecate a model when:

* A newer model offers a clear improvement for the same use case
* The model no longer meets our standards for performance or reliability
* It sees consistently low usage, and continuing to support it would fragment the experience for everyone else

## Deprecation Process

When a model meets deprecation criteria, we announce the change with 30–60 days' notice. Deprecation notices are published via the [changelog](https://featurebase.venice.ai/changelog) and our [Discord server](https://discord.gg/askvenice). When you call a deprecated model during the notice period, the API response will include a deprecation warning.

During the notice period, the model remains available, though in some cases we may reduce infrastructure capacity. We always provide a recommended replacement, and when needed, offer migration guidance to help the transition.

After the sunset date, requests to the model will automatically route to a model of similar processing power at the same or lower price. If routing is not possible for technical or safety reasons, the API will return a 410 Gone response. If a deprecated model was selected via a trait (such as `default_code`, `default_vision`, or `fastest`) that trait will be reassigned to a compatible replacement.

We never remove models silently or alter behavior without versioning. You’ll always know what’s running and how to prepare for what’s next.

<Note>
  Performance-only upgrades: We may roll out improvements that preserve model behavior while improving performance, latency, or cost efficiency. These updates are backward-compatible and require no customer action.
</Note>

See the [Model Deprecation Tracker](#model-deprecation-tracker) below. For earlier announcements, consult the [changelog](https://featurebase.venice.ai/changelog) and our [Discord server](https://discord.gg/askvenice).

## How models are selected for the Venice API

We carefully select which models to make available based on performance, reliability, and real-world developer needs. To be included, a model must demonstrate strong performance, behave consistently under OpenAI-compatible endpoints, and offer a clear improvement over at least one of the models we already support.

Models we’re evaluating may first be released in beta to gather feedback and validate performance at scale.

We don’t expose models that are redundant, unproven, or not ready for consistent production use. Our goal is to keep the Venice API clean, capable, and optimized for what developers actually build.

Learn more in [Model Deprecations](/overview/deprecations#model-deprecations) and <a href="/overview/models" target="_blank" rel="noopener noreferrer">Current Model List</a>.

## Versioning and Aliases

All Venice models are identified by a unique, permanent ID. For example:

`venice-uncensored`
`qwen3-235b`
`llama-3.3-70b`
`deepseek-r1-671b`

Model IDs are stable. If there's a breaking change, we will release a new model ID (for example, add a version like v2). If there are no breaking changes, we may update the existing model and will communicate significant changes.

To provide flexibility, Venice also maintains symbolic aliases — implemented through traits — that point to the recommended default model for a given task. Examples include:

* `default` → currently routes to `llama-3.3-70b`
* `default_code` → currently routes to `qwen-2.5-coder-32b`
* `default_vision` → currently routes to `mistral-31-24b`
* `default_reasoning` → currently routes to `deepseek-r1-671b`

Traits offer a stable abstraction for selecting models while giving Venice the flexibility to improve the underlying implementation. Developers who prefer automatic access to the latest recommended models can rely on trait-based aliases.

For applications that require strict consistency and predictable behavior, we recommend referencing fixed model IDs.

## Beta Models

We sometimes release models in beta to gather feedback and confirm their performance before a full production rollout. Beta status does not guarantee promotion to production. A beta model may be removed if it is too costly to run, performs poorly at scale, or raises safety concerns. Beta models can change without notice and may have limited documentation or support. Models that prove stable, broadly useful, and aligned with our standards are promoted to general availability.

To request early access, join us on [Discord](https://discord.gg/askvenice) and let us know why you’d like to join the beta tester group.

## Feedback

You can submit your feedback or request through our [Featurebase portal](https://featurebase.venice.ai). We maintain a public [changelog](https://featurebase.venice.ai/changelog), roadmap tracker, and transparent rationale for adding, upgrading, or removing models, and we encourage continuous community participation.

## Model Deprecation Tracker

The following models are scheduled for deprecation. We recommend migrating to the suggested replacements before the removal date.

| Deprecated Model          | Replacement                     | Removal by   | Status    | Reason                            |
| ------------------------- | ------------------------------- | ------------ | --------- | --------------------------------- |
| `deepseek-r1-671b`        | `qwen3-235b`                    | Sep 22, 2025 | Available | Better model available, low usage |
| `llama-3.1-405b`          | `qwen3-235b`                    | Sep 22, 2025 | Available | Better model available, low usage |
| `dolphin-2.9.2-qwen2-72b` | `venice-uncensored`             | Sep 22, 2025 | Available | Better model available, low usage |
| `qwen-2.5-vl`             | `mistral-31-24b`                | Sep 22, 2025 | Available | Low usage                         |
| `qwen-2.5-qwq-32b`        | `qwen3-235b` (disable thinking) | Sep 22, 2025 | Available | Low usage                         |
| `qwen-2.5-coder-32b`      | `qwen3-235b`                    | Sep 22, 2025 | Available | Low usage                         |
| `deepseek-coder-v2-lite`  | `qwen3-235b`                    | Sep 22, 2025 | Available | Low usage                         |
| `pony-realism`            | `lustify-sdxl`                  | Sep 22, 2025 | Available | Better model available            |
| `stable-diffusion-3.5`    | `qwen-image`                    | Sep 22, 2025 | Available | Low usage                         |
| `flux-dev`                | `qwen-image`                    | Oct 22, 2025 | Available | Better model available            |
| `flux-dev-uncensored`     | `lustify-sdxl`                  | Oct 22, 2025 | Available | Better model available            |


# Quickstart
Source: https://docs.venice.ai/overview/getting-started


## Step-by-step guide

To get started with Venice quickly, you'll need to:

<Steps>
  <Step title="Generate an API Key">
    Navigate to your user settings within your [Venice API Settings](https://venice.ai/settings/api) and generate a new API key.

    For a more detailed guide, check out the [API Key](/overview/guides/generating-api-key) page.
  </Step>

  <Step title="Choose a model">
    Go to the ["List Models"](https://docs.venice.ai/api-reference/endpoint/models/list) API reference page and enter your API key to output a list of all models, or use the following command in a terminal

    <CodeGroup>
      ```bash Curl
      # Open a terminal, replace <your-api-key> with your actual API key, and run the following command
      curl --request GET \
        --url https://api.venice.ai/api/v1/models \
        --header 'Authorization: Bearer <your-api-key>'
      ```

      ```go Go
      package main

      import (
      "fmt"
      "net/http"
      "io"
      )

      func main() {

      url := "https://api.venice.ai/api/v1/models"
      method := "GET"

      client := &http.Client {}
      req, err := http.NewRequest(method, url, nil)

      if err != nil {
      fmt.Println(err)
      return
      }
      req.Header.Add("Authorization", "Bearer <your-api-key>")

      res, err := client.Do(req)
      if err != nil {
      fmt.Println(err)
      return
      }
      defer res.Body.Close()

      body, err := io.ReadAll(res.Body)
      if err != nil {
      fmt.Println(err)
      return
      }
      fmt.Println(string(body))
      }
      ```

      ```python Python
      import http.client

      conn = http.client.HTTPSConnection("api.venice.ai")
      payload = ''
      headers = {
        'Authorization': 'Bearer <your-api-key>'
      }
      conn.request("GET", "/api/v1/models", payload, headers)
      res = conn.getresponse()
      data = res.read()
      print(data.decode("utf-8"))
      ```

      ```js Javascript
      /**
      * Keep in mind that you will likely run into CORS issues when making requests from the browser.
      * You can get around this by using a proxy service like 
      * https://corsproxy.io/
      *
      * If you're looking for a React/NextJS example, check out: 
      * https://codesandbox.io/p/devbox/adoring-cori-6skflx
      **/
      const myHeaders = new Headers();
      myHeaders.append("Authorization", "Bearer <your-api-key>");

      const requestOptions = {
      method: "GET",
      headers: myHeaders,
      redirect: "follow"
      };

      fetch("https://api.venice.ai/api/v1/models", requestOptions)
      .then((response) => response.text())
      .then((result) => console.log(result))
      .catch((error) => console.error(error));
      ```
    </CodeGroup>
  </Step>

  <Step title="Text Prompt">
    Go to the ["Chat Completions"](https://docs.venice.ai/api-reference/endpoint/chat/completions) API reference page and enter your API key as well as text prompt configuration options, or modify the command below in a terminal

    <CodeGroup>
      ```bash Curl
      # Open a terminal, replace <your-api-key> with your actual API key, edit the information to your needs and run the following command
      curl --request POST \
      --url https://api.venice.ai/api/v1/chat/completions \
      --header 'Authorization: Bearer <your-api-key>' \
      --header 'Content-Type: application/json' \
      --data '{
       "model": "llama-3.3-70b",
       "messages": [
        {
         "role": "system",
         "content": "You are a helpful assistant"
        },
        {
         "role": "user",
         "content": "Tell me about AI"
        }
       ],
       "venice_parameters": {
        "enable_web_search": "on",
        "include_venice_system_prompt": true
       },
       "frequency_penalty": 0,
       "presence_penalty": 0,
       "max_tokens": 1000,
       "max_completion_tokens": 998,
       "temperature": 1,
       "top_p": 0.1,
       "stream": false
      }'
      ```
    </CodeGroup>
  </Step>

  <Step title="Image Generation">
    Go to the ["Generate Images"](https://docs.venice.ai/api-reference/endpoint/image/generate) API reference page and enter your API key as well as image prompt configuration options, or modify the command below in a terminal

    <CodeGroup>
      ```bash Curl
      # Open a terminal, replace <your-api-key> with your actual API key, edit the information to your needs and run the following command
      curl --request POST \
      --url https://api.venice.ai/api/v1/image/generate \
      --header 'Authorization: Bearer <your-api-key>' \
      --header 'Content-Type: application/json' \
      --data '{
      "model": "fluently-xl",
      "prompt": "A beautiful sunset over a mountain range",
      "negative_prompt": "Clouds, Rain, Snow",
      "style_preset": "3D Model",
      "height": 1024,
      "width": 1024,
      "steps": 30,
      "cfg_scale": 7.5,
      "seed": 123456789,
      "lora_strength": 50,
      "safe_mode": false,
      "return_binary": false,
      "hide_watermark": false
      }'
      ```
    </CodeGroup>
  </Step>
</Steps>

<Resources />


# AI Agents
Source: https://docs.venice.ai/overview/guides/ai-agents

Venice is supported with the following AI Agent communities.

* [Coinbase Agentkit](https://www.coinbase.com/developer-platform/discover/launches/introducing-agentkit)

* [Eliza](https://github.com/ai16z/eliza) - Venice support introduced via this [PR](https://github.com/ai16z/eliza/pull/1008).

## Eliza Instructions

To setup Eliza with Venice, follow these instructions. A full blog post with more detail can be found [here](https://venice.ai/blog/how-to-build-a-social-media-ai-agent-with-elizaos-venice-api).

* Clone the Eliza repository:

```bash
# Clone the repository
git clone https://github.com/ai16z/eliza.git
```

* Copy `.env.example` to `.env`

* Update `.env` specifying your `VENICE_API_KEY`, and model selections for  `SMALL_VENICE_MODEL`, `MEDIUM_VENICE_MODEL`, `LARGE_VENICE_MODEL`, `IMAGE_VENICE_MODEL`, instructions on generating your key can be found [here](/overview/guides/generating-api-key).

* Create a new character in the `/characters/` folder with a filename similar to  `your_character.character.json`to specify the character profile, tools/functions, and Venice.ai as the model provider:

```typescript
   modelProvider: "venice"
```

* Build the repo:

```bash
pnpm i
pnpm build
pnpm start
```

* Start your character

```bash
pnpm start --characters="characters/<your_character>.character.json"
```

* Start the local UI to chat with the agent

<img src="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/eliza-config.png?maxW=1172&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=db3e09f1b4c7dc7b44853318c365fe81" alt="" width="1172" height="1002" data-path="images/eliza-config.png" srcset="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/eliza-config.png?w=280&maxW=1172&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=c1ff6d89e706d08e26e7df16e8d6cf53 280w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/eliza-config.png?w=560&maxW=1172&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=41b7125860ec662c517deb5c017d6817 560w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/eliza-config.png?w=840&maxW=1172&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=695bc8182825bc154a57cb51d9e00242 840w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/eliza-config.png?w=1100&maxW=1172&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=30c9d19e63087b92299656db14655185 1100w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/eliza-config.png?w=1650&maxW=1172&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=6effb90a50936432b9ff66135176fa95 1650w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/eliza-config.png?w=2500&maxW=1172&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=0999cef644c9614dba9a8aac7dc72a24 2500w" data-optimize="true" data-opv="2" />


# Generating an API Key
Source: https://docs.venice.ai/overview/guides/generating-api-key


Venice's API is protected via API keys. To begin using the Venice API, you'll first need to generate a new key. Follow these steps to get started.

<Steps>
  <Step title="Visit the API Settings Page">
    To get to the API settings page, by visiting [https://venice.ai/settings/api](https://venice.ai/settings/api). This page is accessible by clicking "API" in the left hand toolbar, or by clicking “API” within your user settings.

    Within this dashboard, you're able to view your Diem and USD balances, your API Tier, your API Usage, and your API Keys.

    <Frame>
      <img src="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/API-Overview.png?maxW=2572&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=a8c0f3ad8cf321c82ac239f24d9f9d17" alt="API Overview" width="2572" height="1252" data-path="images/guides/API-Overview.png" srcset="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/API-Overview.png?w=280&maxW=2572&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=fdb5b29ec06f747466f0b167c8481b9e 280w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/API-Overview.png?w=560&maxW=2572&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=829043eaed09be8dcc66565cabe82bd1 560w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/API-Overview.png?w=840&maxW=2572&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=785a6f5ad59afcf2f50c98db29d4e476 840w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/API-Overview.png?w=1100&maxW=2572&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=dda36897dea28495d4ecf93629da25d3 1100w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/API-Overview.png?w=1650&maxW=2572&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=c7ab47167a755a8fd360b2f8ab62dec5 1650w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/API-Overview.png?w=2500&maxW=2572&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=17fddc1e3d6ed61c42f25c15b9f16d05 2500w" data-optimize="true" data-opv="2" />
    </Frame>
  </Step>

  <Step title="Click Generate New API Key">
    Scroll down the dashboard and select "Generate New API Key". You'll be presented with a list of options.

    * **Description:** This is used to name your API key

    * **API Key Type:**

      * “Admin” keys have the ability to delete or generate additional API keys programmatically.

      * “Inference Only” keys are only permitted to run inference.

    * **Expires at:** You can choose to set an expiration date for the API key after which it will cease to function. By default, a date will not be set, and the key will work in perpetuity.

    * **Epoch Consumption Limits:** This allows you to create limits for API usage from the individual API key. You can choose to limit the Diem or USD amount allowable within a given epoch (24hrs).

    <Frame>
      <img src="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/create-key.png?maxW=2624&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=ec4f9d6be308af33dad7d2fc65b3d15b" alt="Generate New API Key" width="2624" height="1296" data-path="images/guides/api-keys/create-key.png" srcset="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/create-key.png?w=280&maxW=2624&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=6b360f38bdacb6ee3a346d9e9d7ca6e9 280w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/create-key.png?w=560&maxW=2624&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=df7814fb65c422bd61bb6d69671acf5f 560w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/create-key.png?w=840&maxW=2624&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=14490dd7c11f8a2f6d7bbca26eb0ba98 840w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/create-key.png?w=1100&maxW=2624&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=81dccaacf86c18e8921e6b9e2c59e115 1100w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/create-key.png?w=1650&maxW=2624&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=97dc06a4631a5e24bab95c13d2d30d31 1650w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/create-key.png?w=2500&maxW=2624&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=c0235f43209df14ab0200e1167d97449 2500w" data-optimize="true" data-opv="2" />
    </Frame>
  </Step>

  <Step title="Generate the key">
    Clicking Generate will show you the API key.

    <Warning>
      **Important:** This key is only shown once. Make sure to copy it and store it in a safe place. If you lose it, you'll need to delete it and create a new one.
    </Warning>

    <Frame>
      <img src="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/result.png?maxW=1198&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=1c2043f83ca602bd3fd0f677752737f2" alt="Your API Key" width="1198" height="660" data-path="images/guides/api-keys/result.png" srcset="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/result.png?w=280&maxW=1198&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=3e68f63384dd945aa2ae793a168bacc5 280w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/result.png?w=560&maxW=1198&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=4e8a9e796d525386b2247141f858890a 560w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/result.png?w=840&maxW=1198&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=8e614b51eaf456c5f3cb44d11a18edca 840w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/result.png?w=1100&maxW=1198&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=9b888d423eec6bec173132521638c61c 1100w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/result.png?w=1650&maxW=1198&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=aaea15eb0bcc4d47c24304f93f5f3681 1650w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/api-keys/result.png?w=2500&maxW=1198&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=f142f3a77b2b97b445e98336a7ed27aa 2500w" data-optimize="true" data-opv="2" />
    </Frame>
  </Step>
</Steps>


# Autonomous Agent API Key Creation
Source: https://docs.venice.ai/overview/guides/generating-api-key-agent


Autonomous AI Agents can programmatically access Venice.ai's APIs without any human interaction using the "api\_keys" endpoint. AI Agents are now able to manage their own wallets on the BASE blockchain, allowing them to programmatically acquire and stake VVV token to earn a daily Diem inference allocation. Venice's new API endpoint allows them to automate further by generating their own API key.&#x20;

To autonomously generate an API key within an agent, you must:

<Steps>
  <Step title="Acquire VVV">
    The agent will need VVV token to complete this process. This can be achieved by sending tokens directly to the agent wallet, or having the agent swap on a Decentralized Exchange (DEX), like [Aerodrome](https://aerodrome.finance/swap?from=eth\&to=0xacfe6019ed1a7dc6f7b508c02d1b04ec88cc21bf\&chain0=8453\&chain1=8453) or [Uniswap](https://app.uniswap.org/swap?chain=base\&inputCurrency=NATIVE\&outputCurrency=0xacfe6019ed1a7dc6f7b508c02d1b04ec88cc21bf).
  </Step>

  <Step title="Stake VVV with Venice">
    Once funded, the agent will need to stake the VVV tokens within the [Venice Staking Smart Contract](https://basescan.org/address/0x321b7ff75154472b18edb199033ff4d116f340ff#code). To accomplish this you first must approve VVV tokens for staking, then execute a "stake" transaction.&#x20;

    <Frame as="div">
      <img src="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/SC-Stake.png?maxW=812&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=ab230d3650ee478956003fe201d7a650" alt="Smart Contract Staking" width="812" height="324" data-path="images/guides/SC-Stake.png" srcset="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/SC-Stake.png?w=280&maxW=812&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=cf8f778cfae65bde7642e454f08b1bed 280w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/SC-Stake.png?w=560&maxW=812&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=08e8e1d8ed483813b814ac89b996035e 560w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/SC-Stake.png?w=840&maxW=812&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=d2a8e24ceea72d10e9d5a3e587b22d62 840w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/SC-Stake.png?w=1100&maxW=812&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=2427f13013877a74b48bfec87a7abdc3 1100w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/SC-Stake.png?w=1650&maxW=812&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=d82a12ee7eb9ac42a302374509dc0a3c 1650w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/guides/SC-Stake.png?w=2500&maxW=812&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=cee847bb4a6dcf2d102e93a47765bb63 2500w" data-optimize="true" data-opv="2" />
    </Frame>

    When the transaction is complete, you will see the VVV tokens exit the wallet and sVVV tokens returned to your wallet. This indicates a successful stake.&#x20;
  </Step>

  <Step title="Obtain Validation Token">
    To generate an API key, you need to first obtain your validation token. You can get this by calling this [API endpoint ](https://docs.venice.ai/api-reference/endpoint/api_keys/generate_web3_key/get)`https://api.venice.ai/api/v1/api_keys/generate_web3_key` . The API response will provide you with a "token".&#x20;

    Here is an example request:

    ```
    curl --request GET \
      --url https://api.venice.ai/api/v1/api_keys/generate_web3_key
    ```
  </Step>

  <Step title="Sign for Wallet Validation">
    Sign the token with the wallet holding VVV to complete the association between the wallet and token.&#x20;
  </Step>

  <Step title="Generate API Key">
    Now you can call this same [API endpoint](https://docs.venice.ai/api-reference/endpoint/api_keys/generate_web3_key/get) `https://api.venice.ai/api/v1/api_keys/generate_web3_key` to create your API key.&#x20;

    You will need the following information to proceed, which is described further within the "[Generating API Key Guide](https://docs.venice.ai/overview/guides/generating-api-key)":

    * API Key Type: Inference or Admin

    * ConsumptionLimit: To be used if you want to limit the API key usage

    * Signature: The signed token from step 4

    * Token: The unsigned token from step 3

    * Address: The agent's wallet address

    * Description: String to describe your API Key

    * ExpiresAt: Option to set an expiration date for the API key (empty for no expiration)

    Here is an example request:

    ```
    curl --request POST \
      --url https://api.venice.ai/api/v1/api_keys/generate_web3_key \
      --header 'Authorization: Bearer ' \
      --header 'Content-Type: application/json' \
      --data '{
      "description": "Web3 API Key",
      "apiKeyType": "INFERENCE",
      "signature": "<signed token>",
      "token": "<unsigned token>",
      "address": "<wallet address>",
      "consumptionLimit": {
        "diem": 1
      }
    }'
    ```
  </Step>
</Steps>

Example code to interact with this API can be found below:

```
import { ethers } from "ethers";

// NOTE: This is an example. To successfully generate a key, your address must be holding
// and staking VVV.
const wallet = ethers.Wallet.createRandom()
const address = wallet.address
console.log("Created address:", address)

// Request a JWT from Venice's API
const response = await fetch('https://api.venice.ai/api/v1/api_keys/generate_web3_key')
const token = (await response.json()).data.token
console.log("Validation Token:", token)

// Sign the token with your wallet and pass that back to the API to generate an API key
const signature = await wallet.signMessage(token)
const postResponse = await fetch('https://api.venice.ai/api/v1/api_keys/generate_web3_key', {
  method: 'POST',
  body: JSON.stringify({
    address,
    signature,
    token,
    apiKeyType: 'ADMIN'
  })
})

await postResponse.json()
```


# Integrations
Source: https://docs.venice.ai/overview/guides/integrations

Here is a list of third party tools with Venice.ai integrations.

[How to use Venice API](https://venice.ai/blog/how-to-use-venice-api) reference guide.

## Venice Confirmed Integrations

* Agents

  * [ElizaOS](https://venice.ai/blog/how-to-build-a-social-media-ai-agent-with-elizaos-venice-api) (local build)

  * [ElizaOS](https://venice.ai/blog/how-to-launch-an-elizaos-agent-on-akash-using-venice-api-in-less-than-10-minutes) (via [Akash Template](https://console.akash.network/templates/akash-network-awesome-akash-Venice-ElizaOS))

* Coding

  * [Cursor IDE](https://venice.ai/blog/how-to-code-with-the-venice-api-in-cursor-a-quick-guide)

  * [Cline](https://venice.ai/blog/how-to-use-the-venice-api-with-cline-in-vscode-a-developers-guide) (VSC Extension)

  * [ROO Code ](https://venice.ai/blog/how-to-use-the-roo-ai-coding-assistant-in-private-with-venice-api-a-quick-guide)(VSC Extension)

  * [VOID IDE](https://venice.ai/blog/how-to-use-open-source-ai-code-editor-void-in-private-with-venice-api)&#x20;

* Assistants

  * [Brave Leo Browser ](https://venice.ai/blog/how-to-use-brave-leo-ai-with-venice-api-a-privacy-first-browser-ai-assistant)

## Community Confirmed&#x20;

These integrations have been confirmed by the community. Venice is in the process of confirming these integrations and creating how-to guides for each of the following:

* Agents/Bots

  * [Coinbase Agentkit](https://www.coinbase.com/developer-platform/discover/launches/introducing-agentkit)

  * [Eliza\_Starter](https://github.com/Baidis/eliza-Venice) Simplified Eliza setup.

  * [Venice AI Discord Bot](https://bobbiebeach.space/blog/venice-ai-discord-bot-full-setup-guide-features/)

  * [JanitorAI](https://janitorai.com/)

* Coding

  * [Aider](https://github.com/Aider-AI/aider), AI pair programming in your terminal

  * [Alexcodes.app](https://alexcodes.app/)

* Assistants

  * [Jan - Local AI Assistant](https://github.com/janhq/jan)

  * [llm-venice](https://github.com/ar-jan/llm-venice)

  * [unOfficial PHP SDK for Venice](https://github.com/georgeglarson/venice-ai-php)

  * [Msty](https://msty.app)

  * [Open WebUI](https://github.com/open-webui/open-webui)

  * [Librechat](https://www.librechat.ai/)

  * [ScreenSnapAI](https://screensnap.ai/)

## Venice API Raw Data

Many users have requested access to Venice API docs and data in a format acceptable for use with RAG (Retrieval-Augmented Generation) for various purposes. The full API specification is available within the "API Swagger" document below, in yaml format. The Venice API documents included throughout this API Reference webpage are available from the link below, with most documents in .mdx format.

[API Swagger](https://api.venice.ai/doc/api/swagger.yaml)

[API Docs](https://github.com/veniceai/api-docs/archive/refs/heads/main.zip)


# Using Postman
Source: https://docs.venice.ai/overview/guides/postman


## Overview

Venice provides a comprehensive Postman collection that allows developers to explore and test the full capabilities of our API. This collection includes pre-configured requests, examples, and environment variables to help you get started quickly with Venice's AI services.

## Accessing the Collection

Our official Postman collection is available in the Venice AI Workspace:

* [Venice AI Postman Workspace](https://www.postman.com/veniceai/workspace/venice-ai-workspace)
* [Venice AI Postman Examples](https://postman.venice.ai/)

## Collection Features

* **Ready-to-Use Requests**: Pre-configured API calls for all Venice endpoints
* **Environment Templates**: Properly structured environment variables
* **Request Examples**: Real-world usage examples for each endpoint
* **Response Samples**: Example responses to help you understand the API's output
* **Documentation**: Inline documentation for each request

## Getting Started

<Steps>
  <Step title="Fork the Collection">
    * Navigate to the Venice AI Workspace
    * Click "Fork" to create your own copy of the collection
    * Choose your workspace destination
  </Step>

  <Step title="Set Up Your Environment">
    * Create a new environment in Postman
    * Add your Venice API key
    * Configure the base URL: `https://api.venice.ai/api/v1`
  </Step>

  <Step title="Make Your First Request">
    * Select any request from the collection
    * Ensure your environment is selected
    * Click "Send" to test the API
  </Step>
</Steps>

## Available Endpoints

The collection includes examples for all Venice API endpoints:

* Text Generation
* Image Generation
* Model Information
* Image Upscaling
* System Prompt Configuration

## Best Practices

* Keep your API key secure and never share it
* Use environment variables for sensitive information
* Test responses in the Postman console before implementation
* Review the example responses for expected data structures

<Note>*Note: The Postman collection is regularly updated to reflect the latest API changes and features.*</Note>


# Structured Responses
Source: https://docs.venice.ai/overview/guides/structured-responses

Using structured responses within the Venice API

Venice has now included structured outputs via “response\_format” as an available field in the API. This field enables you to generate responses to your prompts that follow a specific pre-defined format. With this new method, the models are less likely to hallucinate incorrect keys or values within the response, which was more prevalent when attempting through system prompt manipulation or via function calling.

The structured output “response\_format” field utilizes the OpenAI API format, and is further described in the openAI guide [here](https://platform.openai.com/docs/guides/structured-outputs). OpenAI also released an introduction article to using stuctured outputs within the API specifically [here](https://openai.com/index/introducing-structured-outputs-in-the-api/). As this is advanced functionality, there are a handful of “gotchas” on the bottom of this page that should be followed.

This functionality is not natively available for all models. Please refer to the models section [here](https://docs.venice.ai/api-reference/endpoint/models/list?playground=open), and look for “supportsResponseSchema” for applicable models.

```json
    {
      "id": "dolphin-2.9.2-qwen2-72b",
      "type": "text",
      "object": "model",
      "created": 1726869022,
      "owned_by": "venice.ai",
      "model_spec": {
        "availableContextTokens": 32768,
        "capabilities": {
          "supportsFunctionCalling": true,
          "supportsResponseSchema": true,
          "supportsWebSearch": true
        },
```

### How to use Structured Responses

To properly use the “response\_format” you can define your schema with various “properties”, representing categories of outputs, each with individually configured data types. These objects can be nested to create more advanced structures of outputs.

Here is an example of an API call using response\_format to explain the step-by-step process of solving a math equation.

You can see that the properties were configured to require both “steps” and “final\_answer” within the response. Within nesting, the steps category consists of both an “explanation” and an “output”, each as strings.

```json
curl --request POST \
  --url https://api.venice.ai/api/v1/chat/completions \
  --header 'Authorization: Bearer <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "dolphin-2.9.2-qwen2-72b",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful math tutor."
    },
    {
      "role": "user",
      "content": "solve 8x + 31 = 2"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "math_response",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "steps": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "explanation": {
                  "type": "string"
                },
                "output": {
                  "type": "string"
                }
              },
              "required": ["explanation", "output"],
              "additionalProperties": false
            }
          },
          "final_answer": {
            "type": "string"
          }
        },
        "required": ["steps", "final_answer"],
        "additionalProperties": false
      }
    }
  }
}

```

Here is the response that was received from the model. You can see that the structure followed the requirements by first providing the “steps” with the “explanation” and “output” of each step, and then the “final answer”.

```json
{
  "steps": [
    {
      "explanation": "Subtract 31 from both sides to isolate the term with x.",
      "output": "8x + 31 - 31 = 2 - 31"
    },
    {
      "explanation": "This simplifies to 8x = -29.",
      "output": "8x = -29"
    },
    {
      "explanation": "Divide both sides by 8 to solve for x.",
      "output": "x = -29 / 8"
    }
  ],
  "final_answer": "x = -29 / 8"
}

```

Although this is a simple example, this can be extrapolated into more advanced use cases like: Data Extraction, Chain of Thought Exercises, UI Generation, Data Categorization and many others.

### Gotchas

Here are some key requirements to keep in mind when using Structured Outputs via response\_format:

* Initial requests using response\_format may take longer to generate a response. Subsequent requests will not experience the same latency as the initial request.

* For larger queries, the model can fail to complete if either `max_tokens` or model timeout are reached, or if any rate limits are violated

* Incorrect schema format will result in errors on completion, usually due to timeout

* Although response\_format ensures the model will output a particular way, it does not guarantee that the model provided the correct information within. The content is driven by the prompt and the model performance.

* Structured Outputs via response\_format are not compatible with parallel function calls

* Important: All fields or parameters must include a `required` tag. To make a field optional, you need to add a `null` option within the `type`of the field, like this `"type": ["string", "null"]`&#x20;

* It is possible to make fields optional by giving a `null` options within the required field to allow an empty response.

* Important: `additionalProperties` must be set to false for response\_format to work properly

* Important: `strict` must be set to true for response\_format to work properly


# Current Models
Source: https://docs.venice.ai/overview/models

Complete list of available models on Venice AI platform

## Text Models

| Model Name               | Model ID                  | Price (in/out)   | Context Limit | Capabilities                | Traits             |
| ------------------------ | ------------------------- | ---------------- | ------------- | --------------------------- | ------------------ |
| Venice Uncensored 1.1    | `venice-uncensored`       | `$0.50 / $2.00`  | 32,768        | —                           | —                  |
| Venice Reasoning         | `qwen-2.5-qwq-32b`        | `$0.50 / $2.00`  | 32,768        | Reasoning                   | —                  |
| Venice Small             | `qwen3-4b`                | `$0.15 / $0.60`  | 40,960        | Function Calling, Reasoning | —                  |
| Venice Medium (3.2 beta) | `mistral-32-24b`          | `$0.50 / $2.00`  | 131,072       | Function Calling, Vision    | —                  |
| Venice Medium (3.1)      | `mistral-31-24b`          | `$0.50 / $2.00`  | 131,072       | Function Calling, Vision    | default\_vision    |
| Venice Large 1.1         | `qwen3-235b`              | `$1.50 / $6.00`  | 131,072       | Function Calling, Reasoning | —                  |
| Llama 3.2 3B             | `llama-3.2-3b`            | `$0.15 / $0.60`  | 131,072       | Function Calling            | fastest            |
| Llama 3.3 70B            | `llama-3.3-70b`           | `$0.70 / $2.80`  | 65,536        | Function Calling            | default            |
| Llama 3.1 405B (D)       | `llama-3.1-405b`          | `$1.50 / $6.00`  | 65,536        | —                           | most\_intelligent  |
| Dolphin 72B (D)          | `dolphin-2.9.2-qwen2-72b` | `$0.70 / $2.80`  | 32,768        | —                           | most\_uncensored   |
| Qwen 2.5 VL 72B (D)      | `qwen-2.5-vl`             | `$0.70 / $2.80`  | 32,768        | Vision                      | —                  |
| Qwen 2.5 Coder 32B (D)   | `qwen-2.5-coder-32b`      | `$0.50 / $2.00`  | 32,768        | —                           | default\_code      |
| DeepSeek R1 671B (D)     | `deepseek-r1-671b`        | `$3.50 / $14.00` | 131,072       | Reasoning                   | default\_reasoning |
| DeepSeek Coder V2 Lite   | `deepseek-coder-v2-lite`  | `$0.50 / $2.00`  | 131,072       | —                           | —                  |

*Pricing is per 1M tokens (input / output). Models with reasoning capabilities support advanced reasoning via thinking mode*.

### Popular Text Models

`qwen3-235b` Venice Large 1.1 - Most powerful flagship model\
`mistral-31-24b` Venice Medium (3.1) - Vision + function calling\
`qwen3-4b` Venice Small - Fast, affordable for most tasks\
`llama-3.3-70b` Llama 3.3 70B - Balanced high-performance model

### Text Model Categories

**Reasoning Models**

`qwen3-235b` Venice Large 1.1 - Advanced reasoning capabilities\
`qwen3-4b` Venice Small - Efficient reasoning model

**Vision-Capable Models**

`mistral-31-24b` Venice Medium (3.1) - Vision-capable model

**Cost-Optimized Models**

`qwen3-4b` Venice Small - Best balance of speed and cost\
`llama-3.2-3b` Llama 3.2 3B - Fastest for simple tasks

**Uncensored Models**

`venice-uncensored` Venice Uncensored 1.1 - No content filtering

**High-Intelligence Models**

`llama-3.3-70b` Llama 3.3 70B - Balanced high-intelligence\
`qwen3-235b` Venice Large 1.1 - Most powerful flagship model

***

## Image Models

| Model Name               | Model ID               | Price   | Model Source               | Traits                 |
| ------------------------ | ---------------------- | ------- | -------------------------- | ---------------------- |
| Venice SD35              | `venice-sd35`          | `$0.01` | Stable Diffusion 3.5 Large | default, eliza-default |
| HiDream                  | `hidream`              | `$0.01` | HiDream I1 Dev             | —                      |
| Qwen Image               | `qwen-image`           | `$0.01` | Qwen Image                 | —                      |
| FLUX Standard (D)        | `flux-dev`             | `$0.01` | FLUX.1 Dev                 | highest\_quality       |
| FLUX Custom (D)          | `flux-dev-uncensored`  | `$0.01` | FLUX.1 Dev                 | —                      |
| Lustify SDXL             | `lustify-sdxl`         | `$0.01` | Lustify SDXL               | —                      |
| Pony Realism (D)         | `pony-realism`         | `$0.01` | Pony Realism               | most\_uncensored       |
| Stable Diffusion 3.5 (D) | `stable-diffusion-3.5` | `$0.01` | Stable Diffusion 3.5 Large | —                      |
| Anime (WAI)              | `wai-Illustrious`      | `$0.01` | WAI-Illustrious            | —                      |

### Popular Image Models

`qwen-image` Qwen Image - Highest quality image generation\
`venice-sd35` Venice SD35 - Default choice with Eliza integration\
`lustify-sdxl` Lustify SDXL - Uncensored image generation\
`hidream` HiDream - Production-ready generation

### Image Model Categories

**High-Quality Models**

`qwen-image` Qwen Image - Highest quality output\
`hidream` HiDream - Production-ready generation

**Default Models**

`venice-sd35` Venice SD35 - Default choice, Eliza-optimized

**Uncensored Models**

`lustify-sdxl` Lustify SDXL - Adult content generation\
`wai-Illustrious` Anime (WAI) - Best for anime/wai NSFW capable

***

## Audio Models

### Text-to-Speech Models

`tts-kokoro` Kokoro TTS - 60+ multilingual voices for natural speech

| Model Name            | Model ID     | Price                | Voices Available | Model Source |
| --------------------- | ------------ | -------------------- | ---------------- | ------------ |
| Kokoro Text to Speech | `tts-kokoro` | `$3.50` per 1M chars | 60+ voices       | Kokoro-82M   |

<Note>
  The tts-kokoro model supports a wide range of multilingual and stylistic voices (including af\_nova, am\_liam, bf\_emma, zf\_xiaobei, and jm\_kumo). Voice is selected using the voice parameter in the request payload.
</Note>

***

## Embedding Models

`text-embedding-bge-m3` BGE-M3 - Versatile embedding model for text similarity

| Model Name | Model ID                | Price                         | Model Source        |
| ---------- | ----------------------- | ----------------------------- | ------------------- |
| BGE-M3     | `text-embedding-bge-m3` | `$0.15 / $0.60` per 1K tokens | KimChen/bge-m3-GGUF |

## Image Processing Models

`upscaler` Image Upscaler - Enhance image resolution up to 4x\
`flux-kontext-dev` Flux Kontext DEV - Multimodal image editing model

### Image Upscaler

| Model Name | Model ID   | Price   | Upscale Options          |
| ---------- | ---------- | ------- | ------------------------ |
| Upscaler   | `upscaler` | `$0.01` | `2x ($0.02), 4x ($0.08)` |

### Image Editing (Inpaint)

| Model Name       | Model ID           | Price   | Model Source | Traits               |
| ---------------- | ------------------ | ------- | ------------ | -------------------- |
| Flux Kontext DEV | `flux-kontext-dev` | `$0.04` | Flux Kontext | specialized\_editing |

## Model Features

* **Vision**: Ability to process and understand images
* **Reasoning**: Advanced logical reasoning capabilities
* **Function Calling**: Support for calling external functions and tools
* **Traits**: Special characteristics or optimizations (e.g., fastest, most\_intelligent, most\_uncensored)

## Usage Notes

* Input pricing refers to tokens sent to the model
* Output pricing refers to tokens generated by the model
* Context limits define the maximum number of tokens the model can process in a single request
* (D) Scheduled for deprecation. For timelines and migration guidance, see the [Deprecation Tracker](/overview/deprecations#model-deprecation-tracker).


# API Pricing
Source: https://docs.venice.ai/overview/pricing


### Pro Users

Pro subscribers automatically receive a one-time \$10 API credit upon upgrading to Pro – double the credit amount compared to competitors. This credit provides capacity for testing and small applications, with seamless pathways to scale via VVV staking or direct USD payments for larger implementations.

### Paid Tier

Paid access to the Venice API can be obtained in two ways:

<Steps>
  <Step title="Purchased API Credits">
    Users can purchase API credits via the [API Dashboard](https://venice.ai/settings/api).
  </Step>

  <Step title="Stake VVV">
    Users can [stake VVV](https://venice.ai/blog/how-to-stake-and-claim-your-venice-tokens-vvv) which in return, provides you proportional access to
    Venice's compute pool in units called Diem. A Diem is worth \$1 of API credit per day. The more you stake, the higher your Diem allocation, and they renew daily. You also earn staking rewards while staked. Visit the [Token Dashboard](https://venice.ai/token) to stake VVV and to see how much Diem you control.
  </Step>
</Steps>

## Model Pricing

### Chat Models

Chat models are priced per million tokens, with separate pricing for input and output tokens. While the price is per million tokens, you will only be charged for the tokens you use.
You can estimate the token count of a chat request using [this calculator](https://quizgecko.com/tools/token-counter).

| Model                                                                                                                                                | Input Tokens (per M.) | Input Tokens (per M.) | Output Tokens (per M.) | Output Tokens (per M.) |
| ---------------------------------------------------------------------------------------------------------------------------------------------------- | :-------------------: | :-------------------: | :--------------------: | :--------------------: |
| <div style={{whiteSpace: 'nowrap'}}>Venice Small (Qwen 3 4B)<br />Llama 3.2 3B<br />BGE 3 Embeddings</div>                                           |       0.15 Diem       |         \$0.15        |        0.6 Diem        |         \$0.60         |
| <div style={{whiteSpace: 'nowrap'}}>Venice Medium (Mistral Small 3.1 24B)<br />Venice Uncensored<br />Qwen 2.5 Coder 32B<br />Qwen 2.5 QWQ 32B</div> |        0.5 Diem       |         \$0.50        |        2.0 Diem        |         \$2.00         |
| <div style={{whiteSpace: 'nowrap'}}>Llama 3.3 70B<br />Dolphin 72B<br /> Qwen 2.5 VL 72B</div>                                                       |        0.7 Diem       |         \$0.70        |        2.8 Diem        |         \$2.80         |
| <div style={{whiteSpace: 'nowrap'}}>Venice Large (Qwen 3 235B)<br />Llama 3.1 405B</div>                                                             |        1.5 Diem       |         \$1.50        |        6.0 Diem        |         \$6.00         |
| <div style={{whiteSpace: 'nowrap'}}>DeepSeek R1 671B</div>                                                                                           |        3.5 Diem       |         \$3.50        |        14.0 Diem       |         \$14.00        |

### Image Models

Venice Image models are currently priced at the following rates:

| Model                  | Diem Pricing | USD Pricing |
| ---------------------- | :----------: | :---------: |
| Generation             |   0.01 Diem  |  \$0.01 USD |
| Upscale / Enhance (2x) |   0.02 Diem  |  \$0.02 USD |
| Upscale / Enhance (4x) |   0.08 Diem  |  \$0.08 USD |
| Edit (aka Inpaint)     |   0.04 Diem  |  \$0.04 USD |

### Audio Models

All Venice Audio models are currently priced at the following rates:

| Model | Input Characters (per M.) | Input Characters (per M.) |
| ----- | :-----------------------: | :-----------------------: |
| All   |          3.5 Diem         |         \$3.50 USD        |


# Privacy
Source: https://docs.venice.ai/overview/privacy


Nearly all AI apps and services collect user data (personal information, prompt text, and AI text and image responses) in central servers, which they can access, and which they can (and do) share with third parties, ranging from ad networks to governments. Even if a company wants to keep this data safe, data breaches happen [all the time](https://www.wired.com/story/wired-guide-to-data-breaches/), often unreported.

> The only way to achieve reasonable user privacy is to avoid collecting this information in the first place. This is harder to do from an engineering perspective, but we believe it’s the correct approach.

### Privacy as a principle

One of Venice’s guiding principles is user privacy. The platform's architecture flows from this philosophical principle, and every component is designed with this objective in mind.

#### Architecture

The Venice API replicates the same technical architecture as the Venice platform from a backend perspective.

**Venice does not store or log any prompt or model responses on our servers.** API calls are forwarded directly to GPUs running across a collection of decentralized providers over encrypted HTTPS paths.

<img src="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/privacy-architecture.png?maxW=2042&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=65af169a0a8b5eba7ed409329d41e807" alt="Venice AI Privacy Architecture" width="2042" height="812" data-path="images/privacy-architecture.png" srcset="https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/privacy-architecture.png?w=280&maxW=2042&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=c677d5718ce7d841533d6e64946fb57f 280w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/privacy-architecture.png?w=560&maxW=2042&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=1c48449a18ba51de2c24bde1acea2d4b 560w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/privacy-architecture.png?w=840&maxW=2042&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=3df7e3e15fc4df9c075872aab298e980 840w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/privacy-architecture.png?w=1100&maxW=2042&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=1223d065e05a00e4742d6a770681b187 1100w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/privacy-architecture.png?w=1650&maxW=2042&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=64142f06055330876b2857539feddc63 1650w, https://mintcdn.com/veniceai/IFxWLBK8qRcf4Dhb/images/privacy-architecture.png?w=2500&maxW=2042&auto=format&n=IFxWLBK8qRcf4Dhb&q=85&s=e5334a05be0c5bcd234be3b566b1b9ab 2500w" data-optimize="true" data-opv="2" />