POST
/
chat
/
completions
curl --request POST \
  --url https://api.venice.ai/api/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "dolphin-2.9.2-qwen2-72b",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ]
}'

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
any
required

ID of the model to use, or the model trait to select the model from.

messages
object[]
required

A list of messages comprising the conversation so far.

venice_parameters
object
temperature
number
default: 1

What sampling temperature to use. Higher values make output more random, lower values more focused.

Required range: 0 < x < 2
top_p
number
default: 1

An alternative to sampling with temperature, called nucleus sampling.

Required range: 0 < x < 1
stream
boolean
default: false

Whether to stream back partial progress as server-sent events.

max_tokens
integer
deprecated

Maximum number of tokens to generate.

max_completion_tokens
integer | null

An upper bound for the number of tokens that can be generated for a completion.

tools
object[]

A list of tools the model may call.

Response

200 - application/json
id
string
required

Unique identifier for the chat completion

object
enum<string>
required

The object type

Available options:
chat.completion
created
integer
required

Unix timestamp of when the completion was created

model
string
required

The model used for completion

choices
object[]
required
usage
object