POST
/
chat
/
completions
curl --request POST \
  --url https://api.venice.ai/api/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "dolphin-2.9.2-qwen2-72b",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ]
}'

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
messages
object[] | null
required

A list of messages comprising the conversation so far.

model
any
required

ID of the model to use, or the model trait to select the model from.

frequency_penalty
number | null
default:
0

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Required range: -2 < x < 2
max_completion_tokens
integer | null

An upper bound for the number of tokens that can be generated for a completion.

max_tokens
integer
deprecated

Maximum number of tokens to generate.

presence_penalty
number | null
default:
0

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

Required range: -2 < x < 2
prompt
string
default:
<|endoftext|>

The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays.

Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document.

stop

Up to 4 sequences where the API will stop generating further tokens.

stream
boolean
default:
false

Whether to stream back partial progress as server-sent events.

temperature
number
default:
1

What sampling temperature to use. Higher values make output more random, lower values more focused.

Required range: 0 < x < 2
tools
object[]

A list of tools the model may call.

top_p
number
default:
1

An alternative to sampling with temperature, called nucleus sampling.

Required range: 0 < x < 1
venice_parameters
object

Response

200 - application/json
choices
object[]
required
created
integer
required

Unix timestamp of when the completion was created

id
string
required

Unique identifier for the chat completion

model
string
required

The model used for completion

object
enum<string>
required

The object type

Available options:
chat.completion
usage
object