Structured Responses
Using structured responses within the Venice API
Venice has now included structured outputs via “response_format” as an available field in the API. This field enables you to generate responses to your prompts that follow a specific pre-defined format. With this new method, the models are less likely to hallucinate incorrect keys or values within the response, which was more prevalent when attempting through system prompt manipulation or via function calling.
The structured output “response_format” field utilizes the OpenAI API format, and is further described in the openAI guide here. OpenAI also released an introduction article to using stuctured outputs within the API specifically here. As this is advanced functionality, there are a handful of “gotchas” on the bottom of this page that should be followed.
This functionality is not natively available for all models. Please refer to the models section here, and look for “supportsResponseSchema” for applicable models.
How to use Structured Responses
To properly use the “response_format” you can define your schema with various “properties”, representing categories of outputs, each with individually configured data types. These objects can be nested to create more advanced structures of outputs.
Here is an example of an API call using response_format to explain the step-by-step process of solving a math equation.
You can see that the properties were configured to require both “steps” and “final_answer” within the response. Within nesting, the steps category consists of both an “explanation” and an “output”, each as strings.
Here is the response that was received from the model. You can see that the structure followed the requirements by first providing the “steps” with the “explanation” and “output” of each step, and then the “final answer”.
Although this is a simple example, this can be extrapolated into more advanced use cases like: Data Extraction, Chain of Thought Exercises, UI Generation, Data Categorization and many others.
Gotchas
Here are some key requirements to keep in mind when using Structured Outputs via response_format:
-
Initial requests using response_format may take longer to generate a response. Subsequent requests will not experience the same latency as the initial request.
-
For larger queries, the model can fail to complete if either
max_tokens
or model timeout are reached, or if any rate limits are violated -
Incorrect schema format will result in errors on completion, usually due to timeout
-
Although response_format ensures the model will output a particular way, it does not guarantee that the model provided the correct information within. The content is driven by the prompt and the model performance.
-
Structured Outputs via response_format are not compatible with parallel function calls
-
Important: All fields fields or parameters must include a
required
tag. To make a field optional, you need to add anull
option within thetype
of the field, like this"type": ["string", "null"]
-
It is possible to make fields optional by giving a
null
options within the required field to allow an empty response. -
Important:
additionalProperties
must be set to false for response_format to work properly -
Important:
strict
must be set to true for response_format to work properly