Venice has now included structured outputs via “response_format” as an available field in the API. This field enables you to generate responses to your prompts that follow a specific pre-defined format. With this new method, the models are less likely to hallucinate incorrect keys or values within the response, which was more prevalent when attempting through system prompt manipulation or via function calling.

The structured output “response_format” field utilizes the OpenAI API format, and is further described in the openAI guide here. OpenAI also released an introduction article to using stuctured outputs within the API specifically here. As this is advanced functionality, there are a handful of “gotchas” on the bottom of this page that should be followed.

This functionality is not natively available for all models. Please refer to the models section here, and look for “supportsResponseSchema” for applicable models.

    {
      "id": "dolphin-2.9.2-qwen2-72b",
      "type": "text",
      "object": "model",
      "created": 1726869022,
      "owned_by": "venice.ai",
      "model_spec": {
        "availableContextTokens": 32768,
        "capabilities": {
          "supportsFunctionCalling": true,
          "supportsResponseSchema": true,
          "supportsWebSearch": true
        },

How to use Structured Responses

To properly use the “response_format” you can define your schema with various “properties”, representing categories of outputs, each with individually configured data types. These objects can be nested to create more advanced structures of outputs.

Here is an example of an API call using response_format to explain the step-by-step process of solving a math equation.

You can see that the properties were configured to require both “steps” and “final_answer” within the response. Within nesting, the steps category consists of both an “explanation” and an “output”, each as strings.

curl --request POST \
  --url https://api.venice.ai/api/v1/chat/completions \
  --header 'Authorization: Bearer <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "dolphin-2.9.2-qwen2-72b",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful math tutor."
    },
    {
      "role": "user",
      "content": "solve 8x + 31 = 2"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "math_response",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "steps": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "explanation": {
                  "type": "string"
                },
                "output": {
                  "type": "string"
                }
              },
              "required": ["explanation", "output"],
              "additionalProperties": false
            }
          },
          "final_answer": {
            "type": "string"
          }
        },
        "required": ["steps", "final_answer"],
        "additionalProperties": false
      }
    }
  }
}

Here is the response that was received from the model. You can see that the structure followed the requirements by first providing the “steps” with the “explanation” and “output” of each step, and then the “final answer”.

{
  "steps": [
    {
      "explanation": "Subtract 31 from both sides to isolate the term with x.",
      "output": "8x + 31 - 31 = 2 - 31"
    },
    {
      "explanation": "This simplifies to 8x = -29.",
      "output": "8x = -29"
    },
    {
      "explanation": "Divide both sides by 8 to solve for x.",
      "output": "x = -29 / 8"
    }
  ],
  "final_answer": "x = -29 / 8"
}

Although this is a simple example, this can be extrapolated into more advanced use cases like: Data Extraction, Chain of Thought Exercises, UI Generation, Data Categorization and many others.

Gotchas

Here are some key requirements to keep in mind when using Structured Outputs via response_format:

  • Initial requests using response_format may take longer to generate a response. Subsequent requests will not experience the same latency as the initial request.

  • For larger queries, the model can fail to complete if either max_tokens or model timeout are reached, or if any rate limits are violated

  • Incorrect schema format will result in errors on completion, usually due to timeout

  • Although response_format ensures the model will output a particular way, it does not guarantee that the model provided the correct information within. The content is driven by the prompt and the model performance.

  • Structured Outputs via response_format are not compatible with parallel function calls

  • Important: All fields fields or parameters must include a required tag. To make a field optional, you need to add a null option within the typeof the field, like this "type": ["string", "null"]

  • It is possible to make fields optional by giving a null options within the required field to allow an empty response.

  • Important: additionalProperties must be set to false for response_format to work properly

  • Important: strict must be set to true for response_format to work properly