> ## Documentation Index
> Fetch the complete documentation index at: https://docs.venice.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# File Inputs

> Attach PDFs, Office documents, text, data, and source-code files to Venice chat completion requests for summarization, Q&A, and transformation.

File inputs let you attach documents and source files directly to a `/chat/completions` request. Venice extracts the file to text before sending it to the selected model, so you can ask questions, summarize, compare, or transform file content without building your own parser first.

Use file inputs when your prompt depends on the contents of a document, spreadsheet, markdown file, JSON file, or code file. They are request-scoped inputs, not persistent file storage, so include the file in each request that needs it.

<Info>
  File inputs use the OpenAI-compatible chat content array. Add a content block with `type: "file"` and provide the file content in `file.file_data`.
</Info>

## Supported File Types

The chat API accepts file inputs as either base64 data URLs or publicly accessible URLs.

The maximum file size is **25MB per file**, measured after decoding a base64 data URL or after fetching a URL.

| Category      | Formats                                                                                                                        |
| ------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| Documents     | PDF, DOCX, PPTX                                                                                                                |
| Spreadsheets  | XLSX, XLS, CSV                                                                                                                 |
| Text and data | TXT, Markdown, JSON                                                                                                            |
| Source code   | Most common code files, including `.py`, `.js`, `.ts`, `.c`, `.cpp`, `.java`, `.go`, `.rs`, `.ps1`, `.sh`, `.yaml`, and `.sql` |

<Note>
  Files are extracted to text before inference. The extracted text counts toward the model's input context, so choose a model with enough `availableContextTokens` for the file plus your instructions and expected answer.
</Note>

## Basic Usage

Send a `messages` array where the user message `content` is an array of text and file blocks:

<CodeGroup>
  ```python Python theme={"system"}
  import base64
  import os
  from pathlib import Path

  from openai import OpenAI

  client = OpenAI(
      api_key=os.environ["VENICE_API_KEY"],
      base_url="https://api.venice.ai/api/v1",
  )

  path = Path("q3-report.pdf")
  file_data = "data:application/pdf;base64," + base64.b64encode(path.read_bytes()).decode("utf-8")

  response = client.chat.completions.create(
      model="openai-gpt-55",
      messages=[
          {
              "role": "user",
              "content": [
                  {
                      "type": "text",
                      "text": "Summarize this report in five bullets and list the main risks.",
                  },
                  {
                      "type": "file",
                      "file": {
                          "file_data": file_data,
                          "filename": "q3-report.pdf",
                      },
                  },
              ],
          }
      ],
  )

  print(response.choices[0].message.content)
  ```

  ```javascript Node.js theme={"system"}
  import OpenAI from "openai";
  import { readFile } from "node:fs/promises";

  const client = new OpenAI({
    apiKey: process.env.VENICE_API_KEY,
    baseURL: "https://api.venice.ai/api/v1",
  });

  const pdf = await readFile("q3-report.pdf");
  const fileData = `data:application/pdf;base64,${pdf.toString("base64")}`;

  const response = await client.chat.completions.create({
    model: "openai-gpt-55",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Summarize this report in five bullets and list the main risks.",
          },
          {
            type: "file",
            file: {
              file_data: fileData,
              filename: "q3-report.pdf",
            },
          },
        ],
      },
    ],
  });

  console.log(response.choices[0].message.content);
  ```

  ```bash cURL theme={"system"}
  PDF_BASE64=$(base64 < q3-report.pdf | tr -d '\n')

  curl https://api.venice.ai/api/v1/chat/completions \
    -H "Authorization: Bearer $VENICE_API_KEY" \
    -H "Content-Type: application/json" \
    -d @- <<EOF
  {
    "model": "openai-gpt-55",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Summarize this report in five bullets and list the main risks."
          },
          {
            "type": "file",
            "file": {
              "file_data": "data:application/pdf;base64,$PDF_BASE64",
              "filename": "q3-report.pdf"
            }
          }
        ]
      }
    ]
  }
  EOF
  ```
</CodeGroup>

## File URLs

If the file is already hosted at a public HTTP or HTTPS URL, pass the URL in `file_data` instead of base64 encoding it:

```json theme={"system"}
{
  "model": "openai-gpt-55",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Identify the governing law, renewal terms, and termination rights in this agreement."
        },
        {
          "type": "file",
          "file": {
            "file_data": "https://example.com/contracts/vendor-agreement.pdf",
            "filename": "vendor-agreement.pdf"
          }
        }
      ]
    }
  ]
}
```

<Warning>
  Only use public URLs that Venice can fetch without authentication. For private files, send a base64 data URL.
</Warning>

## Multiple Files

You can include more than one file block in the same message. Put a short text instruction before the files so the model knows how to use them.

```json theme={"system"}
{
  "model": "openai-gpt-55",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Compare these two policy drafts. Return the material differences and recommend which version is clearer."
        },
        {
          "type": "file",
          "file": {
            "file_data": "data:application/pdf;base64,JVBERi0xLjQK...",
            "filename": "policy-v1.pdf"
          }
        },
        {
          "type": "file",
          "file": {
            "file_data": "data:application/pdf;base64,JVBERi0xLjQK...",
            "filename": "policy-v2.pdf"
          }
        }
      ]
    }
  ]
}
```

For best results, name each file clearly and refer to those names in your prompt.

## Data URLs

For local files, encode the file bytes as base64 and prefix them with the correct MIME type:

| File type  | Data URL prefix                                                                          |
| ---------- | ---------------------------------------------------------------------------------------- |
| PDF        | `data:application/pdf;base64,`                                                           |
| DOCX       | `data:application/vnd.openxmlformats-officedocument.wordprocessingml.document;base64,`   |
| PPTX       | `data:application/vnd.openxmlformats-officedocument.presentationml.presentation;base64,` |
| XLSX       | `data:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;base64,`         |
| CSV        | `data:text/csv;base64,`                                                                  |
| Markdown   | `data:text/markdown;base64,`                                                             |
| Plain text | `data:text/plain;base64,`                                                                |
| JSON       | `data:application/json;base64,`                                                          |

If you do not know the exact MIME type, use `application/octet-stream`. Including an accurate `filename` still helps Venice identify and display the file.

## Working With Large Files

Because files become prompt text, large files can increase latency, token usage, and cost. Keep the model's context window in mind.

The raw file must be 25MB or smaller. Base64 encoding increases request size by about 33%, so a file near the 25MB limit will produce a larger JSON request body.

Good patterns for large files:

* Ask for a specific task instead of a broad "analyze everything" prompt.
* Include only the documents needed for the current answer.
* Use models with larger `availableContextTokens` for long reports or codebases.
* Put stable, repeated documents before dynamic user questions if you are also using [prompt caching](/guides/features/prompt-caching).
* Use `stream: true` when you expect a long response.

## File Inputs vs. Text Parser

Use chat file inputs when you want the model to reason over the file immediately.

Use the [Text Parser API](/api-reference/endpoint/augment/text-parser) when you want to extract text first, inspect the token count, store the extracted text in your own system, or send the same extracted text to multiple requests.

| Need                                         | Use                                               |
| -------------------------------------------- | ------------------------------------------------- |
| Ask a model about a document in one request  | Chat file input                                   |
| Extract text without model inference         | Text Parser API                                   |
| Check extracted token count before prompting | Text Parser API                                   |
| Reuse extracted text across many requests    | Text Parser API, then include the text in prompts |

## Best Practices

* Include `filename` whenever possible, especially when sending multiple files.
* Put the instruction before the file blocks so the model knows the task before reading the extracted content.
* Use public URLs only for files that can be fetched without cookies, headers, or signed session state.
* Prefer base64 data URLs for private files or files generated inside your application.
* Ask focused questions and specify the output format you want.
* For structured extraction, combine file inputs with [structured responses](/guides/features/structured-responses).

## Troubleshooting

<AccordionGroup>
  <Accordion title="The model says it cannot access the file">
    Make sure the message content uses an array and includes a `type: "file"` block. If you used a URL, verify it is publicly reachable without authentication.
  </Accordion>

  <Accordion title="The request is slow or expensive">
    The file may extract to a large amount of text. Use a larger-context model, narrow the task, send fewer files, or pre-extract and trim the text with the Text Parser API.
  </Accordion>

  <Accordion title="The response ignores one of my files">
    Give each file a descriptive `filename` and refer to the filenames directly in your prompt. For example, "Compare `policy-v1.pdf` against `policy-v2.pdf`."
  </Accordion>

  <Accordion title="A model rejects the file content">
    File inputs are available on compatible chat models. Check the [Models page](/models/overview) for current model capabilities and context limits, or try a current large-context text model.
  </Accordion>
</AccordionGroup>