Tools

Text Parser

Extracts text from a document file. Supports PDF, DOCX, PPTX, XLSX, and plain text formats. Upload a file via multipart/form-data.

Privacy: Text parsing runs entirely in-memory on Venice’s infrastructure with zero data retention. Documents are processed and immediately discarded — no content is stored or logged.

Authentication: This endpoint accepts either a Bearer API key or a SIGN-IN-WITH-X header for x402 wallet-based authentication. The legacy X-Sign-In-With-X header is also accepted during migration. When using x402, a 402 Payment Required response indicates insufficient balance and includes top-up instructions.

POST

augment

text-parser

/api/v1/augment/text-parser

curl --request POST \
  --url https://api.venice.ai/api/v1/augment/text-parser \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form response_format=json

{
  "text": "<string>",
  "tokens": 123
}

This is an experimental API. The request and response format may change without notice.

Upload a document file via multipart/form-data using the file field. Supported formats include PDF, DOCX, XLSX, and plain text files (up to 25MB). Set response_format to json (default) for structured output with extracted text and token count, or text for the raw extracted text. Privacy: Text parsing runs entirely in-memory on Venice’s infrastructure with zero data retention. Your documents are processed and immediately discarded — no content is stored or logged. Pricing: $0.01 per request.

Example (cURL)

curl -X POST https://api.venice.ai/api/v1/augment/text-parser \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -F "[email protected]" \
  -F "response_format=json"

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data

file

required

The document file to parse. Supported formats: PDF, DOCX, PPTX, XLSX, and plain text files. Maximum size: 25MB.

response_format

enum<string>

default:json

The format of the response output. "json" returns structured JSON with text and token count, "text" returns only the extracted text.

Available options:

json,

text

Response

Text extraction completed successfully

Text parser response containing extracted text and token count.

text

string

required

The extracted text content from the document.

tokens

number

required

The token count of the extracted text.

Complete Video Web Scrape

⌘I

​Example (cURL)

Authorizations

Body

Response

Example (cURL)