Speech-to-Text Models | Venice API Docs

Loading models…

Usage

Speech-to-text models transcribe spoken audio into written text. They are accessed via the Audio Transcriptions API.

mp3, mp4, mpeg, mpga, m4a, wav, webm, flac, ogg

Format	Description
`json`	Default. Returns `{ "text": "..." }`.
`text`	Plain transcribed text.
`srt`	SubRip subtitle format with timestamps.
`vtt`	WebVTT subtitle format with timestamps.
`verbose_json`	Full response with segment-level timestamps and metadata.

Pricing is billed per second of input audio. See the Audio Transcriptions API for request examples and parameter details.

⌘I