Skip to main content
POST
/
audio
/
transcriptions
/api/v1/audio/transcriptions
curl --request POST \
  --url https://api.venice.ai/api/v1/audio/transcriptions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form model=nvidia/parakeet-tdt-0.6b-v3 \
  --form response_format=json \
  --form timestamps=false
{
  "text": "<string>",
  "timestamps": {
    "word": [
      {
        "word": "<string>",
        "start": 123,
        "end": 123
      }
    ],
    "segment": [
      {
        "text": "<string>",
        "start": 123,
        "end": 123
      }
    ],
    "char": [
      {
        "char": "<string>",
        "start": 123,
        "end": 123
      }
    ]
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data

Request to transcribe audio to text.

file
file

The audio file object (not a base64 string). Supported formats: WAV, WAVE, FLAC, M4A, AAC, MP4, MP3.

model
enum<string>
default:nvidia/parakeet-tdt-0.6b-v3

The model to use for transcription. Currently only nvidia/parakeet-tdt-0.6b-v3 is supported.

Available options:
nvidia/parakeet-tdt-0.6b-v3
Example:

"nvidia/parakeet-tdt-0.6b-v3"

response_format
enum<string>
default:json

The format of the transcript output, in one of these options: json, text.

Available options:
json,
text
Example:

"json"

timestamps
boolean
default:false

Whether to include timestamps in the response.

Example:

false

Response

Transcription completed successfully

text
string

The transcribed text

timestamps
object

Timestamps for the transcription (only if timestamps=true)