POST
/
api
/
v1
/
speech-to-text
Convert speech to text
curl --request POST \
  --url https://waves-api.smallest.ai/api/v1/speech-to-text \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form model=lightning \
  --form file=@example-file
{
  "text": "Hello, this is a sample transcription of the audio file.",
  "duration": 5.2,
  "language": "en"
}

ASR POST API

The ASR POST API allows you to convert speech to text by uploading audio files. This endpoint accepts any audio format and returns the transcribed text using our Lightning ASR model, which automatically detects the spoken language from the audio.

Endpoint

POST https://waves-api.smallest.ai/api/v1/speech-to-text

Authentication

This endpoint requires authentication using a Bearer token in the Authorization header:
Authorization: Bearer YOUR_API_KEY

Request Format

The API accepts multipart/form-data with the following fields:

Required Parameters

model
string
required
The ASR model to use for transcriptionSupported Values:
  • lightning - High-performance ASR model
file
file
required
Audio file to transcribeSupported Formats:
  • Any audio/* MIME type
  • Common formats: MP3, WAV, FLAC, M4A, OGG, OPUS, AAC
  • Maximum file size: 25MB
  • Maximum duration: 30 minutes

Response Format

Success Response (200)

{
  "text": "Hello, this is a sample transcription of the audio file.",
  "duration": 5.2,
  "language": "en"
}
text
string
The transcribed text from the audio file
duration
number
Duration of the audio file in seconds
language
string
Detected language of the audio (ISO 639-1 code)

Code Examples

curl --location 'https://waves-api.smallest.ai/api/v1/speech-to-text' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--form 'model="lightning"' \
--form 'file=@"/path/to/your/audio.mp3"'

Error Responses

400 Bad Request

{
  "error": "Invalid file format. Supported formats: audio/*"
}
Cause: Unsupported file format or missing required parameters
Solution: Ensure you’re uploading a valid audio file and include all required parameters

401 Unauthorized

{
  "error": "Unauthorized - Invalid API key"
}
Cause: Invalid, missing, or malformed API key
Solution: Verify your API key is correct and properly formatted in the Authorization header

413 Payload Too Large

{
  "error": "File size exceeds maximum limit of 25MB"
}
Cause: Audio file exceeds the 25MB size limit
Solution: Compress your audio file or use a shorter audio clip

429 Too Many Requests

{
  "error": "Rate limit exceeded. Please try again later."
}
Cause: You’ve exceeded the rate limit for your plan
Solution: Wait before making additional requests or upgrade your plan

Best Practices

Audio Quality

  • Sample Rate: 16kHz or higher for optimal results
  • Format: WAV or FLAC for best quality, MP3 for smaller file sizes
  • Duration: Keep files under 10 minutes for faster processing
  • Noise: Use clean audio with minimal background noise

File Management

  • Size: Compress large files before uploading
  • Format: Use widely supported formats (MP3, WAV, M4A)
  • Encoding: Ensure proper audio encoding for your format

Error Handling

  • Always check the response status code
  • Implement retry logic for transient errors (429, 500)
  • Validate file format and size before uploading

Supported Languages

The Lightning ASR model supports automatic language detection and transcription for the following languages:
  • Italian (it)
  • Spanish (es)
  • Portuguese (pt)
  • English (en)
  • German (de)
  • French (fr)
  • Russian (ru)
  • Ukrainian (uk)
  • Polish (pl)
  • Dutch (nl)
  • Slovak (sk)
  • Czech (cs)
  • Bulgarian (bg)
  • Croatian (hr)
  • Romanian (ro)
  • Finnish (fi)
  • Hungarian (hu)
  • Swedish (sv)
  • Danish (da)
  • Estonian (et)
  • Maltese (mt)
  • Greek (el)
  • Lithuanian (lt)
  • Latvian (lv)
  • Slovenian (sl)
The model automatically detects the language of your audio file - no need to specify the language parameter. Simply upload your audio and receive transcription in the detected language.

Authorizations

Authorization
string
header
required

API key authentication using Bearer token format. Include your API key in the Authorization header as: Bearer YOUR_API_KEY

Body

multipart/form-data

Response

200
application/json

Speech transcribed successfully

The response is of type object.