Skip to main content
Fastest real-time speech-to-text transcription using the Lightning STT API.
The Waves Automatic Speech Recognition (STT) stack processes audio via https://waves-api.smallest.ai/api/v1/lightning/get_text and returns low-latency transcripts with configurable languages, formats, and pricing tiers suited for enterprise deployments.
Feature highlights
Our models specialize in processing audio to preserve information that is often lost during conventional speech-to-text conversion.
- 30+ languages – automatic language detection or ISO 639-1 codes (
en, hi, etc.).
- Diarization – identify and separate generated text into speaker turns.
- Timestamps – receive sentence-level and word-level timing information.
- Age prediction – estimate the age group of each speaker.
- Gender prediction – detect the gender of speakers.
- Emotion detection – reports emotional tone with strength of 5 core emotion types.
- Low latency – streaming pipeline tuned for ~64 ms time to first transcript latency.
Supported languages
| Language | Code |
|---|
| Italian | it |
| Spanish | es |
| English | en |
| Portuguese | pt |
| Hindi | hi |
| German | de |
| French | fr |
| Ukrainian | uk |
| Russian | ru |
| Kannada | kn |
| Malayalam | ml |
| Polish | pl |
| Marathi | mr |
| Gujarati | gu |
| Czech | cs |
| Slovak | sk |
| Telugu | te |
| Oriya (Odia) | or |
| Dutch | nl |
| Bengali | bn |
| Latvian | lv |
| Estonian | et |
| Romanian | ro |
| Punjabi | pa |
| Finnish | fi |
| Swedish | sv |
| Bulgarian | bg |
| Tamil | ta |
| Hungarian | hu |
| Danish | da |
| Lithuanian | lt |
| Maltese | mt |
Use language=multi to auto-detect across the full list or specify one of the codes above to pin the model to a single language.
Next steps