Skip to main content
POST
/
api
/
v1
/
lightning-v3.1
/
get_speech
Generate speech from text (Lightning V3.1)
curl --request POST \
  --url https://waves-api.smallest.ai/api/v1/lightning-v3.1/get_speech \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "text": "<string>",
  "voice_id": "<string>",
  "sample_rate": 44100,
  "speed": 1,
  "language": "auto",
  "output_format": "pcm",
  "pronunciation_dicts": [
    "<string>"
  ]
}
'
"<string>"

Overview

Lightning v3.1 is a 44 kHz text-to-speech model that delivers natural, expressive, and realistic speech synthesis.

Key Features

  • Voice Cloning Support: Compatible with cloned voices
  • Ultra-Low Latency: Optimized for real-time applications
  • Multi-Language: Supports English (en) and Hindi (hi)
  • Multiple Output Formats: PCM, MP3, WAV, and mulaw
  • Flexible Sample Rates: 8000 Hz to 44100 Hz
  • Speed Control: Adjustable from 0.5x to 2x speed

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <api_key>, where <api_key> is your api key.

Body

application/json
text
string
required

The text to convert to speech.

voice_id
string
required

The voice identifier to use for speech generation.

sample_rate
enum<integer>
default:44100

The sample rate for the generated audio.

Available options:
8000,
16000,
24000,
44100
speed
number
default:1

The speed of the generated speech.

Required range: 0.5 <= x <= 2
language
enum<string>
default:auto

Language code for text normalization (e.g., how numbers, dates, and abbreviations are spelled out). Set to 'auto' for automatic language detection, or specify a language code like 'en' or 'hi'.

Available options:
auto,
en,
hi,
ta,
es
output_format
enum<string>
default:pcm

The format of the output audio.

Available options:
pcm,
mp3,
wav,
mulaw
pronunciation_dicts
string[]

The IDs of the pronunciation dictionaries to use for speech generation.

The ID of the pronunciation dictionary to use for speech generation.

Response

Synthesized speech retrieved successfully.

A PCM int16 WAV file at the specified sample rate.