Skip to main content
POST
/
api
/
v1
/
lightning-v3.1
/
get_speech
Generate speech from text (Lightning V3.1)
curl --request POST \
  --url https://waves-api.smallest.ai/api/v1/lightning-v3.1/get_speech \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "text": "<string>",
  "voice_id": "<string>",
  "sample_rate": 44100,
  "speed": 1,
  "language": "en",
  "output_format": "pcm",
  "pronunciation_dicts": [
    "<string>"
  ]
}
'
"<string>"

Overview

Lightning v3.1 is a 44 kHz text-to-speech model that delivers natural, expressive, and realistic speech synthesis.

Key Features

  • Voice Cloning Support: Compatible with cloned voices
  • Ultra-Low Latency: Optimized for real-time applications
  • Multi-Language: Supports English (en) and Hindi (hi)
  • Multiple Output Formats: PCM, MP3, WAV, and mulaw
  • Flexible Sample Rates: 8000 Hz to 44100 Hz
  • Speed Control: Adjustable from 0.5x to 2x speed

Available Voices

Voice IDGenderAccent
sophiaFemaleUS
sandraFemaleUS
rachelFemaleUS
laurenFemaleUS
hannahFemaleUS
vanessaFemaleUS
brookeFemaleUS
meganFemaleUS
robertMaleUS
johnnyMaleUS
ethanMaleUS
lucasMaleUS
danielMaleUS
edwardMaleBritish
vaibhavMaleIndian
hiteshMaleIndian
gauravMaleIndian
vivaanMaleIndian
arjunMaleIndian
kunalMaleIndian
siddharthMaleIndian
advikaFemaleIndian
aishaFemaleIndian
yuvikaFemaleIndian
ishaniFemaleIndian
anujaFemaleIndian

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
text
string
required

The text to convert to speech.

voice_id
string
required

The voice identifier to use for speech generation.

sample_rate
enum<integer>
default:44100

The sample rate for the generated audio.

Available options:
8000,
16000,
24000,
44100
speed
number
default:1

The speed of the generated speech.

Required range: 0.5 <= x <= 2
language
enum<string>
default:en

Determines how numbers are spelled out. If set to 'en', numbers will be read in English. If set to 'hi', numbers will be read in Hindi.

Available options:
en,
hi
output_format
enum<string>
default:pcm

The format of the output audio.

Available options:
pcm,
mp3,
wav,
mulaw
pronunciation_dicts
string[]

The IDs of the pronunciation dictionaries to use for speech generation.

The ID of the pronunciation dictionary to use for speech generation.

Response

Synthesized speech retrieved successfully.

A PCM int16 WAV file at the specified sample rate.