Installation
Setup
Set your API key:Usage
The skill triggers automatically when you ask your agent to generate speech or transcribe audio. Just talk naturally: Text-to-Speech:- “Say good morning in a male voice”
- “Read this aloud: The meeting is at 3pm”
- “Generate a voice note saying hello in Hindi”
- “Transcribe this audio file”
- “What did they say in this recording?”
- “Say ‘namaste, kaise hain aap’ in advika’s voice”
- “Say ‘hola buenos dias’ using camilla”
Voices
The skill auto-selects voices based on your request:| Voice | Gender | Accent | Best For |
|---|---|---|---|
sophia | Female | American | General use (default) |
robert | Male | American | Professional (default male) |
advika | Female | Indian | Hindi, code-switching |
vivaan | Male | Indian | Bilingual English/Hindi |
camilla | Female | Mexican/Latin | Spanish |
zara | Female | American | Conversational |
melody | Female | American | Storytelling |
arjun | Male | Indian | English/Hindi bilingual |
stella | Female | American | Expressive, warm |
Features
- Sub-100ms text-to-speech via Lightning v3.1
- 64ms speech-to-text via Pulse
- Supports WAV, MP3, OGG, FLAC, M4A, and WebM audio formats (STT)
- 30+ languages with automatic language detection
- Speaker diarization and emotion detection (STT)
- Hindi-English code-switching
- Voice cloning — clone any voice with just 5 seconds of audio (Basic plan+)

