Learn how to convert text to speech with real-time streaming synthesis.
WavesStreamingTTS
class provides high-performance text-to-speech conversion with configurable streaming parameters. This implementation is optimized for low-latency applications where immediate audio feedback is critical, such as voice assistants, live narration, or interactive applications.
TTSConfig
object to manage synthesis parameters:
synthesize
method:
synthesize_streaming
:
voice_id
: Voice identifier (e.g., “aditi”, “male-1”, “female-2”)api_key
: Your Smallest AI API keylanguage
: Language code for synthesis (default: “en”)sample_rate
: Audio sample rate in Hz (default: 24000)speed
: Speech speed multiplier (default: 1.0 - normal speed, 0.5 = half speed, 2.0 = double speed)consistency
: Voice consistency parameter (default: 0.5, range: 0.0-1.0)enhancement
: Audio enhancement level (default: 1)similarity
: Voice similarity parameter (default: 0, range: 0.0-1.0)max_buffer_flush_ms
: Maximum buffer time in milliseconds before forcing audio output (default: 0)