Skip to main content
Add voice capabilities to your OpenClaw agent. Generate speech with sub-100ms latency and transcribe audio with the Smallest AI skill.

Installation

# Via ClawHub (recommended)
clawhub install smallest-ai

# Or manually
git clone https://github.com/smallest-inc/smallest-ai-openclaw.git
cp -r smallest-ai-openclaw ~/.openclaw/skills/smallest-ai

Setup

Set your API key:
export SMALLEST_API_KEY="your_key_here"
Get a free key at waves.smallest.ai. Restart the gateway:
openclaw gateway stop && openclaw gateway start

Usage

The skill triggers automatically when you ask your agent to generate speech or transcribe audio. Just talk naturally: Text-to-Speech:
  • “Say good morning in a male voice”
  • “Read this aloud: The meeting is at 3pm”
  • “Generate a voice note saying hello in Hindi”
Speech-to-Text:
  • “Transcribe this audio file”
  • “What did they say in this recording?”
Multilingual:
  • “Say ‘namaste, kaise hain aap’ in advika’s voice”
  • “Say ‘hola buenos dias’ using camilla”

Voices

The skill auto-selects voices based on your request:
VoiceGenderAccentBest For
sophiaFemaleAmericanGeneral use (default)
robertMaleAmericanProfessional (default male)
advikaFemaleIndianHindi, code-switching
vivaanMaleIndianBilingual English/Hindi
camillaFemaleMexican/LatinSpanish
zaraFemaleAmericanConversational
melodyFemaleAmericanStorytelling
arjunMaleIndianEnglish/Hindi bilingual
stellaFemaleAmericanExpressive, warm
80+ more voices available. The agent picks the right voice based on language and gender preference.

Features

  • Sub-100ms text-to-speech via Lightning v3.1
  • 64ms speech-to-text via Pulse
  • Supports WAV, MP3, OGG, FLAC, M4A, and WebM audio formats (STT)
  • 30+ languages with automatic language detection
  • Speaker diarization and emotion detection (STT)
  • Hindi-English code-switching
  • Voice cloning — clone any voice with just 5 seconds of audio (Basic plan+)