ASR WebSocket Best Practices
This guide covers best practices for implementing, optimizing, and troubleshooting the Waves ASR WebSocket API for production applications.Audio Quality Optimization
Sample Rate Selection
Choose the optimal sample rate for your use case:16 kHz (Recommended)
Best for: Speech recognition, real-time applications
- Optimal balance of quality and performance
- Lower bandwidth requirements
- Faster processing times
8 kHz
Best for: Telephony applications
- Standard for phone call quality
- Minimal bandwidth usage
- Good for voice-only content
44.1 kHz
Best for: High-fidelity audio
- Music or broadcast content
- Higher accuracy for complex audio
- Increased bandwidth and processing time
48 kHz
Best for: Video/multimedia
- Professional audio production
- Maximum quality requirements
- Higher resource consumption
Audio Format Guidelines
Recommended Configuration:16-bit Linear PCM (Recommended)
- Uncompressed, high quality
- Predictable bandwidth usage
- Wide compatibility
- Best accuracy/performance ratio
Audio Preprocessing
Implement client-side audio processing for better results:Troubleshooting Guide
Common Issues and Solutions
Connection Refused or Timeout
Connection Refused or Timeout
Symptoms:
- WebSocket connection fails immediately
- Connection timeout errors
- “Failed to connect” messages
- Verify API Key: Ensure your API key is valid and properly formatted
- Check Subscription: Confirm you have an active Enterprise plan
- Network Issues: Test connectivity and check firewall settings
- Rate Limiting: Implement exponential backoff for reconnections
No Transcription Results
No Transcription Results
Symptoms:
- WebSocket connects but no transcription responses
- Audio sent but no text returned
- Silent failures
- Audio Format: Verify audio encoding matches parameters
- Audio Quality: Ensure audio contains actual speech
- Chunk Size: Check if audio chunks are appropriate size
- Parameters: Validate all connection parameters
High Latency or Delays
High Latency or Delays
Symptoms:
- Slow transcription responses
- Poor real-time performance
- Reduce Chunk Size: Use smaller audio chunks (0.5-1 second)
- Optimize Audio Processing: Minimize client-side processing
- Network Optimization: Use faster network connection
- Parameter Tuning: Adjust
speechEndThreshold
for faster responses