ASR WebSocket Best Practices

This guide covers best practices for implementing, optimizing, and troubleshooting the Waves ASR WebSocket API for production applications.

Audio Quality Optimization

Sample Rate Selection

Choose the optimal sample rate for your use case:

16 kHz (Recommended)

Best for: Speech recognition, real-time applications
  • Optimal balance of quality and performance
  • Lower bandwidth requirements
  • Faster processing times

8 kHz

Best for: Telephony applications
  • Standard for phone call quality
  • Minimal bandwidth usage
  • Good for voice-only content

44.1 kHz

Best for: High-fidelity audio
  • Music or broadcast content
  • Higher accuracy for complex audio
  • Increased bandwidth and processing time

48 kHz

Best for: Video/multimedia
  • Professional audio production
  • Maximum quality requirements
  • Higher resource consumption

Audio Format Guidelines

Recommended Configuration:
const optimalConfig = {
    audioEncoding: 'linear16',
    audioSampleRate: '16000',
    audioChannels: '1',        // Mono for efficiency
    addPunctuation: 'true',
};
Format-Specific Tips:
16-bit Linear PCM (Recommended)
  • Uncompressed, high quality
  • Predictable bandwidth usage
  • Wide compatibility
  • Best accuracy/performance ratio

Audio Preprocessing

Implement client-side audio processing for better results:
// Audio preprocessing example
function preprocessAudio(audioBuffer) {
    const processedBuffer = new Float32Array(audioBuffer.length);
    
    // 1. Normalize audio levels
    const maxValue = Math.max(...audioBuffer.map(Math.abs));
    const normalizationFactor = maxValue > 0 ? 0.8 / maxValue : 1;
    
    // 2. Apply normalization and basic filtering
    for (let i = 0; i < audioBuffer.length; i++) {
        let sample = audioBuffer[i] * normalizationFactor;
        
        // Simple high-pass filter to reduce low-frequency noise
        if (i > 0) {
            sample = sample - 0.95 * processedBuffer[i - 1];
        }
        
        processedBuffer[i] = sample;
    }
    
    return processedBuffer;
}

// Apply in audio processor
processor.onaudioprocess = (e) => {
    const inputData = e.inputBuffer.getChannelData(0);
    const processedData = preprocessAudio(inputData);
    
    // Convert to Int16 and send
    const int16Data = new Int16Array(processedData.length);
    for (let i = 0; i < processedData.length; i++) {
        int16Data[i] = Math.max(-32768, Math.min(32767, processedData[i] * 32768));
    }
    
    if (ws.readyState === WebSocket.OPEN) {
        ws.send(int16Data.buffer);
    }
};

Troubleshooting Guide

Common Issues and Solutions