Pre-recorded best practices

Follow these recommendations to keep Lightning STT latencies low while preserving transcript fidelity.

Audio preprocessing workflow

Convert with FFmpeg

# Convert to 16 kHz mono WAV (recommended ingest format)
ffmpeg -i input.mp3 -ar 16000 -ac 1 -sample_fmt s16 output.wav

# Convert to MP3 with optimal speech settings
ffmpeg -i input.wav -ar 16000 -ac 1 -b:a 128k output.mp3

Python example

from pydub import AudioSegment

audio = AudioSegment.from_file("input.mp3")
audio = audio.set_frame_rate(16000).set_channels(1)
audio.export("output.wav", format="wav")

JavaScript example

import { createFFmpeg, fetchFile } from '@ffmpeg/ffmpeg';

const ffmpeg = createFFmpeg({ log: true });
await ffmpeg.load();

ffmpeg.FS('writeFile', 'input.mp3', await fetchFile('input.mp3'));
await ffmpeg.run('-i', 'input.mp3', '-ar', '16000', '-ac', '1', 'output.wav');
const data = ffmpeg.FS('readFile', 'output.wav');

Quality checklist

Use 16 kHz mono whenever possible; downsample higher-fidelity recordings.
Normalize audio levels so peaks stay consistent across large batches.
Remove silence at the beginning and end to avoid wasted compute.
Handle multiple speakers by enabling diarization when agents and customers share a channel.
Test with a sample clip before launching full backfills to validate accuracy and metadata.

Troubleshooting Code Examples

⌘I

Introduction

Getting Started

Text to Speech

Speech to Text

Voice Cloning

Integrations

Product

Best Practices

Best Practices

Pre-recorded best practices

Audio preprocessing workflow

Convert with FFmpeg

Python example

JavaScript example

Quality checklist

Introduction

Getting Started

Text to Speech

Speech to Text

Voice Cloning

Integrations

Product

Best Practices

​Pre-recorded best practices

​Audio preprocessing workflow

​Convert with FFmpeg

​Python example

​JavaScript example

​Quality checklist

Pre-recorded best practices

Audio preprocessing workflow

Convert with FFmpeg

Python example

JavaScript example

Quality checklist