Skip to main content
Lightning STT evaluations revolve around four pillars:
  1. Accuracy – how close transcripts are to the ground truth.
  2. Latency & throughput – how quickly and efficiently results arrive.
  3. Enrichment quality – how reliable diarization, timestamps, and metadata are.

Accuracy metrics

Word Error Rate (WER)

  • Formula: WER = (Substitutions + Deletions + Insertions) / Total Words.
  • Interprets overall transcript fidelity; normalize casing/punctuation before computing.

Character Error Rate (CER)

  • Parallel to WER but at the character level; useful for languages with compact scripts or heavy compounding.

Sentence accuracy

  • Percentage of sentences that match ground truth exactly.
  • Tracks readability for QA/customer-support recaps.

Latency & throughput

Time to First Result (TTFR)

  • Measures the delay between request start and first interim token.
  • For real-time agents, keep TTFR below ~30 ms to maintain natural turn-taking.

End-to-end latency

  • Wall-clock time from submission to final transcript.
  • Report p50/p90/p95 to capture outliers introduced by long files or retries.

Real-Time Factor (RTF)

  • RTF = Processing Time / Audio Duration.
  • Values less than 1 indicate faster-than-real-time processing; Lightning STT typically runs near 0.4 RTF on clean inputs.

Enrichment quality

MetricWhat to watchWhy it matters
Diarization accuracy% of words with correct speaker_idCall-center QA, coaching, compliance
Word timestamp driftGap between predicted and reference timestampsSubtitle alignment and editing
Sentence-level timestamps% of audio covered by utterances segmentsChaptering, meeting notes
Emotion/age/gender precisionConfidence distributionRouting, analytics, compliance flags

Coverage & robustness

  • Language detection accuracy: share of files that land on the intended ISO 639-1 code or auto-detected language.
  • Noise robustness: WER delta at multiple SNR levels (clean vs. +5 dB noise, etc.).
  • Accent/domain diversity: track WER per accent or scenario (support, media, meetings) to avoid blind spots.

Operational metrics

  • Requests per second / concurrent sessions: validate you stay within quota and plan scaling needs.
  • Cost per minute: Lightning STT bills per second at $0.025/minute list price—include enrichment toggles when modeling cost.
  • Retry volume: differentiate infrastructure retries (HTTP 5xx) from transcription failures to spot upstream vs downstream issues.

Reporting checklist

  1. Describe dataset composition (language, accent, domain, duration).
  2. Publish WER/CER, TTFR, and RTF with averages and percentiles.
  3. Include enrichment coverage (how many segments include diarization/timestamps).
  4. Summarize cost/latency impact when enabling optional features.
  5. Link to reproducible scripts or notebooks for auditing.