Example Response
Response Fields
type: Message type identifier, set to"transcription"for transcription results.status: Status of the transcription request, typically"success"for valid responses.session_id: Unique identifier for the transcription session.transcript: Partial or complete transcription text for the current segment.is_final: Indicates if this is the final transcription for the current segment.falseindicates a partial/interim transcript;trueindicates a final transcript.is_last: Indicates if this is the last transcription in the session.truewhen the session is complete.
Optional Fields
The following fields may be included in responses under certain conditions:full_transcript: Complete transcription text accumulated so far. Only included whenfull_transcript=truequery parameter is set ANDis_final=true.language: Detected primary language code. Only returned whenis_final=true.languages: Array of language codes detected in the audio. Only returned whenis_final=true.words: Array of word-level timestamps (only included whenword_timestamps=truein query parameters). Each word object containsword,start,end, andconfidencefields. Whendiarize=true, also includesspeaker(integer ID) andspeaker_confidence(0.0 to 1.0) fields.utterances: Array of sentence-level timestamps (only included whensentence_timestamps=truein query parameters). Each utterance object containstext,start, andendfields. Whendiarize=true, also includesspeaker(integer ID) field.redacted_entities: Array of redacted entity placeholders (only included whenredact_pii=trueorredact_pci=true). Examples:[FIRSTNAME_1],[CREDITCARDCVV_1].
Handling Responses
We maintain an internal server-side buffer that collects chunked audio sent by the user. Once this buffer reaches a specific size, the server sends a special response with theis_final parameter set to true that contains the transcription of user audio collected since the last such response.
is_final = true
We recommend processing responses of this kind for optimal transcription accuracy. The internal buffer size is calibrated to optimize response times and accuracy.
- Additionally, the
languagefield is set to the specified language, or the detected language if the language parameter is set tomulti. Other responses will not include thelanguagefield. - The
full_transcriptis non-empty if the user sends the end token{"type":"end"}to signal end of session.
is_final = false
These are interim transcript responses sent for each chunk. They provide quick feedback for low latency use cases.
- These responses may provide inaccurate results for the most recent words. This occurs when the audio for these words is not fully sent to the server in the respective chunk.
The
full_transcript field is a feature that requires the full_transcript query parameter to be set to true. Learn more about the Full Transcript feature.is_last = true
This response is similar to an is_final=true response, but it is the final response received after the user sends the end token {"type":"end"}. When is_last=true, the server has finished processing all audio and the session is complete.
- This is the last response of the live transcription session and contains all the fields of the
is_final=trueresponse.

