How to detect when an agent has finished speaking

When working with Text-to-Speech (TTS) audio in the agent system, you may need to detect when an agent has completely finished speaking. This guide covers the recommended approach and how to make this information available to web clients.

Why detect playback completion?

Common use cases include:

UI feedback: Update a visual indicator (like an animated avatar or speaking icon) when the agent stops talking.
Turn-taking logic: Trigger actions only after the agent finishes, such as enabling a "your turn" prompt.
Analytics: Measure actual speaking duration for conversation analysis.
Accessibility: Provide screen reader cues that the agent has finished speaking.

Using SpeechHandle.wait_for_playout() (Recommended)

The most accurate way to detect when an agent has finished speaking is to use the SpeechHandle returned by session.say() or session.generate_reply():

# Using session.say()
handle = session.say("Hello, how can I help you today?")
await handle.wait_for_playout()
logger.info("Agent finished speaking")

# Using session.generate_reply()
handle = await session.generate_reply(instructions="Greet the user")
await handle.wait_for_playout()
logger.info("Agent finished speaking")

The wait_for_playout() method returns when the audio has been fully played to the participant. You can also check if the speech was interrupted:

handle = session.say("Let me explain...")
await handle.wait_for_playout()

if handle.interrupted:
    logger.info("Speech was interrupted by user")
else:
    logger.info("Speech completed naturally")

Using callbacks for non-blocking detection

If you don't want to await the playout, you can use a callback:

def on_playout_complete(handle):
    logger.info(f"Playout complete, interrupted: {handle.interrupted}")

handle = session.say("Processing your request...")
handle.add_done_callback(on_playout_complete)

Notifying web clients

To notify web clients when speech completes, you can use participant attributes or data messages. Here's an example using attributes:

Agent-side implementation

import json

async def notify_speech_complete(room, interrupted: bool):
    # Update participant attributes to signal completion
    await room.local_participant.set_attributes({
        "speech_state": "idle",
        "last_speech_interrupted": str(interrupted).lower()
    })

# Usage
handle = session.say("Hello!")
await handle.wait_for_playout()
await notify_speech_complete(ctx.room, handle.interrupted)

Client-side implementation

On the web client, listen for attribute changes:

room.on(RoomEvent.ParticipantAttributesChanged, (changedAttributes, participant) => {
  if (participant.isAgent && changedAttributes.speech_state === 'idle') {
    console.log('Agent finished speaking');
    const wasInterrupted = changedAttributes.last_speech_interrupted === 'true';
    // Handle the completion event
  }
});

Important consideration

The final parameter in the transcription listener does not indicate that the audio has finished playing—it only indicates that the transcription is complete. The transcription may finish before or after the actual audio playback completes.

For accurate playback completion detection, always use SpeechHandle.wait_for_playout() rather than relying on transcription events.

Summary

Approach	Use Case	Accuracy
`SpeechHandle.wait_for_playout()`	Detecting actual audio playback completion	✅ Accurate
`SpeechHandle.add_done_callback()`	Non-blocking playback detection	✅ Accurate
Transcription `final` flag	Detecting when transcription is complete	❌ Not for playback timing

Additional resources

For more examples and advanced implementations, refer to our voice agents examples repository.

How to detect when an agent has finished speaking

Why detect playback completion?

Using SpeechHandle.wait_for_playout() (Recommended)

Using callbacks for non-blocking detection

Notifying web clients

Agent-side implementation

Client-side implementation

Important consideration

Summary

Additional resources

Agents overview

Quickstart guide

Agent models

Building multi-agent architectures with LiveKit agents

Can you increase agent deployment limits?

Why detect playback completion?

Using SpeechHandle.wait_for_playout() (Recommended)

Using callbacks for non-blocking detection

Notifying web clients

Agent-side implementation

Client-side implementation

Important consideration

Summary

Additional resources

Read related documentation

Agents overview

Quickstart guide

Agent models

Find more Agents guides

Building multi-agent architectures with LiveKit agents

Can you increase agent deployment limits?