VAD and turn detection configuration guide

LiveKit Agents provides extensive control over Voice Activity Detection (VAD) and turn detection. However, not all parameters apply to every agent configuration. This guide explains which parameters matter for your specific setup.

VAD Configuration Parameters

These parameters are defined in the Silero VAD implementation and control how speech is detected in audio streams:

Parameter	Description
`min_speech_duration`	Minimum duration of speech to trigger detection
`min_silence_duration`	Minimum silence duration to mark end of speech
`prefix_padding_duration`	Audio to include before detected speech starts
`max_buffered_speech`	Maximum speech audio to buffer before processing
`activation_threshold`	Confidence threshold to classify audio as speech

These parameters are used whenever VAD is present in your pipeline.

Turn Detection and Interrupt Parameters

These parameters are defined in the AgentSession constructor and control how the agent handles conversational turns:

Parameter	Description
`allow_interruptions`	Whether users can interrupt agent responses
`min_interruption_duration`	Minimum speech duration to trigger an interruption
`min_endpointing_delay`	Minimum wait time before finalizing a user turn
`max_endpointing_delay`	Maximum wait time when turn detector confidence is low

Scenario Reference

Pipeline Agents with Turn Detector Model

When using a turn detector model alongside VAD and STT, all parameters are active:

Parameter	Status
All VAD parameters	Used
`allow_interruptions`	Used
`min_interruption_duration`	Used
`min_endpointing_delay`	Used
`max_endpointing_delay`	Used

The turn detector model provides confidence scores that influence timing between min_endpointing_delay and max_endpointing_delay.

Pipeline Agents without Turn Detector Model

Without a turn detector model, max_endpointing_delay becomes irrelevant since there are no confidence predictions to act on:

Parameter	Status
All VAD parameters	Used
`allow_interruptions`	Used
`min_interruption_duration`	Used
`min_endpointing_delay`	Used (as the default delay)
`max_endpointing_delay`	Not used

Realtime Model Agents (OpenAI, Google)

When using realtime models like OpenAI's Realtime API or Google's realtime models, VAD and turn detection are handled server-side:

Parameter	Status	Reason
`min_speech_duration`	Not used	Server-side VAD
`min_silence_duration`	Not used	Server-side VAD
`prefix_padding_duration`	Not used	Server-side VAD
`max_buffered_speech`	Not used	Server-side VAD
`activation_threshold`	Not used	Server-side VAD
`min_endpointing_delay`	Not used	Server-side turn detection
`max_endpointing_delay`	Not used	Server-side turn detection
`allow_interruptions`	Used	Controls client-side behavior
`min_interruption_duration`	Used	Controls client-side behavior

Universal Parameters

These parameters work across all agent types:

Parameter	Description
`allow_interruptions`	Controls whether user speech interrupts agent playback
`min_interruption_duration`	Sets minimum speech duration to count as interruption

Quick Reference by Agent Type

Agent Type	VAD Params	Endpointing Delays	Interruption Params
Pipeline + Turn Detector	All	Both	Both
Pipeline (no Turn Detector)	All	min only	Both
Realtime (OpenAI/Google)	None	None	Both

How Turn Detection Mode is Selected

The turn detection mode is automatically selected based on available components, with this priority order:

realtime_llm — Uses server-side VAD and turn detection
vad — Uses client-side Silero VAD
stt — Falls back to STT-based detection
manual — No automatic turn detection

Configuration Examples

Pipeline Agent with Turn Detector

from livekit.agents import AgentSession
from livekit.plugins import silero

session = AgentSession(
    vad=silero.VAD.load(
        min_speech_duration=0.1,
        min_silence_duration=0.3,
        prefix_padding_duration=0.5,
        activation_threshold=0.5,
    ),
    turn_detector=my_turn_detector,
    allow_interruptions=True,
    min_interruption_duration=0.5,
    min_endpointing_delay=0.5,
    max_endpointing_delay=6.0,  # Used with turn detector confidence
)

Pipeline Agent without Turn Detector

from livekit.agents import AgentSession
from livekit.plugins import silero

session = AgentSession(
    vad=silero.VAD.load(
        min_speech_duration=0.1,
        min_silence_duration=0.5,  # More important without turn detector
        prefix_padding_duration=0.5,
        activation_threshold=0.5,
    ),
    allow_interruptions=True,
    min_interruption_duration=0.5,
    min_endpointing_delay=0.8,  # Acts as the fixed delay
    # max_endpointing_delay not needed
)

Realtime Model Agent

from livekit.agents import AgentSession
from livekit.plugins import openai

session = AgentSession(
    llm=openai.realtime.RealtimeModel(),
    # VAD params not needed - handled server-side
    # Endpointing delays not needed - handled server-side
    allow_interruptions=True,
    min_interruption_duration=0.5,
)

Summary

Understanding which parameters apply to your agent configuration prevents confusion and helps you tune the right settings. Remember:

Pipeline agents give you full control over VAD and turn detection
Realtime agents delegate most detection to the server, but you still control interruption behavior
Turn detector models enable confidence-based timing with the endpointing delay range

For complete documentation on turn detection modes, interruption handling, and session configuration, see the Turns overview in the LiveKit docs.

VAD and turn detection configuration guide

VAD Configuration Parameters

Turn Detection and Interrupt Parameters

Scenario Reference

Pipeline Agents with Turn Detector Model

Pipeline Agents without Turn Detector Model

Realtime Model Agents (OpenAI, Google)

Universal Parameters

Quick Reference by Agent Type

How Turn Detection Mode is Selected

Configuration Examples

Pipeline Agent with Turn Detector

Pipeline Agent without Turn Detector

Realtime Model Agent

Summary

Agents overview

Quickstart guide

Agent models

Building multi-agent architectures with LiveKit agents

Can you increase agent deployment limits?

VAD Configuration Parameters

Turn Detection and Interrupt Parameters

Scenario Reference

Pipeline Agents with Turn Detector Model

Pipeline Agents without Turn Detector Model

Realtime Model Agents (OpenAI, Google)

Universal Parameters

Quick Reference by Agent Type

How Turn Detection Mode is Selected

Configuration Examples

Pipeline Agent with Turn Detector

Pipeline Agent without Turn Detector

Realtime Model Agent

Summary

Read related documentation

Agents overview

Quickstart guide

Agent models

Find more Agents guides

Building multi-agent architectures with LiveKit agents

Can you increase agent deployment limits?