Plans designed to scale with your projects
From building your first AI voice or video agent to realtime applications with millions of users and everything in between.
Estimate costs for
AI voice and video agents
Preview the per-minute cost to run an agent on LiveKit Cloud. Our plans include monthly allotments for agent session minutes, inbound calling minutes (for US local phone numbers), and inference credits to call the most popular AI models.
For detailed LLM, STT, and TTS model pricing, see Inference pricing.
For detailed provider and model API support, see Documentation.
estimated cost
Build
$0/mo
Ship
$50/mo
Scale
$500/mo
Enterprise
Custom
AI voice and video agents
Deploy and host agents on LiveKit Cloud infrastructure
AI voice and video agents
Deploy and host agents on LiveKit Cloud infrastructure
Agent session minutes
Concurrent agent sessions
Agent deployments
Deployment metrics
Cold start prevention
Instant rollback
LiveKit Inference
Access LLM, STT, and TTS models with a single API key
LiveKit Inference
Access LLM, STT, and TTS models with a single API key
LiveKit Inference credits
(~50 minutes)
(~100 minutes)
(~1,000 minutes)
LiveKit Inference concurrency
Agent observability
Gather insights into your agent’s behavior and performance
Agent observability
Gather insights into your agent’s behavior and performance
Agent session recordings
Agent observability events
Export to cloud storage
Telephony
Connect with your users over regular phone calls
Telephony
Connect with your users over regular phone calls
US local phone numbers
US local inbound minutes
US toll-free phone numbers
US toll-free inbound minutes
Third-party SIP minutes
Custom SIP domains
Participants
Allow end users to connect to realtime sessions
Participants
Allow end users to connect to realtime sessions
WebRTC minutes
Concurrent connections
Media transport
Deliver voice and video worldwide in under 250ms
Media transport
Deliver voice and video worldwide in under 250ms
Uptime
Enhanced noise cancellation
Downstream data transfer
Stream import
Ingest media encoded in another format and deliver it as a realtime stream
Stream import
Ingest media encoded in another format and deliver it as a realtime stream
Transcode minutes
then $0.005 per minute (audio-only)
then $0.004 per minute (audio-only)
Concurrent imports
Recording and export
Capture realtime media and encode it in another format for recording or multistreaming
Recording and export
Capture realtime media and encode it in another format for recording or multistreaming
Transcode minutes
then $0.005 per minute (audio-only)
then $0.004 per minute (audio-only)
Track egress
Concurrent exports
Platform
Build, ship, and manage your applications with additional tools and features
Platform
Build, ship, and manage your applications with additional tools and features
Dashboard
CLI
Team collaboration
Metrics export APIs
Shared plan across projects
Non-credit card billing
Security and compliance
Protect your applications through access, application, and operational security
Security and compliance
Protect your applications through access, application, and operational security
End-to-end encryption
DPA
Role-based access
Region pinning
Security reports
- SOC 2 Type II
- Network pentest
- SOC 2 Type II
- Network pentest
HIPAA compliance
Single sign-on (SSO)
AWS Assume Role for S3 egress
Support
Get help and technical assistance for building your applications
Support
Get help and technical assistance for building your applications
Community support
Email support
Shared Slack channel
Designated solutions engineer
Support SLA
Frequently asked questions
What's the difference between agent deployments, concurrent agent sessions, and LiveKit Inference concurrency?
An agent deployment is a running version of your agent backend hosted on LiveKit Cloud, typically with a unique prompt, set of voice AI models, and function calls. You can configure your agent to complete different tasks or workflows. Deploy separate agents when you need distinct reasoning behavior or tool access (e.g., a front-office receptionist agent to handle inbound phone calls for appointment scheduling and triage vs a back-office agent to make outbound calls to insurance providers to verify patient coverage).
A concurrent agent session is a live interaction between your agent and an end user. If your agent is handling 10 calls or conversations at the same time, that counts as 10 concurrent sessions, regardless of how many agent deployments you have on LiveKit Cloud.
LiveKit Inference concurrency refers specifically to how many AI inference requests across LLM, STT, and TTS can run at the same time through LiveKit Inference. It limits how many model calls can be processed concurrently, independent of how many agent sessions or deployments you have. The LiveKit Inference concurrency limit for each plan applies to your aggregate usage of a model type (e.g., total connections to any LiveKit Inference STT). For example, if there are 10 concurrent agent sessions running and the agent is configured to use LiveKit Inference for STT, then there are 10 concurrent STT connections.
For more information on LiveKit Cloud quotas and limits, refer to our docs.
Can I self-host LiveKit?
The LiveKit Agents framework and LiveKit media server are both completely open source and available to run locally or host on your own infrastructure.
LiveKit Cloud is the best way to run LiveKit in production, with fully managed agent deployments, built-in observability and dashboards, and ultra low-latency global media transport.
Sign up for LiveKit Cloud here, or refer to our docs on how to run LiveKit’s media server locally or deploy LiveKit Agents in a custom environment.
Do you offer on-premise or private deployments?
Yes. Contact sales so we can better understand your needs.
Ready to build?
Start building a voice AI agent with a free account. Reach out to us if you're interested in custom pricing.
No credit card required • 1,000 free agent session minutes monthly