Building Performant Voice Agents in India

India is one of the world's fastest growing regions for voice agents. With LiveKit Cloud's regional agent deployment in Mumbai, you can deploy your entire voice stack in India to minimize latency.

Key takeaway: co-locate your AI models near your voice agent

The key to building fast voice agents on LiveKit is to host your agents in a region close to your entire AI model stack, including your STT (speech-to-text model), LLM (large language model), and TTS (text-to-speech model).

There's two factors that affect the speed of your voice agent: User ↔ Agent latency and Agent ↔ Model latency.

User ↔ Agent latency:

The audio packet between the user and the agent can run relatively uninterrupted over long distances on LiveKit's optimized private network.
This means it's okay to have users far away from where your agent is as long as your agent is close to your model stack.

Agent ↔ Model latency:

Every call from the agent to the AI models is sequential (e.g. agent to LLM only runs after agent to STT finishes).
This includes any intermediary function calls your LLM might be making, such as to your database.
Because this path involves more steps and each call is dependent on the previous one, it's important that each individual agent-to-model call is done over the shortest possible path. The best way to do this is geographically co-locating each AI model with your agent.

Voice agent latency comparison by architecture

Let's use a example where a user and agent are both located in India, but the AI models are all located in the United States. Since the agents and models are not co-located, most requests are traveling between cloud regions, adding latency with each hop.

Voice agent latency when models are not co-located

When models and the agent are co-located, most requests are short with the long user to agent transport running on LiveKit's private network.

Voice agent latency when models are co-located

Building an India-based voice AI stack

Model selection for Indian languages

Because of the large number of regional dialects across India, it's important to choose STT and TTS models that have broad language support. Here's what's currently available through LiveKit Inference and LiveKit Agents plugins:

AI Model Type	Model Providers	Indian Language Support
STT	Deepgram Nova-3 (via LiveKit Inference, hosted in India)	Hindi and 7 other languages. Auto-routes to India regional deployment.
STT	Cartesia Ink Whisper (via LiveKit Inference)	100+ languages, including Hindi
STT	Sarvam AI (via LiveKit Agents plugin)	Hindi, Tamil, Telugu, Kannada, Malayalam, Bengali, Marathi, Gujarati, Punjabi and more
TTS	Cartesia (via LiveKit Inference)	Multilingual, including Hindi
TTS	ElevenLabs (via LiveKit Inference)	29+ languages, including Hindi
TTS	Rime (via LiveKit Inference)	Multilingual, including Hindi
TTS	Sarvam AI (via LiveKit Agents plugin)	11 Indian languages, including Hindi, Tamil, Telugu, Bengali, Marathi, Gujarati, Kannada, Malayalam, Punjabi, Odia
LLM	Azure OpenAI (India) / Vertex Gemini (asia-south1) (via LiveKit plugins)	50+ languages, including Hindi and other Indic languages

If you want to minimize latency to serve users within India, here's an example stack we would recommend:

Layer	Provider	Notes
Agent hosting	LiveKit Cloud Mumbai (ap-south)	Deploy agents in LiveKit India's ap-south region.
STT	Deepgram (regional deployment via LiveKit Inference)	LiveKit-hosted Deepgram instance in India.
LLM	Azure OpenAI (India) or Vertex Gemini (asia-south1)	Use Azure's or Google's India cloud regions.
TTS	Cartesia (via LiveKit Inference)	Routed to Cartesia's India region through LiveKit. Failovers in-region not guaranteed.

When and how to host voice agents in India

With LiveKit's co-located hosted Deepgram STT and agent deployment region in Mumbai (ap-south), you can now host both your entire model stack and agent in India.

To optimize for the lowest latency experience, the best option is to host your agent and call AI models within the same region. However, if you need specific models only available in other regions (e.g., latest GPT, specialized TTS voices), it's best to host your agent in the region nearest your models rather than your end users.

Example performance: An agent hosted in LiveKit Cloud Mumbai (ap-south) using GPT-4o, Cartesia TTS, and co-located Deepgram STT achieves ~1.67s end-to-end latency, ~1s faster than with an equivalent model stack hosted in a different region.

Telephony for voice agents in India

For voice agents handling calls in India, we recommend using a regional SIP trunk provider. Using a SIP trunk has a few advantages:

Local phone numbers: Indian users are far more likely to pick up a call from a local number (+91) than an international one. Local numbers also enable features like DND registry compliance.
Better call quality: Calls that originate and terminate within India avoid international hops, reducing jitter and packet loss.
TRAI Regulation: Indian telecom regulations (TRAI) have specific requirements around number provisioning, call handling, and caller ID that using a SIP trunk helps with. All telephony calls are subject to further Department of Telecommunications (DoT) data residency requirements.

LiveKit works with most SIP trunk providers. Here's how to get started:

Get phone numbers from a local SIP provider (Plivo, Twilio, Exotel).
Configure your SIP trunk to point to LiveKit's India SIP endpoint.
Enable region pinning by contacting LiveKit. Region pinning helps to keep inbound and outbound calls stay within the India region.
Deploy your agent in LiveKit Cloud's Mumbai (ap-south) region.

Additional considerations for LiveKit voice agents in India

Fallback Availability: LiveKit offers co-located Deepgram STT models in Mumbai, India that falls back to Deepgram's Frankfurt, Germany region if unavailable.
Language Support: LiveKit's co-located Deepgram STT models currently only supports English/Hindi for nova-2-general and English for flux-general. More details are available here.
Questions: If you have any questions about optimal setup or compliance requirements, reach out here or in our community. Regulatory and data residency requirements vary by customer and use case.
Getting Started: Check out the LiveKit Agents quickstart to get your first agent running, then adapt with the India stack config above.

Building Performant Voice Agents in India

Key takeaway: co-locate your AI models near your voice agent

Voice agent latency comparison by architecture

Building an India-based voice AI stack

Model selection for Indian languages

When and how to host voice agents in India

Telephony for voice agents in India

Additional considerations for LiveKit voice agents in India

Agents overview

Quickstart guide

Agent models

Building multi-agent architectures with LiveKit agents

Can you increase agent deployment limits?

Key takeaway: co-locate your AI models near your voice agent

Voice agent latency comparison by architecture

Building an India-based voice AI stack

Model selection for Indian languages

When and how to host voice agents in India

Telephony for voice agents in India

Additional considerations for LiveKit voice agents in India

Read related documentation

Agents overview

Quickstart guide

Agent models

Find more Agents guides

Building multi-agent architectures with LiveKit agents

Can you increase agent deployment limits?