use-case

Voice AI

See, hear, and speak with LLMs in realtime.

If one day AI is as smart as a human, we’ll interact with it like we do with each other.

Sometimes we text, but most times we talk.

Your LLM or Diffusion model is the brain.
LiveKit is the nervous system.

Developer chatting with our open-source, multimodal AI named KITT.

Build the next generation of stateful, voice-driven applications using LiveKit on the server.

transform

Detect and convert speech to text

Turn realtime speech into text using your own model or one of our STT plugins like Whisper or Deepgram.

Speech-to-text

I’m

writing

blog

post

tell

people

about

how

built

you

want

help?

infer

Run your LLM at the edge

Send tokens to integrated LLMs like GPT-4, Claude and Mistral, or your own fine-tuned model.

1async def talk_to_llm(self, stream: rtc.AudioStream):
2  async for event in stream:
3    alt = event.alternatives[0]
4    msg = LLMMessage(role=LLMMessageRole.User, content=alt.text)
5    llm_stream = self.llm.send_message(msg)

synthesize

Give AI a voice

Use any TTS model like Open AI or ElevenLabs to synthesize and stream speech to client devices.

Hello, I'm here.

data channels

Stream text or tokens

Send arbitrary binary data to one or many users such as tokens from your LLM or audio transcriptions.

1# stream tokens to client
2async def process_llm_tokens(self, stream: LLMStream):
3  async for token in stream:
4    await self.ctx.room.local_participant.publish_data(
5        json.dumps({"type": "agent_token", "text": token})

Build Conversational AI for free

LiveKit Cloud is a cloud WebRTC platform and the fastest path to production using the open source LiveKit stack.