Voice AI
See, hear, and speak with LLMs in realtime.
If one day AI is as smart as a human, we’ll interact with it like we do with each other.
Sometimes we text, but most times we talk.
Your LLM or Diffusion model is the brain.
LiveKit is the nervous system.
Developer chatting with our open-source, multimodal AI named KITT.
Build the next generation of stateful, voice-driven applications using LiveKit on the server.
Detect and convert speech to text
Turn realtime speech into text using your own model or one of our STT plugins like Whisper or Deepgram.
Run your LLM at the edge
Send tokens to integrated LLMs like GPT-4, Claude and Mistral, or your own fine-tuned model.
1async def talk_to_llm(self, stream: rtc.AudioStream):2 async for event in stream:3 alt = event.alternatives[0]4 msg = LLMMessage(role=LLMMessageRole.User, content=alt.text)5 llm_stream = self.llm.send_message(msg)
1
Give AI a voice
Use any TTS model like Open AI or ElevenLabs to synthesize and stream speech to client devices.
Stream text or tokens
Send arbitrary binary data to one or many users such as tokens from your LLM or audio transcriptions.
1# stream tokens to client2async def process_llm_tokens(self, stream: LLMStream):3 async for token in stream:4 await self.ctx.room.local_participant.publish_data(5 json.dumps({"type": "agent_token", "text": token})
1
Build Conversational AI for free
LiveKit Cloud is a cloud WebRTC platform and the fastest path to production using the open source LiveKit stack.