Best practices for managing webhook event streams

LiveKit webhooks let your backend receive near-realtime notifications when rooms, participants, and tracks change (and for ingress/egress lifecycle events). They're ideal for triggering application logic, maintaining a "room state" model outside of LiveKit, auditing, billing, or kicking off downstream jobs.

This guide focuses on building a reliable, load-tested webhook consumer that is resilient to transient failures.

With LiveKit Cloud, webhooks can be configured in the Settings section of your project's dashboard, see Settings->Webhooks.

For Egress, extra webhooks can also be configured inside Egress requests.

Understand what LiveKit sends

Before building your consumer, it helps to know how events are formatted and what to expect.

Payload format and headers

LiveKit sends HTTP POST requests with a JSON-encoded WebhookEvent in the body.
Requests use Content-Type: application/webhook+json (make sure your framework accepts it).
Requests include an Authorization header containing a signed JWT with a SHA-256 hash of the payload, used to validate authenticity/integrity.

Event types you'll see

Common webhook events include:

room_started, room_finished
participant_joined, participant_left, participant_connection_aborted
track_published, track_unpublished
egress_started, egress_updated, egress_ended
ingress_started, ingress_ended

Each event includes:

id (UUID) and createdAt (UNIX seconds).

Delivery, retries, and what that implies for your design

LiveKit webhooks are push-based HTTP requests, so there are no guarantees of delivery.

LiveKit mitigates transient failures by retrying delivery multiple times and preserves ordering when events queue up—newer events won't be delivered ahead of older ones until the older ones are delivered or abandoned.

This has two core implications:

Assume at least one delivery. You may receive duplicates. Your consumer should be idempotent.
Don't block the webhook request. If your endpoint is slow, you'll increase retries, amplify your load, and fall behind.

The golden rule: ACK fast, process async

Keeping the request path fast and moving work off the critical path is the key to reliable webhook handling.

What not to do in the webhook request path

Avoid doing these synchronously before returning 2xx:

Writing to your primary database (especially if it enforces constraints or triggers heavy indexes)
Calling other services (payment, CRM, provisioning, LLM jobs, etc.)
Running "full validation" that could fail on unexpected new fields
Performing expensive enrichment (e.g., looking up customer/account metadata)

If any of that work fails and you return a non-2xx status code, LiveKit will retry the webhook, potentially causing duplicates—and if you've already partially processed the event, you can end up with inconsistent downstream state.

What to do instead

In the request path:

Verify the webhook signature/authenticity.
Minimal schema sanity checks (e.g., "has event and id").
Enqueue the raw payload (or store it durably).
Return 2xx immediately.

After the response:

A worker processes the queue, performing validation, persistence, side effects, and fan-out.

The reference receiver validators require the raw POST body (not a parsed JSON object). For example, in Express, you must use express.raw({ type: 'application/webhook+json' }) so the signature validation works.

Idempotency and deduping

LiveKit includes a unique ID for each webhook event. Use it as your primary dedupe key:

Maintain a "seen events" store (a database table with a unique constraint, etc.).
If an event id is already processed, treat the delivery as a no-op and continue returning 2xx.

This protects you from:

Retries due to transient network errors
Repeated deliveries caused by your own slowdowns
Worker restarts that replay messages from a queue

Store raw first; be flexible about fields

LiveKit may expand event payloads over time (new event types, new fields). Your consumer should:

Parse unknown fields safely
Prefer storing the raw JSON (or a minimal normalized subset plus raw)
Version your internal schema so you can reprocess historical events if needed

Even within existing events, some sections are intentionally partial. For example, for track publish/unpublish events, the docs note that only sid, identity, and name are sent within the Room and Participant objects.

So downstream consumers should not assume every object is "fully hydrated."

Monitoring and operational guardrails

At minimum, monitor:

Request rate, 2xx vs non-2xx
p95/p99 latency of the webhook endpoint
Queue depth/consumer lag
Worker error rate and DLQ (dead-letter queue) growth
Dedup hit rate (spikes often indicate retries/backpressure)

Alert on patterns like:

Sustained non-2xx responses
Rising latency + rising retries symptoms (often correlated with queue depth)
Processing lag exceeding your business tolerance (e.g., billing/automation delayed)

Because delivery is best-effort and retries are finite, treat webhook processing as a production system with on-call runbooks.

Suggested reference architecture (simple and robust)

A minimal, production-ready setup looks like this:

Webhook endpoint

Raw-body capture + signature verification
Enqueue raw payload + headers + received timestamp
Return 2xx

Queue

SQS / PubSub / Kafka / NATS / Redis streams (pick your standard)
Optional DLQ for poison messages

Worker

Dedup by event id
Validate/normalize
Update state store (room lifecycle, participant sessions, track metrics)
Trigger downstream actions (jobs, billing, analytics)

Best practices for managing webhook event streams

Understand what LiveKit sends

Payload format and headers

Event types you'll see

Delivery, retries, and what that implies for your design

The golden rule: ACK fast, process async

What not to do in the webhook request path

What to do instead

Idempotency and deduping

Store raw first; be flexible about fields

Monitoring and operational guardrails

Suggested reference architecture (simple and robust)

Webhook endpoint

Queue

Worker

Deploy to LiveKit Cloud

Cloud administration

Observability

Deploy LiveKit Agents with GitHub Actions

Deployment reliability on LiveKit Cloud

Understand what LiveKit sends

Payload format and headers

Event types you'll see

Delivery, retries, and what that implies for your design

The golden rule: ACK fast, process async

What not to do in the webhook request path

What to do instead

Idempotency and deduping

Store raw first; be flexible about fields

Monitoring and operational guardrails

Suggested reference architecture (simple and robust)

Webhook endpoint

Queue

Worker

Read related documentation

Deploy to LiveKit Cloud

Cloud administration

Observability

Find more Deployment & Scaling guides

Deploy LiveKit Agents with GitHub Actions

Deployment reliability on LiveKit Cloud