Scalable Virtual Classrooms: Building 1000+ User Rooms with LiveKit and NATS
Introduction
Building real-time collaborative applications at scale is one of the most challenging problems in modern web engineering. When we set out to create a virtual classroom platform capable of hosting thousands of concurrent users, we quickly discovered that WebRTC alone couldn't solve the problem elegantly. While WebRTC delivers the low-latency, interactive experience users expect, it doesn't scale beyond a few hundred participants in a single room.
This is the story of how we built a scalable room architecture that supports 1000+ concurrent users through intelligent role-based management, side-channel signaling, and seamless user promotion—and the engineering decisions that made it possible.
The Problem: WebRTC's Scalability Ceiling
WebRTC excels at low-latency, bidirectional communication. It's the technology powering Google Meet, Zoom, and countless other video conferencing applications. However, WebRTC was designed for small-group communication, not large-scale scenarios.
A typical LiveKit SFU (Selective Forwarding Unit) can handle 200-300 participants before experiencing degradation. Beyond this point, several issues emerge:
Meanwhile, our virtual classroom platform needed to support 1000+ concurrent users while maintaining real-time interactivity for teachers and active students. A typical lecture might have one teacher, 10-20 active participants asking questions, and hundreds of passive observers who don't need bidirectional communication.
The challenge became clear: How do we scale WebRTC rooms beyond their natural limits while maintaining the interactive experience for those who need it?
The Solution: Role-Based Architecture
We designed a role-based architecture that separates users by their interaction needs:
| Role | Connection Type | Permissions | Capacity | Use Case |
|------|----------------|-------------|----------|----------|
| Interactive | LiveKit WebRTC | Full publish/subscribe | ~200 users | Teachers, active students |
| Passive | No direct connection | View-only, no publish | Unlimited | Observers, late joiners |
The key insight is that in most educational scenarios, only a small percentage of users need bidirectional communication at any given time. The majority are passive observers who can be managed through a separate signaling channel without consuming WebRTC resources.
Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│ LiveKit SFU Cluster │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Teacher │ │ Student A │ │ Student B │ │
│ │ (publish) │ │ (pub/sub) │ │ (pub/sub) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ └────────────────┼────────────────┘ │
│ │ │
│ Interactive Participants (WebRTC) │
│ │ │
└──────────────────────────┼──────────────────────────────────┘
│
│
┌─────────┴─────────┐
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ ha-api │ │ NATS JetStream│
│ (Orchestrator)│ │ (Signaling) │
└──────┬───────┘ └──────┬───────┘
│ │
│ │
▼ ▼
┌───────────────────────────────────────┐
│ Passive Viewers (1000+) │
│ ┌──────────┐ ┌──────────┐ ┌──────┐│
│ │ Viewer 1 │ │ Viewer 2 │ │ ... ││
│ │ (HTTP) │ │ (HTTP) │ │ ││
│ └──────────┘ └──────────┘ └──────┘│
└───────────────────────────────────────┘
The ha-api Service: Orchestration Layer
The ha-api service is the brain of our scalable architecture. Built with Node.js and Express, it handles:
Here's how we manage room creation and token issuance:
import { AccessToken } from 'livekit-server-sdk';interface UserRole {
userId: string;
roomId: string;
role: 'interactive' | 'passive';
canPublish: boolean;
canSubscribe: boolean;
}
async function generateToken(userRole: UserRole): Promise {
const token = new AccessToken(
process.env.LIVEKIT_API_KEY!,
process.env.LIVEKIT_API_SECRET!,
{
identity: userRole.userId,
ttl: '24h',
}
);
token.addGrant({
room: userRole.roomId,
roomJoin: true,
canPublish: userRole.canPublish,
canSubscribe: userRole.canSubscribe,
canPublishData: userRole.canPublish,
});
return token.toJwt();
}
// Interactive user gets full permissions
const interactiveToken = await generateToken({
userId: 'student-123',
roomId: 'class-456',
role: 'interactive',
canPublish: true,
canSubscribe: true,
});
// Passive user gets no token (watches via alternative method)
// They can still signal via HTTP → NATS → SSE
Deep Dive: NATS JetStream Signaling
The most interesting engineering challenge we faced was this: How does a passive viewer (who has no WebRTC connection to the room) signal the moderator that they want to speak?
In a traditional WebRTC setup, participants use data channels or the signaling server to send messages. But our passive viewers are completely disconnected from the LiveKit infrastructure—they have no WebRTC connection at all.
We solved this with a side-channel signaling system using NATS JetStream.
Why NATS JetStream?
We evaluated several options for our signaling backbone:
| Technology | Pros | Cons |
|------------|------|------|
| Redis Pub/Sub | Simple, fast | No persistence, no replay |
| Kafka | Durable, scalable | Heavy, complex setup |
| RabbitMQ | Mature, reliable | Doesn't fit event-streaming model |
| NATS JetStream | Lightweight, persistent, exactly-once | Perfect fit |
NATS JetStream gave us the best of all worlds: the simplicity of Redis with the durability of Kafka, all in a single lightweight binary. It handles 50,000+ messages per second with minimal resource usage.
Signal Flow Architecture
┌──────────────────┐ ┌──────────────────┐
│ Passive Viewer │ │ Moderator │
│ (HTTP Client) │ │ (WebRTC Client) │
└────────┬─────────┘ └────────┬─────────┘
│ │
│ HTTP POST /api/rooms/{id}/raise-hand │
▼ │
┌──────────────────┐ │
│ ha-api │ │
│ (Express.js) │ │
└────────┬─────────┘ │
│ │
│ js.publish('room.{id}.signal.raise-hand') │
▼ │
┌──────────────────┐ │
│ NATS JetStream │ │
│ (Stream) │ │
└────────┬─────────┘ │
│ │
│ Consumer subscription │
▼ │
┌──────────────────┐ │
│ ha-api │──────── SSE Push ─────────────────────▶
│ (SSE Endpoint) │ 'RAISE_HAND' event │
└──────────────────┘ ▼
┌──────────────────┐
│ Moderator sees │
│ raise-hand UI │
└──────────────────┘
Implementation Details
NATS Stream Configuration:
import { connect, JetStreamManager, RetentionPolicy, StorageType } from 'nats';async function setupNatsStreams() {
const nc = await connect({ servers: process.env.NATS_URL });
const jsm = await nc.jetstreamManager();
// Create stream for room signals
await jsm.streams.add({
name: 'ROOM_SIGNALS',
subjects: ['room..signal.'],
retention: RetentionPolicy.Limits,
storage: StorageType.Memory,
max_age: 3600 * 1e9, // 1 hour in nanoseconds
max_msgs_per_subject: 1000,
});
return nc;
}
Publishing a Raise-Hand Event:
app.post('/api/rooms/:roomId/raise-hand', authenticate, async (req, res) => {
const { roomId } = req.params;
const { userId, displayName } = req.user;
const subject = room.${roomId}.signal.raise-hand;
const payload = JSON.stringify({
type: 'RAISE_HAND',
userId,
displayName,
timestamp: Date.now(),
metadata: {
viewerType: 'passive',
connectionId: req.headers['x-connection-id'],
},
});
await js.publish(subject, payload, {
msgID: raise-hand-${userId}-${Date.now()}, // Deduplication
});
res.json({ success: true, message: 'Hand raised successfully' });
});
SSE Endpoint for Moderators:
app.get('/api/rooms/:roomId/events', authenticate, requireModerator, async (req, res) => {
const { roomId } = req.params;
// Set up SSE headers
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.flushHeaders();
// Subscribe to room signals
const consumer = await js.consumers.get('ROOM_SIGNALS', moderator-${roomId});
const messages = await consumer.consume();
for await (const msg of messages) {
const event = JSON.parse(msg.data.toString());
res.write(event: ${event.type}\n);
res.write(data: ${JSON.stringify(event)}\n\n);
msg.ack();
}
req.on('close', () => {
messages.stop();
});
});
The Seamless Promotion Flow
The crown jewel of our architecture is the promotion flow—the ability to instantly upgrade a passive viewer to an active WebRTC participant without any page reload or context loss.
The User Journey
canPublish: true and canSubscribe: trueClient-Side Implementation (React)
import { useEffect, useState, useCallback } from 'react';
import { LiveKitRoom, VideoConference } from '@livekit/components-react';
import PassiveViewer from './PassiveViewer';type ViewerMode = 'passive' | 'interactive' | 'transitioning';
interface RoomViewerProps {
roomId: string;
userId: string;
}
export function RoomViewer({ roomId, userId }: RoomViewerProps) {
const [viewerMode, setViewerMode] = useState('passive');
const [livekitToken, setLivekitToken] = useState(null);
const [isHandRaised, setIsHandRaised] = useState(false);
// SSE connection for receiving events
useEffect(() => {
const eventSource = new EventSource(
/api/rooms/${roomId}/user-events?userId=${userId}
);
eventSource.addEventListener('PROMOTE', (event) => {
const data = JSON.parse(event.data);
console.log('Promotion received!', data);
setViewerMode('transitioning');
setLivekitToken(data.token);
// Small delay to ensure clean transition
setTimeout(() => {
setViewerMode('interactive');
setIsHandRaised(false);
}, 500);
});
eventSource.addEventListener('DEMOTE', (event) => {
console.log('Demotion received');
setViewerMode('transitioning');
setTimeout(() => {
setLivekitToken(null);
setViewerMode('passive');
}, 500);
});
eventSource.onerror = (error) => {
console.error('SSE connection error:', error);
// Implement reconnection logic
};
return () => eventSource.close();
}, [roomId, userId]);
const handleRaiseHand = useCallback(async () => {
try {
await fetch(/api/rooms/${roomId}/raise-hand, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
});
setIsHandRaised(true);
} catch (error) {
console.error('Failed to raise hand:', error);
}
}, [roomId]);
// Render based on current mode
if (viewerMode === 'transitioning') {
return (
Connecting to live session...
);
}
if (viewerMode === 'interactive' && livekitToken) {
return (
token={livekitToken}
serverUrl={process.env.NEXT_PUBLIC_LIVEKIT_URL}
connect={true}
audio={true}
video={true}
>
);
}
// Passive mode - no WebRTC connection
return (
{/ Raise Hand Button /}
onClick={handleRaiseHand}
disabled={isHandRaised}
className={px-6 py-3 rounded-full font-medium transition-all ${
isHandRaised
? 'bg-yellow-500 text-black'
: 'bg-blue-600 hover:bg-blue-700 text-white'
}}
>
{isHandRaised ? '✋ Hand Raised' : '🙋 Raise Hand'}
{/ Passive mode indicator /}
👁️ Viewing Mode
);
}
Server-Side Promotion Handler
app.post('/api/rooms/:roomId/promote/:userId', authenticate, requireModerator, async (req, res) => {
const { roomId, userId } = req.params;
// Check if room has capacity for another interactive participant
const roomInfo = await getRoomInfo(roomId);
if (roomInfo.interactiveCount >= MAX_INTERACTIVE_PARTICIPANTS) {
return res.status(400).json({
error: 'Room at capacity for interactive participants'
});
}
// Generate LiveKit token with publishing permissions
const token = new AccessToken(
process.env.LIVEKIT_API_KEY!,
process.env.LIVEKIT_API_SECRET!,
{
identity: userId,
ttl: '24h',
}
);
token.addGrant({
room: roomId,
roomJoin: true,
canPublish: true,
canSubscribe: true,
canPublishData: true,
});
const jwt = token.toJwt();
// Publish promotion event via NATS
await js.publish(room.${roomId}.user.${userId}.event, JSON.stringify({
type: 'PROMOTE',
token: jwt,
roomId,
timestamp: Date.now(),
}));
// Update user role in database
await updateUserRole(userId, roomId, 'interactive');
res.json({ success: true, message: 'User promoted successfully' });
});
Performance Results and Lessons Learned
After extensive load testing and production deployment, here are our results:
Metrics
| Metric | Target | Achieved |
|--------|--------|----------|
| Max concurrent users | 1,000 | 1,247 |
| Promotion latency | <5s | 2.3s avg |
| Signal delivery latency | <1s | 0.4s avg |
| Infrastructure cost vs pure WebRTC | -50% | -78% |
| Client CPU usage (passive) | <10% | 4.2% |
| NATS message throughput | 10k/sec | 52k/sec |
Key Lessons
room.{id}.signal.{event} which allows for flexible subscription patterns. Moderators can subscribe to all signals in a room, or filter by event type.Conclusion
Building scalable real-time applications requires thinking beyond single-technology solutions. Our role-based WebRTC architecture demonstrates that by intelligently managing user roles and using side-channel signaling, we can achieve both the interactivity users expect and the scalability businesses require.
The key architectural decisions that made this possible:
This approach has allowed us to scale our virtual classroom platform from hundreds to thousands of users while actually reducing infrastructure costs. The same patterns could be applied to webinars, live events, gaming spectator modes, or any scenario where you need to combine real-time interaction with large-scale observation.
---
Have questions about implementing similar architectures? We'd love to hear from you in the comments below.