Portfolio of Yousef Kakhki - Software Developer

Introduction

Building real-time collaborative applications at scale is one of the most challenging problems in modern web engineering. When we set out to create a virtual classroom platform capable of hosting thousands of concurrent users, we quickly discovered that WebRTC alone couldn't solve the problem elegantly. While WebRTC delivers the low-latency, interactive experience users expect, it doesn't scale beyond a few hundred participants in a single room.

This is the story of how we built a scalable room architecture that supports 1000+ concurrent users through intelligent role-based management, side-channel signaling, and seamless user promotion—and the engineering decisions that made it possible.

The Problem: WebRTC's Scalability Ceiling

WebRTC excels at low-latency, bidirectional communication. It's the technology powering Google Meet, Zoom, and countless other video conferencing applications. However, WebRTC was designed for small-group communication, not large-scale scenarios.

A typical LiveKit SFU (Selective Forwarding Unit) can handle 200-300 participants before experiencing degradation. Beyond this point, several issues emerge:

CPU and bandwidth exhaustion on the SFU as it forwards media streams to each participant

Client-side resource constraints as browsers struggle to decode multiple video streams

Network complexity as the mesh of connections grows exponentially

Cost scaling as infrastructure requirements grow linearly with participant count

Meanwhile, our virtual classroom platform needed to support 1000+ concurrent users while maintaining real-time interactivity for teachers and active students. A typical lecture might have one teacher, 10-20 active participants asking questions, and hundreds of passive observers who don't need bidirectional communication.

The challenge became clear: How do we scale WebRTC rooms beyond their natural limits while maintaining the interactive experience for those who need it?

The Solution: Role-Based Architecture

We designed a role-based architecture that separates users by their interaction needs:

| Role | Connection Type | Permissions | Capacity | Use Case |
|------|----------------|-------------|----------|----------|
| Interactive | LiveKit WebRTC | Full publish/subscribe | ~200 users | Teachers, active students |
| Passive | No direct connection | View-only, no publish | Unlimited | Observers, late joiners |

The key insight is that in most educational scenarios, only a small percentage of users need bidirectional communication at any given time. The majority are passive observers who can be managed through a separate signaling channel without consuming WebRTC resources.

Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│                      LiveKit SFU Cluster                    │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐         │
│  │   Teacher  │  │  Student A  │  │  Student B  │         │
│  │  (publish)  │  │ (pub/sub)   │  │ (pub/sub)   │         │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘         │
│         └────────────────┼────────────────┘                 │
│                          │                                  │
│         Interactive Participants (WebRTC)                   │
│                          │                                  │
└──────────────────────────┼──────────────────────────────────┘
                           │
                           │
                  ┌─────────┴─────────┐
                  │                   │
                  ▼                   ▼
         ┌──────────────┐    ┌──────────────┐
         │    ha-api    │    │ NATS JetStream│
         │ (Orchestrator)│    │  (Signaling)  │
         └──────┬───────┘    └──────┬───────┘
                │                   │
                │                   │
                ▼                   ▼
    ┌───────────────────────────────────────┐
    │      Passive Viewers (1000+)         │
    │  ┌──────────┐  ┌──────────┐  ┌──────┐│
    │  │ Viewer 1 │  │ Viewer 2 │  │ ...  ││
    │  │ (HTTP)   │  │ (HTTP)   │  │      ││
    │  └──────────┘  └──────────┘  └──────┘│
    └───────────────────────────────────────┘

The ha-api Service: Orchestration Layer

The ha-api service is the brain of our scalable architecture. Built with Node.js and Express, it handles:

Room lifecycle management - Creating rooms, managing participant limits

Token generation - Issuing appropriate LiveKit tokens based on user role

User promotion/demotion - Moving users between interactive and passive roles

Signal bridging - Connecting passive viewers to the interactive layer via NATS

Here's how we manage room creation and token issuance:

import { AccessToken } from 'livekit-server-sdk';
interface UserRole {
  userId: string;
  roomId: string;
  role: 'interactive' | 'passive';
  canPublish: boolean;
  canSubscribe: boolean;
}
async function generateToken(userRole: UserRole): Promise {
  const token = new AccessToken(
    process.env.LIVEKIT_API_KEY!,
    process.env.LIVEKIT_API_SECRET!,
    {
      identity: userRole.userId,
      ttl: '24h',
    }
  );
  
  token.addGrant({
    room: userRole.roomId,
    roomJoin: true,
    canPublish: userRole.canPublish,
    canSubscribe: userRole.canSubscribe,
    canPublishData: userRole.canPublish,
  });
  
  return token.toJwt();
}
// Interactive user gets full permissions
const interactiveToken = await generateToken({
  userId: 'student-123',
  roomId: 'class-456',
  role: 'interactive',
  canPublish: true,
  canSubscribe: true,
});// Passive user gets no token (watches via alternative method)
// They can still signal via HTTP → NATS → SSE

Deep Dive: NATS JetStream Signaling

The most interesting engineering challenge we faced was this: How does a passive viewer (who has no WebRTC connection to the room) signal the moderator that they want to speak?

In a traditional WebRTC setup, participants use data channels or the signaling server to send messages. But our passive viewers are completely disconnected from the LiveKit infrastructure—they have no WebRTC connection at all.

We solved this with a side-channel signaling system using NATS JetStream.

Why NATS JetStream?

We evaluated several options for our signaling backbone:

| Technology | Pros | Cons |
|------------|------|------|
| Redis Pub/Sub | Simple, fast | No persistence, no replay |
| Kafka | Durable, scalable | Heavy, complex setup |
| RabbitMQ | Mature, reliable | Doesn't fit event-streaming model |
| NATS JetStream | Lightweight, persistent, exactly-once | Perfect fit |

NATS JetStream gave us the best of all worlds: the simplicity of Redis with the durability of Kafka, all in a single lightweight binary. It handles 50,000+ messages per second with minimal resource usage.

Signal Flow Architecture

┌──────────────────┐                              ┌──────────────────┐
│  Passive Viewer  │                              │    Moderator     │
│   (HTTP Client)  │                              │ (WebRTC Client)  │
└────────┬─────────┘                              └────────┬─────────┘
         │                                                  │
         │ HTTP POST /api/rooms/{id}/raise-hand            │
         ▼                                                  │
┌──────────────────┐                                       │
│     ha-api       │                                       │
│  (Express.js)    │                                       │
└────────┬─────────┘                                       │
         │                                                  │
         │ js.publish('room.{id}.signal.raise-hand')       │
         ▼                                                  │
┌──────────────────┐                                       │
│  NATS JetStream  │                                       │
│    (Stream)      │                                       │
└────────┬─────────┘                                       │
         │                                                  │
         │ Consumer subscription                           │
         ▼                                                  │
┌──────────────────┐                                       │
│     ha-api       │──────── SSE Push ─────────────────────▶
│  (SSE Endpoint)  │         'RAISE_HAND' event            │
└──────────────────┘                                       ▼
                                              ┌──────────────────┐
                                              │  Moderator sees  │
                                              │  raise-hand UI   │
                                              └──────────────────┘

Implementation Details

NATS Stream Configuration:

import { connect, JetStreamManager, RetentionPolicy, StorageType } from 'nats';async function setupNatsStreams() {
  const nc = await connect({ servers: process.env.NATS_URL });
  const jsm = await nc.jetstreamManager();
  
  // Create stream for room signals
  await jsm.streams.add({
    name: 'ROOM_SIGNALS',
    subjects: ['room..signal.'],
    retention: RetentionPolicy.Limits,
    storage: StorageType.Memory,
    max_age: 3600 * 1e9, // 1 hour in nanoseconds
    max_msgs_per_subject: 1000,
  });
  
  return nc;
}

Publishing a Raise-Hand Event:

app.post('/api/rooms/:roomId/raise-hand', authenticate, async (req, res) => {
  const { roomId } = req.params;
  const { userId, displayName } = req.user;
  
  const subject = room.${roomId}.signal.raise-hand;
  const payload = JSON.stringify({
    type: 'RAISE_HAND',
    userId,
    displayName,
    timestamp: Date.now(),
    metadata: {
      viewerType: 'passive',
      connectionId: req.headers['x-connection-id'],
    },
  });
  
  await js.publish(subject, payload, {
    msgID: raise-hand-${userId}-${Date.now()}, // Deduplication
  });
  
  res.json({ success: true, message: 'Hand raised successfully' });
});

SSE Endpoint for Moderators:

app.get('/api/rooms/:roomId/events', authenticate, requireModerator, async (req, res) => {
  const { roomId } = req.params;
  
  // Set up SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  res.flushHeaders();
  
  // Subscribe to room signals
  const consumer = await js.consumers.get('ROOM_SIGNALS', moderator-${roomId});
  const messages = await consumer.consume();
  
  for await (const msg of messages) {
    const event = JSON.parse(msg.data.toString());
    res.write(event: ${event.type}\n);
    res.write(data: ${JSON.stringify(event)}\n\n);
    msg.ack();
  }
  
  req.on('close', () => {
    messages.stop();
  });
});

The Seamless Promotion Flow

The crown jewel of our architecture is the promotion flow—the ability to instantly upgrade a passive viewer to an active WebRTC participant without any page reload or context loss.

The User Journey

Passive viewer observes the room - No WebRTC connection, minimal resource usage

Viewer raises hand - HTTP request to ha-api → NATS → SSE to moderator

Moderator approves - Clicks "Promote" button in their UI

ha-api generates LiveKit token - With canPublish: true and canSubscribe: true

SSE pushes PROMOTE event - Contains the new token and room info

React client hot-swaps components - Passive viewer component unmounts, LiveKit room mounts

User is now interactive - Can speak, share video, participate fully via WebRTC

Client-Side Implementation (React)

import { useEffect, useState, useCallback } from 'react';
import { LiveKitRoom, VideoConference } from '@livekit/components-react';
import PassiveViewer from './PassiveViewer';
type ViewerMode = 'passive' | 'interactive' | 'transitioning';
interface RoomViewerProps {
  roomId: string;
  userId: string;
}
export function RoomViewer({ roomId, userId }: RoomViewerProps) {
  const [viewerMode, setViewerMode] = useState('passive');
  const [livekitToken, setLivekitToken] = useState(null);
  const [isHandRaised, setIsHandRaised] = useState(false);
  
  // SSE connection for receiving events
  useEffect(() => {
    const eventSource = new EventSource(
      /api/rooms/${roomId}/user-events?userId=${userId}
    );
    
    eventSource.addEventListener('PROMOTE', (event) => {
      const data = JSON.parse(event.data);
      console.log('Promotion received!', data);
      
      setViewerMode('transitioning');
      setLivekitToken(data.token);
      
      // Small delay to ensure clean transition
      setTimeout(() => {
        setViewerMode('interactive');
        setIsHandRaised(false);
      }, 500);
    });
    
    eventSource.addEventListener('DEMOTE', (event) => {
      console.log('Demotion received');
      setViewerMode('transitioning');
      
      setTimeout(() => {
        setLivekitToken(null);
        setViewerMode('passive');
      }, 500);
    });
    
    eventSource.onerror = (error) => {
      console.error('SSE connection error:', error);
      // Implement reconnection logic
    };
    
    return () => eventSource.close();
  }, [roomId, userId]);
  
  const handleRaiseHand = useCallback(async () => {
    try {
      await fetch(/api/rooms/${roomId}/raise-hand, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
      });
      setIsHandRaised(true);
    } catch (error) {
      console.error('Failed to raise hand:', error);
    }
  }, [roomId]);
  
  // Render based on current mode
  if (viewerMode === 'transitioning') {
    return (
      
        
          
          Connecting to live session...

        

      

    );
  }
  
  if (viewerMode === 'interactive' && livekitToken) {
    return (
              token={livekitToken}
        serverUrl={process.env.NEXT_PUBLIC_LIVEKIT_URL}
        connect={true}
        audio={true}
        video={true}
      >
        
      
    );
  }
  
  // Passive mode - no WebRTC connection
  return (
    
      
      
      {/ Raise Hand Button /}
      
                  onClick={handleRaiseHand}
          disabled={isHandRaised}
          className={px-6 py-3 rounded-full font-medium transition-all ${
            isHandRaised
              ? 'bg-yellow-500 text-black'
              : 'bg-blue-600 hover:bg-blue-700 text-white'
          }}
        >
          {isHandRaised ? '✋ Hand Raised' : '🙋 Raise Hand'}
        
      

      
      {/ Passive mode indicator /}
      
        👁️ Viewing Mode
      

    

  );
}

Server-Side Promotion Handler

app.post('/api/rooms/:roomId/promote/:userId', authenticate, requireModerator, async (req, res) => {
  const { roomId, userId } = req.params;
  
  // Check if room has capacity for another interactive participant
  const roomInfo = await getRoomInfo(roomId);
  if (roomInfo.interactiveCount >= MAX_INTERACTIVE_PARTICIPANTS) {
    return res.status(400).json({ 
      error: 'Room at capacity for interactive participants' 
    });
  }
  
  // Generate LiveKit token with publishing permissions
  const token = new AccessToken(
    process.env.LIVEKIT_API_KEY!,
    process.env.LIVEKIT_API_SECRET!,
    {
      identity: userId,
      ttl: '24h',
    }
  );
  
  token.addGrant({
    room: roomId,
    roomJoin: true,
    canPublish: true,
    canSubscribe: true,
    canPublishData: true,
  });
  
  const jwt = token.toJwt();
  
  // Publish promotion event via NATS
  await js.publish(room.${roomId}.user.${userId}.event, JSON.stringify({
    type: 'PROMOTE',
    token: jwt,
    roomId,
    timestamp: Date.now(),
  }));
  
  // Update user role in database
  await updateUserRole(userId, roomId, 'interactive');
  
  res.json({ success: true, message: 'User promoted successfully' });
});

Performance Results and Lessons Learned

After extensive load testing and production deployment, here are our results:

Metrics

| Metric | Target | Achieved |
|--------|--------|----------|
| Max concurrent users | 1,000 | 1,247 |
| Promotion latency | <5s | 2.3s avg |
| Signal delivery latency | <1s | 0.4s avg |
| Infrastructure cost vs pure WebRTC | -50% | -78% |
| Client CPU usage (passive) | <10% | 4.2% |
| NATS message throughput | 10k/sec | 52k/sec |

Key Lessons

NATS is incredibly lightweight - A single NATS server handles 50,000+ messages/second with minimal resource usage. JetStream's persistence adds negligible overhead. This made it perfect for our signaling layer.

SSE > WebSocket for simple push - For one-way server-to-client communication, SSE is simpler to implement and more reliable than WebSocket. It also works better with load balancers and doesn't require connection upgrade logic.

Transition UX is critical - Users initially found the 2-3 second promotion delay jarring. Adding a "transitioning" state with a loading animation dramatically improved perceived performance. The visual feedback made the wait feel intentional rather than broken.

Role-based architecture scales - By limiting interactive participants to ~200 and keeping the rest as passive observers, we achieved 6x capacity improvement with 78% cost reduction. The key was recognizing that not all users need the same level of interactivity.

NATS subject naming matters - We use hierarchical subjects like room.{id}.signal.{event} which allows for flexible subscription patterns. Moderators can subscribe to all signals in a room, or filter by event type.

Token generation is fast - Generating LiveKit tokens on-demand for promotions adds only ~50ms latency. We cache room metadata but generate tokens fresh to ensure proper permissions.

Conclusion

Building scalable real-time applications requires thinking beyond single-technology solutions. Our role-based WebRTC architecture demonstrates that by intelligently managing user roles and using side-channel signaling, we can achieve both the interactivity users expect and the scalability businesses require.

The key architectural decisions that made this possible:

Role-based tier separation - Not all users need the same level of interactivity. Passive observers don't need WebRTC connections.

Side-channel signaling with NATS - Decoupling passive viewers from the WebRTC infrastructure allows unlimited scale for observers.

Seamless client-side transitions - Making the technology invisible to the end user through hot-swapping components without page reloads.

Intelligent capacity management - Actively managing the number of interactive participants to stay within WebRTC's sweet spot.

This approach has allowed us to scale our virtual classroom platform from hundreds to thousands of users while actually reducing infrastructure costs. The same patterns could be applied to webinars, live events, gaming spectator modes, or any scenario where you need to combine real-time interaction with large-scale observation.

---

Have questions about implementing similar architectures? We'd love to hear from you in the comments below.

Scalable Virtual Classrooms: Building 1000+ User Rooms with LiveKit and NATS