Back to Blogs
Building an Award-Winning AI Hologram: How We Won Best Booth at ITEX 2024

Building an Award-Winning AI Hologram: How We Won Best Booth at ITEX 2024

12/18/202510 min
aicomputer-visionpythonfastapihologramaward

The Challenge

When Capitalino decided to participate in ITEX 2024, we wanted to create something that would truly stand out. A regular booth with brochures and demos wouldn't cut it. We needed something innovative, interactive, and memorable.

That's when the idea of an AI-powered hologram installation was born. The goal was to create an interactive experience where visitors could interact with a holographic representation that would respond to gestures, answer questions, and showcase Capitalino's innovative spirit.

The Vision

We envisioned a system where:

  • Visitors could approach the hologram

  • The hologram would detect gestures and respond

  • Natural language processing would enable conversations

  • Real-time AI would generate contextual responses

  • The experience would be visually stunning
  • Technical Architecture

    ┌─────────────────────────────────────────┐
    │ Camera System │
    │ - Depth sensing │
    │ - Gesture recognition │
    │ - Person detection │
    └──────────────┬──────────────────────────┘


    ┌─────────────────────────────────────────┐
    │ AI Processing Layer │
    │ - Computer vision │
    │ - Gesture classification │
    │ - Natural language processing │
    │ - Response generation │
    └──────────────┬──────────────────────────┘


    ┌─────────────────────────────────────────┐
    │ Hologram Display │
    │ - 3D rendering │
    │ - Animation system │
    │ - Audio output │
    └─────────────────────────────────────────┘

    Implementation

    Computer Vision Pipeline

    import cv2
    import mediapipe as mp
    import numpy as np

    class GestureRecognizer:
    def __init__(self):
    self.mp_hands = mp.solutions.hands
    self.hands = self.mp_hands.Hands(
    static_image_mode=False,
    max_num_hands=2,
    min_detection_confidence=0.7
    )

    def recognize_gesture(self, frame):
    # Convert BGR to RGB
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Process frame
    results = self.hands.process(rgb_frame)

    if results.multi_hand_landmarks:
    # Extract hand landmarks
    landmarks = results.multi_hand_landmarks[0]

    # Classify gesture
    gesture = self.classify_gesture(landmarks)

    return gesture

    return None

    def classify_gesture(self, landmarks):
    # Extract key points
    thumb_tip = landmarks.landmark[4]
    index_tip = landmarks.landmark[8]

    # Calculate distances
    thumb_index_dist = np.sqrt(
    (thumb_tip.x - index_tip.x)2 +
    (thumb_tip.y - index_tip.y)2
    )

    # Classify based on distances and positions
    if thumb_index_dist < 0.05:
    return 'point'
    elif self.is_fist(landmarks):
    return 'wave'
    else:
    return 'open_hand'

    Real-Time AI Inference

    from fastapi import FastAPI, WebSocket
    import torch
    from transformers import pipeline

    app = FastAPI()

    Load AI models


    nlp_pipeline = pipeline(
    'text-generation',
    model='gpt-2',
    device=0 if torch.cuda.is_available() else -1
    )

    gesture_classifier = load_gesture_model()

    @app.websocket("/ws/hologram")
    async def hologram_websocket(websocket: WebSocket):
    await websocket.accept()

    try:
    while True:
    # Receive data from client
    data = await websocket.receive_json()

    if data['type'] == 'gesture':
    # Process gesture
    gesture = data['gesture']
    response = await process_gesture(gesture)

    await websocket.send_json({
    'type': 'hologram_response',
    'action': response['action'],
    'text': response['text']
    })

    elif data['type'] == 'speech':
    # Process speech
    text = data['text']
    response = await process_speech(text)

    await websocket.send_json({
    'type': 'hologram_response',
    'text': response['text'],
    'animation': response['animation']
    })

    except Exception as e:
    print(f"WebSocket error: {e}")
    finally:
    await websocket.close()

    async def process_gesture(gesture: str):
    # Map gestures to responses
    gesture_responses = {
    'wave': {
    'action': 'wave_back',
    'text': 'Hello! Welcome to Capitalino.'
    },
    'point': {
    'action': 'look_at',
    'text': 'I see you! How can I help?'
    },
    'open_hand': {
    'action': 'greet',
    'text': 'Nice to meet you!'
    }
    }

    return gesture_responses.get(gesture, {
    'action': 'idle',
    'text': ''
    })

    async def process_speech(text: str):
    # Generate contextual response
    context = f"You are a friendly hologram assistant for Capitalino, a FinTech company. User said: {text}"

    response = nlp_pipeline(
    context,
    max_length=100,
    num_return_sequences=1,
    temperature=0.7
    )[0]['generated_text']

    # Extract response text
    response_text = response.split('User said:')[1].strip() if 'User said:' in response else response

    return {
    'text': response_text,
    'animation': 'speak'
    }

    Hologram Rendering

    import pygame
    from OpenGL.GL import *

    class HologramRenderer:
    def __init__(self):
    pygame.init()
    self.screen = pygame.display.set_mode((1920, 1080), pygame.OPENGL)

    # Initialize OpenGL
    glEnable(GL_DEPTH_TEST)
    glEnable(GL_BLEND)
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA)

    # Load 3D model
    self.model = load_3d_model('hologram_character.obj')
    self.animation_state = 'idle'

    def render(self, action: str, text: str = None):
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)

    # Update animation based on action
    if action == 'wave_back':
    self.animate_wave()
    elif action == 'speak':
    self.animate_speak(text)
    elif action == 'greet':
    self.animate_greet()

    # Render 3D model
    self.render_model()

    # Render text overlay if needed
    if text:
    self.render_text(text)

    pygame.display.flip()

    def animate_wave(self):
    # Wave animation logic
    pass

    def animate_speak(self, text: str):
    # Speaking animation with lip sync
    pass

    Deployment

    Docker Setup

    FROM python:3.9-slim

    WORKDIR /app

    Install system dependencies


    RUN apt-get update && apt-get install -y \
    libgl1-mesa-glx \
    libglib2.0-0 \
    && rm -rf /var/lib/apt/lists/*

    Install Python dependencies


    COPY requirements.txt .
    RUN pip install --no-cache-dir -r requirements.txt

    Copy application


    COPY . .

    Expose port


    EXPOSE 8000

    Run application


    CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

    Results

    Exhibition Impact

  • Visitors: 5,000+ visitors interacted with the hologram

  • Engagement Time: Average 3-5 minutes per visitor

  • Social Media: 10,000+ shares and mentions

  • Award: Best Booth at ITEX 2024
  • Technical Metrics

  • Response Time: < 200ms for gesture recognition

  • AI Inference: < 500ms for text generation

  • Uptime: 99.9% during exhibition

  • Concurrent Users: Handled 10+ simultaneous interactions
  • Lessons Learned

  • User experience is everything - Technical excellence means nothing if users don't enjoy the experience

  • Real-time is challenging - Low latency requires careful optimization

  • Robustness matters - Exhibition environments are unpredictable

  • Visual impact counts - The hologram's visual appeal was a major factor in winning

  • Team collaboration - Success required close coordination between hardware, software, and design teams
  • Conclusion

    Building the AI hologram installation was one of the most exciting projects of my career. It combined cutting-edge AI, computer vision, and interactive design to create an experience that truly stood out. Winning Best Booth at ITEX 2024 was a testament to the technical excellence and innovative thinking that went into the project.

    The project demonstrated that when you combine technical innovation with user-centric design, you can create experiences that are both impressive and memorable.

    ---

    Interested in AI, computer vision, or interactive installations? Let's connect!