Skip to content

SynapsysESPRIT/SavyOR

Repository files navigation

πŸ₯ SavyOR - AI First-Aid Assistant

A comprehensive AI-powered first-aid assistance platform featuring real-time CPR guidance with computer vision, multimodal AI interaction, and emergency response capabilities.

🌟 Features

Backend (FastAPI)

  • Real-time Video Analysis: Computer vision with MediaPipe for pose detection and CPR monitoring
  • Speech Recognition: Faster Whisper integration for voice commands and questions
  • Text-to-Speech: Real-time audio guidance and feedback
  • RAG System: Medical knowledge retrieval from official first-aid documents
  • BPM Monitoring: Accurate CPR compression rate calculation and quality assessment
  • WebSocket Support: Real-time bidirectional communication
  • Multimodal AI: Integration with LLaVA for visual analysis and LLaMA for text generation

Frontend (React + TypeScript)

  • Interactive Dashboard: Real-time monitoring and control interface
  • Video Stream: Live camera feed with pose visualization
  • AI Guidance Panel: Dynamic instruction display with emergency level indicators
  • BPM Monitor: Visual CPR rate tracking with quality metrics
  • Mode Switching: Vision-only, conversation-only, and multimodal modes
  • Mobile Ready: Responsive design with Capacitor for mobile deployment

Key Capabilities

  • CPR Guidance: Real-time visual feedback on compression technique and rate
  • Emergency Classification: AI-powered scene analysis and emergency level assessment
  • Voice Interaction: Natural language queries and responses for emergency scenarios
  • Knowledge Base: Searchable medical database from official first-aid manuals
  • Session Management: Complete CPR session tracking and analytics

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Node.js 16+
  • Docker & Docker Compose (optional)

Option 1: Docker Compose (Recommended)

  1. Clone the repository:

    git clone https://github.com/SynapsysESPRIT/SavyOR.git
    cd SavyOR
  2. Configure environment:

    cp .env.example .env
    # Edit .env with your API credentials
  3. Start all services:

    docker-compose up -d
  4. Access the application:

Option 2: Manual Setup

Backend Setup

  1. Navigate to backend directory:

    cd backend
  2. Create virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Configure environment:

    cp .env.example .env
    # Edit .env with your API credentials
  5. Add knowledge base documents:

    mkdir knowledge_base
    # Copy your PDF first-aid manuals to knowledge_base/
  6. Start the server:

    python start_server.py
    # Or: uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Frontend Setup

  1. Navigate to frontend directory:

    cd frontend
  2. Install dependencies:

    npm install
    # Or: bun install
  3. Start development server:

    npm run dev
    # Or: bun dev
  4. Access the application:

πŸ“– API Documentation

WebSocket Endpoints

Main WebSocket: ws://localhost:8000/ws/assistant/{session_id}

Supported Message Types:

  1. Frame Analysis:

    {
      "type": "frame_analysis",
      "data": {
        "image_data": "base64_encoded_image",
        "current_bpm": 105
      }
    }
  2. Speech Input:

    {
      "type": "speech_input",
      "data": {
        "audio_data": "base64_encoded_audio",
        "language": "en"
      }
    }
  3. Text Query:

    {
      "type": "text_input",
      "data": {
        "text": "How do I check for a pulse?"
      }
    }
  4. Mode Change:

    {
      "type": "mode_change",
      "data": {
        "mode": "multimodal"
      }
    }

REST API Endpoints

  • GET / - Root endpoint with API information
  • GET /api/v1/health - Health check
  • POST /api/v1/analyze-frame - Analyze video frame
  • POST /api/v1/speech-to-text - Convert speech to text
  • POST /api/v1/text-to-speech - Convert text to speech
  • POST /api/v1/rag-query - Query knowledge base
  • POST /api/v1/bpm-update - Update BPM calculation
  • GET /api/v1/bpm-status - Get BPM status
  • POST /api/v1/upload-knowledge - Upload PDF documents

Full API documentation available at: http://localhost:8000/docs

🎯 Usage Modes

1. Multimodal Mode (Recommended)

  • Vision: Real-time video analysis for CPR technique assessment
  • Audio: Voice commands and natural language queries
  • Guidance: Combined visual and conversational AI responses
  • Use Case: Complete AI-assisted emergency response

2. Vision-Only Mode

  • Video Analysis: LLaVA-powered scene understanding
  • CPR Monitoring: Real-time technique feedback
  • Audio Output: Spoken guidance based on visual analysis
  • Use Case: Silent environments or hearing-impaired users

3. Conversation-Only Mode

  • Voice Interaction: Speech recognition and natural responses
  • Knowledge Queries: RAG-based medical information retrieval
  • Audio Guidance: Spoken instructions and advice
  • Use Case: Hands-free operation or visually impaired users

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend      β”‚    β”‚    Backend      β”‚    β”‚   AI Services   β”‚
β”‚                 β”‚    β”‚                 β”‚    β”‚                 β”‚
β”‚ β€’ React App     │◄──►│ β€’ FastAPI       │◄──►│ β€’ LLaVA Vision  β”‚
β”‚ β€’ WebSocket     β”‚    β”‚ β€’ WebSocket     β”‚    β”‚ β€’ LLaMA Text    β”‚
β”‚ β€’ Camera Feed   β”‚    β”‚ β€’ MediaPipe     β”‚    β”‚ β€’ Whisper STT   β”‚
β”‚ β€’ Audio I/O     β”‚    β”‚ β€’ RAG System    β”‚    β”‚ β€’ Knowledge DB  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

  1. Vision Service: MediaPipe pose detection + LLaVA analysis
  2. Audio Service: Faster Whisper STT + IndexTTS2 TTS (voice cloning)
  3. BPM Service: Real-time CPR rate calculation and quality assessment
  4. RAG Service: FAISS vectorstore + medical document retrieval
  5. WebSocket Manager: Real-time communication hub

IndexTTS2 Setup

IndexTTS2 provides high-quality voice cloning TTS. To set it up:

# The repository is cloned at backend/first_aid_assistant/ai_components/indextts_repo/
# Download models (run from the indextts_repo directory):
cd backend/first_aid_assistant/ai_components/indextts_repo
huggingface-cli download IndexTeam/IndexTTS-2 --local-dir checkpoints

# Optional: Provide a custom reference voice for voice cloning
# Copy your voice file to: backend/first_aid_assistant/ai_components/reference_voice.wav

πŸ“± Mobile Deployment

The frontend includes Capacitor configuration for mobile deployment:

cd frontend
npm run build
npx cap add android
npx cap add ios
npx cap sync
npx cap run android

πŸ”§ Configuration

Backend Configuration (backend/.env)

# AI API Configuration
API_KEY=your_openai_api_key
BASE_URL=https://your-api-url.com/v1
LLAMA_MODEL=hosted_vllm/Llama-3.1-70B-Instruct
LLAVA_MODEL=hosted_vllm/llava-1.5-7b-hf

# Server Settings
DEBUG=True
HOST=0.0.0.0
PORT=8000
CORS_ORIGINS=["http://localhost:3000", "http://localhost:5173"]

# Audio/Video Settings
SAMPLE_RATE=16000
TARGET_BPM_MIN=100
TARGET_BPM_MAX=120

Frontend Configuration

  • API URL configuration in environment files
  • WebSocket connection settings
  • Camera and audio permissions

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ₯ Medical Disclaimer

This AI assistant is designed to provide guidance based on official first-aid manuals but should not replace professional medical training or emergency services. Always call emergency services (911/112) for life-threatening situations.

πŸ“ž Support

For support and questions:

πŸ™ Acknowledgments

  • First-Aid manuals and guidelines from official medical organizations
  • OpenAI for AI model APIs
  • MediaPipe for pose detection
  • FastAPI and React communities

Built with ❀️ by the ESPRIT Synapsis team for emergency response and life-saving applications.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •