Skip to content

kosinal/aim-classification-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

33 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Content Recommendation Classifier API

A production-ready content recommendation classifier API built with FastAPI and DSPy (Declarative Self-improving Language Programs). The system uses Azure OpenAI to intelligently classify content summaries and determine if they should be recommended to users, supporting multiple project-specific fine-tuned models.

πŸš€ Features

  • Multi-Project Model Support: Load and serve multiple project-specific DSPy models simultaneously
  • Azure OpenAI Integration: Leverages Azure OpenAI for intelligent content classification
  • Chain-of-Thought Reasoning: Uses DSPy's ChainOfThought module for explainable predictions
  • FastAPI Framework: Modern, high-performance API with automatic OpenAPI documentation
  • Comprehensive Testing: 95% test coverage requirement with extensive test suite
  • Type Safety: Full mypy type checking with strict mode enabled
  • Code Quality: Automated formatting (Black), linting (Ruff), and import sorting (isort)

πŸ“‹ Table of Contents

⚑ Quick Start

Prerequisites

  • Python 3.11 (specified in pyproject.toml)
  • Poetry for dependency management
  • Azure OpenAI API key and endpoint access

Installation

  1. Clone the repository

    git clone https://github.com/kosinal/aim-classification-llm
    cd aim-classification-llm
  2. Install dependencies

    make install
    # OR: poetry install
  3. Configure environment variables

    cp .env.example .env
    # Edit .env and add your Azure OpenAI credentials
  4. Add trained models

    • Place DSPy model files in src/aim/model_definitions/
    • Follow naming pattern: flag_classifier_project_project_{n}.json where {n} is the project ID
    • Models are auto-discovered and loaded at startup

Running the Application

Development mode (with auto-reload):

make dev
# OR: poetry run uvicorn aim.main:app --host 0.0.0.0 --port 8000 --reload

Production mode:

make run
# OR: poetry run uvicorn aim.main:app --host 0.0.0.0 --port 8000

The API will be available at http://localhost:8000

πŸ“š API Documentation

Once running, interactive API documentation is available at:

Available Endpoints

GET /health

Health check endpoint that returns application status and loaded project models.

Response:

{
  "status":"healthy",
  "models_loaded":true,
  "model_count":3,
  "project_ids":["2","3","4"]
}

POST /api/project/{project_id}/assess

Classify content summary for a specific project.

Parameters:

  • project_id (path, integer): The project ID for which to use the trained model

Request Body:

{
  "summary": "Your content summary text here"
}

Response:

{
  "recommend": true,
  "recommendation_score": 0.85,
  "reasoning": "The content is highly relevant because...",
  "project_id": "1"
}

Status Codes:

  • 200: Success
  • 404: No model found for the requested project_id
  • 503: Models not loaded yet (startup in progress)
  • 500: Processing error during classification

πŸ“ Project Structure

aim-assignment/
β”œβ”€β”€ src/aim/                          # Main application package
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ main.py                       # FastAPI app, lifespan management, DSPy config
β”‚   β”œβ”€β”€ config.py                     # Azure OpenAI configuration
β”‚   β”œβ”€β”€ models.py                     # DSPy model definitions (FlagAssessor, FlagClassifier)
β”‚   β”œβ”€β”€ routes.py                     # API endpoints
β”‚   β”œβ”€β”€ schemas.py                    # Pydantic request/response models
β”‚   └── model_definitions/            # Serialized DSPy models (*.json)
β”‚       β”œβ”€β”€ __init__.py
β”‚       └── flag_classifier_project_project_{n}.json
β”œβ”€β”€ tests/                            # Test suite (mirrors src structure)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ test_main.py                  # Application lifespan and startup tests
β”‚   β”œβ”€β”€ test_routes.py                # API endpoint tests
β”‚   β”œβ”€β”€ test_models.py                # DSPy model tests
β”‚   β”œβ”€β”€ test_schemas.py               # Pydantic schema tests
β”‚   └── test_config.py                # Configuration tests
β”œβ”€β”€ _notebooks/                       # Jupyter notebooks (paired with .py files)
β”‚   β”œβ”€β”€ 00_EDA.ipynb/py              # Exploratory Data Analysis
β”‚   β”œβ”€β”€ 01a_LLM_classifier.ipynb/py  # Single model training approach
β”‚   └── 01b_LLM_separate.ipynb/py    # Multi-model training (current approach)
β”œβ”€β”€ _data/                            # Training/evaluation datasets (gitignored)
β”œβ”€β”€ .env.example                      # Environment variable template
β”œβ”€β”€ .env                              # Environment variables (gitignored)
β”œβ”€β”€ .env.test                         # Test environment variables (gitignored)
β”œβ”€β”€ pyproject.toml                    # Poetry dependencies and tool config
β”œβ”€β”€ poetry.lock                       # Locked dependency versions
β”œβ”€β”€ Makefile                          # Development commands
β”œβ”€β”€ CLAUDE.md                         # AI assistant guidance
└── README.md                         # This file

πŸ› οΈ Development

Makefile Commands

make help          # Show available commands
make install       # Install dependencies with Poetry
make dev           # Run with auto-reload (development)
make run           # Run in production mode
make test          # Run all tests
make test-verbose  # Run tests with verbose output
make lint          # Run all linters and formatters
make clean         # Remove cache and temporary files

# Docker commands
make docker-build  # Build Docker image
make docker-run    # Run application in Docker using docker-compose
make docker-stop   # Stop Docker containers
make docker-logs   # View Docker container logs
make docker-clean  # Remove Docker containers and images
make docker-shell  # Open shell in running container

Code Quality Tools

The project enforces high code quality standards:

# Run all quality tools at once
make lint

# Or run individually:
poetry run black .                                              # Code formatter
poetry run isort .                                              # Import sorter
poetry run ruff check --fix                                     # Linter with auto-fix
poetry run mypy --namespace-packages --explicit-package-bases src  # Type checker

Quality Standards:

  • Line length: 100 characters (Black, isort)
  • Type checking: Enabled with mypy strict mode
  • Test coverage: Minimum 95% (enforced by pytest-cov)

Jupyter Notebooks

Notebooks are located in _notebooks/ and use Jupytext for version control:

  • Notebooks are paired with Python scripts (.py files) using percent format
  • Editing .ipynb auto-syncs to .py and vice versa
  • Never edit both files - choose one and let Jupytext sync automatically
  • Format: ipynb,py:percent (configured in pyproject.toml)

Available Notebooks:

  • 00_EDA: Exploratory Data Analysis
  • 01a_LLM_classifier: Single model training approach
  • 01b_LLM_separate: Multi-model training (current production approach)

πŸ§ͺ Testing

Running Tests

# Run all tests
make test
# OR: poetry run pytest

# Verbose output
make test-verbose
# OR: poetry run pytest -v

# Specific test file
poetry run pytest tests/test_routes.py

# Specific test function
poetry run pytest tests/test_routes.py::test_assess_content_success

# With coverage report
poetry run pytest --cov=aim --cov-report=html

Test Environment

  • Uses .env.test for test configuration (configured in pyproject.toml)
  • Test files mirror src/ structure with test_{module}.py naming
  • Coverage threshold: 95% minimum (fails below this)
  • Comprehensive mocking of DSPy models to avoid Azure API calls during tests

Testing Approaches

  1. FastAPI TestClient: Integration tests for API endpoints
  2. Mocking DSPy: Avoids external Azure API calls
  3. Parameterized Tests: Multiple scenarios via @pytest.mark.parametrize
  4. Async Tests: Uses pytest-asyncio for async endpoint testing

πŸ—οΈ Architecture

Multi-Model Loading Pattern

Critical Design: The application loads multiple project-specific models at startup:

  1. At startup, main.py scans src/aim/model_definitions/ for files matching pattern: flag_classifier_project_project_{n}.json
  2. Extracts project_id from filename using regex
  3. Loads each model into app.state.models dict with project_id as key
  4. API routes access models via: request.app.state.models[project_id_str]

Adding New Project Models:

  1. Place model file in src/aim/model_definitions/ following naming pattern
  2. Restart the application - models are auto-discovered
  3. Verify via /health endpoint which shows loaded project IDs

DSPy Integration

Configuration (happens once at startup in lifespan()):

  • Uses Azure OpenAI endpoint (not standard OpenAI)
  • Configures global DSPy settings with dspy.configure() and dspy.settings.configure()
  • Model path format: azure/{model_name} (not just model name)

Model Architecture:

  • FlagAssessor (Signature): Defines input/output schema with field descriptions
  • FlagClassifier (Module): Uses ChainOfThought reasoning for explainable predictions
  • Model outputs:
    • reasoning (string): Explanation of classification decision
    • prediction_score (float 0-1): Confidence score
    • prediction (string): "positive" or "negative"

Request Flow

Client Request
    ↓
POST /api/project/{project_id}/assess
    ↓
Validate project_id exists in loaded models
    ↓
Load project-specific FlagClassifier model
    ↓
Process summary with DSPy ChainOfThought
    ↓
Azure OpenAI API call (via DSPy)
    ↓
Parse prediction, score, reasoning
    ↓
Return AssessmentResponse JSON

βš™οΈ Configuration

Environment Variables

Create a .env file from .env.example:

# Azure OpenAI Configuration (required)
AIM_OPENAI_KEY=your-azure-openai-api-key-here
AZURE_ENDPOINT=https://aim-australia-east.openai.azure.com/
AZURE_MODEL_NAME=gpt-5-mini-hiring
AZURE_API_VERSION=2025-03-01-preview

Important:

  • Development uses .env
  • Tests use .env.test (automatically loaded by pytest-dotenv)
  • Never commit .env files to version control

Configuration Files

  • pyproject.toml: Dependencies, tool configuration, and project metadata
  • poetry.lock: Locked dependency versions for reproducibility
  • .flake8: Flake8 linter configuration (legacy, mostly replaced by Ruff)

🐳 Docker

Running with Docker

The application includes Docker support for containerized deployment.

Quick Start with Docker

  1. Build the Docker image

    make docker-build
    # OR: docker build -t aim-classifier-api:latest .
  2. Run with docker-compose (recommended)

    make docker-run
    # OR: docker-compose up -d
  3. View logs

    make docker-logs
    # OR: docker-compose logs -f
  4. Stop containers

    make docker-stop
    # OR: docker-compose down

Docker Commands Reference

make docker-build   # Build Docker image
make docker-run     # Run application in Docker using docker-compose
make docker-stop    # Stop Docker containers
make docker-logs    # View Docker container logs
make docker-clean   # Remove Docker containers and images
make docker-shell   # Open shell in running container

Docker Environment Setup

When running with Docker, ensure you have:

  • A .env file with Azure OpenAI credentials

The application will be available at http://localhost:8000 after starting the container.

🚒 Deployment

Production Considerations

  1. Environment Variables: Set all required Azure OpenAI credentials
  2. Model Files: Ensure all project model files are present in src/aim/model_definitions/
  3. Dependencies: Install with poetry install --no-dev for production
  4. ASGI Server: Use production ASGI server like Gunicorn with Uvicorn workers:
    gunicorn aim.main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
  5. Health Checks: Use /health endpoint for container health probes
  6. Monitoring: Monitor model loading status and Azure OpenAI API latency

πŸ“ Common Gotchas

  1. Model Loading: Models load at startup, not on-demand. Add new models β†’ restart app.
  2. Project ID Type: URL path uses int, model dict keys are str. Always convert: project_id_str = str(project_id)
  3. LLM Output Parsing: prediction_score may be string from LLM - robust parsing with try/except
  4. Environment Files: Dev uses .env, tests use .env.test - they're separate
  5. DSPy Global State: DSPy configured once at startup - never reconfigure in routes
  6. Coverage Threshold: 95% minimum - comprehensive tests required for new code

About

Using LLM as a judge for classification.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published