A production-ready content recommendation classifier API built with FastAPI and DSPy (Declarative Self-improving Language Programs). The system uses Azure OpenAI to intelligently classify content summaries and determine if they should be recommended to users, supporting multiple project-specific fine-tuned models.
- Multi-Project Model Support: Load and serve multiple project-specific DSPy models simultaneously
- Azure OpenAI Integration: Leverages Azure OpenAI for intelligent content classification
- Chain-of-Thought Reasoning: Uses DSPy's ChainOfThought module for explainable predictions
- FastAPI Framework: Modern, high-performance API with automatic OpenAPI documentation
- Comprehensive Testing: 95% test coverage requirement with extensive test suite
- Type Safety: Full mypy type checking with strict mode enabled
- Code Quality: Automated formatting (Black), linting (Ruff), and import sorting (isort)
- Quick Start
- API Documentation
- Project Structure
- Development
- Testing
- Code Quality
- Architecture
- Configuration
- Docker
- Deployment
- Python 3.11 (specified in pyproject.toml)
- Poetry for dependency management
- Azure OpenAI API key and endpoint access
-
Clone the repository
git clone https://github.com/kosinal/aim-classification-llm cd aim-classification-llm -
Install dependencies
make install # OR: poetry install -
Configure environment variables
cp .env.example .env # Edit .env and add your Azure OpenAI credentials -
Add trained models
- Place DSPy model files in
src/aim/model_definitions/ - Follow naming pattern:
flag_classifier_project_project_{n}.jsonwhere{n}is the project ID - Models are auto-discovered and loaded at startup
- Place DSPy model files in
Development mode (with auto-reload):
make dev
# OR: poetry run uvicorn aim.main:app --host 0.0.0.0 --port 8000 --reloadProduction mode:
make run
# OR: poetry run uvicorn aim.main:app --host 0.0.0.0 --port 8000The API will be available at http://localhost:8000
Once running, interactive API documentation is available at:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
Health check endpoint that returns application status and loaded project models.
Response:
{
"status":"healthy",
"models_loaded":true,
"model_count":3,
"project_ids":["2","3","4"]
}Classify content summary for a specific project.
Parameters:
project_id(path, integer): The project ID for which to use the trained model
Request Body:
{
"summary": "Your content summary text here"
}Response:
{
"recommend": true,
"recommendation_score": 0.85,
"reasoning": "The content is highly relevant because...",
"project_id": "1"
}Status Codes:
200: Success404: No model found for the requested project_id503: Models not loaded yet (startup in progress)500: Processing error during classification
aim-assignment/
βββ src/aim/ # Main application package
β βββ __init__.py
β βββ main.py # FastAPI app, lifespan management, DSPy config
β βββ config.py # Azure OpenAI configuration
β βββ models.py # DSPy model definitions (FlagAssessor, FlagClassifier)
β βββ routes.py # API endpoints
β βββ schemas.py # Pydantic request/response models
β βββ model_definitions/ # Serialized DSPy models (*.json)
β βββ __init__.py
β βββ flag_classifier_project_project_{n}.json
βββ tests/ # Test suite (mirrors src structure)
β βββ __init__.py
β βββ test_main.py # Application lifespan and startup tests
β βββ test_routes.py # API endpoint tests
β βββ test_models.py # DSPy model tests
β βββ test_schemas.py # Pydantic schema tests
β βββ test_config.py # Configuration tests
βββ _notebooks/ # Jupyter notebooks (paired with .py files)
β βββ 00_EDA.ipynb/py # Exploratory Data Analysis
β βββ 01a_LLM_classifier.ipynb/py # Single model training approach
β βββ 01b_LLM_separate.ipynb/py # Multi-model training (current approach)
βββ _data/ # Training/evaluation datasets (gitignored)
βββ .env.example # Environment variable template
βββ .env # Environment variables (gitignored)
βββ .env.test # Test environment variables (gitignored)
βββ pyproject.toml # Poetry dependencies and tool config
βββ poetry.lock # Locked dependency versions
βββ Makefile # Development commands
βββ CLAUDE.md # AI assistant guidance
βββ README.md # This file
make help # Show available commands
make install # Install dependencies with Poetry
make dev # Run with auto-reload (development)
make run # Run in production mode
make test # Run all tests
make test-verbose # Run tests with verbose output
make lint # Run all linters and formatters
make clean # Remove cache and temporary files
# Docker commands
make docker-build # Build Docker image
make docker-run # Run application in Docker using docker-compose
make docker-stop # Stop Docker containers
make docker-logs # View Docker container logs
make docker-clean # Remove Docker containers and images
make docker-shell # Open shell in running containerThe project enforces high code quality standards:
# Run all quality tools at once
make lint
# Or run individually:
poetry run black . # Code formatter
poetry run isort . # Import sorter
poetry run ruff check --fix # Linter with auto-fix
poetry run mypy --namespace-packages --explicit-package-bases src # Type checkerQuality Standards:
- Line length: 100 characters (Black, isort)
- Type checking: Enabled with mypy strict mode
- Test coverage: Minimum 95% (enforced by pytest-cov)
Notebooks are located in _notebooks/ and use Jupytext for version control:
- Notebooks are paired with Python scripts (
.pyfiles) using percent format - Editing
.ipynbauto-syncs to.pyand vice versa - Never edit both files - choose one and let Jupytext sync automatically
- Format:
ipynb,py:percent(configured in pyproject.toml)
Available Notebooks:
00_EDA: Exploratory Data Analysis01a_LLM_classifier: Single model training approach01b_LLM_separate: Multi-model training (current production approach)
# Run all tests
make test
# OR: poetry run pytest
# Verbose output
make test-verbose
# OR: poetry run pytest -v
# Specific test file
poetry run pytest tests/test_routes.py
# Specific test function
poetry run pytest tests/test_routes.py::test_assess_content_success
# With coverage report
poetry run pytest --cov=aim --cov-report=html- Uses
.env.testfor test configuration (configured in pyproject.toml) - Test files mirror
src/structure withtest_{module}.pynaming - Coverage threshold: 95% minimum (fails below this)
- Comprehensive mocking of DSPy models to avoid Azure API calls during tests
- FastAPI TestClient: Integration tests for API endpoints
- Mocking DSPy: Avoids external Azure API calls
- Parameterized Tests: Multiple scenarios via
@pytest.mark.parametrize - Async Tests: Uses
pytest-asynciofor async endpoint testing
Critical Design: The application loads multiple project-specific models at startup:
- At startup,
main.pyscanssrc/aim/model_definitions/for files matching pattern:flag_classifier_project_project_{n}.json - Extracts
project_idfrom filename using regex - Loads each model into
app.state.modelsdict withproject_idas key - API routes access models via:
request.app.state.models[project_id_str]
Adding New Project Models:
- Place model file in
src/aim/model_definitions/following naming pattern - Restart the application - models are auto-discovered
- Verify via
/healthendpoint which shows loaded project IDs
Configuration (happens once at startup in lifespan()):
- Uses Azure OpenAI endpoint (not standard OpenAI)
- Configures global DSPy settings with
dspy.configure()anddspy.settings.configure() - Model path format:
azure/{model_name}(not just model name)
Model Architecture:
FlagAssessor(Signature): Defines input/output schema with field descriptionsFlagClassifier(Module): Uses ChainOfThought reasoning for explainable predictions- Model outputs:
reasoning(string): Explanation of classification decisionprediction_score(float 0-1): Confidence scoreprediction(string): "positive" or "negative"
Client Request
β
POST /api/project/{project_id}/assess
β
Validate project_id exists in loaded models
β
Load project-specific FlagClassifier model
β
Process summary with DSPy ChainOfThought
β
Azure OpenAI API call (via DSPy)
β
Parse prediction, score, reasoning
β
Return AssessmentResponse JSON
Create a .env file from .env.example:
# Azure OpenAI Configuration (required)
AIM_OPENAI_KEY=your-azure-openai-api-key-here
AZURE_ENDPOINT=https://aim-australia-east.openai.azure.com/
AZURE_MODEL_NAME=gpt-5-mini-hiring
AZURE_API_VERSION=2025-03-01-previewImportant:
- Development uses
.env - Tests use
.env.test(automatically loaded by pytest-dotenv) - Never commit
.envfiles to version control
pyproject.toml: Dependencies, tool configuration, and project metadatapoetry.lock: Locked dependency versions for reproducibility.flake8: Flake8 linter configuration (legacy, mostly replaced by Ruff)
The application includes Docker support for containerized deployment.
-
Build the Docker image
make docker-build # OR: docker build -t aim-classifier-api:latest . -
Run with docker-compose (recommended)
make docker-run # OR: docker-compose up -d -
View logs
make docker-logs # OR: docker-compose logs -f -
Stop containers
make docker-stop # OR: docker-compose down
make docker-build # Build Docker image
make docker-run # Run application in Docker using docker-compose
make docker-stop # Stop Docker containers
make docker-logs # View Docker container logs
make docker-clean # Remove Docker containers and images
make docker-shell # Open shell in running containerWhen running with Docker, ensure you have:
- A
.envfile with Azure OpenAI credentials
The application will be available at http://localhost:8000 after starting the container.
- Environment Variables: Set all required Azure OpenAI credentials
- Model Files: Ensure all project model files are present in
src/aim/model_definitions/ - Dependencies: Install with
poetry install --no-devfor production - ASGI Server: Use production ASGI server like Gunicorn with Uvicorn workers:
gunicorn aim.main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
- Health Checks: Use
/healthendpoint for container health probes - Monitoring: Monitor model loading status and Azure OpenAI API latency
- Model Loading: Models load at startup, not on-demand. Add new models β restart app.
- Project ID Type: URL path uses
int, model dict keys arestr. Always convert:project_id_str = str(project_id) - LLM Output Parsing:
prediction_scoremay be string from LLM - robust parsing with try/except - Environment Files: Dev uses
.env, tests use.env.test- they're separate - DSPy Global State: DSPy configured once at startup - never reconfigure in routes
- Coverage Threshold: 95% minimum - comprehensive tests required for new code