Skip to content

A high-performance, privacy-focused analytics engine written in Rust.

Notifications You must be signed in to change notification settings

rush-cms/analytics

Repository files navigation

Rush Analytics Engine

High-performance, privacy-focused web analytics engine written in Rust. Built for speed, security, and scalability.

What It Does

Rush Analytics provides real-time web analytics without compromising user privacy:

  • High-Throughput Ingestion: Buffered event processing with async batch persistence
  • Privacy First: No PII stored. Daily rotating visitor hashing with cryptographic salts
  • Production Ready: Dead letter queue, exponential backoff, graceful shutdown
  • Live Metrics: Real-time visitor tracking with Redis/memory hybrid cache
  • Multi-Tenant: CRUD API for managing multiple sites programmatically
  • Observable: Health checks, Prometheus metrics, structured JSON logging

Architecture

Stack: Rust (Axum, Tokio, SQLx) + PostgreSQL + Redis (optional)

Design: Hexagonal architecture with clear separation:

  • src/api: HTTP handlers, DTOs, middleware
  • src/core: Domain models, business logic, ports (traits)
  • src/infra: Database repositories, config, state
  • src/workers: Background jobs (flusher, cleanup)

Quick Start

Prerequisites

  • Rust 1.75+
  • PostgreSQL 15+
  • Redis 7+ (optional, for live visitors)

Setup

  1. Clone and configure:

    git clone <repo-url>
    cd analytics
    cp .env.example .env
  2. Edit .env:

    DATABASE_URL=postgres://user:pass@localhost:5432/analytics
    REDIS_URL=redis://localhost:6379
    AUTH_SECRET=your_32_char_secret_key_min_length
    ADMIN_SECRET=admin_32_char_secret_key_min_length
  3. Run migrations:

    sqlx migrate run
  4. Start server:

    cargo run --release
  5. Verify:

    curl http://localhost:3000/health

API Usage

1. Create a Site (First Step)

Before ingesting events, create a site via the admin API:

curl -X POST http://localhost:3000/admin/sites \
  -H "Authorization: Bearer your_admin_secret_here" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "name": "My Website"
  }'

Response:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "domain": "example.com",
  "name": "My Website",
  "created_at": "2026-01-26T10:00:00Z"
}

Save the id - you'll need it for ingestion.

2. Ingest Events (Public Endpoint)

Track pageviews from your website:

curl -X POST http://localhost:3000/ingest \
  -H "Content-Type: application/json" \
  -H "Origin: https://example.com" \
  -d '{
    "site_id": "550e8400-e29b-41d4-a716-446655440000",
    "session_id": "unique-session-id",
    "path": "/blog/post-1",
    "referrer": "https://google.com",
    "utm_source": "newsletter"
  }'

Returns 202 Accepted - event buffered for async processing.

3. Query Stats (Protected Endpoint)

Retrieve analytics data:

curl http://localhost:3000/stats/550e8400-e29b-41d4-a716-446655440000?period=7d \
  -H "Authorization: Bearer your_auth_secret_here"

Response:

{
  "site_id": "550e8400-e29b-41d4-a716-446655440000",
  "from": "2026-01-19",
  "to": "2026-01-26",
  "summary": {
    "total_visitors": 1523,
    "total_pageviews": 4891,
    "avg_bounce_rate": 42.5,
    "live_visitors": 12
  },
  "chart_data": [...],
  "top_pages": [...],
  "top_sources": [...]
}

4. Additional Endpoints

Site Management (Admin):

  • GET /admin/sites - List all sites
  • GET /admin/sites/:id - Get site details
  • PUT /admin/sites/:id - Update site
  • DELETE /admin/sites/:id - Delete site

Operations:

  • GET /health - System health check
  • POST /admin/dlq/replay - Replay failed events from DLQ

For complete API documentation, request/response schemas, and error codes, see:

  • OpenAPI Spec: Coming soon
  • Integration Guide: Check docs/ directory

Development

Run Tests

# All tests (unit + integration + e2e)
cargo test

# E2E flows only
cargo test --test e2e_flows

# With logs
cargo test -- --nocapture

Linting

# Format check
cargo fmt --check

# Clippy (strict mode)
cargo clippy -- -D warnings

Docker

# Production build
docker compose up -d

# Access at http://localhost:3000

Configuration Reference

All environment variables with defaults:

# Database (required)
DATABASE_URL=postgres://user:pass@localhost:5432/analytics

# Cache (optional)
REDIS_URL=redis://localhost:6379

# Server
PORT=3000

# Security (min 32 chars each)
AUTH_SECRET=your_secret_key_here
ADMIN_SECRET=admin_secret_key_here

# CORS (comma-separated)
ALLOWED_ORIGINS=https://example.com,https://www.example.com

# Buffer & Performance
BUFFER_SIZE=1000                    # Events before flush
INGEST_RATE_LIMIT_PER_SEC=100      # Requests/sec
INGEST_BURST_SIZE=200              # Burst allowance

# Flusher Worker
FLUSHER_MAX_RETRIES=3
FLUSHER_INITIAL_BACKOFF_MS=100
FLUSHER_MAX_BACKOFF_MS=5000

# Retention
DATA_RETENTION_DAYS=90             # Auto-cleanup old partitions
CLEANUP_DRY_RUN=false              # Set true to test cleanup

Observability

Health Check

curl http://localhost:3000/health

Returns:

  • System uptime
  • Database latency
  • Redis status
  • Worker health
  • Buffer capacity

Metrics (Prometheus)

Coming in v0.2.0. Will expose:

  • Request rates by endpoint
  • P50/P95/P99 latencies
  • Buffer utilization
  • Flush success/failure rates

Troubleshooting

Database Connection Errors

Error: could not connect to database

Fix: Verify DATABASE_URL format and PostgreSQL is running:

psql $DATABASE_URL -c "SELECT 1"

Auth Secret Too Short

ERROR: AUTH_SECRET must be at least 32 characters

Fix: Generate a secure secret:

openssl rand -base64 32

SQLx Offline Mode

If you see sqlx-data.json errors:

# Regenerate query metadata
cargo sqlx prepare --database-url $DATABASE_URL

Port Already in Use

Error: Address already in use (os error 98)

Fix: Change port in .env or kill existing process:

lsof -ti:3000 | xargs kill -9

Production Deployment

Environment Checklist

  • Strong secrets (32+ chars, random)
  • CORS origins configured
  • PostgreSQL with SSL enabled
  • Redis persistence enabled
  • Log aggregation configured
  • Health checks monitored
  • Auto-restart on crash

Performance Tips

  1. Database: Use connection pooling (default: 10)
  2. Buffer: Tune BUFFER_SIZE based on traffic (100-10000)
  3. Redis: Enable AOF persistence for live visitor data
  4. Partitions: Monitor partition_drops table for cleanup logs

License

MIT - See LICENSE file.

Contributing

See CONTRIBUTING.md for guidelines.

Support

  • Issues: GitHub Issues
  • Docs: docs/ directory
  • Changelog: CHANGELOG.md

About

A high-performance, privacy-focused analytics engine written in Rust.

thhps://rushcms.com/analytics

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published