Skip to content

API Rate Limiting

Eric Fitzgerald edited this page Jan 24, 2026 · 2 revisions

API Rate Limiting

Version: 2.0.0 Last Updated: 2025-01-24 Status: Fully Implemented

Overview

The TMI API implements rate limiting to protect against abuse, ensure fair resource allocation, and maintain service availability. Rate limits are applied at multiple scopes depending on the endpoint category and authentication status.

This document provides comprehensive details about the rate limiting strategy documented in the OpenAPI specification via x-rate-limit extensions.

Table of Contents

  1. Rate Limiting Strategy
  2. Tier Definitions
  3. Multi-Scope Rate Limiting
  4. Configurable Quotas
  5. Quota Caching
  6. Rate Limit Headers
  7. Client Integration
  8. Database Schema
  9. Implementation Notes

Rate Limiting Strategy

TMI uses a tiered rate limiting approach with five distinct tiers:

Tier Name Scope Configurable Endpoint Count
1 Public Discovery IP No 5
2 Auth Flows Multi-scope No 9
3 Resource Operations User Yes 112
4 Webhooks User Yes 7
5 Addon Invocations User Yes (DB) 3

Design Principles

  1. Unauthenticated endpoints use IP-based rate limiting
  2. Authenticated endpoints use user-based rate limiting (extracted from JWT subject)
  3. Auth flow endpoints use multi-scope limiting to balance security and usability
  4. Webhook endpoints leverage existing database-backed quota system
  5. Configurable limits allow per-user customization for resource operations and webhooks

Tier Definitions

Tier 1: Public Discovery

Applies to: Unauthenticated endpoints that provide API metadata and discovery information.

Endpoints:

  • GET / - API information
  • GET /.well-known/openid-configuration - OpenID configuration
  • GET /.well-known/oauth-authorization-server - OAuth metadata
  • GET /.well-known/jwks.json - JSON Web Key Set
  • GET /.well-known/oauth-protected-resource - Protected resource metadata

Rate Limit Configuration:

scope: ip
tier: public-discovery
limits:
  - type: requests_per_minute
    default: 10
    configurable: false
    tracking_method: Source IP address

Rationale:

  • These endpoints are cacheable and low-cost
  • Low limit (10/min) prevents excessive polling
  • IP-based tracking is appropriate for unauthenticated access

Tier 2: Auth Flows

Applies to: OAuth 2.0 and SAML 2.0 authentication endpoints.

Endpoints:

  • OAuth: /oauth2/authorize, /oauth2/callback, /oauth2/token, /oauth2/refresh, /oauth2/introspect
  • SAML: /saml/login, /saml/acs, /saml/slo (GET and POST)

Rate Limit Configuration:

strategy: multi-scope
tier: auth-flows
scopes:
  - name: session
    limits:
      - type: requests_per_minute
        default: 5
        configurable: false
        tracking_method: OAuth state parameter or SAML request ID
  - name: ip
    limits:
      - type: requests_per_minute
        default: 100
        configurable: false
        tracking_method: Source IP address
  - name: user_identifier
    limits:
      - type: attempts_per_hour
        default: 10
        configurable: false
        tracking_method: login_hint parameter or email address
enforcement: Most restrictive limit applies

Multi-Scope Enforcement:

Auth flow endpoints use three concurrent rate limit scopes:

  1. Session Scope (5 requests/minute)

    • Prevents individual browser sessions from hammering the endpoint
    • Tracked via OAuth state parameter or SAML request ID
    • Protects against misconfigured clients or tight retry loops
  2. IP Scope (100 requests/minute)

    • Prevents DoS from single IP address
    • High limit allows large organizations (corporate NAT, universities)
    • Addresses shared IP concern for multi-user applications
  3. User Identifier Scope (10 attempts/hour)

    • Prevents credential stuffing attacks on specific accounts
    • Tracked via login_hint parameter (OAuth) or email/username (form inputs)
    • Independent of session or IP for maximum protection

Example Scenarios:

Scenario Session Limit IP Limit User Limit Result
Single user, normal login 1/min 1/min 1/hour Allowed
User refreshing page rapidly 6/min 6/min 6/hour Blocked (session limit)
Corporate office (100 users) 1/min each 100/min total 1/hour each Allowed
Attacker trying alice@example.com 5/min 5/min 11/hour Blocked (user limit)
Distributed botnet Varies Varies 11/hour per user Blocked (user limit)

Rationale:

  • Single IP limit alone would block legitimate users in shared environments
  • Session tracking prevents tight retry loops
  • User identifier tracking prevents account takeover attempts
  • Most restrictive limit applies - any scope hitting its limit blocks the request

Tier 3: Resource Operations

Applies to: All authenticated endpoints for threat models, diagrams, users, and collaboration.

Endpoints:

  • User management: /me, /oauth2/userinfo
  • Threat models: /threat_models/*
  • Diagrams: /threat_models/{id}/diagrams/*
  • Sub-resources: Assets, threats, documents, notes, repositories, metadata
  • Collaboration: /me/sessions

Rate Limit Configuration:

scope: user
tier: resource-operations
limits:
  - type: requests_per_minute
    default: 1000
    configurable: true
    quota_source: user_api_quotas

User-Based Tracking:

  • Rate limit applied per JWT subject (user ID)
  • Default: 1000 requests/minute per user
  • Configurable: Operators can customize limits per user via database

Quota Source:

  • Table: user_api_quotas
  • Schema includes:
    • user_internal_uuid (UUID, primary key, foreign key to users)
    • max_requests_per_minute (INT, default 100 in DB, 1000 in code)
    • max_requests_per_hour (INT, default NULL)
    • created_at, modified_at (timestamps)

Rationale:

  • 1000 req/min supports interactive UI usage and reasonable automation
  • User-based tracking ensures fair allocation across all users
  • Configurability allows VIP users, integrations, or CI/CD to have higher limits
  • Existing pattern from webhook quotas ensures consistency

Tier 4: Webhooks

Applies to: Webhook subscription management and delivery history.

Endpoints:

  • /webhooks/subscriptions (GET, POST)
  • /webhooks/subscriptions/{id} (GET, DELETE)
  • /webhooks/subscriptions/{id}/test (POST)
  • /webhooks/deliveries (GET)
  • /webhooks/deliveries/{id} (GET)

Rate Limit Configuration:

scope: user
tier: webhooks
limits:
  - type: subscription_requests_per_minute
    default: 10
    configurable: true
    quota_source: webhook_quotas.max_subscription_requests_per_minute
  - type: subscription_requests_per_day
    default: 20
    configurable: true
    quota_source: webhook_quotas.max_subscription_requests_per_day
  - type: events_per_minute
    default: 12
    configurable: true
    quota_source: webhook_quotas.max_events_per_minute
  - type: max_subscriptions
    default: 10
    configurable: true
    quota_source: webhook_quotas.max_subscriptions

Multiple Rate Limits:

Webhook endpoints enforce four distinct limits:

  1. Subscription Requests Per Minute (10/min)

    • Applies to: POST, DELETE on /webhooks/subscriptions
    • Prevents rapid subscription churn
  2. Subscription Requests Per Day (20/day)

    • Applies to: POST, DELETE on /webhooks/subscriptions
    • Prevents subscription quota farming
  3. Events Per Minute (12/min)

    • Applies to: Webhook event publications (not HTTP API calls)
    • Limits rate of events sent to user's subscriptions
  4. Max Subscriptions (10 total)

    • Static limit on number of active subscriptions per user
    • Prevents resource exhaustion

Existing Implementation:

Webhook rate limiting is fully implemented:

  • Database table: webhook_quotas (see docs/reference/legacy-migrations/002_business_domain.up.sql)
  • Rate limiter: api/webhook_rate_limiter.go
  • Storage: Redis sorted sets for sliding window algorithm
  • Tests: api/webhook_rate_limiter_test.go

Rationale:

  • Multiple limits provide granular control over webhook usage
  • Database-backed quotas proven effective in implementation
  • Configurable limits support different subscription tiers
  • Event publication limit prevents webhook spam

Tier 5: Addon Invocations

Applies to: Add-on invocation endpoints for executing custom code against threat models.

Endpoints:

  • /addons/{addon_id}/invoke (POST)
  • /addons/invocations/{invocation_id} (GET)
  • /addons/invocations/{invocation_id} (DELETE)

Rate Limit Configuration:

scope: user
tier: addon-invocations
limits:
  - type: max_active_invocations
    default: 3
    configurable: true
    quota_source: addon_invocation_quotas.max_active_invocations
  - type: invocations_per_hour
    default: 10
    configurable: true
    quota_source: addon_invocation_quotas.max_invocations_per_hour
tracking_method: Sliding window with Redis sorted sets

Enforcement Details:

  1. Active Invocation Limit (3 concurrent)

    • Prevents users from running too many addons simultaneously
    • Checked before creating new invocation
    • Releases when invocation completes or times out
  2. Hourly Rate Limit (10/hour)

    • Sliding window using Redis sorted sets
    • Prevents addon abuse and resource exhaustion
    • Old entries automatically cleaned up

Database Schema:

Table: addon_invocation_quotas

CREATE TABLE IF NOT EXISTS addon_invocation_quotas (
    owner_internal_uuid UUID PRIMARY KEY,
    max_active_invocations INT NOT NULL DEFAULT 1,
    max_invocations_per_hour INT NOT NULL DEFAULT 10,
    created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    modified_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (owner_internal_uuid) REFERENCES users(internal_uuid) ON DELETE CASCADE
);

Implementation Status:

  • Database table created
  • GlobalAddonInvocationQuotaStore initialized
  • GlobalAddonRateLimiter created and integrated
  • Rate limiting enforced in addon invocation handlers
  • Admin API endpoints for managing quotas implemented

Rationale:

  • Addons execute custom code and consume significant resources
  • Single concurrent invocation prevents resource exhaustion
  • Hourly limit prevents abuse while allowing reasonable automation
  • Database-backed quotas allow per-user customization for power users

Multi-Scope Rate Limiting

Overview

Multi-scope rate limiting applies multiple independent rate limits to a single request, enforcing the most restrictive limit. This approach balances security with usability.

How It Works

For each request to an auth flow endpoint:

  1. Extract identifiers from request:

    • Session ID (OAuth state or SAML request ID)
    • Source IP address
    • User identifier (login_hint, email, or username)
  2. Check all scopes against their respective limits:

    • Session: 5 requests/minute
    • IP: 100 requests/minute
    • User: 10 attempts/hour
  3. Enforce most restrictive limit:

    • If ANY scope exceeds its limit -> Return 429
    • If ALL scopes are under limit -> Allow request
  4. Record request in all applicable scopes

Tracking Mechanisms

Session Tracking:

  • OAuth: Extract from state query parameter
  • SAML: Extract from SAMLRequest or RelayState
  • Lifespan: Typically 5-15 minutes (OAuth spec)
  • Storage: Redis sorted set per session ID

IP Tracking:

  • Source IP from X-Forwarded-For (if trusted proxy) or direct connection
  • Storage: Redis sorted set per IP address

User Identifier Tracking:

  • OAuth: login_hint query parameter (optional)
  • Form login: Username or email field
  • Only tracked when identifier is provided
  • Storage: Redis sorted set per normalized identifier (lowercase email)

Redis Key Patterns

# Session scope
ratelimit:session:{state_or_request_id}:minute

# IP scope
ratelimit:ip:{ip_address}:minute

# User identifier scope
ratelimit:user:{normalized_email}:hour

Graceful Degradation

If Redis is unavailable:

  • Session and user limits: Disabled (logs warning)
  • IP limit: Falls back to in-memory tracking (loses distributed state)
  • Service continues: Rate limiting disabled to maintain availability

Configurable Quotas

Overview

Tiers 3 (Resource Operations), 4 (Webhooks), and 5 (Addon Invocations) support per-user configurable quotas stored in PostgreSQL. This allows operators to:

  • Increase limits for VIP users or integrations
  • Implement tiered subscription plans
  • Grant higher quotas to CI/CD systems
  • Throttle specific users if needed

Default Values

Quota Type Field Default Value
User API max_requests_per_minute 1000
User API max_requests_per_hour 60000 (optional)
Webhook max_subscriptions 10
Webhook max_events_per_minute 12
Webhook max_subscription_requests_per_minute 10
Webhook max_subscription_requests_per_day 20
Addon Invocation max_active_invocations 3
Addon Invocation max_invocations_per_hour 10

Admin API Endpoints

TMI provides comprehensive quota management for administrators to control resource limits per user. All quota endpoints require administrator privileges.

User API Quota Endpoints

GET    /admin/quotas/users              # List all custom user API quotas
GET    /admin/quotas/users/{user_id}    # Get user's API quota
PUT    /admin/quotas/users/{user_id}    # Create/update user's API quota
DELETE /admin/quotas/users/{user_id}    # Delete quota (revert to defaults)

Example: Set higher quota for power user

curl -X PUT \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"max_requests_per_minute": 5000, "max_requests_per_hour": 300000}' \
  "https://api.example.com/admin/quotas/users/550e8400-e29b-41d4-a716-446655440000"

Webhook Quota Endpoints

GET    /admin/quotas/webhooks              # List all custom webhook quotas
GET    /admin/quotas/webhooks/{user_id}    # Get user's webhook quota
PUT    /admin/quotas/webhooks/{user_id}    # Create/update webhook quota
DELETE /admin/quotas/webhooks/{user_id}    # Delete quota (revert to defaults)

Example: Set webhook quota

curl -X PUT \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "max_subscriptions": 20,
    "max_events_per_minute": 24,
    "max_subscription_requests_per_minute": 20,
    "max_subscription_requests_per_day": 40
  }' \
  "https://api.example.com/admin/quotas/webhooks/550e8400-e29b-41d4-a716-446655440000"

Addon Invocation Quota Endpoints

GET    /admin/quotas/addons              # List all custom addon invocation quotas
GET    /admin/quotas/addons/{user_id}    # Get user's addon invocation quota
PUT    /admin/quotas/addons/{user_id}    # Create/update addon invocation quota
DELETE /admin/quotas/addons/{user_id}    # Delete quota (revert to defaults)

Example: Set addon invocation quota

curl -X PUT \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"max_active_invocations": 5, "max_invocations_per_hour": 100}' \
  "https://api.example.com/admin/quotas/addons/550e8400-e29b-41d4-a716-446655440000"

Common Response Codes

Code Description
200 OK Request successful, returns quota data
201 Created Quota created successfully
204 No Content Quota deleted successfully
400 Bad Request Invalid request body or user ID
401 Unauthorized Missing or invalid authentication
403 Forbidden User is not an administrator
404 Not Found Quota or user not found

Best Practices

  1. List Before Modifying: Use list endpoints to discover which users have custom quotas
  2. Pagination: Always use limit and offset parameters for large result sets
  3. Default Values: Only set custom quotas when needed; defaults work for most users
  4. Documentation: Document why specific users have custom quotas

Quota Caching

Overview

To avoid database queries on every API request, TMI implements an in-memory quota cache with automatic expiration and invalidation.

Cache Implementation

Global Instance: GlobalQuotaCache (initialized in main.go)

Configuration:

  • TTL: 60 seconds (configurable)
  • Storage: In-memory maps with read-write mutex
  • Cleanup: Automatic background goroutine removes expired entries

Cached Data:

  • User API quotas (map[string]*cachedUserAPIQuota)
  • Webhook quotas (map[string]*cachedWebhookQuota)

Cache Behavior

On Cache Miss:

  1. Fetch quota from database via store interface
  2. Store in cache with expiration timestamp (now + TTL)
  3. Return quota to caller

On Cache Hit:

  1. Check if entry is expired (time.Now().Before(expiresAt))
  2. If not expired: Return cached quota
  3. If expired: Fetch from database and update cache

Cache Invalidation

Per-User Invalidation (Primary):

GlobalQuotaCache.InvalidateUserAPIQuota(userID)  // Removes specific user's API quota
GlobalQuotaCache.InvalidateWebhookQuota(userID)  // Removes specific user's webhook quota

Automatic Invalidation:

  • Called automatically when admin updates user quota via PUT endpoint
  • Called automatically when admin deletes user quota via DELETE endpoint
  • Ensures quota changes take effect immediately (within cache check)

Global Invalidation (Available but not exposed):

GlobalQuotaCache.InvalidateAll()  // Clears all cached quotas

Performance Impact

Without Caching:

  • Database query on every API request
  • ~5-10ms latency per request
  • Increased database load

With Caching:

  • Database query only on cache miss (every 60 seconds per user)
  • ~0.1ms latency for cache hits
  • 99%+ reduction in database queries

Trade-off:

  • Quota changes take up to 60 seconds to propagate (or immediate with invalidation)
  • Small memory overhead (negligible for typical user counts)

Implementation Details

Location: api/quota_cache.go

Key Features:

  • Thread-safe with sync.RWMutex
  • Automatic cleanup goroutine prevents memory leaks
  • Graceful shutdown via Stop() method
  • Falls back to database on cache failure

Rate Limit Headers

When a rate limit is enforced, the API returns HTTP 429 with informative headers:

Response Headers

Header Type Description Example
X-RateLimit-Limit Integer Maximum requests allowed in window 100
X-RateLimit-Remaining Integer Requests remaining in current window 0
X-RateLimit-Reset Integer Unix timestamp when window resets 1640000000
Retry-After Integer Seconds to wait before retrying 60

Example 429 Response

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1732233600
Retry-After: 45

{
  "code": "rate_limit_exceeded",
  "message": "Rate limit exceeded: 100 requests per minute. Retry after 45 seconds.",
  "details": {
    "limit": 100,
    "window": "minute",
    "retry_after": 45
  }
}

Multi-Scope Headers

For auth flow endpoints with multi-scope limits, headers reflect the most restrictive scope:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1732233660
Retry-After: 30
X-RateLimit-Scope: session

{
  "code": "rate_limit_exceeded",
  "message": "Rate limit exceeded: 5 requests per minute per session. Retry after 30 seconds.",
  "details": {
    "limit": 5,
    "scope": "session",
    "window": "minute",
    "retry_after": 30
  }
}

Client Integration

Best Practices

  1. Always check rate limit headers in responses (even 200 OK)
  2. Implement exponential backoff when receiving 429
  3. Respect Retry-After header before retrying
  4. Pre-emptively throttle when X-RateLimit-Remaining is low

Sample Client Code

Python

import requests
import time

def make_request_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)

        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            print(f"Rate limited. Waiting {retry_after} seconds...")
            time.sleep(retry_after)
            continue

        # Check remaining quota
        remaining = int(response.headers.get('X-RateLimit-Remaining', 100))
        if remaining < 10:
            print(f"Warning: Only {remaining} requests remaining")

        return response

    raise Exception("Max retries exceeded")

Go

func makeRequestWithRetry(url string, token string, maxRetries int) (*http.Response, error) {
    client := &http.Client{}

    for attempt := 0; attempt < maxRetries; attempt++ {
        req, _ := http.NewRequest("GET", url, nil)
        req.Header.Set("Authorization", "Bearer " + token)

        resp, err := client.Do(req)
        if err != nil {
            return nil, err
        }

        if resp.StatusCode == 429 {
            retryAfter, _ := strconv.Atoi(resp.Header.Get("Retry-After"))
            if retryAfter == 0 {
                retryAfter = 60
            }
            log.Printf("Rate limited. Waiting %d seconds...", retryAfter)
            time.Sleep(time.Duration(retryAfter) * time.Second)
            continue
        }

        // Check remaining quota
        remaining, _ := strconv.Atoi(resp.Header.Get("X-RateLimit-Remaining"))
        if remaining < 10 {
            log.Printf("Warning: Only %d requests remaining", remaining)
        }

        return resp, nil
    }

    return nil, fmt.Errorf("max retries exceeded")
}

JavaScript/TypeScript

async function makeRequestWithRetry(
    url: string,
    token: string,
    maxRetries: number = 3
): Promise<Response> {
    for (let attempt = 0; attempt < maxRetries; attempt++) {
        const response = await fetch(url, {
            headers: { 'Authorization': `Bearer ${token}` }
        });

        if (response.status === 429) {
            const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
            console.log(`Rate limited. Waiting ${retryAfter} seconds...`);
            await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
            continue;
        }

        // Check remaining quota
        const remaining = parseInt(response.headers.get('X-RateLimit-Remaining') || '100');
        if (remaining < 10) {
            console.warn(`Warning: Only ${remaining} requests remaining`);
        }

        return response;
    }

    throw new Error('Max retries exceeded');
}

Database Schema

Existing Tables

webhook_quotas

See docs/reference/legacy-migrations/002_business_domain.up.sql for complete schema.

Purpose: Store per-user webhook rate limits and subscription quotas.

Key Fields:

  • owner_id - User UUID (primary key)
  • max_subscriptions - Maximum active subscriptions (default: 10)
  • max_events_per_minute - Event publication rate (default: 12)
  • max_subscription_requests_per_minute - API request rate (default: 10)
  • max_subscription_requests_per_day - Daily API quota (default: 20)

user_api_quotas

Purpose: Store per-user API rate limits for resource operations.

Status: Fully implemented (table exists in database)

Schema:

CREATE TABLE IF NOT EXISTS user_api_quotas (
    user_internal_uuid UUID PRIMARY KEY,
    max_requests_per_minute INT NOT NULL DEFAULT 100,
    max_requests_per_hour INT DEFAULT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    modified_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (user_internal_uuid) REFERENCES users(internal_uuid) ON DELETE CASCADE
);

addon_invocation_quotas

Purpose: Store per-user addon invocation rate limits.

Status: Fully implemented (table exists in database)

Schema:

CREATE TABLE IF NOT EXISTS addon_invocation_quotas (
    owner_internal_uuid UUID PRIMARY KEY,
    max_active_invocations INT NOT NULL DEFAULT 1,
    max_invocations_per_hour INT NOT NULL DEFAULT 10,
    created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    modified_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (owner_internal_uuid) REFERENCES users(internal_uuid) ON DELETE CASCADE
);

Implementation Notes

Current Status

Fully Implemented:

  • OpenAPI specification with x-rate-limit extensions
  • 429 response component with proper headers
  • API rate limiter (Redis-based sliding window)
  • Webhook rate limiter (Redis-based sliding window)
  • Addon invocation rate limiter (active + hourly limits)
  • user_api_quotas database table and store
  • webhook_quotas database table and store
  • addon_invocation_quotas database table and store
  • Quota caching system with 60s TTL
  • Cache invalidation on quota updates
  • Rate limiting middleware (RateLimitMiddleware)
  • Middleware registered in router (main.go)
  • Admin API for user API quota management
  • Admin API for webhook quota management
  • Admin API for addon invocation quota management
  • Rate limit headers (X-RateLimit-*)
  • Comprehensive test coverage for rate limiting

Partially Implemented:

  • Multi-scope rate limiter for auth flows (code exists, basic integration)
  • IP-based rate limiting for public endpoints (code exists)

Middleware Integration

Rate Limit Middleware (api/rate_limit_middleware.go):

  • Registered globally in router: r.Use(api.RateLimitMiddleware(apiServer))
  • Applied to all authenticated endpoints
  • Skips public discovery endpoints (/, /.well-known/*)
  • Skips auth flow endpoints (OAuth, SAML)
  • Extracts user ID from JWT context
  • Checks per-minute and per-hour limits
  • Returns HTTP 429 with retry-after on limit exceeded
  • Adds rate limit headers to all responses
  • Fails open on errors (allows request, logs warning)

IP Rate Limit Middleware (api/ip_and_auth_rate_limit_middleware.go):

  • Registered: r.Use(api.IPRateLimitMiddleware(apiServer))
  • Protects public endpoints from IP-based abuse
  • 10 requests/minute per IP address
  • Uses Redis sorted sets for distributed tracking

Auth Flow Rate Limit Middleware (api/ip_and_auth_rate_limit_middleware.go):

  • Registered: r.Use(api.AuthFlowRateLimitMiddleware(apiServer))
  • Applies to OAuth and SAML endpoints
  • Multi-scope tracking (session, IP, user identifier)
  • Prevents credential stuffing and auth flow abuse

Technology Stack

Rate Limiting:

  • Algorithm: Sliding window (token bucket alternative)
  • Storage: Redis sorted sets (ZSET)
  • Key Pattern: ratelimit:{scope}:{identifier}:{window}
  • TTL: Window duration + 60 seconds buffer

Database:

  • Storage: PostgreSQL
  • Tables: webhook_quotas, user_api_quotas, addon_invocation_quotas
  • Access: Via store interface pattern

Graceful Degradation:

  • Redis unavailable -> Rate limiting disabled, logs warning
  • Database unavailable -> Falls back to default quotas
  • Maintains service availability over strict enforcement

Performance Considerations

Redis Operations:

  • Rate limit checks: 2-3 Redis commands (ZREMRANGEBYSCORE, ZCOUNT, ZADD)
  • Pipelined for atomicity and performance
  • Expected latency: <5ms per check

Database Queries:

  • Quota lookups cached in-memory (TTL: 60 seconds)
  • No database query on every request
  • Quota changes take effect within 60 seconds

Sliding Window Cleanup:

  • Automatic via ZREMRANGEBYSCORE before each check
  • TTL ensures old keys are eventually cleaned up
  • No separate cleanup job needed

Security Considerations

Distributed Attacks:

  • Multi-scope limiting prevents single-vector attacks
  • User identifier tracking stops credential stuffing
  • IP limiting prevents single-IP DoS

Quota Bypass:

  • JWT validation ensures user identity
  • Redis atomic operations prevent race conditions
  • Database foreign key constraints prevent orphaned quotas

Information Disclosure:

  • Rate limit headers reveal system limits (acceptable for public API)
  • Error messages don't expose internal implementation details
  • Quota configuration not exposed via user-facing APIs

References

Related Documentation

Standards and RFCs

Tools and Libraries


Changelog

2.0.0 (2025-01-24)

  • Added Tier 5: Addon Invocations with active and hourly limits
  • Updated default values to match current implementation
  • Added comprehensive admin API documentation for all quota types
  • Added quota caching section
  • Marked implementation status as Fully Implemented
  • Migrated to wiki format

1.0.0 (2025-11-21)

  • Initial specification
  • Four-tier rate limiting strategy
  • Multi-scope auth flow protection
  • Database-backed configurable quotas
  • Comprehensive client integration guide

Clone this wiki locally