Skip to content

FOIA-Stream/foia-stream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

47 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ›๏ธ FOIA Stream

The Open-Source Command Center for Public Transparency and Records Management.

Build Status Coverage Version License Runtime Frontend Backend

Table of Contents

Overview

FOIA Stream is a professional-grade, full-stack monorepo designed to automate the lifecycle of Freedom of Information Act (FOIA) requests. Public records access is often hampered by bureaucratic friction, fragmented tracking, and complex redaction requirements. This platform centralizes the process into a high-performance "Command Center."

Built with a focus on speed and security, FOIA Stream leverages the Bun runtime and Hono web framework for sub-millisecond API responses, while providing a modern, accessible interface through Astro and React. It transforms the arduous task of records management into a streamlined, auditable workflow.

Who is this for?

  • ๐Ÿ“ฐ Investigative Journalists managing massive document dumps and multiple agency deadlines.
  • โš–๏ธ Legal Advocates requiring strict audit trails and statutory compliance tracking.
  • ๐Ÿ›๏ธ Government Agencies seeking a modern interface for managing public record disclosures.
  • ๐Ÿ›ก๏ธ Transparency Activists wanting to visualize agency responsiveness and compliance scores.

Features

๐Ÿ“‹ Request Management

  • โœจ Smart Templates: Pre-configured legal citations for local, state, and federal requests.
  • โšก Statutory Tracking: Automated countdowns based on jurisdiction-specific legal deadlines.
  • ๐ŸŽฏ Multi-Agency Routing: Send the same request to multiple agencies with one click.

๐Ÿข Agency Intelligence

  • ๐Ÿ” Verified Directory: Database of agency contacts and Records Access Officers (RAO).
  • ๐Ÿ“Š Compliance Metrics: Real-time scoring of agency responsiveness and fulfillment rates.
  • ๐Ÿท๏ธ Jurisdiction Scoping: Filters for Police, Education, Finance, and Executive branches.

๐Ÿ›ก๏ธ Document & Security

  • ๐Ÿ–‹๏ธ True Redaction: Specialized engine that physically removes PII rather than masking it.
  • ๐Ÿค– AI-Assisted Detection: Automated identification of SSNs, names, and addresses.
  • ๐Ÿฆ  Malware Protection: Integrated VirusTotal scanning for all incoming agency responses.
  • ๐Ÿ”’ Encrypted Storage: AES-256 encryption for private requests and sensitive documents.

Architecture

FOIA Stream is organized as a Turborepo monorepo, ensuring strict type safety between the frontend, backend, and shared utilities.

System Components

graph TD
    subgraph "Client Layer (Astro + React)"
        UI[User Dashboard]
        RedactUI[Interactive Redactor]
        Form[Request Wizard]
    end

    subgraph "API Layer (Hono + Bun)"
        Gateway[API Gateway]
        Auth[Auth & MFA Service]
        RedactSvc[Redaction Engine]
        AgencySvc[Agency Search]
    end

    subgraph "Infrastructure"
        DB[(PostgreSQL + Drizzle)]
        Cache[LRU Memory Cache]
        Storage[S3/Local Encrypted Disk]
        AV{{VirusTotal API}}
    end

    UI --> Gateway
    RedactUI --> RedactSvc
    Gateway --> Auth
    Gateway --> RedactSvc
    Gateway --> AgencySvc
    RedactSvc --> AV
    RedactSvc --> Storage
    AgencySvc --> Cache
    Auth --> DB
Loading

Data Flow: Document Ingestion & Redaction

flowchart LR
    PDF[/"Raw Agency PDF"/] --> Scan[VirusTotal Scan]
    Scan --> |"Clean"| Parse[Text/Object Extraction]
    Parse --> |"Tokens"| Pattern[Pattern Matching Trie/Rabin-Karp]
    Pattern --> |"Suggestions"| UI[User Manual Review]
    UI --> |"Approved"| Flatten[PDF Flattening & Sanitization]
    Flatten --> |"Redacted PDF"| Store[(Secure Storage)]
    
    Scan -.-> |"Malicious"| Quarantine[Quarantine & Notify]
Loading

Technology Stack

Layer Technology Purpose
Runtime Bun High-performance JS runtime & package manager
Frontend Astro + React Content-rich pages with interactive React islands
Backend Hono Lightweight, TypeScript-first web framework
Database PostgreSQL Relational data with Drizzle ORM
Validation Zod Schema validation across the entire stack
Security Argon2 + Jose Password hashing and JWT session management

Quick Start

Prerequisites

  • Bun: >= 1.1.0
  • Docker & Docker Compose: For database and proxy services
  • VirusTotal API Key: (Optional) For automated malware scanning

Installation

# Clone the repository
git clone https://github.com/FOIA-Stream/foia-stream.git
cd foia-stream

# Install dependencies
bun install

# Configure environment (Edit .env files with your credentials)
cp apps/api/.env.example apps/api/.env
cp apps/astro/.env.example apps/astro/.env

# Start development environment (Postgres + API + Frontend)
docker-compose up -d
bun run dev

Expected Output

The API will be available at http://localhost:3000 and the Frontend at http://localhost:4321. You should see a "Ready" message in the console indicating the Drizzle migrations are complete.

Usage & Examples

Creating a New FOIA Request

  1. Select Agency: Use the search bar to find the target jurisdiction.
  2. Apply Template: Choose "General Records Request" or "Police Incident Report."
  3. Submit: The system generates the PDF/Email and begins the statutory clock.

API Usage Example (Programmatic Request)

const response = await fetch('http://localhost:3000/api/v1/requests', {
  method: 'POST',
  headers: { 
    'Content-Type': 'application/json',
    'Authorization': 'Bearer <YOUR_TOKEN>' 
  },
  body: JSON.stringify({
    agencyId: "uuid-gov-123",
    subject: "Budget Allocations 2023",
    templateId: "standard-request",
    isPublic: true
  })
});

const data = await response.json();
console.log(`Request tracked with ID: ${data.id}`);
Advanced: Custom Redaction Patterns

You can extend the redaction engine by adding custom regex patterns to packages/shared/src/utils/redacted.ts. This allows you to target agency-specific ID formats or proprietary data structures.

Configuration

Environment Variables (API)

Variable Required Default Description
DATABASE_URL Yes - PostgreSQL connection string
JWT_SECRET Yes - Secret for signing session tokens
VIRUSTOTAL_API_KEY No - API key for malware scanning
PORT No 3000 Port for the Hono server
NODE_ENV No development Runtime environment

API Reference

Agencies API

GET /api/v1/agencies

  • Params: q (search string), limit, offset
  • Returns: Array of agency objects with RAO contact info.

Redaction API

POST /api/v1/redaction/analyze

  • Body: File (Multipart PDF)
  • Returns: BoundingBox[] of suggested PII locations.

Requests API

PATCH /api/v1/requests/:id/status

  • Body: { "status": "fulfilled" | "denied" | "appealed" }
  • Returns: Updated request metadata.

Development

Project Structure

  • apps/api: Hono backend, database schemas, and core services.
  • apps/astro: Frontend pages and React UI components.
  • packages/shared: Shared Zod schemas and heavy-lifting DSA (Trie, Rabin-Karp).
  • compliance/: Documentation regarding PII handling and security controls.

Running Tests

# Run unit tests for shared package
bun run test --filter=shared

# Run API integration tests
bun run test --filter=api

# Run E2E Cypress tests
bun run test:e2e --filter=astro

Troubleshooting

Error Cause Solution
ECONNREFUSED 127.0.0.1:5432 Postgres container not running Run docker-compose up -d postgres
Unexpected token in JSON API crashed or returned 500 Check docker logs foia-api for stack traces
Redaction engine timeout PDF is too large or complex Increase MAX_FILE_SIZE in apps/api/src/config/index.ts

Contributing

We welcome contributions from journalists and developers alike.

  1. Fork the repository.
  2. Create a feature branch (git checkout -b feature/amazing-feature).
  3. Commit your changes using conventional commits (feat: add auto-redaction).
  4. Push to the branch and Open a Pull Request.

Please ensure you run bun run lint (using Biome) before submitting code.

Roadmap & Known Issues

  • ๐Ÿ“… Integration with Google Calendar for deadline alerts.
  • โœ‰๏ธ Direct SMTP integration for sending emails from the dashboard.
  • ๐Ÿ“ฑ Mobile application for capturing field documents.
  • โš ๏ธ Issue: OCR performance is currently limited for handwritten documents.
  • โš ๏ธ Issue: Multi-page PDF redaction (>100 pages) may require significant memory in the Bun runtime.

License & Credits

  • License: MIT - See LICENSE for details.
  • Maintainer: FOIA Stream Core Team
  • Inspirations: MuckRock, FOIA Machine, and the investigative journalism community.