OLAP analytics layer for FormBD - columnar aggregations and time-series analysis.
FormBD-Analytics provides high-performance analytical queries over FormBD documents. While FormBD prioritizes auditability and reversibility, analytics workloads require different optimization strategies. This separation allows FormBD to maintain its principles while enabling fast aggregations, rollups, and time-series analysis.
┌─────────────────────────────────────────────────────────────┐
│ FormBD (Source of Truth) │
│ Documents with PROMPT scores │
└─────────────────────────────────┬───────────────────────────┘
│ HTTP API
▼
┌─────────────────────────────────────────────────────────────┐
│ FormBD-Analytics │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Ingester │ │ Columnar │ │ Query │ │
│ │ │──▶│ Store │──▶│ Engine │ │
│ │ (ETL from │ │ (Arrow/ │ │ (DataFrames, │ │
│ │ FormBD) │ │ Parquet) │ │ Aggregates) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼ HTTP API
┌─────────────────────────────────────────────────────────────┐
│ Consumers (formbd-studio, dashboards, reports) │
└─────────────────────────────────────────────────────────────┘-
Columnar Storage: Arrow/Parquet for analytical query patterns
-
PROMPT Score Analytics: Aggregations over epistemological dimensions
-
Time-Series Analysis: Document creation/modification trends
-
Rollups: Pre-computed aggregations for common queries
-
Provenance Tracking: Analyze who contributed what, when
FormBD-Analytics uses Julia because:
-
Native columnar operations: DataFrames.jl is optimized for analytical workloads
-
Arrow integration: Arrow.jl provides zero-copy interop
-
Performance: JIT compilation approaches C performance
-
Scientific computing: Strong ecosystem for statistical analysis
-
Hyperpolymath policy: Julia is the approved language for data/batch processing
GET /analytics/health
Health check
GET /analytics/stats
Overall statistics about indexed data
POST /analytics/query
Execute analytical query
Body: { "query": "...", "params": {...} }
GET /analytics/prompt-scores?collection=X&groupBy=Y
PROMPT score aggregations
GET /analytics/time-series?collection=X&field=Y&interval=day
Time-series analysis
GET /analytics/contributors?collection=X
Contributor/provenance analysis[formbd]
api_url = "http://localhost:8080"
collections = ["evidence", "claims"]
[server]
host = "127.0.0.1"
port = 8082
[storage]
# Path for Parquet files
data_dir = "./data"
# Retention in days (0 = forever)
retention_days = 0
[sync]
# Auto-sync interval in minutes (0 = manual only)
auto_sync_minutes = 60FormBD documents may include PROMPT epistemological scores:
-
Provenance - Source traceability
-
Replicability - Can findings be reproduced?
-
Objective - Methodological rigor
-
Methodology - Analytical approach quality
-
Publication - Peer review status
-
Transparency - Data/method openness
FormBD-Analytics provides aggregations:
# Average PROMPT scores by collection
prompt_stats(collection="evidence", groupby=:source)
# Score distribution histograms
prompt_distribution(collection="evidence", dimension=:provenance)
# Correlation between dimensions
prompt_correlations(collection="evidence")