Brain Telemetry Infrastructure
| Status | Implemented |
| Feature gate | fractal-brain (morphee-core) |
| Location | morphee-core brain/telemetry.rs + bench-cli |
| Triggered by | brain-critical-analysis.md |
Problem
The fractal brain makes dozens of decisions per problem — substrat matching, tree recall, confidence gating, anti-recall, verification, reward — but none of it was queryable. Brain metrics were scattered across tracing::info!() logs that vanished after the process ended. We couldn't answer basic questions like "How many problems did the brain recognize correctly?" or "Is the brain getting better over time?"
Architecture
Three layers: SQLite (persistent decision log), Prometheus (live gauges), CLI + Dashboard (analysis and visualization).
┌─────────────────────────────────────────────────────────────────┐
│ NeuronRecallStrategy.process() │
│ ├── recognition timing → metadata["recognition_ms"] │
│ ├── recall timing → metadata["recall_ms"] │
│ └── total timing → metadata["total_brain_ms"] │
└────────────────────────┬────────────────────────────────────────┘
│ HashMap<String, serde_json::Value>
▼
┌─────────────────────────────────────────────────────────────────┐
│ BrainTelemetry::from_metadata() │
│ (crates/morphee-core/src/brain/telemetry.rs) │
│ Extracts & classifies: execution_path, recall_type, etc. │
└────────────────────────┬────────────────────────────────────────┘
│
┌──────────┼──────────┐
▼ ▼ ▼
┌──────────┐ ┌────────┐ ┌──────────┐
│ SQLite │ │Promethe│ │ CLI + │
│ brain.db │ │us │ │Dashboard │
└──────────┘ └────────┘ └──────────┘
SQLite Schema (brain.db)
Separate database file from bench.db. Three tables:
brain_events — One row per problem per run
| Column | Type | Description |
|---|---|---|
| run_id | TEXT | Bench run identifier |
| problem_id | TEXT | Problem identifier |
| problem_index | INTEGER | Position in run |
| correct | INTEGER | 1 if correct, 0 otherwise |
| substrat_id | TEXT | Matched substrat (nullable) |
| substrat_membership | REAL | Membership strength [0, 1] |
| recognition_result | TEXT | recognized / familiar / novel |
| tree_id | TEXT | Matched neuron tree (nullable) |
| recall_type | TEXT | exact / variation / method / guided_llm / novel |
| recall_similarity | REAL | Cosine similarity of match |
| recall_confidence | REAL | Confidence score |
| substitution_count | INTEGER | Parameter substitutions |
| execution_path | TEXT | cerebellum / neocortex / guided_llm / raw_llm |
| llm_calls | INTEGER | LLM calls for this problem |
| working_memory_size | INTEGER | Candidates in working memory |
| candidate_count | INTEGER | Total candidates considered |
| predicted_confidence | REAL | Pre-execution confidence |
| surprise | REAL | Prediction error |
| recognition_ms | INTEGER | Recognition phase timing |
| recall_ms | INTEGER | Recall phase timing |
| total_brain_ms | INTEGER | Total brain overhead |
| created_at | TEXT | ISO 8601 timestamp |
brain_snapshots — Periodic topology snapshots
Captured every --snapshot-every N problems (default 10).
| Column | Type | Description |
|---|---|---|
| run_id | TEXT | Bench run identifier |
| problem_index | INTEGER | Snapshot position |
| total_trees | INTEGER | Neuron tree count |
| total_substrats | INTEGER | Substrat count |
| total_method_neurons | INTEGER | Method neuron count |
| avg_confidence | REAL | Mean confidence |
| recognition_rate | REAL | % of problems not novel |
| recall_accuracy | REAL | Accuracy of recalled answers |
| llm_calls_saved | INTEGER | Problems with 0 LLM calls |
| total_llm_calls | INTEGER | Cumulative LLM calls |
| avg_brain_overhead_ms | REAL | Mean brain overhead |
| current_accuracy | REAL | Running accuracy at snapshot |
| created_at | TEXT | ISO 8601 timestamp |
dream_events — One row per dream consolidation
| Column | Type | Description |
|---|---|---|
| run_id | TEXT | Bench run identifier |
| merges, prunes, deleted_hopeless, rehabilitated | INTEGER | Dream consolidation counts |
| events_pruned, branches_pruned | INTEGER | Pruning stats |
| substrats_formed, substrats_assigned | INTEGER | Substrat changes |
| method_neurons_born | INTEGER | New method neurons |
| code_tested, code_boosted, code_fragile, code_removed | INTEGER | Code verification stats |
| created_at | TEXT | ISO 8601 timestamp |
Prometheus Metrics (11 gauges/counters/histograms)
| Metric | Type | Labels | Purpose |
|---|---|---|---|
brain_substrat_count | Gauge | — | Current substrat count |
brain_recognition_rate | Gauge | — | % of problems recognized |
brain_execution_path | Counter | path | Distribution across paths |
brain_llm_calls_saved | Counter | — | Cumulative 0-LLM recalls |
brain_recall_accuracy | Gauge | type | Per recall-type accuracy |
brain_overhead_ms | Histogram | — | Brain overhead distribution |
brain_substrat_membership | Histogram | — | Membership strength distribution |
brain_surprise | Histogram | — | Prediction error distribution |
brain_confidence_mean | Gauge | — | Running mean confidence |
brain_trees_total | Gauge | — | Live neuron tree count |
brain_method_neurons | Gauge | — | Live method neuron count |
CLI Commands
5 subcommands under bench brain:
# Full brain report for latest (or specific) run
bench brain report [--run ID] [--brain-db path]
# Side-by-side comparison of two runs
bench brain compare --runs A,B [--brain-db path]
# Learning curve from snapshots
bench brain curve [--run ID] [--brain-db path]
# Full decision trace for a specific problem
bench brain explain --run ID --problem PID [--brain-db path]
# Substrat topology table
bench brain substrats [--run ID] [--json] [--brain-db path]
Dashboard
The standalone bench dashboard (bench/dashboard/) provides brain visualization via two pages:
Brain page (/brain):
- Run selector with brain event counts
- Learning curve (Recharts) — accuracy + recognition rate over problem index
- Execution path distribution with progress bars
- Substrat topology table
- Dream consolidation event timeline
Runners page (/runners):
- Live runner status with auto-refresh (10s heartbeats)
- Brain stats per runner (trees, substrats, method neurons)
- Progress tracking (problems done/total)
Data Flow (Dual Mode)
Local mode (default): Brain telemetry stored in SQLite (data/brain.db). CLI subcommands query it directly.
Hub mode (DASHBOARD_URL set): Runners batch-submit brain data to the hub via REST API:
POST /api/runner/brain-events— Decision telemetry per problemPOST /api/runner/brain-snapshots— Topology snapshots every N problemsPOST /api/runner/dream-events— Dream consolidation results
All data lands in PostgreSQL (schema: bench/migrations/002_brain_tables.sql). Dashboard reads from PostgreSQL.
Dashboard REST endpoints: /api/brain/runs, /api/brain/report/:id
Key Files
| File | Lines | Tests | Purpose |
|---|---|---|---|
crates/morphee-core/src/brain/telemetry.rs | ~200 | 6 | Data contract, from/to metadata |
crates/morphee-core/src/brain/store.rs | ~1300 | 24 | NeuronStore trait + File/Git/InMemory + sync |
bench/cli/src/brain_store.rs | ~860 | 13 | SQLite persistence (3 tables) |
bench/cli/src/runner_client.rs | ~460 | 4 | HTTP client for hub (brain events, snapshots, dreams) |
bench/cli/src/commands/brain.rs | ~350 | 10 | 5 CLI subcommands |
bench/cli/src/metrics.rs | +130 | 3 | 11 Prometheus metrics |
bench/cli/src/commands/bench.rs | +90 | — | Wiring (store, snapshots, dreams, hub submission) |
bench/dashboard/server/routes/brain.ts | ~100 | — | Brain dashboard API (PostgreSQL) |
bench/dashboard/server/routes/runner-api.ts | ~335 | — | Runner API (brain events/snapshots/dreams) |
bench/dashboard/src/pages/Brain.tsx | ~200 | — | Brain visualization page |
bench/migrations/002_brain_tables.sql | ~60 | — | PostgreSQL schema |
Design Decisions
-
Dual storage — Local SQLite for dev/quick tests (no infrastructure needed). PostgreSQL via hub for production benchmarking (centralized, multi-runner). Same data model, different backends.
-
Both Prometheus + PostgreSQL — Prometheus for live monitoring during runs (real-time gauges, Grafana dashboards). PostgreSQL for post-hoc analysis (reports, comparisons, learning curves).
-
Batch submission — Runners buffer brain events and submit every 10 problems (configurable). Reduces HTTP overhead while keeping dashboard reasonably up-to-date.
-
Snapshot frequency — Default
--snapshot-every 10balances granularity vs. overhead. For short runs, use--snapshot-every 1. -
Brain tree sync via git — Brain knowledge (neuron trees) syncs through
GitNeuronStore.sync()using git push/pull to a bare repo on the hub. Content-addressable SHA-256 tree IDs mean no merge conflicts. Telemetry (events/snapshots) flows through REST API. Trees flow through git. They're complementary.