Brain Telemetry Infrastructure


Status	Implemented
Feature gate	`fractal-brain` (morphee-core)
Location	morphee-core `brain/telemetry.rs` + bench-cli
Triggered by	brain-critical-analysis.md

Problem

The fractal brain makes dozens of decisions per problem — substrat matching, tree recall, confidence gating, anti-recall, verification, reward — but none of it was queryable. Brain metrics were scattered across tracing::info!() logs that vanished after the process ended. We couldn't answer basic questions like "How many problems did the brain recognize correctly?" or "Is the brain getting better over time?"

Architecture

Three layers: SQLite (persistent decision log), Prometheus (live gauges), CLI + Dashboard (analysis and visualization).

┌─────────────────────────────────────────────────────────────────┐
│  NeuronRecallStrategy.process()                                 │
│  ├── recognition timing → metadata["recognition_ms"]            │
│  ├── recall timing → metadata["recall_ms"]                      │
│  └── total timing → metadata["total_brain_ms"]                  │
└────────────────────────┬────────────────────────────────────────┘
                         │ HashMap<String, serde_json::Value>
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│  BrainTelemetry::from_metadata()                                │
│  (crates/morphee-core/src/brain/telemetry.rs)                   │
│  Extracts & classifies: execution_path, recall_type, etc.       │
└────────────────────────┬────────────────────────────────────────┘
                         │
              ┌──────────┼──────────┐
              ▼          ▼          ▼
        ┌──────────┐ ┌────────┐ ┌──────────┐
        │ SQLite   │ │Promethe│ │ CLI +    │
        │ brain.db │ │us      │ │Dashboard │
        └──────────┘ └────────┘ └──────────┘

SQLite Schema (brain.db)

Separate database file from bench.db. Three tables:

`brain_events` — One row per problem per run

Column	Type	Description
run_id	TEXT	Bench run identifier
problem_id	TEXT	Problem identifier
problem_index	INTEGER	Position in run
correct	INTEGER	1 if correct, 0 otherwise
substrat_id	TEXT	Matched substrat (nullable)
substrat_membership	REAL	Membership strength [0, 1]
recognition_result	TEXT	recognized / familiar / novel
tree_id	TEXT	Matched neuron tree (nullable)
recall_type	TEXT	exact / variation / method / guided_llm / novel
recall_similarity	REAL	Cosine similarity of match
recall_confidence	REAL	Confidence score
substitution_count	INTEGER	Parameter substitutions
execution_path	TEXT	cerebellum / neocortex / guided_llm / raw_llm
llm_calls	INTEGER	LLM calls for this problem
working_memory_size	INTEGER	Candidates in working memory
candidate_count	INTEGER	Total candidates considered
predicted_confidence	REAL	Pre-execution confidence
surprise	REAL	Prediction error
recognition_ms	INTEGER	Recognition phase timing
recall_ms	INTEGER	Recall phase timing
total_brain_ms	INTEGER	Total brain overhead
created_at	TEXT	ISO 8601 timestamp

`brain_snapshots` — Periodic topology snapshots

Captured every --snapshot-every N problems (default 10).

Column	Type	Description
run_id	TEXT	Bench run identifier
problem_index	INTEGER	Snapshot position
total_trees	INTEGER	Neuron tree count
total_substrats	INTEGER	Substrat count
total_method_neurons	INTEGER	Method neuron count
avg_confidence	REAL	Mean confidence
recognition_rate	REAL	% of problems not novel
recall_accuracy	REAL	Accuracy of recalled answers
llm_calls_saved	INTEGER	Problems with 0 LLM calls
total_llm_calls	INTEGER	Cumulative LLM calls
avg_brain_overhead_ms	REAL	Mean brain overhead
current_accuracy	REAL	Running accuracy at snapshot
created_at	TEXT	ISO 8601 timestamp

`dream_events` — One row per dream consolidation

Column	Type	Description
run_id	TEXT	Bench run identifier
merges, prunes, deleted_hopeless, rehabilitated	INTEGER	Dream consolidation counts
events_pruned, branches_pruned	INTEGER	Pruning stats
substrats_formed, substrats_assigned	INTEGER	Substrat changes
method_neurons_born	INTEGER	New method neurons
code_tested, code_boosted, code_fragile, code_removed	INTEGER	Code verification stats
created_at	TEXT	ISO 8601 timestamp

Prometheus Metrics (11 gauges/counters/histograms)

Metric	Type	Labels	Purpose
`brain_substrat_count`	Gauge	—	Current substrat count
`brain_recognition_rate`	Gauge	—	% of problems recognized
`brain_execution_path`	Counter	`path`	Distribution across paths
`brain_llm_calls_saved`	Counter	—	Cumulative 0-LLM recalls
`brain_recall_accuracy`	Gauge	`type`	Per recall-type accuracy
`brain_overhead_ms`	Histogram	—	Brain overhead distribution
`brain_substrat_membership`	Histogram	—	Membership strength distribution
`brain_surprise`	Histogram	—	Prediction error distribution
`brain_confidence_mean`	Gauge	—	Running mean confidence
`brain_trees_total`	Gauge	—	Live neuron tree count
`brain_method_neurons`	Gauge	—	Live method neuron count

CLI Commands

5 subcommands under bench brain:

# Full brain report for latest (or specific) run
bench brain report [--run ID] [--brain-db path]

# Side-by-side comparison of two runs
bench brain compare --runs A,B [--brain-db path]

# Learning curve from snapshots
bench brain curve [--run ID] [--brain-db path]

# Full decision trace for a specific problem
bench brain explain --run ID --problem PID [--brain-db path]

# Substrat topology table
bench brain substrats [--run ID] [--json] [--brain-db path]

Dashboard

The standalone bench dashboard (bench/dashboard/) provides brain visualization via two pages:

Brain page (/brain):

Run selector with brain event counts
Learning curve (Recharts) — accuracy + recognition rate over problem index
Execution path distribution with progress bars
Substrat topology table
Dream consolidation event timeline

Runners page (/runners):

Live runner status with auto-refresh (10s heartbeats)
Brain stats per runner (trees, substrats, method neurons)
Progress tracking (problems done/total)

Data Flow (Dual Mode)

Local mode (default): Brain telemetry stored in SQLite (data/brain.db). CLI subcommands query it directly.

Hub mode (DASHBOARD_URL set): Runners batch-submit brain data to the hub via REST API:

POST /api/runner/brain-events — Decision telemetry per problem
POST /api/runner/brain-snapshots — Topology snapshots every N problems
POST /api/runner/dream-events — Dream consolidation results

All data lands in PostgreSQL (schema: bench/migrations/002_brain_tables.sql). Dashboard reads from PostgreSQL.

Dashboard REST endpoints: /api/brain/runs, /api/brain/report/:id

Key Files

File	Lines	Tests	Purpose
`crates/morphee-core/src/brain/telemetry.rs`	~200	6	Data contract, from/to metadata
`crates/morphee-core/src/brain/store.rs`	~1300	24	NeuronStore trait + File/Git/InMemory + sync
`bench/cli/src/brain_store.rs`	~860	13	SQLite persistence (3 tables)
`bench/cli/src/runner_client.rs`	~460	4	HTTP client for hub (brain events, snapshots, dreams)
`bench/cli/src/commands/brain.rs`	~350	10	5 CLI subcommands
`bench/cli/src/metrics.rs`	+130	3	11 Prometheus metrics
`bench/cli/src/commands/bench.rs`	+90	—	Wiring (store, snapshots, dreams, hub submission)
`bench/dashboard/server/routes/brain.ts`	~100	—	Brain dashboard API (PostgreSQL)
`bench/dashboard/server/routes/runner-api.ts`	~335	—	Runner API (brain events/snapshots/dreams)
`bench/dashboard/src/pages/Brain.tsx`	~200	—	Brain visualization page
`bench/migrations/002_brain_tables.sql`	~60	—	PostgreSQL schema

Design Decisions

Dual storage — Local SQLite for dev/quick tests (no infrastructure needed). PostgreSQL via hub for production benchmarking (centralized, multi-runner). Same data model, different backends.
Both Prometheus + PostgreSQL — Prometheus for live monitoring during runs (real-time gauges, Grafana dashboards). PostgreSQL for post-hoc analysis (reports, comparisons, learning curves).
Batch submission — Runners buffer brain events and submit every 10 problems (configurable). Reduces HTTP overhead while keeping dashboard reasonably up-to-date.
Snapshot frequency — Default --snapshot-every 10 balances granularity vs. overhead. For short runs, use --snapshot-every 1.
Brain tree sync via git — Brain knowledge (neuron trees) syncs through GitNeuronStore.sync() using git push/pull to a bare repo on the hub. Content-addressable SHA-256 tree IDs mean no merge conflicts. Telemetry (events/snapshots) flows through REST API. Trees flow through git. They're complementary.

Problem​

Architecture​

SQLite Schema (brain.db)​

brain_events — One row per problem per run​

brain_snapshots — Periodic topology snapshots​

dream_events — One row per dream consolidation​

Prometheus Metrics (11 gauges/counters/histograms)​

CLI Commands​

Dashboard​

Data Flow (Dual Mode)​

Key Files​

Design Decisions​