Fractal Brain — Digital Organism Architecture
Status: Implemented (Phases 1-6 + Prediction + Telemetry). Actively benchmarked via Kaggle AIMO competition.
Feature gate: fractal-brain (brain modules), grpc (proto/tonic)
ADR: Follows from ADR-012 (RL Policy Network)
Location: crates/morphee-core/src/brain/ (26 files, ~10,000 lines, 233 tests)
Benchmarking: bench/ (CLI + Dashboard + Docker + Remote Runners) — 284 tests
Telemetry: brain-telemetry.md — SQLite + Prometheus + CLI + Dashboard
Competition context: The Fractal Brain is being demonstrated through the Kaggle AIMO math competition. Math is the perfect domain — it requires abstraction, pattern recognition, and compositional reasoning. Success here proves the brain can handle day-to-day tasks for families, teams, and professionals. Learned knowledge will be shareable through the Knowledge Marketplace (V2.1).
See also: Digital Brain Vision — the next evolution. Brain Critical Analysis — honest assessment of strengths and gaps. Brain Telemetry — measurement infrastructure.
Overview
The Fractal Brain implements Universal Recursive Intelligence — everything is an Organism (neurons, spaces, groups, LLMs, WASM modules). Same trait, same lifecycle, every scale. The core pattern: receive signal → recall → respond → learn.
Six scales of organization, each implementing the same Organism trait:
| Scale | Example | Recall | Learn |
|---|---|---|---|
| Neuron | Single concept/fact | Fingerprint match | Hebbian weight update |
| Experience | LLM call, WASM module | Direct delegation | Reward signal |
| Space | Family, Classroom, Project | NeuronMemory (3-mode) | Edge weight update + neuron storage |
| Group | The Dupont Family | Route to best Space | Cross-space edge learning |
| Instance | Desktop/mobile app | Local organism graph | Sync with server |
| Network | Specialist neurons | gRPC signal propagation | Federated learning |
Phase 1-3: Neuron-based Knowledge Representation
Problem
morphee-core's knowledge pipeline used flat 384-dim embedding vectors for recall (recall_similar via cosine similarity). This loses structural information:
- "Find the GCD of 48 and 18" and "Find the GCD of 360 and 240" appear "similar" but there's no way to know the operation matches and only the arguments differ
- Flat cosine can't distinguish structural similarity from surface similarity
- Every near-miss requires a full LLM call even when only parameter substitution is needed
Solution: Neurons + Synapses + Trees
The Fractal Brain treats each embedding vector as a neuron (a point in 384-dim space) and connections between them as synapses (weighted by influence). A single BERT forward pass produces per-token hidden states; trajectory segmentation recursively decomposes them into a tree.
Three Recall Modes
| Mode | Condition | Action | LLM Calls |
|---|---|---|---|
| Exact | All neurons match (root >0.95, children >0.90) | Replay stored solution code | 0 |
| Variation | Operation neurons match, leaf neurons differ | Substitute parameters in stored code | 0 |
| Novel | No structural match | Full LLM call, store new tree | 1+ |
Recall Architecture
Query text
↓
embed_tokens() → per-token hidden states (384-dim each)
↓
TrajectorySegmenter → NeuronTree (recursive fractal structure)
↓
FingerprintIndex.find_similar() → candidate trees (O(1) lookup)
↓
compare_trees() → TreeMatch { Exact | Variation | Novel }
↓
NeuronRecallStrategy:
Exact → replay stored code
Variation → string-replace stored labels → execute
Novel → fallback strategy → store new tree
Trajectory Segmentation Algorithm
- Compute running mean of hidden states at each token position
- Compute deltas (how the running mean changes per token)
- Compute cosine between consecutive deltas
- Adaptive threshold:
mean(cosines) - 1.0 * std(cosines) - Split at points where cosine drops below threshold (direction change)
- Recurse on each segment until
min_segment_lenormax_depth - Synapse weight =
||child_mean - parent_mean|| * token_count / total_influence
Sparse Fingerprints
For O(1) approximate matching, each neuron stores a SparseFingerprint:
- Top-32 most active dimensions (by absolute value) + their signs
- Jaccard-like similarity (dimension overlap × sign agreement)
- Hash-based
FingerprintIndexfor bucket-based candidate retrieval
Hebbian Plasticity
Synapse weights are updated by RL reward signals:
- Positive reward → strengthen connections (weight increases)
- Negative reward → weaken connections (weight decreases)
- Weights clamped to [0.01, 1.0]
Neuron Merging
Two neurons with fingerprint similarity >0.92 can merge:
- Weighted average of activation vectors (by strength)
- Combined strength count
- Creates "concept neurons" that generalize across experiences
Phase 4: Organism Architecture
Organism Trait
The universal contract that everything implements:
pub trait Organism: Send + Sync {
fn id(&self) -> &OrganismId;
fn scale(&self) -> Scale;
fn health(&self) -> Health;
async fn receive(&self, signal: &BrainSignal, ctx: &SignalContext) -> Result<Vec<BrainSignal>>;
async fn learn(&self, signal_id: &str, reward: f64) -> Result<()>;
}
BrainSignal
Signals carry data between organisms with modality awareness:
Modality: Text, Audio, Scalar, Image, Structured, EventActivation: substrat_id + embedding vectorSignalContext: source organism, depth, budget, timestamp
Edge System
Directed weighted edges connect organisms:
EdgeKind: Temporal, Hierarchical, Associative, CrossSubstrat, CrossOrganismAdaptiveFilter: learned gate on each edge (pass_rate updated by reward)- Weight range: [0.0, 1.0], Hebbian update on signal flow
SubstratEncoder
Abstraction for different embedding modalities:
BertSubstratwraps the existingEmbeddertraitTokenStatetracks per-token activation history- Designed for multi-modal: text, audio, image substrats share the same trait
Grammar System
Tokenization/detokenization per modality:
TextGrammarwrapsBertTokenizerAudioGrammarstub for future audio signal processingTokenwithTokenKind(Word, Subword, Punctuation, Special, AudioFrame, ScalarValue)
Phase 5: Signal Propagation & Execution
SignalGraphExecutor
The engine that propagates signals through the organism graph:
pub struct SignalGraphExecutor {
organisms: Arc<RwLock<HashMap<OrganismId, Arc<RwLock<dyn Organism>>>>>,
config: ExecutorConfig, // max_depth=5, budget_ms=1000, max_fanout=8
}
Safety bounds:
max_depth=5— prevents infinite recursionbudget_ms=1000— checked at each recursion viaInstant::elapsed()max_fanout=8— limits edges explored per organism
Propagation loop:
- Deliver signal to target organism via
receive() - For each response signal, check outgoing edges from target
- For each edge where
AdaptiveFilter.pass_rate > threshold, recurse - Guard: depth < max_depth AND elapsed < budget_ms AND fanout < max_fanout
- Record
SignalTrace(path of organism hops with edge weights)
SignalTrace
Records the path a signal took through the graph. When learn() is called with a reward, the trace tells which edges to update:
pub struct SignalTrace {
pub signal_id: String,
pub path: Vec<(OrganismId, OrganismId, f32)>, // (from, to, edge_weight)
pub timestamp: u64,
}
SpaceOrganism
First production impl Organism. Each Space becomes an independent learning organism:
pub struct SpaceOrganism {
id: OrganismId,
space_id: String,
group_id: String,
neuron_memory: Arc<NeuronMemory>,
edges: Vec<Edge>,
substrats: Vec<Arc<Substrat>>,
centroid: Option<Activation>,
child_organisms: Vec<OrganismId>,
signal_traces: Arc<Mutex<VecDeque<SignalTrace>>>,
pipeline_fallback: Option<Arc<Pipeline>>,
}
receive() flow:
- Grammar: decode signal modality → tokens
- SubstratEncoder: encode tokens → activations
- NeuronMemory::recall() → Exact/Variation/Novel
- Exact → construct response from stored
source_text(0 LLM) - Variation → apply parameter substitutions (0 LLM)
- Novel → delegate to LLM child organism or Pipeline fallback
- Store new NeuronTree if Novel
- Return response signals
learn() flow:
- Find SignalTrace for signal_id
- Update edge weights along path (Hebbian: reward strengthens, punishment weakens)
- Update NeuronTree strength in NeuronStore
- Does NOT propagate to other spaces (independence)
LlmOrganism
Wraps Arc<dyn Inferencer> as Organism at Scale::Experience. Receives text signals, calls inferencer.generate(), returns response signal. learn() is a no-op but logs for RL policy.
WasmOrganism
Wraps Arc<dyn Executor> as Organism at Scale::Experience. Maps signal → executor input, executor output → response signal.
Phase 6: Reward System & Confidence Tracking
Reward Architecture (reward.rs, 564 lines, 14 tests)
The reward system tracks confidence at the neuron/tree level:
- Confidence tracking: Per-tree confidence based on reward history (uses/correct/incorrect counts)
- Quarantine: Trees with reject rate >60% and 2+ uses are quarantined (excluded from recall)
- Branch-level blame:
BranchBlamedistributes reward across tree children proportionally
Key types:
RewardEvent— timestamped reward with match_type and context_hashTreeRewardLedger— per-tree tracking (events, total_uses, total_correct, confidence, quarantined)BranchBlame— branch-level blame attribution (child_id → correct/total)QuarantineConfig— thresholds (min_uses=2, reject_rate=0.6)
Substrat Index — Problem Type Recognition
SubstratIndex (substrat_index.rs, 952 lines, 36 tests)
The substrat index provides problem type recognition before recall. Each substrat is a Gaussian cluster in 384-dim sentence embedding space. When a new query arrives, it's classified as Recognized/Familiar/Novel at the substrat level before tree-level recall begins.
SubstratCluster
pub struct SubstratCluster {
id: SubstratId,
centroid: Vec<f32>, // mean embedding
scope: f32, // sigma (spread)
temperature: f32, // plasticity control
confidence: f32, // learned reliability
exemplar_tree_ids: Vec<String>,
origin: SubstratOrigin, // Explicit / Emergent / Archived
}
Membership function: exp(-d² / (2σ²)) where σ = scope × (1.0 + 0.3 × temperature)
Constants:
MEMBERSHIP_THRESHOLD= 0.3NOVEL_DISTANCE_THRESHOLD= 0.80TEMPERATURE_DECAY= 0.95METHOD_NEURON_THRESHOLD= 5 (exemplars to birth method neuron)
RecognitionResult
| Result | Meaning | Action |
|---|---|---|
| Recognized | High membership in existing substrat | Recall within that substrat's trees |
| Familiar | Moderate membership | Broader recall + centroid update |
| Novel | No substrat match | Full LLM call, potentially create new substrat |
WorkingMemory
Transient context during recall: candidate substrats + candidate trees. Scoped to a single query, discarded after.
Method Neurons — Learned Procedures
MethodNeuron (method_neuron.rs, 254 lines, 10 tests)
Method neurons encapsulate learned procedures — code templates, solution patterns, or best exemplars that have proven reliable. They mature through three stages inspired by neuroscience:
| Stage | Plasticity | Confidence | Behavior |
|---|---|---|---|
| Hippocampus | High | Low | New, actively learning, every execution verified |
| Neocortex | Moderate | Medium | Verified, dual-path check (execute + verify) |
| Cerebellum | Low | High | Automated, no LLM verification needed |
Promotion thresholds:
- Hippocampus → Neocortex: 10+ uses, 70%+ confidence
- Neocortex → Cerebellum: 50+ uses, 90%+ confidence
Procedure types:
ParameterizedCode— code template with parameter slotsSolutionTemplate— structured solution patternBestExemplar— reference to highest-confidence exemplar tree
Prediction System — Surprise-Driven Learning
PredictionTracker (prediction.rs, 257 lines, 10 tests)
The prediction system enables surprise-driven learning. Before execution, the brain predicts its confidence. After execution, the actual outcome is compared. High surprise (|predicted - actual| > 0.3) triggers stronger centroid updates in the substrat.
pub struct Prediction {
substrat_id: SubstratId,
predicted_confidence: f32, // before execution
actual_outcome: Option<f32>, // after execution
surprise: f32, // |predicted - actual|
}
This makes the brain learn faster from unexpected results — both surprising failures and surprising successes drive more aggressive substrat reorganization.
Dream Consolidation
DreamConsolidator (dream.rs, 473 lines, 10 tests)
9-phase background consolidation cycle (phases 7-9 require fractal-brain):
- Neuron merging — merge neurons with fingerprint similarity >0.92
- Synapse pruning — remove synapses with weight < 0.05
- Weak tree deletion — delete trees with 0% correct + quarantined
- Rehabilitation — reset quarantine for old trees
- Event pruning — keep only 100 most recent events per tree
- Branch pruning — remove worst-performing children
- Mitosis detection — detect low-coherence clusters → split
- Substrat clustering — assign trees to substrat clusters
- Method neuron birth — 5+ exemplars in a substrat → birth method neuron
DreamScheduler (dream_scheduler.rs, 216 lines, 4 tests)
Background timer for dream cycles. Default interval: 5 minutes (configurable via MORPHEE_DREAM_INTERVAL_SECS).
Lifecycle (lifecycle.rs, 280 lines, 6 tests)
MitosisDetector— monitors organism coherence (threshold: 200 neurons, 0.4 coherence, 10 min cluster size), triggers splitDecayPolicy— configurable decay for edges and synapses
Telemetry Infrastructure
BrainTelemetry (telemetry.rs, 364 lines, 6 tests)
Every brain decision is captured as structured telemetry:
pub struct BrainTelemetry {
substrat_id: Option<String>,
recognition_result: String, // recognized/familiar/novel
tree_id: Option<String>,
recall_type: String, // exact/variation/novel
execution_path: String, // neuron_exact/neuron_variation/knowledge/llm
llm_calls: u32,
working_memory_size: u32,
candidate_count: u32,
predicted_confidence: Option<f32>,
surprise: Option<f32>,
timing: BrainTiming, // recognition_ms, recall_ms, total_brain_ms
}
Three-layer measurement: SQLite (persistent, per-problem decisions), Prometheus (live gauges, 11 metrics), CLI + Dashboard (analysis and visualization). See brain-telemetry.md.
Multi-Space Management
SpaceOrganismRegistry (space_registry.rs, 380 lines, 11 tests)
Manages multiple independent SpaceOrganisms per group:
pub struct SpaceOrganismRegistry {
organisms: HashMap<OrganismId, Arc<RwLock<SpaceOrganism>>>,
executor: Arc<SignalGraphExecutor>,
dream_handles: HashMap<OrganismId, DreamHandle>,
}
create_space()— creates a new SpaceOrganism with its own NeuronStoresend_signal()— routes signal to the correct spaceadd_cross_space_edge()— connects spaces viaEdgeKind::CrossOrganismexport_space()/import_space()— bundle spaces for marketplace sharing
Cross-Space Edges
Spaces connect via EdgeKind::CrossOrganism edges:
- A "Math" space has an edge to a "Calculator" space
- When Math gets a Novel signal, it propagates through the edge to Calculator
- Calculator processes and returns response signals
- Math's edge weight to Calculator is updated by learn() — independent of Calculator's internal state
NeuronStore Implementations
NeuronStore trait (store.rs, 459 lines, 11 tests)
pub trait NeuronStore: Send + Sync {
async fn store_tree(&self, tree: &NeuronTree) -> Result<()>;
async fn get_tree(&self, id: &str) -> Result<Option<NeuronTree>>;
async fn find_similar(&self, fingerprint: &SparseFingerprint, limit: usize) -> Result<Vec<NeuronTree>>;
async fn list_trees(&self) -> Result<Vec<NeuronTree>>;
async fn delete_tree(&self, id: &str) -> Result<()>;
}
Three implementations:
- InMemoryNeuronStore — for testing, HashMap-backed
- FileNeuronStore — persistent, JSON files in
{data_dir}/neurons/{space_id}/ - SqliteNeuronStore — persistent, SQLite database with FTS for fingerprint search
Per-space isolation is handled at construction time — each SpaceOrganism gets its own store instance.
gRPC Proto Definitions
organism.proto (212 lines)
7 RPCs for organism communication:
| RPC | Direction | Purpose |
|---|---|---|
Send | server streaming | Signal propagation through organism graph |
Learn | unary | Reward feedback to organism |
Observe | server streaming | Live signal stream for frontend visualization |
GetOrganism | unary | Inspect organism state |
ListOrganisms | unary | Enumerate organisms |
TriggerDream | unary | On-demand consolidation |
Chat | server streaming | Text chat (replaces SSE /v1/chat) |
Proto ↔ Rust conversions in proto_convert.rs (408 lines, 8 tests), gated behind #[cfg(feature = "grpc")].
Pipeline → Organism Mapping
| Pipeline component | Organism equivalent | How |
|---|---|---|
Embedder.embed() | SubstratEncoder.encode() | BertSubstrat wraps Embedder |
Router.route() | Edge weights + AdaptiveFilter | Signal follows strongest edges |
Strategy.process() | Recursive receive() | Signal propagation through depth |
Executor.execute() | WasmOrganism child | WASM module as organism |
Inferencer.generate() | LlmOrganism child | LLM as organism |
Scorer.score() | Implicit in edge weights | Learned, not computed |
FeedbackLoop | learn() + Hebbian edges | Reward propagates through trace |
MiddlewareChain | AdaptiveFilter on edges | Filters learn what to pass |
EventBus | SignalPropagated events | Observer of signal flow |
Events
Brain-related events emitted to EventBus:
| Event | Source | Purpose |
|---|---|---|
NeuronTreeBuilt | recall | New neuron tree stored |
NeuronRecalled | recall | Existing tree matched (Exact/Variation) |
OrganismEdgeFormed | space | New edge between organisms |
OrganismMitosis | lifecycle | Organism split detected |
OrganismFusion | lifecycle | Organisms merged |
DreamCycleCompleted | dream | Background consolidation finished |
SignalPropagated | executor | Signal hop recorded |
Strategy Chain (bench-cli)
DirectToolStrategy (pattern match → tools, fastest)
→ NeuronRecallStrategy (fractal-brain, 3-tier recognition)
→ KnowledgeRecallStrategy (flat cosine, experience store)
→ AdaptiveStrategy (teaching → solver_first → code_execution → single_shot)
Three-Tier Recognition Stack (NeuronRecallStrategy)
- Substrat Recognition — sentence-level Gaussian clustering. Classifies the problem TYPE (e.g., "number theory / GCD"). Narrows the search space.
- Tree Recall — fingerprint-based matching within the recognized substrat. Finds structurally similar past experiences. Returns Exact/Variation/Novel.
- Method Neuron Execution — if a mature method neuron exists for this substrat, use its procedure directly (parameterized code or solution template).
Each tier short-circuits: if substrat recognition finds a Cerebellum-stage method neuron, zero LLM calls. If tree recall hits Exact, zero LLM calls. Only Novel queries fall through to LLM.
Key Files
Brain module files (26 files, ~10,000 lines, 233 tests)
| File | Tests | Purpose |
|---|---|---|
mod.rs | — | Module registration + feature gates |
neuron.rs | 11 | Neuron, Synapse, NeuronTree, Hebbian update, merge |
fingerprint.rs | 7 | SparseFingerprint, FingerprintIndex |
segmenter.rs | 9 | TrajectorySegmenter (recursive tree builder) |
store.rs | 19 | NeuronStore trait + InMemory + File + Git |
recall.rs | 17 | TreeMatch, compare_trees, NeuronMemory |
reward.rs | 14 | TreeRewardLedger, quarantine, BranchBlame |
dream.rs | 10 | DreamConsolidator (9-phase cycle) |
substrat_index.rs | 36 | SubstratCluster, SubstratIndex, RecognitionResult, WorkingMemory |
method_neuron.rs | 10 | MethodNeuron, NeuronStage, Procedure |
prediction.rs | 10 | PredictionTracker, surprise-driven learning |
telemetry.rs | 6 | BrainTelemetry, structured decision capture |
organism.rs | 3 | Universal Organism trait, Scale, Health |
signal.rs | 5 | BrainSignal, Activation, Modality |
edge.rs | 7 | Edge, EdgeKind, AdaptiveFilter |
substrat.rs | 4 | SubstratEncoder, BertSubstrat |
grammar/mod.rs | 2 | Grammar trait, Token types |
grammar/text.rs | 3 | TextGrammar |
executor.rs | 12 | SignalGraphExecutor, SignalTrace |
space_organism.rs | 16 | SpaceOrganism (impl Organism) |
lifecycle.rs | 6 | MitosisDetector, DecayPolicy |
llm_organism.rs | 6 | LlmOrganism |
wasm_organism.rs | 4 | WasmOrganism |
space_registry.rs | 11 | SpaceOrganismRegistry |
dream_scheduler.rs | 4 | DreamScheduler |
proto_convert.rs | 8 | Proto ↔ Rust conversions |
Proto definitions
| File | Lines | Purpose |
|---|---|---|
proto/organism.proto | 212 | gRPC schema (7 RPCs) |
Modified files
| File | Change |
|---|---|
Cargo.toml | fractal-brain = [] + grpc features |
lib.rs | #[cfg(feature = "fractal-brain")] pub mod brain; |
traits/embedder.rs | TokenActivation + embed_tokens() |
providers/embeddings.rs | embed_tokens_sync() in candle_impl |
providers/candle_embedder.rs | Override embed_tokens() |
providers/tokenizer.rs | reverse_vocab + id_to_token() |
events/types.rs | 7 brain-related events |
providers/rl_policy/state.rs | NeuronContext (6 dims) |
space.rs | edges, substrats, centroid fields |
Bench Infrastructure — Brain Development Toolkit
The bench CLI and dashboard provide the full toolkit for growing, testing, and analyzing brain performance. This is the primary development loop for AIMO competition work.
CLI Commands (bench brain)
| Command | Purpose |
|---|---|
bench brain report [--run ID] | Full run report (accuracy, recognition, execution paths, LLM savings) |
bench brain compare --runs A,B | Side-by-side comparison of two runs |
bench brain curve [--run ID] | ASCII learning curve (brain improvement over time) |
bench brain explain --run ID --problem PID | Single-problem decision trace |
bench brain substrats [--run ID] | Substrat topology breakdown |
Benchmark Runner
# Local run with brain telemetry
bench bench --model qwen2.5-math-1.5b --suite math-dataset --limit 500 \
--brain_db brain.db --snapshot_every 50 --dream
# Remote run with dashboard heartbeat
bench bench --model qwen2.5-math-7b --suite aime --limit 100 \
--dashboard_url http://dashboard:3939 --brain_db brain.db --dream
Docker Infrastructure
| File | Purpose |
|---|---|
bench/Dockerfile | Rust bench-cli build (2-stage, debian-slim) |
bench/Dockerfile.dashboard | Node.js dashboard + Rust API (3-stage) |
bench/docker-compose.bench.yml | Run benchmarks (8GB mem, 4 CPU) |
bench/docker-compose.coolify.yml | Production dashboard (Coolify-ready) |
bench/scripts/run-remote.sh | Remote runner script (auto model download) |
Dashboard Pages
- Brain — 4 stat cards (trees, substrats, method neurons, recognition rate) + learning curve chart + substrat accuracy chart + dream events table
- Runners — Remote runner monitoring (status, progress, brain topology, system metrics, 5s auto-refresh)
SQLite Brain Store (bench/cli/src/brain_store.rs, 859 lines)
Separate brain.db with 3 tables:
brain_events— per-problem decision capturebrain_snapshots— periodic topology snapshotsdream_events— consolidation stats
Design Decisions
- Feature-gated:
fractal-brainkeeps the brain module optional — no impact on builds that don't need it - Last BERT layer only: Simpler, validated in Python. Neuron struct is layer-aware for future multi-layer
- Three store implementations: InMemory (testing), File (desktop/offline), SQLite (server/production)
- Graceful degradation: If
embed_tokens()is not supported, NeuronRecallStrategy silently delegates to fallback - Content-addressable IDs: NeuronId = SHA-256 of activation vector — deterministic, collision-resistant
- Safety bounds on executor: Prevents runaway signal propagation (depth, budget, fanout limits)
- Independent space learning: Each SpaceOrganism learns independently, cross-space edges are opt-in
- Pipeline fallback: SpaceOrganism falls back to Pipeline for Novel queries during transition period