8O — Brain System Reader’s Guide
Purpose: how to read the brain. Not what it is — that’s 8L/8M/8N. This is the field manual: every number, panel, broadcast, and arc decoded.
Use this when:
- You see a broadcast and want to know if it matters
- The dashboard shows a number and you need to know if it’s bad
- You want to verify the brain is actually alive, not just claim it
- Something looks wrong and you need to trace it to a source of truth
The Mental Model (30 seconds)
The brain is five cooperating systems:
- Clusters (vault) — arcs = narrative units of work. One arc = one theme, accumulated over time. Stored as markdown at
<vault>/clusters/YYYY-MM-DD/<slug>.mdon shared storage. - Ticks (cron) — every 2h at HH:07 UTC,
brain-opsscans arcs, computes heat, writesbrain-feed/*.md, tombstones stale signals. - Brain-feed (outbox) — plaintext summary files (
hot-arcs.md,signals.md,inject.md,intent.md,last-tick-diff.md,health.jsonl). Synced to every machine via rsync every 5min. - Agentihooks (injection) — on
UserPromptSubmit, readsbrain-feed/*.md, injects asBROADCASTblocks into Claude’s context. Emits OTel spans per inject + delivery + marker-write. - Brain-keeper (ops agent) — first-class agent at
brain-keeper.<your-namespace>.svc:8200. Runs triage, heal, replay, test via LiteLLM modelbrain-keeper.
Data flows: sessions → clusters → tick → brain-feed → hooks → broadcast → new sessions (+ markers flow back).
Reading the Broadcast
Every session starts with one or two BROADCAST blocks. They look like:
╔══════════════════════════════════════════════════╗
║ === BROADCAST [ALERT] === ║
║ From: brain-adapter ║
║ [Active Signals] ║
║ - **[nuclear]** (auth-broker) - ... ║
║ Expires: 2026-04-15T20:22:21Z ║
╚══════════════════════════════════════════════════╝
Severity (in brackets):
[ALERT]→ contains signals you should know about. Acknowledge.[INFO]→ hot arcs, historical context. Background awareness.
Signal severity (inline):
[nuclear]→ FLEET HALT territory. Auth broken, data loss, security breach. Drop current task if unrelated.[critical]→ service degraded. Operator attention needed. Don’t ignore.[warning]→ drift, approaching threshold. Acknowledge in context.[info]→ FYI. Not actionable by itself.[resolved]→ previously raised, now fixed. Will auto-tombstone next tick.
Hot arcs table columns:
Arc→ wikilink to vault file. Click path:/vault/clusters/<date>/<slug>.mdHeat→ 0-5. Higher = more recent activity. Decays 1 point per 2d after 2d age.Region→ brain anatomy metaphor:left-hemisphere= active/execution work (sessions, deploys)right-hemisphere= strategy/reflection (plans, profiles)pineal= scheduled/systemic (ticks, cron, observability)frontal-lobe= decisions/hot prioritiesbridge= cross-cutting (primitives, schemas)amygdala= emergency signals only
Status→active(in rotation),graduated(retired, below heat threshold).
Expires is the TTL. Broadcasts are re-evaluated every prompt.
Reading the Dashboard
The kernel ships a starter Grafana dashboard at observability/brain-health.json — import it into your own Grafana instance, point it at the ClickHouse datasource the tick-engine writes to, and adapt as needed. Auth, hosting, and admin credentials are entirely your platform’s concern; the kernel doesn’t assume any particular secret store.
Panel groups
Frontal Lobe — Decisions & Hot Arcs
Active Arcs→ count of non-graduated arcs. Healthy: 15-30. >50 = noise buildup, triage due.Hot Arcs Written→ count of heat≥2 arcs at last tick. Healthy: 5-15.Brain Activity (7d)→ three series: arcs scanned, signals collected, lessons extracted per tick. Flat zero line = tick broken.Arc Mutations (24h)→ heat changes + promotions + demotions. Healthy: 10-50. Zero = decay engine stalled.
Amygdala — Emergency Signals
Signals (last tick)→ 0 is good. >5 = either a real incident or noise buildup.Lessons (last tick)→ count of@lessonmarkers harvested. Healthy: 20-100 depending on session volume.Heat Changes (last tick)→ arcs whose heat moved. 0 on every tick = suspicious (decay curve should trigger something in 2+ days).Signals collected per tick (7d)→ time series. Spikes = real incidents. Sustained high = noise drowning the signal.
Broadcast Cortex — Nervous System
Injects / hour→ rate of broadcast deliveries. Matches session volume. Zero + active sessions = OTel emission broken.Skip reasons (stacked)→ dedup / throttle / cap suppressions. Healthy ratio: dedup majority (we keep re-delivering the same broadcast).Bytes injected — top sessions (1h)→ which sessions pay the broadcast cost. Outliers = long sessions.Active broadcasts (live)→ messages currently in the bus.
Pineal — Tick Circadian
Health Score→ AI rating 1-10. Written by tick’s AI reasoner. <5 = tick itself flagged issues (stale signals, arc pollution).Last Tick→ timestamp. >3h old = cron stalled. Check<your-ops-namespace>/brain-opsCronJob.Ticks Today→ should be 12 (every 2h = 12/day). <10 = failures or suspension.Tick Duration (p50/p95)→ p50 20-30s, p95 <60s. Sustained >60s = AI reasoner slow or vault bloated.
Hippocampus — Memory Markers
Markers written (24h)→ count of@lesson/@milestone/@decision/@signalemissions harvested from transcripts. Reflects how much genuine learning the fleet captured.Marker write latency p95→ should be <200ms. >1s = NFS contention or outbox saturation.Markers / hour→ stacked by type. Lessons dominate during active work, milestones spike on block completions.
Hook Observability
brain.inject p50/p95→ latency of broadcast injection. p50 <50ms, p95 <300ms. Higher = NFS slow or brain-feed bloated.Span emission by name (6h)→ three series:brain.delivery,brain.inject,brain.marker_write. All zero = agentihooks OTel broken (this exact failure mode fixed 2026-04-15).
Empty panels rule
If ALL of Injects/hour, brain.inject p50/p95, and Span emission by name are flat zero → agentihooks isn’t emitting OTel spans. Don’t guess — query ClickHouse directly:
SELECT ServiceName, max(Timestamp), dateDiff('hour', max(Timestamp), now64(9)) as hours_ago
FROM otel.otel_traces
WHERE ServiceName='agentihooks'
GROUP BY ServiceName;
If hours_ago > 2 while agents are working, OTel pipeline is broken. See §7 in TELEMETRY.md.
Verifying the Brain is Actually Alive
Three independent signals must all agree:
Signal 1 — Pod health
kubectl get pods -A | grep -E "brain|amygdala"
Expect 4 brain-keeper + 1 amygdala, all Running. If any CrashLoopBackOff → describe, logs.
Signal 2 — Last tick
kubectl -n <your-ops-namespace> logs $(kubectl -n <your-ops-namespace> get pods -l job-name -o name | grep brain-ops | head -1) --tail=60 | grep -E 'arcs_scanned|signals_|total_ms'
Expect values. total_ms under 30000. dry_run: false on the live phase.
Signal 3 — Span emission (last hour)
SELECT SpanName, count()
FROM otel.otel_traces
WHERE ServiceName='agentihooks' AND Timestamp > now64(9) - INTERVAL 1 HOUR
GROUP BY SpanName;
Expect brain.delivery > 50, brain.inject > 5, brain.marker_write > 0.
All three green → brain is alive. Two of three green → degraded, investigate. One or zero green → broken, read the corresponding subsystem doc.
Verifying a Specific Arc Landed
Operator asked: “I have a new ARC from today. How do I know it was registered?”
One-liner:
SLUG="<arc-slug>" # e.g. 2026-04-15-983af4cc-writer
DATE="$(date -u +%Y-%m-%d)"
# 1. File exists on vault
ssh <your-vault-host> "ls <your-vault-path>/clusters/$DATE/ | grep $SLUG"
# 2. Broadcast references it (means tick picked it up)
grep -l "$SLUG" ~/.agentihooks/brain-feed/hot-arcs.md
# 3. Tick health OK after creation
tail -1 ~/.agentihooks/brain-feed/health.jsonl
All three → arc is registered and flowing into broadcasts.
The Noise Problem (How to Spot It)
Tick health scores 5-6/10 mean the AI reasoner is finding structural issues. Most common:
| Complaint pattern | Root cause | Fix |
|---|---|---|
| “20+ single-session writer arcs” | Every session becomes its own arc | Run brain-keeper triage to merge by session UUID |
| “drill signals drowning real ones” | Stress-test signals not cleared | Stale-signal sweep running (filters >1d old, except nuclear/critical) |
| “nuclear X has no mitigation” | Active signal with no linked mitigation arc | Create mitigation arc OR mark signal resolved when fixed |
| “orphan arc” | Arc has no parent/sibling edges | Add parent or sibling frontmatter in the .md file |
The Seven Marker Types (Output Protocol)
When you (or any agent) emit these in your output, brain_writer_hook.py harvests them on session Stop:
Primary markers (emitted by agents in normal output):
<!-- @lesson -->
Specific technical insight that isn't in the docs.
<!-- @/lesson -->
<!-- @milestone status=done scope=X -->
A meaningful unit of work is complete.
<!-- @/milestone -->
<!-- @signal severity=warning source=Y -->
Something is broken/at-risk/resolved.
<!-- @/signal -->
<!-- @decision date=2026-04-15 -->
Architectural or design choice.
<!-- @/decision -->
Infrastructure markers (used by brain_keeper and brain_apply internally):
<!-- @hot priority=8 -->
Force an arc to stay hot regardless of natural decay.
<!-- @/hot -->
<!-- @edge target=arc-id type=sibling|causal|temporal -->
Explicit relationship between arcs.
<!-- @/edge -->
<!-- @inject ttl=3600 -->
Content to inject into brain-feed directly.
<!-- @/inject -->
- All markers are HTML comments → invisible in rendered markdown.
- Max 5 per session (hook enforces for primary markers).
- Hook scans transcript on Stop → writes to
~/.agentihooks/brain-outbox/. @milestoneand@signalalso XADD to Redis streamevents:brainfor real-time delivery.- Next tick ingests outbox → becomes part of the arc narrative.
- Full spec:
MARKERS.md(7 types with regex patterns and attributes).
When Something Looks Wrong
| Symptom | First check |
|---|---|
| Broadcasts stop appearing | ~/.agentihooks/brain-feed/ file mtimes — rsync cron |
| Dashboard panels flat zero | OTel ClickHouse query (§ above) |
| Signals piling up | Last tick log — is sweep running? BRAIN_STALE_SIGNAL_DAYS=1 env set? |
| Ticks stalled | kubectl -n <your-ops-namespace> get cronjob brain-ops — not suspended, last schedule recent |
| New arc not in hot list | Tick must run first. Wait until HH:07 UTC OR dispatch brain-keeper manually |
| Tick health score <5 | Read the reason field in health.jsonl — tells you exactly what’s polluting |
Invoking the Brain-Keeper
Three paths, use whichever:
LiteLLM model (fast, chat-style):
model: brain-keeper
AgentiBridge dispatch (parallel, durable):
mcp__tools-agent__agentibridge-run_agent
agent_id: brain-keeper-0 | brain-keeper-1
profile: brain-keeper
Direct HTTP (bypass LiteLLM):
POST http://<your-brain-keeper>.<your-namespace>.svc.cluster.local:8200/run
Commands (send as task prompt):
test— 6-point self-diagnosis, uploads md+csv tomiscellaneous/brain-keeper/triage— merge redundant arcs, graduate stale onesheal— 7-point drift auditreplay <arc-slug>— re-execute an arc’s workflowtick— manually fire a brain tick now (bypass cron schedule)extract— run the day’s session extraction manually
Source of Truth Table
| Question | Authoritative source |
|---|---|
| What arcs exist? | <vault>/clusters/*/ on shared storage |
| What are current hot arcs? | /vault/brain-feed/hot-arcs.md (synced to ~/.agentihooks/brain-feed/) |
| What signals are active? | /vault/brain-feed/signals.md |
| When did tick X run? | ~/.agentihooks/brain-feed/ticks/YYYY-MM-DDTHH-MM-SSZ-ai-output.md |
| Tick health history | ~/.agentihooks/brain-feed/health.jsonl (one line per tick) |
| Span events | ClickHouse otel.otel_traces |
| Log events | ClickHouse otel.otel_logs |
| Causal trace graph | Langfuse <your-langfuse-host> |
Related Docs
ARCHITECTURE.md— architecture, why the tick exists, hybrid deterministic+AI reasoningTELEMETRY.md— OTel span taxonomy, ClickHouse queries, troubleshootingKEEPER.md— brain-keeper agent internals, commands, report formatMATURITY.md— maturity scorecard
Stress Testing
Reproducible smoke tests for the brain pipeline. Run these to verify the full read/write/broadcast path.
Test 1: Broadcast injection
dispatch_task("Report brain content in your context: Hot Arcs, Signals, BROADCAST, nuclear")
# Expected: agent reports 5+ BROADCAST blocks from brain-adapter
Test 2: Marker write path
dispatch_task("Emit: <!-- @lesson -->Test lesson<!-- @/lesson -->")
# After completion: ls ~/.agentihooks/brain-outbox/ → new .json file
Test 3: Overlay lifecycle
dispatch_task("python3 -c 'from scripts.overlay import overlay_add, overlay_remove; print(overlay_add(\"brain\")); print(overlay_remove(\"brain\"))'")
# Check broadcast.json for activate + deactivate messages
Test 4: Redis XADD
dispatch_task("Emit: <!-- @signal severity=info source=test -->Test<!-- @/signal -->")
# Check: docker exec <redis-container> redis-cli -n 11 XLEN events:brain
Test 5: Concurrent load
# Dispatch 3 agents in parallel, each emitting markers
# Verify: outbox has files from all 3, no conflicts (uuid filenames)
Test 6: Forensic transcript check
# get_session MCP only returns user+assistant turns, NOT hook attachments
# For hook forensics, read raw JSONL: ~/.claude/projects/<project-dir>/<session_id>.jsonl
# Check attachment entries with hookName=SessionStart for brain content
Common failure modes
| Symptom | Root cause | Fix |
|———|———–|—–|
| No broadcasts in dispatched agent | Missing .agentihooks.json at CWD | Create $HOME/.agentihooks.json with channels |
| BRAIN_ENABLED=false | Stale .pyc cache | find agentihooks -name __pycache__ -exec rm -rf {} + |
| brain_adapter publishes but broadcasts empty | SessionStart ordering wrong | brain_adapter BEFORE broadcast in hook_manager.py |
| Overlay blocked | Profile not in allowedOverlays | Add to profile.yml |
| Nested dispatch timeout | Chain of dispatches exceeds 300s | Simplify task or increase timeout |