How Waldo Works · Flow Diagrams · Whiteboard Companion

Waldo System Flow Atlas

Every load-bearing flow in the Waldo agent harness and interface layer, drawn end-to-end — from a wearable's overnight samples to a message that feels like "already on it." Built for walkthroughs, deep dives and whiteboarding.

00
Atlas index

Four passes over the same machine, each one level deeper: the product view, the harness internals, the body-and-memory substrate, and the interface & trust flows.

FigFlowGrounded in
01A day with Waldo — the proactive rhythmADR-0017 · Brief/Patrol/Dreaming cadence
02Master request lifecycle — the whole machine, one passArchitecture overview §6
03Initiation & KAIROS routing — why it costs ~nothing to be always-onADR-0017 · cost model v2
04The bounded ReAct loopHarness plan · ADR-0008/0033
05Run journal — crash-resume state machineADR-0054
06Delivery consistency — the transactional outboxADR-0054 · ADR-0009
07Failure ladder · circuit breaker · spend degradeADR-0051 · harness plan
08Safety pipeline — the 9-event hook gauntletADR-0032/0024/0033
09Multi-source reconciliation → CRSADR-0011 (+ amendment)
10Memory lifecycle — write · read · dreamADR-0005/0006/0024/0031/0037
11Telegram, rich two-way — one brain on every surfaceArchitecture overview Flow E · ADR-0055
12The Handoff — Explore · Plan · Act with a human tapADR-0018 · ADR-0027 (token mechanic)
13Trigger concurrency inside the DOHarness plan (two-level guard) · ADR-0014
14Timezone & travel — circadian intelligence on the moveBackend plan (tz detection)
15GDPR deletion — erasure across every storeADR-0055

A
The product view

FIG 01A day with Waldoproduct rhythm

The agent works while the user doesn't: it gets smarter at 2am, pre-computes the morning at 6:30, delivers the Brief in under 3 seconds at wake, and patrols quietly all day. Chat can interrupt anywhere — same brain, same memory.

flowchart LR
  N2[2:00 · Dreaming Mode
consolidate memory
pre-compute tomorrow] --> S1[6:30 · pre-Brief sweep
today.md refreshed] S1 --> B1[7:00 · Morning Brief
from checkpoint · under 3s
in-app · never a push] B1 --> P[all day · Patrol every 15 min
pure compute · KAIROS skips ~80%] P --> M[12:00 · midday Brief] M --> F[afternoon · stress sustained 10 min
Fetch candidate · shadow in V1] F --> E[18:00 · evening Close] E --> Q[night · sleep window
alarms mostly skip] Q --> N2 C([chat / Telegram · anytime]) -.->|user_message| P classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef ev fill:#1e1430,stroke:#c084fc,color:#f3ece0; classDef act fill:#10241c,stroke:#34d399,color:#f3ece0; class N2 ev; class S1,P,Q neu; class B1,M,E act; class F,C core;
FIG 02Master request lifecyclecanonical

Surface to delivery, crossing the two trust zones. The hard invariant: only derived insight crosses from Supabase into the agent — raw HRV/HR/sleep never leave the data zone. Every side-effect exits through the journaled outbox, so a crash can never half-deliver.

flowchart TB
  W[Wearable] --> A[App / Telegram]
  A -->|encrypted sync · JWT| EF[Edge Functions
sync + build-intelligence
deterministic · zero LLM] EF --> DB[(Postgres + RLS
raw health stays here)] AL([DO alarm · webhook · chat]) --> K{KAIROS
tick-and-decide} K -->|not worth it| Z([hibernate · ~$0]) K -->|act| LOOP[runAgentLoop
bounded ReAct ≤3] DB -->|derived insight only
per-user RLS JWT| LOOP LOOP <-->|provider-shaped| L[[LLM · AI Gateway]] LOOP <-->|Zod · ACL · taint| T[Tool plane] LOOP --> G[Quality gates] G --> OB[(run journal + outbox)] OB -->|idempotent flush| CH[Channels
in-app · APNs / FCM · Telegram] classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef info fill:#10202e,stroke:#60a5fa,color:#f3ece0; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef ev fill:#1e1430,stroke:#c084fc,color:#f3ece0; class W,A,Z,CH neu; class EF,DB info; class K,LOOP,T,OB core; class G risk; class L ev;
FIG 03Initiation & KAIROS routing decisioncanonical

Everything starts inside the per-user DO — alarms, webhooks, chat. Patrol is ~80% pure compute; the model only fires when a branch resolves to a real trigger, and even then the pre-filter can answer with a free template. This is the cost-control heart, drawn as the decision it actually is.

flowchart TB
  AL([DO alarm · every 15 min waking]) --> RC[recompute CRS + stress sniff
zero LLM] UM([user message · channel webhook]) --> LOOP RC --> D1{wake window or
pre-Brief sweep?} D1 -->|yes| BR[brief · variant] D1 -->|no| D2{stress ≥0.60 sustained 10 min
cooldown ok · budget ok} D2 -->|yes| FE[Fetch candidate
shadow in V1 · gated + logged] D2 -->|no| D3{2am · 3+ unconsolidated
or 48h since last run?} D3 -->|yes| DM[dreaming_mode] D3 -->|no| Z([hibernate · ~$0]) BR --> PRE{pre-filter
Form>60 and calm?} PRE -->|yes| TPL[template with real data
no LLM · $0] PRE -->|no| LOOP[runAgentLoop] FE --> LOOP classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef ev fill:#1e1430,stroke:#c084fc,color:#f3ece0; classDef act fill:#10241c,stroke:#34d399,color:#f3ece0; class AL,UM,Z,TPL neu; class RC,LOOP core; class D1,D2,D3,PRE risk; class DM ev; class BR,FE act;

IN FLIGHT · The Fetch goes live (push) only after its measured false-positive rate clears a written bar on the validation cohort — until then it runs the full pipeline in shadow.

B
The agent harness, under the hood

FIG 04The bounded ReAct loopcanonical

Trust is reset on every wake — an injection in an earlier session can't carry permissions forward. The loop is bounded three ways: max 3 tool iterations, a diminishing-returns guard, and a circuit breaker on provider failures.

sequenceDiagram
  autonumber
  participant DO as DO loop
  participant LLM as Model
  participant T as Tool plane
  participant CH as Channel
  DO->>DO: session trust reset · emergency scan · pre-filter
  DO->>DO: recall(ctx) · build REASONS prompt (frozen snapshot)
  loop max 3 iterations
    DO->>LLM: prompt + tool schemas
    LLM->>T: tool_use (Zod parse · trigger ACL · autonomy + taint gates)
    T-->>LLM: sanitised result, capped ~500 tok
    Note over DO: guards — 3-iteration cap · diminishing returns · circuit breaker
  end
  LLM-->>DO: candidate message
  DO->>DO: quality gates (medical scrub · hallucination · canary)
  DO->>CH: deliver via outbox (push budget · idempotency key)
  DO->>DO: write episode · trace · memory intents
  
FIG 05Run journal — crash-resume (ADR-0054)canonical

Cloudflare may evict a DO mid-loop at any time. Every state transition is one committed DO SQLite transaction, so the next wake calls tick(runId) and resumes exactly where the run stopped — no silent dark, no double-send.

stateDiagram-v2
  [*] --> PENDING : startRun(trigger · variant · nonce)
  PENDING --> CONTEXT_BUILT : recall + prompt committed
  CONTEXT_BUILT --> LLM_CALLED : model response committed
  LLM_CALLED --> TOOLS_DONE : tool batch committed
  TOOLS_DONE --> GATED : safety gates passed
  GATED --> DELIVERED : outbox flushed (idempotent)
  DELIVERED --> DONE : episode + trace written
  LLM_CALLED --> FAILED : retries exhausted → fallback ladder
  FAILED --> [*]
  DONE --> [*]
  note right of GATED
    journal + outbox persist together.
    DO evicted anywhere? alarm refires,
    tick(runId) resumes from the last
    committed state - outbox dedupes.
  end note
  
FIG 06Delivery consistency — the transactional outboxcanonical

The write fan-out is where a crash would leave stores inconsistent. The outbox makes each side-effect independently retryable; the idempotency key makes every retry safe.

flowchart TB
  C[candidate msg + state delta] --> J[(run journal
DO SQLite · single txn)] J --> OB[(outbox rows
each with idempotency_key)] OB --> P1[push · APNs / FCM / Telegram] OB --> P2[memory intents -> Scribe inbox] OB --> P3[trace -> agent_logs] OB --> P4[scores / feedback -> Supabase] P1 -->|ack| M[mark sent] P1 -.->|crash before ack| RT[retry next tick
idempotency_key dedups] classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef mem fill:#10241c,stroke:#34d399,color:#f3ece0; classDef ev fill:#1e1430,stroke:#c084fc,color:#f3ece0; classDef info fill:#10202e,stroke:#60a5fa,color:#f3ece0; classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; class C,J,OB core; class P1 neu; class P2 mem; class P3 ev; class P4 info; class M neu; class RT risk;

LOCKED · The key is content-derived — hash(user · trigger · variant · run_nonce) (ADR-0054). A retry that crosses a time boundary keeps its key; two distinct same-hour runs never collide. Rate-limiting lives separately, in the DO push budget (ADR-0009).

FIG 07Failure ladder · circuit breaker · spend degradecanonical

The agent never "fails" — it degrades, level by level, and the user never sees an error. A spend-cap hit degrades to the cheap model, never to a template, and is never logged as an attack — normal heavy use is not abuse (ADR-0051).

flowchart TB
  L1[L1 · primary model · full context] -->|ok| OUT([deliver])
  L1 -->|fail| L2[L2 · primary model · reduced context]
  L2 -->|fail| L3[L3 · template with real data · no LLM]
  L3 -->|fail| L4[L4 · silent · log · retry next alarm]
  CB{circuit breaker
3 consecutive fails → OPEN
2 successes → CLOSED} -->|open| L3 GW{AI Gateway down?} -->|yes| L3 SC{spend cap hit?} -->|yes| GE[degrade to cheap model
not template · never flagged as abuse] GE --> OUT classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; class L1,L2,GE core; class L3,L4,OUT neu; class CB,GW,SC risk;

IN FLIGHT · The cap threshold is a pricing decision (a power user can cost more than a Pro tier earns) — the mechanism above is locked; the number is the open founder call (ADR-0051).

FIG 08Safety pipeline — the 9-event hook gauntletcanonical

Nine deterministic hook events wrap every invocation (ADR-0032) — this flow shows the gauntlet a message runs. Emergency detection fires before the model and bypasses the whole loop. A canary leak terminates the session. All of it is code, none of it is prompt instruction.

flowchart TB
  IN[input / trigger] --> STR[session trust reset
fresh ACL · fresh canaries] STR --> ED{emergency patterns?} ED -->|yes| SAFE[safe response + hotlines
no LLM · terminate · log] ED -->|no| PF{pre-filter calm?} PF -->|yes| TPL[template] PF -->|no| H1[pre-LLM hooks
fence memory · inject canaries · budget check] H1 --> LLM[[LLM · pre/post-tool hooks
ACL · autonomy · taint · sanitise]] LLM --> QG{quality gates
medical scrub · hallucination · canary} QG -->|canary leak| KILL[terminate session · log incident] QG -->|ok| PB{push budget ok?} PB -->|yes| SEND([deliver]) PB -->|no| HOLD[hold · in-app only] classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef ev fill:#1e1430,stroke:#c084fc,color:#f3ece0; class IN,TPL,SEND,HOLD neu; class STR,ED,PF,QG,PB,SAFE,KILL risk; class H1 core; class LLM ev;

C
The body & memory substrate

FIG 09Multi-source reconciliation → CRSbody substrate

Per-metric confidence is authoritative for conflicts (device priority is only the tiebreaker), and baselines are kept per source so switching devices never silently re-baselines the user. The formula is locked, citable science — pure math, zero LLM.

flowchart TB
  AW[Apple Watch · 1.0] --> R
  OU[Oura · 0.85] --> R
  WH[WHOOP · 0.8] --> R
  SA[Samsung proxy · 0.6] --> R
  R{per-metric confidence rank
priority list = tiebreaker only} --> HD[(health_daily
primary_source + contributing_sources)] R --> BG[per-source baselines
no silent re-baseline on switch] HD --> BI[build-intelligence EF · zero LLM] BG --> BI BI --> FORM["Form = Sleep ×0.50 + HRV ×0.35
+ Circadian ×0.075 + Motion ×0.075
SAFTE-FAST grounded · 856-day validated"] FORM --> CRS[(crs_scores
zone + pillar drag)] classDef info fill:#10202e,stroke:#60a5fa,color:#f3ece0; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef act fill:#10241c,stroke:#34d399,color:#f3ece0; class AW,OU,WH,SA,HD,CRS info; class BI,FORM core; class R risk; class BG act;

IN FLIGHT · Device capability is honest by design: WHOOP/Oura expose overnight data only — intraday stress detection needs Apple Watch / Health Connect. Onboarding states this per device.

FIG 10Memory lifecycle — write · read · dreammemory substrate

Nothing writes the brain directly: writes are sanitised at write time and staged in the Scribe inbox; reads union committed halls with the sanitised pending entries, so same-day learning is visible before the 2am merge — no amnesia, no poisoning window.

flowchart TB
  TC[tool / agent observation] --> SAN[sanitise at write
5 checks · taint label · per-hall ACL] SAN --> INB[(memory_inbox · staged)] Q[recall ctx · every invocation] --> U{union read:
committed halls ∪ sanitised pending} INB -.->|same-day visible
trust = provisional| U U --> RRF[BM25 + recency · RRF k=60
confidence + salience boosts] RRF --> BLK[fenced recall block
marked NOT instructions] INB --> MRG[2am Scribe merge -> halls
pattern_id dedupe / supersede
bi-temporal · never DELETE] classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef mem fill:#10241c,stroke:#34d399,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef ev fill:#1e1430,stroke:#c084fc,color:#f3ece0; class TC,Q,BLK core; class SAN risk; class INB,U,RRF mem; class MRG ev;

D
The interface layer & trust flows

FIG 11Telegram, rich two-way — one brain on every surfaceinterface layer

Telegram is not a second agent — it's a channel adapter into the same per-user DO. Unknown senders are silently dropped; linking uses a one-time deep-link token; and a Telegram conversation lands in the same thread the app shows, with the same memory and approvals.

flowchart TB
  TU[Telegram user] -->|message · voice · button tap| WH[webhooks EF router]
  WH --> AL{telegram_user_id
on allowlist?} AL -->|no| DROP([silent drop · always 200]) AL -->|link flow| OTT[one-time deep-link token
10 min · single use] --> LINK[account linked] AL -->|yes| NRM[normalize event
text · voice -> Whisper transcript · callback] NRM --> DOX[per-user DO
same brain · same memory] DOX --> THR[(canonical thread store
shared with in-app Chat)] DOX --> RSP[reply · channel-shaped
concise prose + inline buttons] RSP --> OBX[(outbox · idempotent flush)] OBX --> TU classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef info fill:#10202e,stroke:#60a5fa,color:#f3ece0; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef ifc fill:#1e1430,stroke:#c084fc,color:#f3ece0; class TU,DROP neu; class WH,OTT,LINK info; class DOX,OBX core; class AL risk; class NRM,THR,RSP ifc;

BY DESIGN · Health-derived content on Telegram is consent-gated, never carries raw biometric values, and its cloud-chat residual is documented in the deletion runbook (ADR-0055).

FIG 12The Handoff — Explore · Plan · Act, with a human tapinterface layer

Waldo's most powerful loop is also its most gated. Explore is read-only; Plan produces proposal cards; Act fires only after a signed human tap. The approval token is HMAC-signed and expires in 5 minutes — a prompt injection cannot fake the tap.

flowchart TB
  ST([Handoff starts]) --> EX[EXPLORE
read-only tools · build context] EX --> PL[PLAN
propose_action -> proposal cards] PL --> CARD[user sees card
Do it · Modify · Not now] CARD -->|tap approve| TOK{HMAC approval token
5-min expiry · server-validated} CARD -->|2h no response| EXP([auto-decline · expire]) TOK -->|valid| ACT[ACT
execute_action · the only gated door] TOK -->|invalid / expired| EXP ACT --> PE[Patrol entry + undo affordance] AUT{autonomy level} -.->|L1 observe only| EX AUT -.->|L2 propose · default| PL AUT -.->|L3 auto + undo
except connector_write| ACT classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef act fill:#10241c,stroke:#34d399,color:#f3ece0; classDef ifc fill:#1e1430,stroke:#c084fc,color:#f3ece0; class ST,EXP neu; class EX,PL core; class CARD,PE ifc; class TOK,AUT risk; class ACT act;
FIG 13Trigger concurrency inside the DOcanonical

The DO is single-writer by construction. Overlapping triggers queue through the two-level guard (pendingInterrupt); proactive content arriving during a live chat merges into the open thread instead of pushing — Waldo never talks over itself.

flowchart TB
  A1[Patrol alarm] --> GATE
  A2[user message] --> GATE
  GATE{DO input gate
single-writer} GATE -->|run active| Q[pendingInterrupt queue
processed after current run] GATE -->|idle| RUN[start run] Q --> RUN RUN --> PC{proactive content while
a chat is live?} PC -->|yes| SUPP[merge into the open thread
notification dot · no push] PC -->|no| SEND([deliver normally]) classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef ifc fill:#1e1430,stroke:#c084fc,color:#f3ece0; class A2,SEND neu; class A1,RUN core; class GATE,PC risk; class Q,SUPP ifc;
FIG 14Timezone & travel — circadian intelligence on the movecanonical

A circadian product has to survive a red-eye. Timezone changes are detected on the next health sync and the DO reschedules every per-user alarm; scheduling is IANA-zone based so DST never silently shifts the wake alarm.

flowchart TB
  FLY[travel LAX -> LHR] --> SY[next HealthKit sync
X-Device-Timezone header] SY --> CH{tz changed?} CH -->|no| OK([no-op]) CH -->|yes| UP[update users.timezone
IANA zone · DST-safe] UP --> RS[reschedule signal -> DO] RS --> AL[recompute wake · Patrol · 2am alarms
in the new local time] AL --> JL[recovery-day skill primes
for 24-48h jet-lag window] classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef info fill:#10202e,stroke:#60a5fa,color:#f3ece0; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; classDef act fill:#10241c,stroke:#34d399,color:#f3ece0; class FLY,OK neu; class SY,UP info; class RS,AL core; class CH risk; class JL act;
FIG 15GDPR deletion — erasure across every storecanonical · ADR-0055

An Article-9 product owes a complete answer to "delete me." The runbook enumerates every store that holds personal or derived-health data — including observability — with the Telegram cloud-chat residual handled honestly: disable, notify, and instruct.

flowchart TB
  DEL[deletion request
in-app · Apple webhook] --> ORCH[deletion orchestrator
30-day ceiling] ORCH --> S1[Supabase · FK cascade delete] ORCH --> S2[DO SQLite · wake + purge tables] ORCH --> S3[R2 · workspace + archives by prefix] ORCH --> S4[AI Gateway logs
payload logging off on health routes] ORCH --> S5[observability · hashed IDs
Langfuse / PostHog / Sentry] ORCH --> S6[Telegram · disable + notice
+ user-side delete instructions] classDef neu fill:#1b1813,stroke:#322c23,color:#c9bfae; classDef core fill:#2a1d10,stroke:#f97316,color:#f3ece0; classDef info fill:#10202e,stroke:#60a5fa,color:#f3ece0; classDef mem fill:#10241c,stroke:#34d399,color:#f3ece0; classDef ev fill:#1e1430,stroke:#c084fc,color:#f3ece0; classDef risk fill:#2a1414,stroke:#f87171,color:#f3ece0; class DEL neu; class ORCH core; class S1 info; class S2 core; class S3 mem; class S4,S5 ev; class S6 risk;

IN FLIGHT · Per-store latency commitments and the consent/DPIA wording are being finalized (ADR-0055, proposed) — the store enumeration itself is locked.