Agents are becoming capable, but they are context-blind.
The next interface problem is not whether agents can use tools. It is whether normal people can safely use powerful agents with enough context to make good decisions.
Digital context is incomplete
Calendar, email, documents, and tasks explain what someone is doing. They do not explain whether that person is recovered, overloaded, sharp, or depleted.
Health is the first vertical
Wearables already collect the signal. Waldo turns that signal into useful action through readiness, memory, timing, and proactive delivery.
The infrastructure is horizontal
The same harness can later support engineering, legal, finance, coaching, or team workflows by swapping tools, policies, and domain skills.
Waldo feels like an agent that is already watching the right signals.
Morning context
Waldo opens the day with a read on sleep, recovery, workload, and the best window for hard work.
Stress intervention
When stress patterns appear, Waldo alerts gently, explains why, and suggests a small corrective action.
Day planning
Waldo can walk through the plan, propose schedule changes, and ask for approval before acting.
Trust trail
Every observation, action, and suggestion has an audit trail so the user can see what happened and why.
A body-data pipeline feeding a persistent personal agent.
Waldo separates deterministic health computation from agent judgement. Supabase stores and computes trusted data. Cloudflare Durable Objects run the per-user agent brain.
Deterministic first
Raw health values are normalized, scored, and confidence-weighted before the model sees narrative context.
Persistent per user
The agent has durable state, memory, alarms, traces, and scoped tool access across sessions.
Visible trust
The app renders agent-authored cards and threads while exposing memory, approvals, and audit history.
The model is not the product. The harness makes the model useful.
Waldo follows the modern agent-harness pattern: model intelligence wrapped in state, tools, context, orchestration, safety, memory, and verification.
| Harness layer | What Waldo owns | Why it matters |
|---|---|---|
| Context | Form, Recovery, Weight, schedule, active threads, memory snapshot, channel rules | The model reasons from the user's real state, not generic advice. |
| Orchestration | Durable Object alarms, bounded ReAct loop, Handoff continuation, provider routing | The agent can start, pause, resume, and complete long-running tasks. |
| Memory | Facts, events, discoveries, preferences, advice effectiveness, episodes | Waldo becomes more personal without stuffing every detail into prompt context. |
| Tools | Typed tool registry, per-trigger permissions, adapters, MCP, approvals | Capability expands without letting every trigger access every action. |
| Safety | Emergency detection, medical scrub, hallucination guard, canary checks, human approval | Health guidance and proactive actions require deterministic boundaries. |
| Evaluation | Trace replay, WIS, KeepRate, memory evals, cost gates | The system can improve based on evidence, not anecdotes. |
Our harness thesis
As models improve, some planning and verification ability will move into the model. But the product-specific harness will still matter because it defines the user's data boundary, tool permissions, trust model, feedback loop, and user experience. Waldo's harness is specialized around body context and safe proactive action.
The product is built around a small set of memorable, user-facing concepts.
Health data changes the standard for agent safety.
Waldo is designed so sensitive signals become useful context without becoming careless logs, raw prompts, or invisible autonomous decisions.
Raw values are minimized
Users see clear labels and derived insights. Raw health values are kept out of logs and push messages by default.
Scoped by identity
Authenticated access, row-level security, per-user agent state, adapter boundaries, and egress allowlists protect data paths.
Actions need approval
Destructive or external actions use preview, confirmation, expiry, audit trail, and undo where possible.
Note: Waldo is designed as a decision-support and personal productivity agent, not a medical device.
V1 proves the health-context agent loop, not every future vertical.
What ships first
iOS-first app, HealthKit sync, deterministic readiness model, per-user agent brain, typed memory, scoped tools, Chat, The Brief, The Fetch, The Handoff, Patrol, Settings, APNs and Telegram.
What is intentionally deferred
Full Constellations graph, autonomous code/browser tools, external Waldo MCP server, legal/finance/engineering vertical packs, team dashboards, and high-autonomy workflows.
From health wedge to horizontal context layer.
V1
Personal health-context agent with proactive daily guidance and visible trust surfaces.
Memory graph
Constellations, long-term causal patterns, and richer evidence trails.
Agent economy
MCP, external agents, and domain-specific tool packs using body context safely.
Verticals
Engineering, legal, finance, coaching, and team workflows powered by the same core harness.