Skip to content

World Engine Reference

This is a reference doc. It covers schemas, formats, API surface, and bus topics — not conceptual explanations.

See also: explanation/world-engine-concepts.md for the design rationale behind these components.


WorldStateEngine is generic — it makes no assumptions about what domains exist. Domain data shape is entirely defined by the application via workspace/domains.yaml.

// Engine-level types only. Domain data is application-defined.
interface WorldState {
domains: Record<string, WorldStateDomain<unknown>>;
snapshotVersion: number; // incremented on each knowledge.db write
}
interface WorldStateDomain<T = unknown> {
data: T;
metadata: WorldStateMetadata;
}
interface WorldStateMetadata {
collectedAt: number; // Unix ms
domain: string;
tickNumber: number;
failed?: boolean;
errorMessage?: string;
}

Domains are registered via workspace/domains.yaml (see Workspace Files for schema). Each domain declares:

  • name — unique key in the world state map
  • url — HTTP endpoint to poll (env vars interpolated at poll time)
  • intervalMs — poll interval (default: 60 000 ms)
  • headers — optional request headers

There are no built-in or hardcoded domain names. All domains are configuration.

Each domain write stores two keys:

worldstate:{domain}:{collectedAt} — timestamped snapshot
worldstate:{domain}:latest — stable "latest" key for polling

TTL is 2× the domain’s intervalMs. When Redis is unavailable, WorldStateEngine falls back to an in-memory Map.

WorldStateEngine writes the full WorldState to data/knowledge.db (SQLite) every 5 minutes (configurable via snapshotIntervalMs). On startup it restores the latest snapshot. The last 50 snapshots are retained; older rows are pruned.


Goals are declared in workspace/goals.yaml with optional per-project overrides at .proto/projects/{slug}/goals.yaml.

version: "1.0"
goals:
- id: auth-service-healthy
type: Invariant
description: Auth service must be healthy
severity: critical
enabled: true # default: true
tags: [infrastructure]
selector: services.auth.status
operator: eq
expected: "healthy"
- id: cpu-usage-ok
type: Threshold
description: CPU must stay below 80%
severity: high
selector: metrics.cpu.usage
max: 80
- id: flow-distribution
type: Distribution
description: Feature work must be at least 40% of WIP
severity: medium
selector: flow.distribution
distribution:
feature: 0.4
tolerance: 0.1
TypeWhat it checksRequired fields
InvariantBoolean condition on a state valueselector, optionally operator, expected
ThresholdNumeric min/max boundsselector, at least one of min / max
DistributionValue proportions in an array/objectselector, distribution or pattern

Invariant operators: truthy (default), falsy, eq, neq, in, not_in

LevelColorUse
lowBlueInformational
mediumOrangeShould investigate
highRedRequires prompt attention
criticalPurpleSystem-breaking

When a goal is violated, GoalEvaluatorPlugin emits world.goal.violated:

{
topic: "world.goal.violated",
payload: {
type: "world.goal.violated",
violation: {
goalId: string;
goalType: "Invariant" | "Threshold" | "Distribution";
severity: "low" | "medium" | "high" | "critical";
description: string;
message: string; // human-readable diff
actual: unknown;
expected: unknown;
timestamp: number;
projectSlug?: string;
};
}
}

When a goal is violated the system tries the cheapest capable tier first.

Goal violated
L0 — Deterministic rule matcher
Match found? ── Yes ──► Execute action (no LLM, no cost)
│ No
L1 — A* planner (HTN/GOAP)
Plan found within budget? ── Yes ──► Execute plan (cheap model call)
│ No or over budget
L2 — Ava (LLM reasoning)
Within L2 cost threshold? ── Yes ──► Ava evaluates and acts
│ No
L3 — Human in the loop (HITL)
BudgetPlugin publishes hitl.request.budget.{requestId}
Human approves/rejects via Discord/Plane/API

Each escalation tier corresponds to a cost threshold enforced by BudgetPlugin and TierRouter:

TierLabelMax est. costMin remaining budgetAction
L0Autonomous< $0.10≥ 50%Execute immediately
L1Notify< $1.00≥ 25%Execute, notify ops channel
L2Soft-gate< $5.00≥ 10%Log warning, execute with caution
L3HITL RequiredunlimitedanyBlock, escalate to human

Daily caps: $10 per project per day, $50 total across all projects.


BudgetPlugin handles pre-flight cost checks. Any agent publishes a BudgetRequest and waits for a BudgetDecision.

Publish to budget.request.{requestId}:

{
type: "budget_request";
requestId: string; // UUID
agentId: string;
projectId: string;
goalId?: string;
modelId?: string; // e.g. "claude-sonnet-4-6"
promptText?: string; // used for heuristic token count
estimatedPromptTokens?: number;
estimatedCompletionTokens?: number;
}

Subscribe to budget.decision.{requestId}:

{
type: "budget_decision";
requestId: string;
tier: "L0" | "L1" | "L2" | "L3";
approved: boolean;
estimatedCost: number; // USD
maxCost: number; // conservative upper bound (1.5× if heuristic used)
budgetState: BudgetState;
reason: string;
escalationContext?: EscalationContext; // present when tier === "L3"
}

After execution, publish to budget.actual.{requestId}:

{
type: "budget_actual";
requestId: string;
agentId: string;
projectId: string;
actualCost: number;
actualPromptTokens?: number;
actualCompletionTokens?: number;
}

Discrepancies > 20% between estimated and actual cost trigger an ops.alert.budget event.

pre_flight_estimate uses a 4-chars-per-token heuristic when token counts are not supplied. The conservative maxCost is 1.5× the estimate when heuristics are used.

ModelInput ($/token)Output ($/token)
claude-opus-4-6$0.000015$0.000075
claude-sonnet-4-6$0.000003$0.000015
claude-haiku-4-5$0.00000025$0.00000125
default$0.000003$0.000015

FlowMonitorPlugin continuously tracks 5 Flow Framework metrics. Metrics are recomputed on every work item event and on a 60 s background tick.

1. Velocity — items completed per period

{
currentPeriodCount: number; // completions in current 24 h window
rollingAverage: number; // 30-day rolling average
trend: number; // (recent 3d − prior 3d) / prior 3d
history: VelocityDataPoint[]; // 30 daily data points
period: "daily";
calculatedAt: number;
}

2. Lead Time — creation-to-completion duration (requires ≥ 5 samples)

{
p50Ms: number | null;
p85Ms: number | null;
p95Ms: number | null;
sampleSize: number;
state: "PENDING" | "READY";
minRequired: 5;
calculatedAt: number;
}

3. Efficiency — active time ÷ total cycle time (target: ≥ 35%)

{
ratio: number; // 0.0–1.0
target: 0.35;
healthy: boolean; // ratio >= 0.35
totalActiveMs: number;
totalCycleMs: number;
byStage: Record<string, { activeMs: number; cycleMs: number; ratio: number }>;
calculatedAt: number;
}

4. Load (WIP) — work-in-progress count with Little’s Law enforcement

{
totalWIP: number;
byStage: Record<string, number>;
wipLimit: WIPLimitResult;
calculatedAt: number;
}
// WIPLimitResult:
{
state: "PENDING" | "ok" | "exceeded";
currentWIP: number;
wipLimit: number | null; // null while PENDING (< 5 lead-time samples)
suggestedDelayMs?: number; // delay hint when exceeded
waitQueue: string[]; // item IDs held back
}

5. Distribution — feature / defect / risk / debt ratio

{
ratios: { feature: number; defect: number; risk: number; debt: number };
counts: { feature: number; defect: number; risk: number; debt: number };
total: number;
balanced: boolean; // feature >= 40% AND defect <= 30%
recommended: { feature: 0.4; defect: 0.3; risk: 0.15; debt: 0.15 };
calculatedAt: number;
}

Little’s Law: WIP = Throughput × Lead Time. The WIP limit is set to 1.5× the calculated WIP ceiling. When exceeded, new dispatch requests are queued (not rejected) with a suggestedDelayMs hint.

Bottleneck detection (Theory of Constraints)

Section titled “Bottleneck detection (Theory of Constraints)”

Stages are ranked by total accumulation time (item count × avg dwell time). A stage is flagged as a bottleneck when avg dwell exceeds 2 hours. The primary bottleneck is the highest-ranked stage.


TopicDirectionDescription
tool.world_state.getInboundBus-based world state query
mcp.tool.get_world_stateInboundMCP tool invocation
event.world_state.db_errorOutboundknowledge.db write failure

tool.world_state.get / mcp.tool.get_world_state payload:

{
domain?: string; // any registered domain name from domains.yaml
maxAgeMs?: number; // reject stale data (default: 60000 ms)
}

Reply published to msg.reply.topic:

{ success: true; data: WorldState | WorldStateDomain<unknown> }
// or
{ success: false; error: string }
TopicDirectionDescription
flow.item.createdInboundRegister a new work item
flow.item.updatedInboundUpdate item status/stage
flow.item.completedInboundMark item complete (production)
flow.item.dispatchInboundRequest to dispatch (WIP gating)
tool.flow.metrics.getInboundQuery current metrics
mcp.tool.get_flow_metricsInboundMCP tool invocation
event.flow.metrics.updatedOutboundAfter each metric tick
event.flow.wip_exceededOutboundWIP limit breached
event.flow.bottleneck.detectedOutboundSignificant bottleneck found
event.flow.goal.updatedOutboundGoal state changed
event.flow.efficiency.debugOutboundDebug when efficiency < 35%

flow.item.created payload:

{
id?: string; // UUID generated if omitted
type: "feature" | "defect" | "risk" | "debt";
stage: string; // e.g. "backlog", "in-progress", "review"
createdAt?: number; // Unix ms (defaults to now)
meta?: Record<string, unknown>;
}

flow.item.dispatch reply (WIP gating):

// Accepted:
{ accepted: true; currentWIP: number; wipLimit: number }
// Rejected (WIP exceeded):
{
accepted: false;
reason: "WIP_EXCEEDED";
currentWIP: number;
wipLimit: number;
suggestedDelayMs: number;
queuePosition: number;
}
TopicDirectionDescription
budget.request.#InboundPre-flight cost check
budget.actual.#InboundPost-execution cost reconciliation
budget.decision.{requestId}OutboundTier decision (approved/rejected)
hitl.request.budget.{requestId}OutboundL3 HITL escalation
budget.alert.thresholdOutbound50% or 80% budget threshold crossed
budget.circuit.open.{key}OutboundCircuit breaker opened
ops.alert.budgetOutboundAutonomous rate below 85%, or cost discrepancy
TopicDirectionDescription
world.state.#InboundWorld state updates to evaluate
world.goal.violatedOutboundGoal violation detected