The loop in one picture
You ingest a module, the Oracle collapses the known runtime, the decompiler lifts the rest to C, per-function agents propose understanding bottom-up, you review and correct in the UI, the provenance economy propagates everyone’s work without clobbering, you re-run the agents, and when a new build ships the diff carries every annotation forward. Understanding converges over iterations.Ingest
Parse the
.wasm and its JS glue, fingerprint every function, seed names for free, and see
your starting coverage.Identify
Run the Emscripten Oracle to auto-identify known musl, libc++, dlmalloc, and runtime code with
real upstream names, so effort concentrates on the application-specific remainder.
Decompile and understand
Lift the remaining functions to readable pseudo-C, then run the per-function deep engine
bottom-up so each function gets a name, an understanding, variable renames, and cleaned C.
Review and correct
Open the UI, read the inferred C, jump between functions, and fix names by hand. Human edits are
sovereign and locked, and the economy means your corrections are never overwritten.
Bring your own model
Expose the same knowledge base over MCP so Claude or any MCP-capable model can drive it
alongside you, economy-gated.
Everything below uses real commands. Run
warden demo first to watch the whole loop on generated
samples with no network and no native toolchain, then point these commands at your own module.One-time setup
Configure your model provider once, globally, so you never export an API key or pass it again:~/.config/warden/config.json. A project can override any of them in its own
.warden/config.json (warden config set --project ...), which is handy when one engagement needs
a different key or model.
Start: ingest the module and see coverage
Create a project, then ingest the target.warden init makes a .warden/ directory with the
database inside it, so every command you run from this directory tree uses it automatically with no
--db flag. Pass the Emscripten .js glue with --glue so WARDEN can read the export-index map
and dynCall signatures. Ingestion parses the binary in pure Python, fingerprints every function,
and seeds names for free from exports, imports, and the name section.
warden coverage is your scoreboard for the whole loop. Run it again after each stage to watch the
named fraction climb.
Identify: collapse the known runtime with the Oracle
A real Emscripten module is mostly musl, libc++, dlmalloc, and Emscripten runtime code that already has public source. The Oracle fingerprints those functions against a corpus of labeled, compiled ground truth and writes the real upstream names straight into the knowledge base at high confidence. You stop reversing code you could have read on GitHub.Decompile and understand: run the per-function deep engine
Now lift the application-specific remainder to C and let the agents understand it. The decompiler is a pure-Python lifter that renders readable pseudo-C (realif/else, while loops with
break/continue, and switch), with no native tooling.
warden deep is the centerpiece of understanding. It runs one agent per function, walked
bottom-up over the call graph so leaves are analyzed first and a caller is only seen after its
callees are understood. Each agent gets the function’s disassembly facts, its decompiled C, and the
recovered understanding of the functions it calls. That callee context is read from the knowledge
base, not from raw bytes, so a parent’s context stays bounded even in a module with thousands of
functions. Each agent returns a name, an understanding, a variable rename map, cleaned C, and
whether the function is now closed.
--watch you see each agent live, one line per function:
Leaf vs parent tiering
Leaf vs parent tiering
Leaf functions are the bulk of any module, so name them with a cheap leaf
--backend. Escalate
the harder, high-fan-in parents with a stronger --parent-model. The leaf and parent backends
are configured independently, so you pay frontier prices only where they earn their keep.Cheap by default
Cheap by default
Naming from grounded facts is bulk work, not frontier reasoning, and it is gated by the verifier
and the economy, so the default backend is the cheapest capable one. The auto-detect order is
OpenRouter (Kimi K2.6) first, then OpenAI, then Anthropic, then the zero-dependency offline
heuristic. The whole loop runs offline with no API key as a dry run of the bottom-up flow,
context drop, events, and storage; set
OPENROUTER_API_KEY and a real --backend to make the
agents real.Dedup by stable identity
Dedup by stable identity
Identical functions (same stable identity) are analyzed once and reused. That is the lever that
makes a module with thousands of functions tractable; you can see it in the
reused events.warden agent, when you only want names (not full per-function
understanding). It sweeps unnamed functions bottom-up, proposes names grounded in hard facts, gates
each through a verifier, and writes back under the economy. See the agent crew
for the full backend and routing detail.
Review and correct: the human turn
This is where the loop earns its keep. Start the read-only dashboard and read what the agents produced.The dashboard is structurally read-only: the HTTP connection runs in query-only mode for the whole
request, so no route can write. Writes go through the CLI, the library, or MCP, all of which pass
through the provenance economy. See the UI reference.
Bring your own model over MCP
You are not limited to WARDEN’s own agents.warden mcp serves the knowledge base over the Model
Context Protocol, so Claude or any MCP-capable model can drive the same loop alongside you.
Iterate: converge, then carry forward
Re-running the agents is always safe. Already-confident and locked entries are skipped before a backend is even called, and the economy rejects any proposal that would overwrite stronger work. So re-run them after you lock a batch of names: a caller named with thin context in one round can be re-proposed once its callees carry your real names, and understanding converges.v1 is already attached to the matching
functions in v2. You review only the handful that genuinely changed. That is the whole point: the
second through hundredth decompile cost almost nothing.
Export the deliverable
At any point, emit a deliverable for an external tool or a teammate:warden import, keyed on stable identity
first so a name recovered against one build lands on the same logical function in another, and routed
through the economy so an import never clobbers higher-authority work.
Best practices
Correct high-fan-in functions first
A name you lock on a widely-called function propagates context to every caller on the next
bottom-up round. Fixing one high-fan-in parent is worth more than fixing many leaves.
Lock names you trust
warden set-name locks by default. Lock a name the moment you are confident in it so no
automated pass can ever touch it, and so it carries forward cleanly on the next diff.Measure two models on a slice
Before committing to one backend for a whole module, run two on a representative slice and
compare coverage and quality. Naming is cheap, gated work; pick the model that earns its cost.
Use history to see evolution
warden history <name-or-id> shows when a function first appeared, when its body actually
changed, and who named it. Use it to decide whether a “modified” function needs a real re-review.Where to go next
CLI reference
Every command, flag, and default in one place.
MCP reference
Drive the knowledge base from Claude or any MCP-capable model.
UI reference
The terminal diff view and the read-only HTTP dashboard.
Diff and carry-over
How annotations survive a rebuild by stable identity.