The reverse-engineering loop

This is the answer to “how do I actually reverse-engineer a WASM with WARDEN”. It is not a linear checklist. It is a loop that you, the agent crew, and any external model you bring over MCP all run together against one versioned knowledge base, where every change is tracked and reversible.

The loop in one picture

You ingest a module, the Oracle collapses the known runtime, the decompiler lifts the rest to C, per-function agents propose understanding bottom-up, you review and correct in the UI, the provenance economy propagates everyone’s work without clobbering, you re-run the agents, and when a new build ships the diff carries every annotation forward. Understanding converges over iterations.

Ingest

Parse the .wasm and its JS glue, fingerprint every function, seed names for free, and see your starting coverage.

Identify

Run the Emscripten Oracle to auto-identify known musl, libc++, dlmalloc, and runtime code with real upstream names, so effort concentrates on the application-specific remainder.

Decompile and understand

Lift the remaining functions to readable pseudo-C, then run the per-function deep engine bottom-up so each function gets a name, an understanding, variable renames, and cleaned C.

Review and correct

Open the UI, read the inferred C, jump between functions, and fix names by hand. Human edits are sovereign and locked, and the economy means your corrections are never overwritten.

Bring your own model

Expose the same knowledge base over MCP so Claude or any MCP-capable model can drive it alongside you, economy-gated.

Iterate

Re-run the agents (they respect your locked names) so understanding converges. When a new build ships, the diff carries every annotation forward by stable identity with zero rework.

Everything below uses real commands. Run warden demo first to watch the whole loop on generated samples with no network and no native toolchain, then point these commands at your own module.

One-time setup

Configure your model provider once, globally, so you never export an API key or pass it again:

warden config set openrouter_api_key sk-or-...
warden config set openrouter_model xiaomi/mimo-v2.5-pro

Settings live in ~/.config/warden/config.json. A project can override any of them in its own .warden/config.json (warden config set --project ...), which is handy when one engagement needs a different key or model.

Start: ingest the module and see coverage

Create a project, then ingest the target. warden init makes a .warden/ directory with the database inside it, so every command you run from this directory tree uses it automatically with no --db flag. Pass the Emscripten .js glue with --glue so WARDEN can read the export-index map and dynCall signatures. Ingestion parses the binary in pure Python, fingerprints every function, and seeds names for free from exports, imports, and the name section.

warden init                                              # creates .warden/ (db inside; no --db needed)
warden ingest app_v1.wasm --glue app_v1.js --label v1
warden coverage v1                                       # how much is named already?
warden funcs v1 --unnamed                                # what still needs a name?

warden coverage is your scoreboard for the whole loop. Run it again after each stage to watch the named fraction climb.

Identify: collapse the known runtime with the Oracle

A real Emscripten module is mostly musl, libc++, dlmalloc, and Emscripten runtime code that already has public source. The Oracle fingerprints those functions against a corpus of labeled, compiled ground truth and writes the real upstream names straight into the knowledge base at high confidence. You stop reversing code you could have read on GitHub.

# Build a signature store from any module that still carries a name section
# (a debug or --profiling-funcs build), then identify the stripped target against it.
warden oracle build runtime_debug.wasm --out oracle.json --emver 3.1.55 --opt -O2
warden oracle identify v1 --store oracle.json
warden oracle inspect oracle.json                        # counts by library and Emscripten version

For larger corpora, the index makes matching sublinear:

warden oracle identify v1 --store oracle.json --indexed   # MinHash-LSH candidate index

The Oracle is honest about its corpus. A small seed store ships with the repo for the offline demo and tests. The multi-thousand-signature corpus that gives the high identification rate is produced by the containerized farm in scripts/corpus/, which needs the emsdk toolchain. Once a matrix is built, warden oracle harvest <dir> reads its manifest.json and builds the store in one call. See the Oracle guide.

Decompile and understand: run the per-function deep engine

Now lift the application-specific remainder to C and let the agents understand it. The decompiler is a pure-Python lifter that renders readable pseudo-C (real if/else, while loops with break/continue, and switch), with no native tooling.

warden lift v1 --index 412        # decompile one function to pseudo-C
warden lift v1 --out v1.c         # or lift the whole module to a file

warden deep is the centerpiece of understanding. It runs one agent per function, walked bottom-up over the call graph so leaves are analyzed first and a caller is only seen after its callees are understood. Each agent gets the function’s disassembly facts, its decompiled C, and the recovered understanding of the functions it calls. That callee context is read from the knowledge base, not from raw bytes, so a parent’s context stays bounded even in a module with thousands of functions. Each agent returns a name, an understanding, a variable rename map, cleaned C, and whether the function is now closed.

warden deep v1 --backend kimi --parent-model moonshotai/kimi-k2.6 --watch

With --watch you see each agent live, one line per function:

analyzing func[412]
 proposed func[412] parse_packet_header
   closed func[412] parse_packet_header
   reused func[418]

Leaf vs parent tiering

Leaf functions are the bulk of any module, so name them with a cheap leaf --backend. Escalate the harder, high-fan-in parents with a stronger --parent-model. The leaf and parent backends are configured independently, so you pay frontier prices only where they earn their keep.

Cheap by default

Naming from grounded facts is bulk work, not frontier reasoning, and it is gated by the verifier and the economy, so the default backend is the cheapest capable one. The auto-detect order is OpenRouter (Kimi K2.6) first, then OpenAI, then Anthropic, then the zero-dependency offline heuristic. The whole loop runs offline with no API key as a dry run of the bottom-up flow, context drop, events, and storage; once you have run warden config set openrouter_api_key ... the agents are real (the one-time setup above).

Dedup by stable identity

Identical functions (same stable identity) are analyzed once and reused. That is the lever that makes a module with thousands of functions tractable; you can see it in the reused events.

There is also a lighter pass, warden agent, when you only want names (not full per-function understanding). It sweeps unnamed functions bottom-up, proposes names grounded in hard facts, gates each through a verifier, and writes back under the economy. See the agent crew for the full backend and routing detail.

Review and correct: the human turn

This is where the loop earns its keep. Start the read-only dashboard and read what the agents produced.

warden serve                       # local dashboard at http://127.0.0.1:8787

The dashboard renders a confidence heatmap (every function tinted by who claimed its name and how sure they are), the raw and cleaned C side by side, and panels that let you search for a symbol, jump to a single function, and walk its history. When you find a wrong or thin name, fix it from the CLI:

warden show v1 412                            # everything known about a function
warden set-name v1 412 parse_packet_header    # provenance=human, locked by default

A human-set name is sovereign. It is written at the top authority tier and locked by default, so no automated source can overwrite it. Because every name and variable rename is logged, every change is reversible. You are not editing a fragile file; you are committing to a tracked, versioned knowledge base.

The dashboard is structurally read-only: the HTTP connection runs in query-only mode for the whole request, so no route can write. Writes go through the CLI, the library, or MCP, all of which pass through the provenance economy. See the UI reference.

Bring your own model over MCP

You are not limited to WARDEN’s own agents. warden mcp serves the knowledge base over the Model Context Protocol, so Claude or any MCP-capable model can drive the same loop alongside you.

pip install -e '.[mcp]'
warden mcp                          # serves the KB over MCP (stdio)

The MCP surface exposes project reads, function facts, backend discovery, server-side agent runs, and symbol proposals. Every write through MCP is economy-gated at the KB layer, exactly like the agent crew and exactly like an import. An external model cannot clobber your locked names or a high-confidence Oracle match. Humans, the agent crew, and external models all collaborate on one knowledge base, and every change is tracked and reversible. See the MCP reference.

Iterate: converge, then carry forward

Re-running the agents is always safe. Already-confident and locked entries are skipped before a backend is even called, and the economy rejects any proposal that would overwrite stronger work. So re-run them after you lock a batch of names: a caller named with thin context in one round can be re-proposed once its callees carry your real names, and understanding converges.

warden deep v1                      # re-run; locked human names are respected
warden coverage v1                  # watch the named fraction climb

The payoff arrives on the next build. Ingest the new version and diff it. WARDEN classifies every function as unchanged, moved, modified, new, or deleted, carries all annotations forward by stable identity for unchanged and moved functions, applies a confidence penalty to fuzzy matches, and prints a changelog that separates genuine application deltas from runtime churn caused by an Emscripten version bump.

warden ingest app_v2.wasm --glue app_v2.js --label v2
warden diff v1 v2                   # carry-over plus a semantic changelog
warden history parse_packet_header  # when it first appeared, how it evolved, who named it

Every name, type, and note you and the agents produced for v1 is already attached to the matching functions in v2. You review only the handful that genuinely changed. That is the whole point: the second through hundredth decompile cost almost nothing.

Export the deliverable

At any point, emit a deliverable for an external tool or a teammate:

warden export v1 --format pseudo    # readable per-function pseudo-C
warden export v1 --format headers   # a C header
warden export v1 --format ghidra    # a runnable Ghidra rename script
warden export v1 --format csv        # neutral symbols for round-trip
warden import v1 names.csv --provenance human --lock   # bring edits back, economy-gated

Names you edit in Ghidra, IDA, or by hand come back through warden import, keyed on stable identity first so a name recovered against one build lands on the same logical function in another, and routed through the economy so an import never clobbers higher-authority work.

Best practices

Correct high-fan-in functions first

A name you lock on a widely-called function propagates context to every caller on the next bottom-up round. Fixing one high-fan-in parent is worth more than fixing many leaves.

Lock names you trust

warden set-name locks by default. Lock a name the moment you are confident in it so no automated pass can ever touch it, and so it carries forward cleanly on the next diff.

Measure two models on a slice

Before committing to one backend for a whole module, run two on a representative slice and compare coverage and quality. Naming is cheap, gated work; pick the model that earns its cost.

Use history to see evolution

warden history <name-or-id> shows when a function first appeared, when its body actually changed, and who named it. Use it to decide whether a “modified” function needs a real re-review.

Where to go next

CLI reference

Every command, flag, and default in one place.

MCP reference

Drive the knowledge base from Claude or any MCP-capable model.

UI reference

The terminal diff view and the read-only HTTP dashboard.

Diff and carry-over

How annotations survive a rebuild by stable identity.

Getting started

The pipeline

Reference

Project

The reverse-engineering loop

The loop in one picture

One-time setup

Start: ingest the module and see coverage

Identify: collapse the known runtime with the Oracle

Decompile and understand: run the per-function deep engine

Review and correct: the human turn

Bring your own model over MCP

Iterate: converge, then carry forward

Export the deliverable

Best practices

Correct high-fan-in functions first

Lock names you trust

Measure two models on a slice

Use history to see evolution

Where to go next

CLI reference

MCP reference

UI reference

Diff and carry-over

​The loop in one picture

​One-time setup

​Start: ingest the module and see coverage

​Identify: collapse the known runtime with the Oracle

​Decompile and understand: run the per-function deep engine

​Review and correct: the human turn

​Bring your own model over MCP

​Iterate: converge, then carry forward

​Export the deliverable

​Best practices

Correct high-fan-in functions first

Lock names you trust

Measure two models on a slice

Use history to see evolution

​Where to go next

CLI reference

MCP reference

UI reference

Diff and carry-over

The loop in one picture

One-time setup

Start: ingest the module and see coverage

Identify: collapse the known runtime with the Oracle

Decompile and understand: run the per-function deep engine

Review and correct: the human turn

Bring your own model over MCP

Iterate: converge, then carry forward

Export the deliverable

Best practices

Where to go next