warden export converts the knowledge base for a given version into one of four deliverable
formats. All four are deterministic: given the same KB state, the output is byte-identical
on every run. That means they diff cleanly in git and compose naturally with CI pipelines.
The four formats
headers
A C header of recovered function prototypes. Use this to feed names into a downstream
C/C++ toolchain or as a human-readable symbol sheet.
pseudo
Per-function listings: the recovered name, type signature, agent summary, provenance, and
lifted pseudo-C (requires the original
.wasm to be present; falls back to a mnemonic count
otherwise). Use this for manual review.kb-text
A columnar, stable text dump of every symbol (index, stable id, lock flag, provenance,
confidence, and name), sorted by function index. The primary format for committing alongside
source and reading diffs across versions.
ghidra
A Python script that pushes recovered names back into a Ghidra project. Run it from Ghidra’s
Python console after loading the same module.
Basic usage
--out <file> to write to a file instead.
--db selects the project database when it is not the default warden.db:
Format details
headers
Emits a C header wrapped in an include guard. Each defined function gets a comment line with
its function index, Wasm type signature, provenance, and confidence score, followed by a
skeleton declaration. Only defined (non-imported) functions are emitted; imports are excluded.
The prototypes are stubs. Parameter types are not yet recovered from the Wasm type signature.
The comment carries the full signature string so you can fill it in. This is a known limitation;
proper C prototype reconstruction is on the roadmap.
pseudo
Emits a readable listing of every defined function with its recovered name, type signature,
agent-generated summary, and provenance/confidence. When the original .wasm path is still
accessible in the KB, instruction mnemonics are included inline; otherwise the listing notes
the instruction count without disassembly.
kb-text
A columnar dump of every function in index order (including imports) with a fixed-width
layout designed for git diff. The format is:
index (Wasm function index), stable_id (first 16 hex chars of the full stable
identity hash), lk (L when the symbol is locked; space otherwise), provenance,
confidence, name (dash when unnamed).
When to use it: source-controlled annotation snapshots, CI regression detection, sharing the
current KB state without giving someone access to the database.
ghidra
Emits a Python script for Ghidra’s built-in scripting console. The script iterates over every
defined function that has a recovered name and calls fn.setName(name, SourceType.USER_DEFINED)
to apply it.
.wasm is the same binary that WARDEN ingested for that version.
Built-in decompiler
Thewarden.lift module contains a pure-Python stack-machine lifter that re-folds Wasm
stack operations back into readable pseudo-C. It handles the integer subset comprehensively
including infix arithmetic, memory loads and stores, local and global variables, and function calls.
It degrades gracefully for anything unmodeled by emitting a /* mnemonic */ comment and
an opaque temporary instead of crashing.
How --format pseudo uses it
When you run warden export --format pseudo and the original .wasm is available, the
exporter now calls the lifter instead of dumping raw instruction mnemonics. Each function
block contains a proper pseudo-C body:
.wasm is not on disk fall back to the previous mnemonic-count note; the
switch is automatic.
Targeting a single function
Usewarden lift to decompile one function by name without running a full export:
--index N flag is useful when multiple functions share a recovered name across an
ambiguous KB state.
Python API
lift_module skips imports (they have no body) and concatenates in function-index order so
the result diffs cleanly across builds.
The lifter covers the integer and control-flow subset that Emscripten-compiled C/C++
produces in practice. Floating-point ops and SIMD instructions emit
/* mnemonic */
placeholders. The output is always valid pseudo-C, never a crash or a partial file.HTML report
warden report writes a self-contained HTML file: no server, no CDN, no build step.
Everything is inlined so the file opens from any clone with a double-click, and the output
is deterministic (same KB state in, byte-identical HTML out) so it diffs cleanly in git.
What the report contains
| Section | Description |
|---|---|
| Coverage summary | Named / total defined functions with a progress bar broken down by provenance (oracle, human, agent). |
| Confidence heatmap | Every defined function in index order. Row background hue encodes provenance; alpha encodes confidence. Solid green rows are human-verified; fading amber rows are agent guesses that need review. |
| Thread and memory model | Atomic sites, pthread markers, and shared-memory facts recorded by warden analyze. Hidden when the KB has no thread facts. |
| Changelog | The diff from the nearest earlier version: a chip summary (unchanged / moved / modified / new / deleted) followed by a “needs review” list of genuine app-level deltas. Hidden for the first version. |
| Color | Provenance | Trust level |
|---|---|---|
| Emerald | human | Verified by hand |
| Blue | oracle | Matched against a known corpus |
| Cyan / teal | export / import | Free fact from the binary |
| Violet | string-xref | Inferred from a string reference |
| Amber | diff-carry | Carried across a version bump |
| Dark amber | agent | Model guess (lowest trust) |
| Zinc (desaturated) | (unnamed) | No symbol recovered |
Python API
module=<Module> to either function if you have the parsed .wasm on hand; it is
optional and reserved for future inline disassembly views. The report is fully driven by
the KB without it.
Comparing across versions
Because all formats are deterministic, you can snapshot them at each version and use standard diff tooling to review what changed:stable_id and annotations appear as unchanged lines. New functions,
dropped functions, and any confidence or provenance changes are visible immediately.
For a richer semantic changelog (which functions are new, removed, carried over, or only
partially matched), use
warden diff before exporting. The diff engine
runs the same fingerprinting that export relies on, so the two views are consistent.Reference
| Flag | Default | Description |
|---|---|---|
--format, -f | kb-text | Output format: headers, pseudo, kb-text, or ghidra. |
--out, -o | (stdout) | Write output to a file instead of printing to stdout. |
--db | warden.db | Project database path (or WARDEN_DB env var). |