Skip to main content
All notable changes to WARDEN are documented here. The format follows Keep a Changelog and the project aims to adhere to Semantic Versioning.
0.1.0
First public alpha, released 2026-06-07
The first public alpha. WARDEN runs end-to-end with zero native dependencies. warden demo walks the whole pipeline on generated sample modules: ingest, Oracle identification, agent crew, a new version, and diff with carry-over. Roadmap phases 0 through 6 all have working, tested implementations.

Added

  • Ingestion (phase 0). Pure-Python WebAssembly binary parser covering LEB128, all standard sections, and full opcode disassembly including sign-ext, bulk-memory, reference types, threads/atomics, and SIMD immediates. Includes the name custom-section parser and an Emscripten JS-glue parser (version, dynCall signatures, pthread/PROXY_TO_PTHREAD markers).
  • Knowledge base. SQLite-backed, versioned symbol store keyed to stable function identities, with provenance, confidence, evidence, struct layouts, a thread/memory model, an audit log, and the provenance/confidence economy enforced at the write layer.
  • Stable identity and fingerprinting. Structural skeleton hash, opcode-class histogram, call-neighborhood, surviving type signature, and a deterministic MinHash fuzzy signature. One composite similarity engine reused by both the Oracle and the diff.
  • Emscripten Oracle (phase 2). Signature store, corpus builder from labeled modules, identification pass, and Emscripten-version inference. The warden.oracle.index module adds SignatureIndex.build(store, *, bands=8), index.candidates(fp), and identify_indexed(kb, version_id, store, *, threshold=0.82, write=True), a band-based MinHash-LSH structure for sublinear candidate lookup that matches the linear identify() at the same threshold. CLI: pass --indexed to warden oracle identify. Containerized emsdk matrix scaffold under scripts/corpus/.
  • Cross-version diff and carry-over (phase 3). Match/classify pipeline (unchanged / moved / modified / new / deleted), automatic annotation carry-over (verbatim for shared identities, penalized for fuzzy matches), and a semantic changelog separating app changes from runtime churn.
  • Built-in decompiler / lifter (phase 1). warden.lift (lift_function, lift_module): a pure-Python stack-machine lifter that renders readable pseudo-C. CLI: warden lift <label> [--index N] [--out FILE]. It also backs warden export --format pseudo, so pseudocode export emits real pseudo-C instead of a mnemonic dump. Example: i32 parse_token(i32 p0, i32 p1) { return ((p0 + p1) * 7); }.
  • Agent crew (phase 4). A propose, verify, write-back loop with a deterministic offline heuristic backend (no API key required), an optional OpenAI backend (gpt-5.3-codex by default, with codex and oai aliases), and an optional Anthropic backend (claude-opus-4-8 by default). LLM backends use structured JSON output.
  • Call-graph agent strategy (phase 4). warden.analysis.callgraph provides build_call_graph(module) (returns a CallGraph with .edges, .imports_called, .indirect_callers, .table_targets, and .callees(index)) and layered_schedule(module, graph=None) (bottom-up layers of function indices, with strongly-connected components condensed via iterative Tarjan). run_agent_pass gained two keyword arguments: strategy ("call-graph" by default, or "flat") and concurrency (int, default 8). CLI: warden agent <label> [--strategy call-graph|flat]. The call-graph strategy works in five steps: (1) build a static intra-module call graph (direct calls are exact; call_indirect / dynCall indirect calls are over-approximated to table targets of the matching type from module.elements); (2) condense SCCs and sort into bottom-up layers so every function’s defined callees are in earlier layers; (3) run the concurrency and struct analyzers first and route their findings into per-function notes (atomic sites, struct layouts); (4) process layers bottom-up, giving each function a FunctionFacts.callee_names list of its callees’ recovered names so the naming LLM sees callee meanings before producing a name for the caller; (5) functions within the same layer are independent and are proposed concurrently in-process via asyncio (blocking LLM backends run in worker threads capped by concurrency). Writes still go through the provenance/confidence economy, so concurrent branches sharing a callee cannot clobber each other. FunctionFacts gained two new fields: callee_names (list[str]) and notes (list[str]). The "flat" strategy preserves the original single-pass, leaves-first ordering.
  • Specialized concurrency and struct analyzers (phase 4). warden.analysis.concurrency (analyze_concurrency returns a ConcurrencyReport with .shared_memory, .atomic_sites, .pthread_markers, .facts) and warden.analysis.structs (analyze_structs returns StructLayout objects with .name, .fields, .source_function). Both populate the previously-empty thread_model and structs KB tables. CLI: warden analyze <label>.
  • Verification (phase 5). Determinism verification (runs today) plus a detected, optional wasm2c/w2c2 differential-equivalence plan. A zero-dependency interpreter (warden.interp: execute_function, differential_execute) makes behavioral-equivalence checking runnable without external tooling. CLI: warden exec <label> <index> [args...].
  • Static HTML report generator (phase 6). warden.report (render_report, write_report): a self-contained HTML file (inline CSS, no server required) with a coverage summary, a confidence heatmap colored by provenance and confidence, a thread/memory model section, and the diff changelog. CLI: warden report <label> [--out FILE].
  • Exporters. C headers, readable pseudocode, a git-diffable KB text dump, and a Ghidra rename script.
  • MCP server. Optional warden mcp tool surface mirroring the GhidraMCP pattern, including backend discovery, grounded function facts, server-side agent runs, and economy-gated symbol proposals.
  • CLI and UX. warden init, ingest, versions, coverage, funcs, show, set-name, oracle, agent, agent-backends, diff, lift, exec, analyze, report, export, verify, mcp, and demo with rich terminal output.
  • Test suite (122 tests), CI (lint, types, tests, wasm validation), a PyPI Trusted Publishing release workflow, a Mintlify documentation site, a Docker image, pre-commit hooks, and a full documentation set.
Last modified on June 7, 2026