Decompiler Construction: Chapter 9 - Coverage-Guided Fuzzing for Validation and Symbolic Execution

Posted May 27, 2026 Updated Jun 17, 2026

By Daniel

2 min read

Assume all your passes are wrong and have never been proven.

The idea is to find failures before someone else does, pass fail at the smallest scale first, and snowball into catastrophic behaviours.

Some of these include:

Mis-reconstructed control flow
Incorrect SSA transformations
Undefined behavior misinterpretation
Incorrect type recovery
Silent arithmetic mismatches

Standard tests do not always catch these; your best strategy is using coverage-guided fuzzing.

Core Idea

Instead of randomly generating inputs, a fuzzer:

Generates an input program
Runs it through the passes
Measures coverage
Keeps inputs that explore new paths
Mutates them to go deeper

Over time, it systematically explores edge cases.

Minimum Viable Target

Your fuzzing loop should look like this:

If behavior differs -> Bug!!!!.

Symbolic Execution

Symbolic execution runs programs with symbols instead of concrete values, tracks constraints instead of actual data. It should be as optimal and fast as possible

Inputs become variables (e.g., x = X)
Branches add constraints (e.g. if (x > 10) -> X > 10, else -> X <= 10
Each path stores: path constraints, symbolic expressions for the state

A good symbolic execution library you could utilize: Z3.

Purpose

Solves for inputs that reach specific paths or behaviors by solving constraints.

In validation

Used to detect mismatches between original and transformed IR:

Diverging path conditions Impossible constraints after transformation Missing or extra execution paths

Limitation

Path explosion (exponential growth) makes full execution infeasible; it requires bounding or selective use. In practice, it is applied with bounded depth or selective path exploration.

Oracle

An oracle determines whether the results are correct.

This means comparing behaviors in this pseudo-code

State_1 = symbolic_execution(original_IR);

Transformed = original_IR;
For every pass as p:
	Transformed = p(Transformed);
	if not equivalent(State_1, symbollic_execution(Transformed)):
	 	equivalence is defined over satisfiable path constraints and observable state

Inputs must avoid or filter undefined behavior, or the results are meaningless. Comparisons must define what differences are allowed (if any)

Additional Checks (Should consider)

SSA invariants hold CFG is valid (no broken edges or dominance violations) Type logic remains consistent

A weak oracle hides bugs. A strong oracle exposes bugs

Model

A program is modeled as an input-output system operating over an internal state.

An execution produces a trace:

T = (S, M, O)

S: final machine state (registers)
M: observable memory/abstracted states
O: observable outputs (effects)

Equivalence between executions is defined over equality of O and agreement of S and M under allowed abstraction.

Symbolic execution operates over symbolic states and path constraints, where a path is valid only if its constraints are satisfiable.

All comparisons in the oracle are performed over these trace and constraint representations.

Next Chapter: Chapter 10 - Dead Code Elimination and IR Canonicalization

Prev Chapter: Chapter 8 - Designing Safe Optimization Passes

Decompiler Construction, Chapter 9 - Coverage-Guided Fuzzing for Validation and Symbolic Execution

This post is licensed under CC BY 4.0 by the author.