Decompiler Construction: Chapter 1 - Utilizing Dynamic Binary Instrumentation for Lifting
When analyzing assembly statically, we cannot always precisely determine execution behavior. Will a branch be taken? Where does this indirect jump go to? Which instructions are actually executed at runtime? What instructions get executed at what address?
Dynamic Binary Instrumentation (DBI) provides a different approach by logging execution as it happens. It allows us to record the actual instruction stream, taken branches, and resolved indirect control flow for a given execution.
Unlike static analysis, which must approximate possible paths, DBI provides execution paths for a specific run. However, it only reflects a single execution path, meaning it does not fully describe all possible program behaviors.
We also track a logical program counter (PC) within the instrumentation layer, which helps correct ordering even in cases like self-modifying code. If execution re-enters modified regions, the trace reflects the updated behavior rather than a fixed static state.
The fundamental information we log includes:
- The executed instruction at each address.
- The resolved target of control-flow transfer instructions (jumps, calls, returns).
- Key CPU state changes required for semantic reconstruction (e.g., flags or mode-dependent behavior).
With this, we have the core execution data needed for decompilation and IR reconstruction, with the added benefit that you can always go back and analyze the same data if saved.
For future runs, multiple execution traces can be logged and merged during analysis to improve program coverage.
Next Chapter: Chapter 2 - Designing an ISA-like Intermediate Language (IL)