Branch simplification and recovery are not complicated conceptually, but doing them correctly without breaking semantics is where things can get fairly weird.
At the low level, control flow is just jumps. The goal here is to reduce these to a more conceptual understanding of what is going on at the IR level while safely preserving the same logic.
When simplifying IR control flow heavily, you also have the bonus of usually making SSA and CFG construction faster.
Transformations are only safe if they present no side effects and no syntactical violations occur with the IR.
Here are some of the most common ones:
Basic Branch Merging
1
2
| if (?) then goto L; end
if (?) then goto L; end
|
1
| if (? or ?) then goto L; end
|
Sequential Folding
1
2
3
| if (?) then
if (?) then goto L; end
end
|
1
| if (? and ?) then goto L; end
|
Branch Normalization (Inversion + Fall)
1
2
3
| if (!?) then goto L1; end
goto L2;
L1:
|
1
| if (?) then goto L2; end
|
Loop recovery
1
2
3
4
5
| L1:
....
If (?) then goto L2; end
Goto L1:
L2:
|
1
2
3
| Do {
....
} while (not ?);
|
Jump Threading
1
2
3
| if (?) then goto L1; end
L1:
goto L2;
|
1
| if (?) then goto L2; end
|
Empty Block Removal
This block has no value and should be eliminated entirely by redirecting all incoming edges to L2.
Notes
All of these transformations seem trivial; any control flow transformations on the IR you must check if they do not have any structural violations (e.g., “break or continue not inside of a loop”). Any wrong transformation WILL break control flow integrity.
Next Chapter: Chapter 12 - Abstract Modeling: Pages, Regions, Self-Modifying Code, and Indirect Calls
Prev Chapter: Chapter 10 - Dead Code Elimination and IR Canonicalization