Diff
Chapter 4 ยท Part II begins ยท Calcagno 2009, O'Hearn 2019, Arjovsky 2019, Rubin 1915
Chapter 3 showed the same abstract schema under eight names. This chapter defines the primitive: snapshot before, snapshot after, XOR. What flipped is figure; what held is ground. Diff is the contrast that abduction operates over.
Mechanic
A driver turns the key and gets nothing: no crank, no start. Yesterday the car started fine. The mechanic turns the key herself, watches the dash lights dim and the starter give a single click, and two hypotheses fire: weak battery, or a bad connection at the terminals. Nobody taught her a hypothesis-generation algorithm. The shape of the failure named the next test.
Where did the hypotheses come from? She compared two states: the car that started yesterday, the car that won't crank today. She noted what changed (dim lights, a single click, voltage sagging under load) and what stayed the same (fuel level, security light, the key turning freely). The changes pointed at the starting circuit. The hypotheses followed from the diff.
Diff is computable. The mechanic performed it in her head. OBD-II performs it in silicon. Facebook Infer performs it on code. The representation changes; the operation stays the same.
Primitive
Diff is the substrate of abduction, not abduction entire. It produces the contrast that abduction operates over:
Given expected state E, observed state O, and background frame B, partition what differs from what remains invariant.
Abduction then uses that partition to generate candidate causes H: minimal revisions to B that would make O unsurprising. The partition itself is the primitive. All eight names in Chapter 3 instantiate it. The simplest encoding: take two snapshots of a system's state, one before an event and one after. The comparison partitions state into two sets:
Figure — what changed. Variables whose values differ between snapshots. In Gestalt terms, what pops out against a stable background.
Ground — what held. Variables whose values stayed the same. The context that remained invariant while the figure shifted. In separation logic, the frame that bi-abduction infers.
Formally:
diff(state_before, state_after) → (figure, ground) The partition is symmetric. It doesn't matter which snapshot is “first.” But use is asymmetric. The before-state is the baseline; the after-state is the perturbation. The diff tells you what the perturbation touched.
Caution: diff gives candidates, not causes. A changed variable may be cause, effect, symptom, or coincident noise. The mechanic's diff puts the chief complaint (no crank) next to the diagnostic signs (dim lights, single click, sagging voltage), all in the figure. The diff alone does not know which is upstream. Separating cause from effect requires a dependency graph or an experiment. The diff just names what changed.
XOR
The simplest instantiation: state is a set of key-value pairs. The figure is every key whose value changed. The ground is every key whose value didn't.
Apply it to the no-start:
Five figure entries: no crank, no start, dim lights, a single click from the starter, and a slightly lower resting voltage. Two ground entries: fuel and the security light, both unchanged.
The hypotheses follow from the figure. Dim lights, a single click, no crank. What connects them? The starting circuit: battery, wiring, the starter motor, the ignition switch. Each can starve a crank, and the battery and wiring sit upstream of the dimming lights and the sagging voltage too. The ground tells the mechanic what to deprioritize. Fuel is full and the security light is off, so neither a fuel fault nor an immobilizer lockout is implicated (though that doesn't eliminate them unless the measurement set is complete).
This is the contrast primitive. Hypothesis generation, ranking, and testing build on top of it.
Degrees of freedom
The diff above is the minimal case: one before, one after, one partition. The variants in the literature add degrees of freedom to this primitive:
| Variant | Inputs | Outputs | What it adds |
|---|---|---|---|
| Unary diff | One before, one after | Figure + ground | The primitive. This chapter. |
| Bi-abduction | Partial before, partial after | Inferred frame + inferred anti-frame | Infers the ground autonomously. Ch 5. |
| Incorrectness | One before, one after | Under-approximation of bugs | Flip polarity: attend to failure, not success. |
| Tri-abduction | Fork: shared start, two branches | Causal edge (what the branch changed) | Diff across branches, not just time. Ch 6. |
Each step adds an operand. One snapshot pair gives one frame. Two pairs (actual and counterfactual) give one causal edge. N pairs across N branches give a typed subgraph. The pattern stays diff; the arity grows.
Three witnesses
Three systems, three decades, three fields. Each encodes the diff. They are rarely presented as instances of one operation.
OBD-II (1996) — hardcoded diff
OBD-II reads sensor states, diffs against expected values, and generates fault codes. (Real OBD-II adds thresholds, monitors, and enable conditions; "hardcoded diff" is a simplification of the core logic.) Vehicles have run this since 1996.
But OBD-II is hardcoded. The fault tree is hand-authored, the hypotheses enumerated in advance. An engineer decided which deviations map to which codes. If a failure mode wasn't anticipated, nothing fires. The primitive works; it just runs on a fixed table.
The limitation is the table. Every hypothesis must be written down before the system ships. No table entry, no hypothesis. The diff is there; the inference is manual.
Facebook Infer (2009) — automated diff
Infer (Calcagno et al. 2009) runs bi-abduction on millions of lines of production code. Given a function's precondition and postcondition, it infers the frame (heap the function didn't touch) and the anti-frame (heap that must exist for the function to be safe).
The figure is what the function modifies; the ground is what it leaves alone. Infer doesn't require the programmer to specify the ground. That's the “bi”: abduction in both directions, computing what must hold before and what the function preserves.
Infer moved the diff from a hand-authored table (OBD-II) to an automated inference engine. Same primitive, higher automation.
IRM (2019) — learned diff
Invariant Risk Minimization (Arjovsky et al. 2019) uses environment variation to force figure/ground separation. Train a model across multiple environments. Features that predict the outcome in all environments are invariant (ground). Features that predict in some but not others are spurious (figure).
IRM diffs across environments rather than across time. Instead of comparing two snapshots of one system, it compares the same learning task under different conditions. Invariant features are ground; environment-specific features are figure. (Note the inversion: in IRM, the invariant features are the causal signal you want. Calling them "ground" follows the changed/unchanged definition but reverses the ordinary sense of "figure = thing of interest." The role assignment holds; the valence flips.)
| System | Year | Diff over | Figure | Ground |
|---|---|---|---|---|
| OBD-II | 1996 | Expected vs. observed sensor values | Fault codes (hand-authored) | Normal operating range (hand-authored) |
| Infer | 2009 | Precondition vs. postcondition | Modified heap (automated) | Frame: untouched heap (inferred) |
| IRM | 2019 | Environment A vs. environment B | Spurious features (learned) | Invariant features (learned) |
Three encodings. Handcoded table (OBD-II), automated inference (Infer), learned separation (IRM). IRM is the loosest analogue; it seeks invariant representations rather than literally diffing states. But the structural role holds: partition observations into signal and noise.
Code: full loop
Combine the primitive with hypothesis generation. Given a diff, produce candidate explanations by examining what the figure touches in a dependency graph.
Four candidates from the figure: battery, wiring, starter motor, ignition switch. They don't weigh equally. The battery is upstream of every changed variable (the crank, the lights, the click, the resting voltage), and wiring of most; the starter motor explains the crank and the click, the ignition switch only the crank. The ground (full fuel, security off) keeps the fuel system and the immobilizer off the list. Check the battery and its connections first.
Notice what the code does not do. It does not test the hypotheses, rank them, or estimate their probability. It generates them. The diff is a hypothesis-generation primitive. Testing is induction. Ranking is economy of research (ch 8). The diff names the candidates.
What breaks
The diff requires you to know what to observe. Every variable was chosen by someone: the mechanic who checked six gauges, the engineer who wired six sensors, the programmer who logged six fields. If the relevant state lives in a variable nobody snapshotted, the diff misses it.
Return to the mechanic. She snapshotted resting voltage but never voltage under load. If the real cause is a battery that holds 12.1V at rest yet collapses to 8V the instant the starter pulls current, the diff cannot find it. Load voltage isn't in either snapshot. Not in the figure, not in the ground. Absent.
This is the structural limitation. The unary diff partitions observed state into figure and ground. It cannot reason about unobserved state. The hypothesis space is bounded by what you chose to measure.
OBD-II shows this concretely. If a failure involves a sensor the ECU doesn't monitor, no code fires. Engineers add more sensors, but you can never instrument everything. Some state will always be unmeasured.
Bi-abduction addresses this by inferring the frame: the state the operation must not have touched for the result to be valid. It reasons backward from the postcondition, including state never explicitly observed. The diff goes from "compare what you measured" to "infer what you must have missed."
Sources
| Rubin 1915 | Synsoplevede Figurer. Figure-ground segregation in visual perception. The perceptual ancestor of diff. |
| SAE J1962 (1996) | OBD-II standard. Diagnostic connector, protocol, and trouble code conventions. The diff, hardcoded since 1996. |
| Calcagno et al. 2009 | "Compositional Shape Analysis by Means of Bi-Abduction." POPL. The frame inference engine behind Facebook Infer. Automated figure/ground from separation logic. |
| Arjovsky et al. 2019 | "Invariant Risk Minimization." Environment variation as the lever for figure/ground separation. Learned diff across training conditions. |
| O'Hearn 2019 | "Incorrectness Logic." POPL. Flip the polarity of the diff: under-approximate bugs instead of over-approximating correctness. |
| Ernst et al. 2001 | "Dynamically Discovering Likely Program Invariants to Support Program Evolution." Daikon: infer invariants (ground) from observed execution traces. More samples, sharper ground. |
Neighbors
- Methodeutics
- Ch 2: Security and Uberty — the tradeoff that makes abduction fertile
- Ch 8: Economy of Research — selecting among the hypotheses this chapter generates
- Abduction — the blog post this chapter formalizes
External
- Facebook Infer — bi-abduction in production
OBD-II (Wikipedia)
- Arjovsky et al. 2019 — Invariant Risk Minimization (arXiv)