Temporal Compression: A Crosswalk

For agents and reviewers. This document is the source-of-truth backing the reading section. Every claim in the chapters traces to a citation here. Every citation was fetched and validated against the original paper. If a claim looks wrong, check this file first — the chapter may have simplified. If this file is wrong, open an issue.

Video codec theory, temporal graph theory, and sheaf cohomology on directed networks all work on directed dependency graphs with time-ordered composition. Each field has tools the others lack. This document maps the vocabulary, grades the parallels, and states the open problems.


Shared structure

Parallels graded on a three-level rubric: genuine (same formal primitive or theorem pattern under relabeling), moderate (same structural role but different state space/objective), forced (workflow resemblance or metaphor only).

Event graph / Prediction dependency DAG

Both fields admit DAG representations of temporal dependency. Temporal graph researchers call it a temporal event graph (TEG) (Mellor 2018) or weighted event graph (Kivelä et al. 2018). Codec engineers use a prediction dependency DAG for packet or frame dependencies (Chou & Miao 2006). Closely analogous static DAG representations: the TEG encodes event adjacency under time order; the prediction DAG encodes decode dependencies over packetized data units. The identification is representational, not ontological: same structural pattern, different node semantics. This observation is ours, not made in the cited papers.

Chain fragility (genuine parallel)

TVG: Temporal reachability does not compose transitively. A→B at t₁ and B→C at t₂ only yields A→C if t₂ ≥ t₁ + ζ(e₁, t₁) — arrival must precede next departure, including latency (Casteigts et al. 2012, §4.4). When the timing constraint fails, the journey breaks even though both edges exist.

Codecs: P-frame chains break when a reference is lost — every downstream frame becomes undecodable even though the data exists. This is dependency failure under reference loss, not a timing constraint.

Shared: Both exhibit fragility of sequential dependency: downstream validity depends on every earlier link in the chain. The failure mechanisms differ — TVG chains break when time-ordering fails to compose; codec chains break when a reference node is lost. The structural pattern (sequential dependency where any break severs the tail) is the same.

Algebraic analogue: Krishnan (2014, Theorem 5.12) proves that duality gaps (max-flow < min-cut) arise from failures of exactness in directed sheaf cohomology. “Local consistency fails to globalize” is the structural content of chain fragility in sheaf-theoretic language. (The three-way connection is this crosswalk’s reading, not Krishnan’s.)

Grading rationale: genuine. Both instantiate sequential dependency over a time-ordered dependency graph where tail validity requires every predecessor. The trigger differs; the dependency pattern does not.

Temporal disconnection ↔ Scene-change detection (forced parallel)

TVG: Combinatorial reachability failure; forward composition stops working.

Codecs: Statistical change-point in visual prediction; forward prediction stops being useful.

Different failure criteria, different formal objects. The analogy is only at the workflow level: both detect when forward prediction should be abandoned.

Waiting ↔ Buffering (forced parallel)

TVG: Waiting at intermediate vertices when the next edge is not yet available. Bounded waiting changes complexity and expressivity (Casteigts et al. 2021).

Codecs: B-frame buffering via the decoded picture buffer (DPB) (Richardson, H.264/AVC). DPB exists because decode order, display order, and reference management diverge.

Both involve deferred usability under temporal constraints, but the mathematical objects are not close.


Temporal graph theory (TVG side)

Formal framework

A temporal graph is a 5-tuple G = (V, E, T, ρ, ζ) where ρ is a presence function and ζ is a latency function. The formalism is edge-time based, not restricted to snapshot sequences. It unifies evolving graphs, temporal networks, and schedule-conforming paths (Casteigts et al. 2012).

Journey

A time-respecting path: a sequence of edges where each edge departs after the previous one arrives, t_{i+1} ≥ t_i + ζ(e_i, t_i). Reachability is neither symmetric nor guaranteed to compose transitively.

Temporal connectivity classes

13 classes with strict inclusion (Casteigts et al. 2012, §5, Figure 4). The paper connects some classes to necessary conditions for particular distributed problems.

Journey optimality criteria

Three formally distinct objectives:

(Casteigts et al. 2012, §4.5)

Waiting semantics

Bounded vs. unbounded waiting. Strict vs. non-strict journeys. These choices change complexity results (Casteigts et al. 2021).

Temporal spanners

Sparse subgraphs preserving journey distances up to multiplicative stretch α (called “temporal α-spanners”). Temporal cliques admit sparse spanners (Casteigts, Peters & Schoeters 2021). Bilò et al. 2022 give upper and lower bounds on spanner size. The optimization target is edge count, not bits.

Blackout-tolerant spanners

Spanners that remain valid under timestep-wide edge deletion (Blackout-tolerant spanners, 2023).

Timestamp perturbation

Enright et al. 2025 model worst-case reachability change under timestamp perturbation.

Temporal network compression

Adhikari et al. 2017 condense temporal networks while preserving propagation dynamics. Allen et al. 2024 propose a commutator-based chronology compression method for preserving epidemic dynamics under aggregation. These are heuristic or dynamical-preservation programs, not formal rate-distortion theory.

Event graphs

Mellor (2018) defines the temporal event graph (TEG), a static DAG encoding event adjacency under time order; DAG-ness is proved in Lemma 3.2. Kivelä et al. (2018) define the weighted event graph, which explicitly encapsulates the complete set of δt-constrained time-respecting paths. Mellor’s TEG is narrower than Kivelä’s construction.


Video codecs side

Temporal prediction (H.261 → H.264)

H.261 (1988, ITU-T H.261) uses intra/inter coding with motion-compensated prediction. MPEG-1 (1993, ISO 11172-2) adds B-pictures. H.264/AVC (2003, ITU-T H.264) adds multiple reference frames, decoded picture buffer management, and rate-distortion optimized mode selection. The dependency structure is engineered as a DAG, not usually treated as a formal mathematical object.

I-frame / P-frame / B-frame

B-frames have no clean TVG analogue.

Rate-distortion optimization

Bits vs. reconstruction quality via Lagrangian optimization: min D + λR. Sullivan & Wiegand (1998) systematize this approach for video encoder control, applying the Lagrangian framework of Shoham & Gersho (1988) to hybrid codecs (Sullivan & Wiegand 1998). TVG theory has no direct equivalent.

Packet dependency formalization

Chou & Miao 2006 formalize packet dependencies for rate-distortion optimized streaming. The dependency DAG is taken as input to a scheduling problem, not itself optimized as a compressible object.

Error accumulation and checkpoint resets

P-frame chains accumulate quantization or prediction error across references. I-frames reset that accumulation. TVG spanners preserve path properties approximately, but they do not model per-hop reconstruction error with explicit reset points.

Error concealment

Approximate reconstruction when a reference is lost. TVG theory studies deletion and survivability, but not graceful approximate recovery of a missing journey.

Multi-scale structure

Codecs operate at block, frame, and GOP levels simultaneously. Many temporal network compression papers work on snapshot sequences and treat the snapshot as the atomic unit.

Static graph compression

Navlakha et al. 2008 propose MDL-based graph summarization with bounded per-node reconstruction error. This is not Shannon rate-distortion theory: no rate-distortion function, no converse bounds. A temporal extension with journey-aware distortion remains open.


Converging from a third direction: algebraic topology on directed networks

Three bodies of work approach directed dependency networks from outside both video codecs and TVG theory. They do not solve the bridge, but they sharpen where it could be built.

Cellular sheaves on directed graphs (strongest candidate)

Sheaf cohomology provides algebraic tools for distinguishing removable from irreducible edgewise error on directed dependency graphs. This is the closest formal contact point with codec-style predictive error found so far.

Predictive coding as a sheaf-coboundary system. Seely (2025, NeurIPS Workshop) constructs a cellular sheaf over a computational graph. The coboundary operator

δ⁰: C⁰ → C¹, where (δ⁰s)_e = s_v − W_e s_u

maps activations to prediction errors. This has the same algebraic form as motion-compensated subtraction (current frame minus predicted frame). The predictive-coding energy is E = ½‖δ⁰s‖², and inference is diffusion under the sheaf Laplacian L = (δ⁰)ᵀδ⁰.

With clamped inputs/outputs, Seely replaces δ⁰ by a relative coboundary D. Inference minimizes ½‖Dz + b‖². The Hodge decomposition is exact:

b = (−Dz*) + r*, where −Dz* ∈ im D (removable by inference) and r* ∈ ker Dᵀ (irreducible)

The first cohomology H¹ = C¹/im δ⁰ ≅ ker(δ⁰)ᵀ captures prediction-error patterns that no choice of activations can eliminate. The learning gradient factorizes as ∂E/∂W_e = (Hb)_e · (Gb)_uᵀ, where H is the harmonic projector and G is a pseudoinverse. “Learning requires both a residual and a source. If either vanishes, the update at edge e is zero.”

Two points matter for the codec analogy: the framework is not restricted to DAGs (recurrent topologies handled via monodromy operator Φ), and it is linear networks only. The prediction step in codecs (subtract motion-compensated reference from current frame) is linear, though the full codec pipeline includes nonlinear stages. The linear restriction aligns with the prediction stage specifically, not end-to-end.

Network coding sheaves. Ghrist & Hiraoka define a network coding sheaf F on a directed graph X = (V, E). Theorem 8: H⁰(X; F) = valid information flows. Max-flow ≤ min-cut derived via long exact sequence. Proposition 11 (Network Robustness): H⁰_Z(X; F) = global flows on failure network G\A; robust iff p⁰ = 0. Proposition 13 (Data Merging via Mayer-Vietoris): local flows merge iff g⁰(σ_U, σ_V) = 0. Exact network coding over a field k — no temporal dynamics, no noise.

Semiring generalization. Krishnan 2014 treats sheaves on digraphs as functors from incidence posets into semimodules. Theorem 5.12: sheaf-theoretic MFMC — flow values equal the homotopy limit over cuts; equality when cohomology is exact. Duality gaps = failures of exactness. Works over semirings — handles probabilities, capacities, and tropical quantities natively.

What sheaf theory still does not give: temporal dynamics in the sheaf structure, rate-distortion, checkpoint/reset formalism.

Tropical algebra / temporal state machines (quantitative and invariance side)

Temporal state machines (PMC 9792072). The tropical semiring T = (R∪{∞}, min, +) is physically realized as (first-arrival, delay). Key result — time-translational invariance constraint: all pure race logic must satisfy f(t₁+δ,…,tₙ+δ) = f(t₁,…,tₙ)+δ. Consequences:

Parametric tropical geometry. Joswig et al. 2019 study parametric shortest paths via tropical geometry. Parametric weights (intervals), not temporal dynamics. Should not be cited for temporal state or non-commutativity claims.

Persistent homology / path homology (weaker than expected)

Chowdhury & Huntsman 2020 apply path homology to temporal networks through static digraph snapshots via sliding time windows, not the temporal structure itself. No stability theorems. No journey or reachability connection.

Pritam et al. 2025 classify temporal graphs using persistent homology with average-timestamp-gap filtrations. No stability theorems in the algebraic-topology sense. No journey connection.

This literature supplies summary invariants and algebraic tools, but not a temporal-compression theory.

Speculative synthesis

A cellular sheaf on an event/dependency graph makes prediction error algebraic. Seely provides the operator dictionary:

Krishnan’s semiring generality (Theorem 5.12 works over semirings, so tropical coefficients are algebraically admissible) and the temporal state machines’ invariance result (the boundary between tropical-linear and tropical-multiplicative is the boundary between stateless prediction and stateful buffering) strengthen the synthesis.

What remains missing: an encoding family, a distortion functional, a rate constraint, and temporal dynamics in the sheaf structure. Most of the ingredients exist separately; the assembly doesn’t.


Computational attempts (March 2026)

Three concrete computations were run against the open problems. Results below.

Finding 1: Tropical MFMC breaks on event graphs (strongest result)

On a 4-vertex temporal graph, constructing the event graph (6 nodes, 6 edges, DAG) and defining a cellular sheaf with tropical stalks T = R∪{∞}:

This is not a failure but a structural finding: Krishnan’s Theorem 5.12 (sheaf MFMC over semirings) does not transfer naively to the tropical semiring on event graphs. The “flow” is an earliest-arrival schedule, and the “cut” must be redefined in terms of tropical potentials rather than crossing-edge minima. The obstruction is algebraic (lack of additive inverses), not an artifact of the example.

Status: Likely novel and nontrivial. Publishable as a short note if the failure criterion is made precise (counterexample + theorem characterizing when naive tropical cuts fail). Closest existing work: Ghrist-Hiraoka (2011) and sheaf MFMC over abelian groups (2017), both in additive/vector-space settings where conservation makes sense.

Finding 2: Checkpoint spacing formula (known math, new interpretation)

For a linear prediction chain with iid weight matrix W and per-hop Gaussian drift N(0, σ²I):

Status: The mathematics (Green’s function on a path with transport and drift) is standard. The codec interpretation (this recovers the GOP heuristic and corrects the condition-number intuition) is new but not sufficient for a standalone paper. Closest: Hansen-Ghrist (2019) on spectral sheaf theory, and standard matrix-weighted graph Laplacian literature.

Finding 3: R-D achievability via timestamp coarsening (novel direction, shaky converse)

Status: Novel direction, probably new at the exact formulation level. Not reviewer-ready. Closest: Lossy compression of link streams (2020), which gives an information-theoretic framework for lossy temporal graph compression but does not use journey-aware distortion.


Open problems (updated)

1. Tropical flow-cut duality on event graphs

The naive tropicalization of sheaf MFMC breaks. What is the correct notion of tropical cut on a temporal event graph? Does Krishnan’s equalizer formulation (avoiding explicit subtraction) resolve the obstruction, or is a fundamentally different cut notion needed? Characterize the class of semirings for which the sheaf MFMC transfers to event graphs.

2. R-D bounds for journey observables

The achievability direction works (timestamp coarsening). The converse needs: (a) a repaired distortion definition (additive slack, not multiplicative stretch), (b) a coupling argument showing the tensor-level converse extends to end-to-end journeys, or (c) a different approach entirely. The lossy link-stream framework of Lamarche-Perrin (2019) is the closest existing work and should be compared.

3. Checkpoint spacing on general graphs

The linear chain formula E[‖r*‖²] = σ² · tr Gₙ(W) is known math. The open question is whether it generalizes to non-chain dependency graphs (trees, DAGs with fan-out) in a way that produces nontrivial results. On general graphs, the Green’s function structure is richer and the checkpoint placement becomes a combinatorial optimization.


Computational attempts (April 2026)

Seventeen hypotheses tested against the 2n-3 temporal spanner conjecture. Key findings:

Finding 4: Forward reachability is transitive, backward is not

The tropical semiring (min, +) composes associatively forward in time: if A reaches B by time t₁ and B reaches C at time t₂ ≥ t₁, then A reaches C. Backward composition fails — you can’t un-arrive. This asymmetry kills every symmetric proof technique (union-find, undirected spanning trees, Farey trees, BGP-style route composition). Six of seventeen hypotheses died from this symmetry mismatch alone.

Status: Structural finding. Explains why the semiring gap (graph reachability ≠ temporal reachability) is one-directional. Any proof technique that assumes undirected, transitive, or time-independent structure will fail on temporal spanners.

Finding 5: Best-response dynamics converge in 2 rounds (PoA ≤ 1.25)

Model each vertex as a selfish agent that drops its most redundant edge. Sequential or simultaneous best-response converges to a Nash equilibrium in exactly 2 rounds. The equilibrium matches centralized greedy optimum (Price of Anarchy ≤ 1.25). Per-node degree is O(1) (avg 3-4 edges). This is the first distributed temporal spanner construction — the literature has none.

Status: Novel and publishable. Distributed temporal spanner construction is an open problem. The game-theoretic characterization (Nash = near-optimal, 2-round convergence) adds a new angle. Construction complexity is O(k⁵) due to per-edge reachability checking.

Finding 6: The 4k-3 bound may not hold for K_{k,k}

Exhaustive search at k=4 confirms ~6% of random all-distinct K_{k,k} matrices genuinely require >4k-3 edges for a temporal spanner. Budget compliance drops to 20% at k=8. Optimal spanner size scales as ~4.1k at k=10, growing faster than 4k-3. The CPS conjecture (2n-3) is stated for K_n, not K_{k,k} — the bipartite case may have a different (larger) bound.

Status: Relevant to conjecture scope. If confirmed at larger k, this clarifies that the 2n-3 conjecture applies to complete graphs under the dismountability reduction, not directly to bipartite temporal cliques with arbitrary timestamps.

Finding 7: Relay value is non-local

No local heuristic (per-node edge selection, column degree, timestamp spread, Farey mediant) predicts which edges are critical relays. An edge’s value as a temporal relay depends on global timestamp structure — whether it bridges timestamps for pairs that have no alternative path. Seven hypotheses testing local construction rules all failed. An O(n) construction appears impossible; the best known is O(k⁵).

Status: Negative result. Explains why the conjecture resists constructive proofs: the right edges can only be identified by global reachability queries, not local criteria.

Finding 8: Three proof-technique families are structurally incompatible

Systematic application of the Proof Manual (proof-manual.yml) identified three families that are structurally killed:

Status: These are permanent obstructions, not technical gaps. Any future proof attempt using volumetric, probabilistic, or algebraic rank methods will hit the same barriers. The proof must be combinatorial and constructive.

Finding 9: SM(k) optimal spanner = border of k×k matrix

For the shifted-diagonal matrix SM(k), the optimal temporal spanner is the border: all edges in the first row, last row, and M⁻/M⁺ matchings. Total: 4k-4 edges. The interior edges are redundant because SM’s monotone timestamps allow rerouting through the border. This is the I-frame/B-frame structure from codec theory: the border edges are I-frames (essential), and the interior edges are B-frames (redundant given the I-frames).

Status: Minor but clean result. Connects to the codec analogy from Chapter 4.


Questions for discussion

  1. Where does the event graph ≅ prediction dependency DAG identification break first: B-frame bidirectionality, multi-reference prediction, or weighted event graph δt-semantics?
  2. Temporal spanners optimize edge count, codec R-D optimizes bits. Is edge count a defensible proxy for bitrate, or does the encoding family matter fundamentally?
  3. Is the lack of per-hop error accumulation in standard TVG a conceptual absence, or a modeling choice from exact/unweighted journey semantics?
  4. [Partially answered] Seely’s sheaf formalism is linear. The checkpoint spacing computation confirms this aligns with codec prediction. The open question is nonlinear extension.
  5. [Answered: no, not naively] Can Krishnan’s flow machinery be lifted to tropical event graphs? The naive lift breaks. Modified cut notions needed.
  6. The temporal state-machine invariance constraint draws a formal line between stateless composition and stateful computation. Does this correspond to a known TVG distinction?
  7. [Answered: the duality breaks] The tropical semiring does not yield a meaningful flow-cut duality via naive tropicalization. The question is now: what cut notion works?

Sources