A Synthetic Approach to Markov Kernels

Tobias Fritz · 2020 · arxiv arXiv:1908.07021

Stochastic functions form a category with copy and discard. Copying detects whether a function is deterministic or stochastic.

What's a Markov category

A Markov category (nLab) is three things:

Objects are types: finite sets, state spaces
Morphisms are stochastic functions: input → distribution over outputs
Structure: you can copy data and discard data

That's it. In Python, a morphism is a function that returns a dict: keys are possible outputs, values are probabilities. The dict IS the stochastic matrix.

# A morphism in FinStoch — it's just a dict
weather_to_temp = {
    "sunny":  {  "hot": 0.8, "cold": 0.2  },
    "cloudy": {  "hot": 0.3, "cold": 0.7  },
}
# For each input (weather), a distribution over outputs (temperature).
# That's a stochastic matrix. That's a morphism. That's FinStoch.

Scheme

; A Markov kernel: input -> distribution over outputs
; Distribution = list of (value . probability) pairs

(define (fair-coin x)
  '((heads . 0.5) (tails . 0.5)))

(define (biased-die x)
  '((1 . 0.1) (2 . 0.1) (3 . 0.1)
    (4 . 0.1) (5 . 0.1) (6 . 0.5)))

(define (deterministic-double x)
  (list (cons (* x 2) 1.0)))

(display (fair-coin 'anything))   (newline)
(display (biased-die 'anything))  (newline)
(display (deterministic-double 3)) (newline)
; All three are morphisms in FinStoch.

Copy detects determinism

Every object in a Markov category has a copy morphism: ν(x) = (x, x). Fritz's insight: a morphism f is deterministic if and only if it commutes with copy. "Apply then copy" equals "copy then apply to both copies."

Stochastic functions fail this test. If you flip a coin and then copy the result, you get (heads, heads) or (tails, tails). If you copy first and then flip each independently, you can get (heads, tails). Different distributions. The function is stochastic.

Scheme

; Copy-naturality: f is deterministic iff f;copy = copy;(f⊗f)
; We work with distributions, not samples.

; A deterministic morphism: double
(define (double x) (list (cons (* x 2) 1.0)))

; Copy the distribution: each output paired with itself
(define (copy-dist dist)
  (map (lambda (p) (cons (cons (car p) (car p)) (cdr p))) dist))

; f;copy — apply double, then copy the result
(define f-then-copy (copy-dist (double 5)))

; copy;(f⊗f) — copy input, apply double to each independently
; For deterministic f, both copies get the same result
(define copy-then-f
  (list (cons (cons 10 10) 1.0)))

(display "f;copy     = ") (display f-then-copy) (newline)
(display "copy;(f⊗f) = ") (display copy-then-f) (newline)
(display "equal? ") (display (equal? f-then-copy copy-then-f))
(newline) (newline)

; A stochastic morphism: coin flip
(define (coin x) (list (cons x 0.5) (cons (- x) 0.5)))

; f;copy — flip once, copy the result
(define coin-then-copy (copy-dist (coin 5)))

; copy;(f⊗f) — copy input, flip each independently
(define copy-then-coin
  (list (cons (cons 5 5) 0.25) (cons (cons 5 -5) 0.25)
        (cons (cons -5 5) 0.25) (cons (cons -5 -5) 0.25)))

(display "coin;copy     = ") (display coin-then-copy) (newline)
(display "copy;(coin⊗coin) = ") (display copy-then-coin) (newline)
(display "equal? ") (display (equal? coin-then-copy copy-then-coin))
; #f — stochastic! coin;copy has 2 outcomes, copy;coin⊗coin has 4

; Copy-naturality: f is deterministic iff f;copy = copy;(f⊗f)
; We work with distributions, not samples.

; A deterministic morphism: double
(define (double x) (list (cons (* x 2) 1.0)))

; Copy the distribution: each output paired with itself
(define (copy-dist dist)
  (map (lambda (p) (cons (cons (car p) (car p)) (cdr p))) dist))

; f;copy — apply double, then copy the result
(define f-then-copy (copy-dist (double 5)))

; copy;(f⊗f) — copy input, apply double to each independently
; For deterministic f, both copies get the same result
(define copy-then-f
  (list (cons (cons 10 10) 1.0)))

(display "f;copy     = ") (display f-then-copy) (newline)
(display "copy;(f⊗f) = ") (display copy-then-f) (newline)
(display "equal? ") (display (equal? f-then-copy copy-then-f))
(newline) (newline)

; A stochastic morphism: coin flip
(define (coin x) (list (cons x 0.5) (cons (- x) 0.5)))

; f;copy — flip once, copy the result
(define coin-then-copy (copy-dist (coin 5)))

; copy;(f⊗f) — copy input, flip each independently
(define copy-then-coin
  (list (cons (cons 5 5) 0.25) (cons (cons 5 -5) 0.25)
        (cons (cons -5 5) 0.25) (cons (cons -5 -5) 0.25)))

(display "coin;copy     = ") (display coin-then-copy) (newline)
(display "copy;(coin⊗coin) = ") (display copy-then-coin) (newline)
(display "equal? ") (display (equal? coin-then-copy copy-then-coin))
; #f — stochastic! coin;copy has 2 outcomes, copy;coin⊗coin has 4

Kleisli composition

Morphisms in FinStoch compose via Kleisli composition: feed the output distribution of f into g, marginalizing over the intermediate. This is >>= (bind) from monads.

Scheme

; Kleisli composition: f >=> g
; A distribution is a list of (value . probability) pairs.
; Composition marginalizes the intermediate.

(define (kleisli f g)
  (lambda (x)
    (apply append
      (map (lambda (pair)
        (let ((y (car pair)) (py (cdr pair)))
          (map (lambda (qpair)
            (cons (car qpair) (* py (cdr qpair))))
          (g y))))
      (f x)))))

; f: 50% stay, 50% increment
(define (f x) (list (cons x 0.5) (cons (+ x 1) 0.5)))

; g: double with certainty
(define (g y) (list (cons (* y 2) 1.0)))

; Compose: (f >=> g)(3)
(display ((kleisli f g) 3))
; ((6 . 0.5) (8 . 0.5))

Support — which outputs are possible

The support of a distribution is the set of outcomes with nonzero probability. Fritz uses support to define possibilistic reasoning — forget the probabilities, just ask "can this happen?"

This connects to 🍞 Fritz, Perrone, Rezagholi 2021, where support becomes a monad morphism.

Scheme

; Support: collapse probabilities to reachability
; supp(dist) = the set of values with nonzero probability

(define (support dist)
  (map car (filter (lambda (p) (> (cdr p) 0)) dist)))

(define (f x)
  (list (cons x 0.7) (cons (+ x 1) 0.2) (cons (+ x 2) 0.1)))

(display (support (f 5)))
; (5 6 7) — these are the possible outputs

Informativeness preorder

One morphism is "more informative" than another when it factors through: f ≤ g if you can recover g's output from f's output by post-processing. This is the categorical version of sufficient statistics.

Scheme

; Informativeness: f ≤ g if g factors through f
; "f is at least as informative as g"

(define (full-data x) (list (cons (cons x (* x x)) 1.0)))
(define (just-square x) (list (cons (* x x) 1.0)))
(define (post-process pair) (list (cons (cdr (car pair)) 1.0)))

; full-data(-3) = ((-3 . 9) . 1.0)
; post-process recovers just-square
(display (full-data -3)) (newline)
(display (just-square -3)) (newline)
; full-data ≥ just-square: you can post-process to recover the square
; but not the reverse: just-square loses the sign

Notation reference

Paper	Python	Meaning
FinStoch	# dicts as distributions	Category of finite stochastic matrices
f >=> g	kleisli(f, g)	Kleisli composition (marginalize intermediate)
ν	lambda x: (x, x)	Copy morphism
ε	lambda x: None	Discard morphism
C_det	# functions where copy commutes	Deterministic subcategory
supp(f)	{k for k,v in d.items() if v>0}	Support — nonzero outcomes
f ≤ g	# g factors through f	Informativeness preorder

Neighbors

Other paper pages

🍞 Staton 2025 — Hoare logic works in FinStoch because it's an imperative category
🍞 Baez, Fritz, Leinster 2011 — entropy is the unique information loss measure in FinStoch
🍞 Fritz, Perrone, Rezagholi 2021 — support as a monad morphism

Foundations (Wikipedia)

Translation notes

All examples use finite dicts as distributions over small sets. Fritz's paper works over arbitrary measurable spaces, Borel categories, and continuous distributions. For example: the copy-naturality test on this page checks a function over a handful of integers. In the paper, the same test applies to a Gaussian process over ℝ: infinite-dimensional, continuous, requiring measure theory to even state. The structure (does copying commute?) is identical. The generality is not.

The informativeness preorder example is a simplified illustration. The full definition involves almost-sure factorization, not pointwise equality. Every example is Simplified unless marked otherwise.

Ready for the real thing? arxiv

Read the paper. Start at §10 (p.48) for the deterministic/stochastic boundary, §16 (p.72) for the informativeness preorder.

Framework connection: FinStoch is the ambient category for the Natural Framework pipeline; the copy-naturality test distinguishes deterministic stages from stochastic ones. (Ambient Category, The Natural Framework)

← Staton 2025 · 1 of 21 by june.kim Fritz, Perrone 2021 · 3 of 21 →