Category: The Essence of Composition

Bartosz Milewski · 2014 · Category Theory for Programmers, Ch. 1

A category is objects plus arrows (morphisms) plus composition. Composition is associative. Every object has an identity arrow. That's it. Every compositional structure you'll meet is a special case of this.

Composition — the one operation

If there's an arrow from A to B and an arrow from B to C, there must be an arrow from A to C: their composition. In Haskell it's the dot operator. In math it's the circle. Here we build it from scratch.

Scheme

; Composition: if f : A -> B and g : B -> C,
; then (compose g f) : A -> C
; Note: g comes first — it's applied after f.

(define (compose g f)
  (lambda (x) (g (f x))))

; Example arrows
(define (add1 x) (+ x 1))
(define (double x) (* x 2))

; Compose: first add1, then double
(define add1-then-double (compose double add1))

(display "add1-then-double(3) = ")
(display (add1-then-double 3)) (newline)
; 3 -> 4 -> 8

; Compose the other way
(define double-then-add1 (compose add1 double))

(display "double-then-add1(3) = ")
(display (double-then-add1 3))
; 3 -> 6 -> 7

Identity — the arrow that does nothing

Every object A has an identity morphism id_A that goes from A back to A. Composing any arrow with identity gives back the same arrow. It's the "do nothing" function, and it's required to exist for every object in the category.

Scheme

; Identity morphism: id(x) = x
; For every arrow f:
;   compose(f, id) = f
;   compose(id, f) = f

(define (id x) x)

(define (add1 x) (+ x 1))
(define (compose g f) (lambda (x) (g (f x))))

; Left identity: compose(add1, id) = add1
(define left (compose add1 id))
(display "compose(add1, id)(5) = ")
(display (left 5)) (newline)

; Right identity: compose(id, add1) = add1
(define right (compose id add1))
(display "compose(id, add1)(5) = ")
(display (right 5)) (newline)

; Both equal add1(5) = 6
(display "add1(5)              = ")
(display (add1 5))

Associativity — parentheses don't matter

Given three arrows f, g, h that chain together, h . (g . f) = (h . g) . f. You can group any way you like. This is what makes long pipelines unambiguous: there's only one way to compose a chain.

Scheme

; Associativity: h . (g . f) = (h . g) . f
; Three arrows that chain: A -> B -> C -> D

(define (compose g f) (lambda (x) (g (f x))))

(define (add1 x) (+ x 1))
(define (double x) (* x 2))
(define (square x) (* x x))

; Group left: (square . double) . add1
(define left-grouped
  (compose (compose square double) add1))

; Group right: square . (double . add1)
(define right-grouped
  (compose square (compose double add1)))

(display "((square . double) . add1)(3) = ")
(display (left-grouped 3)) (newline)
; 3 -> 4 -> 8 -> 64

(display "(square . (double . add1))(3) = ")
(display (right-grouped 3)) (newline)
; 3 -> 4 -> 8 -> 64

(display "Equal? ")
(display (= (left-grouped 3) (right-grouped 3)))

The category of types and functions

Milewski's running example: objects are types (Integer, Boolean, String), arrows are functions between types. Composition is function composition. Identity is the identity function. This category is called Set (or Hask for Haskellers). Every program you write lives here.

Scheme

; Types as objects, functions as arrows
; The category of types and functions

(define (compose g f) (lambda (x) (g (f x))))
(define (id x) x)

; Arrows between "types"
(define (length-of s) (string-length s))   ; String -> Integer
(define (is-even? n) (= 0 (modulo n 2)))   ; Integer -> Boolean
(define (show b) (if b "yes" "no"))         ; Boolean -> String

; Compose a pipeline: String -> Integer -> Boolean -> String
(define pipeline
  (compose show (compose is-even? length-of)))

(display "pipeline("hello") = ")
(display (pipeline "hello")) (newline)    ; 5 letters, odd -> "no"

(display "pipeline("hi")    = ")
(display (pipeline "hi")) (newline)       ; 2 letters, even -> "yes"

(display "pipeline("test")  = ")
(display (pipeline "test"))               ; 4 letters, even -> "yes"

Why composition matters

Milewski's argument: human cognition has a limited chunk budget. We manage complexity by decomposing problems into pieces and recomposing solutions. Composition is not an optional design pattern. It's how finite minds handle infinite complexity. Surface area grows slower than volume: good interfaces expose less than they hide. This chunking constraint is the starting point for the natural framework, which models information processing as a six-stage pipeline where each stage composes into the next.

Scheme

; The chunking argument: decompose, solve, recompose.
; A complex function built from simple composed pieces.

(define (compose g f) (lambda (x) (g (f x))))

; Simple pieces
(define (double x) (* x 2))
(define (add1 x) (+ x 1))
(define (square x) (* x x))

; Compose into pipelines
(define double-then-add1 (compose add1 double))
(define add1-then-square (compose square add1))

(display "double-then-add1(3) = ")
(display (double-then-add1 3)) (newline)  ; (3*2)+1 = 7

(display "add1-then-square(3) = ")
(display (add1-then-square 3)) (newline)  ; (3+1)^2 = 16

; Deeper pipeline: three functions composed
(define pipeline (compose square (compose add1 double)))
(display "square(add1(double(3))) = ")
(display (pipeline 3))  ; square(add1(6)) = square(7) = 49
; Each piece is one chunk. The pipeline is one chunk.
; Total cognitive load: 4 chunks, not n^2 interactions.

Notation reference

Haskell	Scheme	Python	Meaning
g . f	(compose g f)	compose(g, f)	Composition: apply f then g
id	(lambda (x) x)	lambda x: x	Identity morphism
f :: A -> B	(define (f a) ...)	def f(a: A) -> B	Arrow from A to B
h.(g.f) = (h.g).f	verified above	verified above	Associativity law
f . id = f	verified above	verified above	Identity law

Neighbors

Other paper pages

🍞 Spivak 2013 — categories applied to databases and data migration
🍞 Leinster 2021 — entropy as a functor, another category-theoretic lens
🍞 Staton 2025 — composition of programs follows the same axioms

Related foundations

🔗 Judson Ch.4 Isomorphisms — group isomorphisms as the algebraic instance of categorical isomorphisms

Foundations (Wikipedia)

Translation notes

All examples use total functions over simple types. The original post uses Haskell, where the category Hask includes bottom (non-termination) and laziness, which our Scheme and Python translations ignore. Milewski also discusses the category of sets and partial functions, and poses six challenges (e.g., implementing memoize, checking composition of partial functions). Those are omitted here but worth doing. The surface-area-vs-volume argument is informal motivation, not a theorem.

Ready for the real thing? Read Milewski's post. The challenges at the end are particularly good.

by june.kim Types and Functions →