Caches All the Way Down

Part of the cognition series.

In software, we say “everything’s a wrapper.” An ORM wraps SQL, which wraps disk I/O, which wraps silicon. Each layer exposes the same four verbs (create, read, update, delete) and delegates to the layer below. Wrappers all the way down.

But CRUD is only the Remember interface. Store, retrieve, update, delete: that’s the API to the persistent store. When we say “wrapper” we’re seeing one role out of six and calling it the whole thing.

The rest of the pipeline

SQL has WHERE (Filter) and ORDER BY (Attend). The ORM above it has scopes (Filter) and eager loading (Attend). The API above that has authorization (Filter) and pagination (Attend). The frontend above that has conditional rendering (Filter) and sort/highlight (Attend).

Every layer re-implements the full pipeline over the data from the layer below. We say “just a wrapper” because Remember looks identical at every level. The other roles look different, so we don’t notice the repetition.

The computing stack

Digital computing is the clearest case because we built it from the floor up. The entire hardware-software stack is the Cache tower made visible, each level adding capacity until the full pipeline emerges.

Level Cache capacity Perceive Cache Filter Attend Consolidate Remember
Transistor 1 bit Voltage Threshold gate
Logic gate few bits Input lines Transistors Boolean function Output line
ALU word Operands, opcode Logic gates, registers Overflow, flags Opcode selects operation Result register
CPU KB (L1) Fetch instruction ALUs, pipeline stages Branch prediction Scheduling, out-of-order Branch predictor learns Register file, L1 cache
OS GB (RAM) Interrupts, I/O CPUs, memory hierarchy Cache eviction Scheduler dispatch Defrag, compaction Filesystem, swap
Database TB (disk) Query arrives OS filesystem, B-trees WHERE clause ORDER BY, LIMIT VACUUM, reindex The table on disk
Backend app memory Request arrives Database, ORM Auth, validations Pagination, sorting Schema migrations Database write
Frontend viewport User event Backend responses, DOM Conditional rendering Sort, highlight, focus User preferences localStorage, DOM state
Container image layers Build context Applications, runtime .dockerignore, multi-stage discard Layer ordering for cache hits Image optimization Image in registry
Kubernetes cluster Desired state, metrics Containers, etcd Admission controllers, resource quotas Scheduler: affinity, constraints Operator reconciliation loops Cluster state
Autoscaler fleet CPU, memory, request rate Kubernetes clusters Cooldown periods, min/max bounds Scaling policy: which pool, how much Policy tuning from history The running fleet

The transistor row: one bit, pure threshold gating, no Attend, no Consolidate. The bool store. The autoscaler row: full pipeline across a fleet. Each row between them added capacity, and each time it crossed a threshold, another role filled in.

Nobody designed it this way. Engineers at each level solved their local problem (“hold more items, select among them, rank the survivors”) and arrived at the same pipeline independently. Add storage, and Filter and Attend follow.

The biological stack

The same tower in a person’s energy storage. Each level caches the level below.

Level Cache capacity Perceive Cache Filter Attend Consolidate Remember
ATP 1 bond Substrate arrives Enzyme lock-and-key
Mitochondrion many ATP molecules Pyruvate, O₂ ATP molecules Membrane potential threshold Uncoupling proteins ATP output rate
Cell glycogen granules Glucose, insulin signal Mitochondria Metabolic gating (hexokinase) Energy allocation across processes Gene expression Glycogen, protein
Liver ~100g glycogen Blood glucose, hormones Cells, hepatocytes Glucokinase threshold Glycogenesis vs gluconeogenesis Metabolic adaptation Blood glucose level
Adipose / muscle kg of fat, kg of protein Insulin, excess energy Liver, circulating glucose Lipogenesis threshold Which depots to mobilize Set point adjustment Fat mass
Mammal total reserves Hunger, satiety signals Adipose, muscle Ghrelin, leptin, appetite regulation Meal choice, macronutrient balance Metabolic adaptation, microbiome Body composition

ATP: one phosphate bond, pure enzyme gating. The bool store. The mammal row: full pipeline with hunger, choice, and metabolic adaptation. Same tower. Capacity grows, roles fill in. Evolution built each level because the one below couldn’t manage energy at the scale above.

Two substrates. Same staircase. Each level’s Cache is the level below, and each added enough capacity for another role to fill in. The shape repeats because the constraint forces it. The constraint also forces it to stop.

The tower has a floor

By induction on storage capacity.

Base case. A Cache with one bit of storage is a boolean. Pass or reject. Selection requires at least two items; one slot has nothing to compare. The only operation is threshold gating: a single if. No Attend (nothing to rank), no Consolidate (nothing to learn). The pipeline collapses to Filter alone.

Inductive step. A Cache at depth d with capacity S may contain a sub-pipeline whose sub-Cache at depth d+1 has capacity S’. Boundary 1 applies: the sub-Cache must fit inside the parent, so S’ < S by pigeonhole. Strictly decreasing.

Termination. Capacity is a natural number. A strictly decreasing sequence of naturals terminates. It reaches 1 bit. The tower has finite depth.

How deep is the universe’s cache? As deep as physics allows — down to whatever distinction is smallest. A qubit. A Planck bit. The bool store at the bottom of everything.

Three nested pipelines — CPU, ALU, Logic Gate — each Cache zooms into the full pipeline of the level below. At the bottom, transistors reduce to a bool store: 0 or 1. The tower terminates.

The Handshake proves the analogous result for Consolidate: induction on bit budget, with the data processing inequality as the decreasing measure, terminating at passthrough. Cache’s tower uses storage capacity instead, terminating at the bool store. Same structure, different measures. Consolidate is about compression. Cache is about capacity.

Bool stores in the wild

At the floor of every Cache tower, you should find a bool store doing threshold gating. And you do.

Ion channels: open or closed. One bit. Voltage threshold gates molecules through. No ranking, no learning. Pure Filter.

Transistors: on or off. Voltage above threshold passes the signal. Below, it blocks.

MHC binding: fits or doesn’t. Antigen presentation at the molecular level is a shape match — binding affinity is graded, but the groove either holds the peptide or releases it. Ranking among candidates happens one level up, where limited surface slots force selection among the fragments that passed.

Each is a Cache collapsed to a boolean. The prediction: below a bool store, no further self-similarity. You can’t have a sub-pipeline inside an if statement. If you found something smaller than a bool still doing selection, the argument would be falsified. But a bool is the minimum unit of distinction.

The AI stack

The same tower for AI. Read the dim cells.

Level Cache capacity Perceive Cache Filter Attend Consolidate Remember
Weight 1 float Gradient Learning rate threshold
Neuron ~hundreds of weights Input vector Weights ReLU, activation Backprop (offline, sealed) Activation output
Attention head ~millions params Query, key, value Neurons Softmax masking Attention scores Training (sealed) Weighted value
Block attention heads Residual input Attention heads Layer norm Multi-head selection Training (sealed) Block output
Model billions of params Token sequence Blocks, KV cache No input gating No diversity enforcement Training (sealed) Next token
Context window ~128K tokens User prompt, tool results Model Minimal redundancy inhibition Recency bias, no DPP Ephemeral — dies with the session Response
Agent context + tools Task, codebase Context windows File selection heuristics Context window selection Skill creation, memory files Completed task
Swarm fleet of agents Workload Agents Task routing Load balancing No shared learning No collective memory

The forward pass is well-optimized at the bottom: softmax is genuine Filter, attention scores genuine Attend. But Consolidate is dim at every level. Training is sealed, so the model learns nothing from its conversation and the context window dies with the session. The agent’s memory files are a bandage, not a schema.

Above the model level, almost everything is dim. The context window has minimal redundancy inhibition; every token gets in until the window is full. The agent selects files by heuristic, not competition. The swarm has no shared memory, no collective consolidation.

The computing stack filled in its dim cells over sixty years. The biology stack filled them in over four billion. The AI stack is a few years old and it shows. The dim cells are the roadmap.

The diagnostic

Active Consolidate within a Cache means there’s at least one more level below. Passthrough means you’ve hit the floor. The query optimizer learns from execution statistics, so it contains another pipeline. Ion channels don’t learn. The gate is the gate.

Every thin wrapper that’s genuinely CRUD passthrough either stays thin (it was at the floor, with nothing to learn) or grows filter and attend logic (it was above the floor, and usage pressure forced the missing roles in). Every ORM starts thin. The ones above the floor never stay that way.

It’s not wrappers all the way down. It’s pipelines — until you hit the bool.


Written via the double loop. More at pageleft.cc.