Caches All the Way Down

Part of the cognition series.

In software, we say “everything’s a wrapper.” An ORM wraps SQL, which wraps disk I/O, which wraps silicon. Each layer exposes the same four verbs (create, read, update, delete) and delegates to the layer below. Wrappers all the way down.

But CRUD is only the Remember interface. Store, retrieve, update, delete: that’s the API to the persistent store. When we say “wrapper” we’re seeing one role out of six and calling it the whole thing.

The rest of the pipeline

SQL has WHERE (Filter) and ORDER BY (Attend). The ORM above it has scopes (Filter) and eager loading (Attend). The API above that has authorization (Filter) and pagination (Attend). The frontend above that has conditional rendering (Filter) and sort/highlight (Attend).

Every layer re-implements the full pipeline over the data from the layer below. We say “just a wrapper” because Remember looks identical at every level. The other roles look different, so we don’t notice the repetition.

The computing stack

Digital computing is the clearest case because we built it from the floor up. The entire hardware-software stack is the Cache tower made visible, each level adding capacity until the full pipeline emerges.

LevelCache capacityPerceiveCacheFilterAttendConsolidateRemember
Transistor1 bitVoltageThreshold gate
Logic gatefew bitsInput linesTransistorsBoolean functionOutput line
ALUwordOperands, opcodeLogic gates, registersOverflow, flagsOpcode selects operationResult register
CPUKB (L1)Fetch instructionALUs, pipeline stagesBranch predictionScheduling, out-of-orderBranch predictor learnsRegister file, L1 cache
OSGB (RAM)Interrupts, I/OCPUs, memory hierarchyCache evictionScheduler dispatchDefrag, compactionFilesystem, swap
DatabaseTB (disk)Query arrivesOS filesystem, B-treesWHERE clauseORDER BY, LIMITVACUUM, reindexThe table on disk
Backendapp memoryRequest arrivesDatabase, ORMAuth, validationsPagination, sortingSchema migrationsDatabase write
FrontendviewportUser eventBackend responses, DOMConditional renderingSort, highlight, focusUser preferenceslocalStorage, DOM state
Containerimage layersBuild contextApplications, runtime.dockerignore, multi-stage discardLayer ordering for cache hitsImage optimizationImage in registry
KubernetesclusterDesired state, metricsContainers, etcdAdmission controllers, resource quotasScheduler: affinity, constraintsOperator reconciliation loopsCluster state
AutoscalerfleetCPU, memory, request rateKubernetes clustersCooldown periods, min/max boundsScaling policy: which pool, how muchPolicy tuning from historyThe running fleet

The transistor row: one bit, pure threshold gating, no Attend, no Consolidate. The bool store. The autoscaler row: full pipeline across a fleet. Each row between them added capacity, and each time it crossed a threshold, another role filled in.

Nobody designed it this way. Engineers at each level solved their local problem (“hold more items, select among them, rank the survivors”) and arrived at the same pipeline independently. Add storage, and Filter and Attend follow.

The biological stack

The same tower in a person’s energy storage. Each level caches the level below.

LevelCache capacityPerceiveCacheFilterAttendConsolidateRemember
ATP1 bondSubstrate arrivesEnzyme lock-and-key
Mitochondrionmany ATP moleculesPyruvate, O₂ATP moleculesMembrane potential thresholdUncoupling proteinsATP output rate
Cellglycogen granulesGlucose, insulin signalMitochondriaMetabolic gating (hexokinase)Energy allocation across processesGene expressionGlycogen, protein
Liver~100g glycogenBlood glucose, hormonesCells, hepatocytesGlucokinase thresholdGlycogenesis vs gluconeogenesisMetabolic adaptationBlood glucose level
Adipose / musclekg of fat, kg of proteinInsulin, excess energyLiver, circulating glucoseLipogenesis thresholdWhich depots to mobilizeSet point adjustmentFat mass
Mammaltotal reservesHunger, satiety signalsAdipose, muscleGhrelin, leptin, appetite regulationMeal choice, macronutrient balanceMetabolic adaptation, microbiomeBody composition

ATP: one phosphate bond, pure enzyme gating. The bool store. The mammal row: full pipeline with hunger, choice, and metabolic adaptation. Same tower. Capacity grows, roles fill in. Evolution built each level because the one below couldn’t manage energy at the scale above.

Two substrates. Same staircase. Each level’s Cache is the level below, and each added enough capacity for another role to fill in. The shape repeats because the constraint forces it. The constraint also forces it to stop.

The tower has a floor

By induction on storage capacity.

Base case. A Cache with one bit of storage is a boolean. Pass or reject. Selection requires at least two items; one slot has nothing to compare. The only operation is threshold gating: a single if. No Attend (nothing to rank), no Consolidate (nothing to learn). The pipeline collapses to Filter alone.

Inductive step. A Cache at depth d with capacity S may contain a sub-pipeline whose sub-Cache at depth d+1 has capacity S’. Boundary 1 applies: the sub-Cache must fit inside the parent, so S’ < S by pigeonhole. Strictly decreasing.

Termination. Capacity is a natural number. A strictly decreasing sequence of naturals terminates. It reaches 1 bit. The tower has finite depth.

How deep is the universe’s cache? As deep as physics allows — down to whatever distinction is smallest. A qubit. A Planck bit. The bool store at the bottom of everything.

Three nested pipelines — CPU, ALU, Logic Gate — each Cache zooms into the full pipeline of the level below. At the bottom, transistors reduce to a bool store: 0 or 1. The tower terminates.

The Handshake proves the analogous result for Consolidate: induction on bit budget, with the data processing inequality as the decreasing measure, terminating at passthrough. Cache’s tower uses storage capacity instead, terminating at the bool store. Same structure, different measures. Consolidate is about compression. Cache is about capacity.

Bool stores in the wild

At the floor of every Cache tower, you should find a bool store doing threshold gating. And you do.

Ion channels: open or closed. One bit. Voltage threshold gates molecules through. No ranking, no learning. Pure Filter.

Transistors: on or off. Voltage above threshold passes the signal. Below, it blocks.

MHC binding: fits or doesn’t. Antigen presentation at the molecular level is a shape match — binding affinity is graded, but the groove either holds the peptide or releases it. Ranking among candidates happens one level up, where limited surface slots force selection among the fragments that passed.

Each is a Cache collapsed to a boolean. The prediction: below a bool store, no further self-similarity. You can’t have a sub-pipeline inside an if statement. If you found something smaller than a bool still doing selection, the argument would be falsified. But a bool is the minimum unit of distinction.

The AI stack

The same tower for AI. Read the dim cells.

LevelCache capacityPerceiveCacheFilterAttendConsolidateRemember
Weight1 floatGradientLearning rate threshold
Neuron~hundreds of weightsInput vectorWeightsReLU, activationBackprop (offline, sealed)Activation output
Attention head~millions paramsQuery, key, valueNeuronsSoftmax maskingAttention scoresTraining (sealed)Weighted value
Blockattention headsResidual inputAttention headsLayer normMulti-head selectionTraining (sealed)Block output
Modelbillions of paramsToken sequenceBlocks, KV cacheNo input gatingNo diversity enforcementTraining (sealed)Next token
Context window~128K tokensUser prompt, tool resultsModelMinimal redundancy inhibitionRecency bias, no DPPEphemeral — dies with the sessionResponse
Agentcontext + toolsTask, codebaseContext windowsFile selection heuristicsContext window selectionSkill creation, memory filesCompleted task
Swarmfleet of agentsWorkloadAgentsTask routingLoad balancingNo shared learningNo collective memory

The forward pass is well-optimized at the bottom: softmax is genuine Filter, attention scores genuine Attend. But Consolidate is dim at every level. Training is sealed, so the model learns nothing from its conversation and the context window dies with the session. The agent’s memory files are a bandage, not a schema.

Above the model level, almost everything is dim. The context window has minimal redundancy inhibition; every token gets in until the window is full. The agent selects files by heuristic, not competition. The swarm has no shared memory, no collective consolidation.

The computing stack filled in its dim cells over sixty years. The biology stack filled them in over four billion. The AI stack is a few years old and it shows. The dim cells are the roadmap.

The diagnostic

Active Consolidate within a Cache means there’s at least one more level below. Passthrough means you’ve hit the floor. The query optimizer learns from execution statistics, so it contains another pipeline. Ion channels don’t learn. The gate is the gate.

Every thin wrapper that’s genuinely CRUD passthrough either stays thin (it was at the floor, with nothing to learn) or grows filter and attend logic (it was above the floor, and usage pressure forced the missing roles in). Every ORM starts thin. The ones above the floor never stay that way.

It’s not wrappers all the way down. It’s pipelines — until you hit the bool.


Written via the double loop. More at pageleft.cc.