The Parts Bin

Part of the cognition series. Builds on The Handshake.

Near-misses

The Natural Framework derives six roles from temporal flow and bounded storage. The Handshake gives each step a contract: precondition and postcondition. The CS textbook is full of operations that almost satisfy them. Almost is the diagnosis.

PageRank is the first specimen. Its postcondition says “ranked by authority” but the Attend contract requires diversity and a bound. It made a lot of money regardless. A broken step doesn’t kill instantly; it compounds.

Google bolted on re-ranking, topic diversity, and freshness signals over two decades: incremental upgrades toward a contract-preserving morphism, one patch at a time.

Quicksort is the second specimen. It satisfies order, the most visible guarantee. But the Attend contract also requires diversity (survivors are dissimilar) and boundedness (output is finite top-k, not a total order). Quicksort is the default because order is the only guarantee most systems measure.

Most familiar algorithms are near-misses. They satisfy some guarantees but not all. The formal test is iteration stability: run the full loop — Perceive through Remember and back — and observe which postcondition degrades. Diversity is the guarantee iteration kills first, because without repulsion between winners the same cluster dominates every cycle. Noticing which guarantee fails under iteration: that’s the diagnostic power.

Degenerate cases are the other edge. A chatbot has no policy store: nil Filter, nil Attend, nil Consolidate, nil Remember. Token in, token out, same rate. That is passthrough, predicted by the existence proofs when policy is zero.

Union-find has no selection: every element is kept, partitions only merge, the structure grows monotonically. No competitive core, no lossy step, no compression. Both are useful. Neither learns.

Diagnostic resolution

The monad is the container — it doesn’t break. What breaks is a morphism’s postcondition. What cascades is the composition. The derivation forces contracts of this shape, and the induction proof shows they compose forever if they compose once.

Step N+1’s precondition is step N’s postcondition. A correct algorithm with a broken precondition is a correct algorithm that produces garbage.

Governance operates at zero resolution. It can’t see the interface, so it replaces the whole composition: fire everyone, rewrite from scratch, new system. Expensive. The framework increases resolution: instead of “the system is broken,” the diagnosis is “Attend’s diversity guarantee is missing.” The parts bin increases it further: instead of “Attend is broken,” the prescription is “this is a top-k sort where you need MMR re-ranking.” Swap one operation, same slot, contract restored.

The most efficient fix requires the most precise name. The most feared enemy is one who cannot be named. The handshake is a naming system.

Diagnosis LLM took a couple of hours. Three layers, six steps each, SOAP notes with six-component plans. That precision came from the framework, not domain expertise in ML. Before germ theory: “the patient has bad air.” After: “streptococcus, here’s penicillin.” The framework is the microscope. The handshake is why the microscope works.

Agent

The framework is the diagnostic manual. The parts bin, once ordered, is the pharmacy. The handshake is why the prescriptions compose. What would it look like to use them systematically?

Describe. A product manager says: “users sign up but never come back.” An agent maps this to the six steps. Cache works. Users arrive and data is stored. Filter is missing. Users get everything, keep nothing. Consolidate is nil. Nothing changes between sessions.

Diagnose. The agent isolates Filter. Then drills deeper: is it the precondition (wrong input from upstream), the operation (wrong mechanism), the postcondition (contract not satisfied), the fidelity (contract satisfied but too lossy), or the scale (right operation, wrong timescale)?

Prescribe. The agent queries the taxonomy. Filters by: matching precondition, matching postcondition, sufficient fidelity, compatible scale. Returns a ranked list of candidate operations from across domains. “Your Filter slot needs: output strictly smaller, criterion applied uniformly. Candidates from the parts bin, ordered by fidelity and cost.”

Validate. The agent checks that the prescribed operation’s postcondition matches the next step’s precondition. If not, it flags the interface mismatch before you build it.

The doctor doesn’t need to understand category theory. They need to read the contracts: precondition, postcondition, fidelity. The handshake is the pharmacology. The taxonomy is the PDR. The agent is the resident who can look things up fast.

Catalog

The full catalog follows. Each entry is an operation: input in, output out. If the precondition and postcondition match the contract, the operation fits the slot. Filter decides per-item admissibility: does this item pass the criterion? Attend decides how admitted items relate to each other: order, diversity, and bound are slate-level properties.

Perceive (raw → encoded) — the column every system gets right, because nothing else works until it does.

OperationPreconditionPostcondition
Lexical analysisRaw byte streamToken sequence, parseable
Parsing (LL/LR)Token stream, conforms to grammarAST with explicit structure, traversable
JSON parsingRaw string, well-formedStructured object, addressable by key
A/D conversionContinuous analog signalDiscrete samples, quantized
SIFT descriptor extractionRaw pixel gridKeypoint descriptors, matchable

Cache (encoded → indexed) — the most studied column. Idreos (2018) built a periodic table from five design primitives.

OperationPreconditionPostcondition
Hash indexingRecords with stable keysKeyed index, exact retrieval by key
B-tree index constructionRecords with ordered keysBalanced index, retrieval + range queries
Trie insertionString keys over finite alphabetPrefix-indexed, retrieval by string or prefix
Inverted index constructionTokenized corpus with document IDsPosting lists, retrieval by term
LSM-tree flushSorted runs in memoryPersistent key-value index, retrievable after compaction
Skip-list indexingOrdered entriesProbabilistic index, O(log n) retrieval

Filter (indexed → selected, strictly smaller) — gates the data store, where most systems use exact predicates. The derivation proves a gate must exist whenever outputs are a proper subset of inputs.

OperationPreconditionPostcondition
Predicate selection (WHERE)Indexed relation + boolean predicateSubset matching predicate, strictly smaller
Range queryOrdered index + interval boundsSubset within interval, strictly smaller
Threshold filteringScored items + threshold tSubset meeting threshold, strictly smaller
Regex extractionString corpus + patternMatching spans retained, non-matches discarded
k-NN radius pruningMetric index + query + radius rSubset within radius, strictly smaller
Pareto filteringCandidates with objective vectorsNon-dominated subset, strictly smaller

Attend ((policy, selected) → ranked, diverse, bounded) — reads the policy store: given the survivors, which are worth pursuing? Policy is a function; it routes data. Control separates from data (derived). Most ranking algorithms satisfy order but miss diversity and bound.

OperationPreconditionPostcondition
MMR re-rankingCandidates + relevance scores + similarity measureTop-k ordered, diversity penalized, bounded
DPP top-k selectionCandidates + relevance weights + similarity kernelTop-k ranked, mutually dissimilar, bounded
xQuAD re-rankingCandidates + relevance + subtopic coverageTop-k ordered, aspect coverage explicit, bounded
Submodular maximizationCandidates + submodular utility (relevance + coverage)Top-k greedy-ranked, diminishing-return diversity, bounded
Diversified beam searchStepwise expansions + diversity penaltyTop-b retained, non-redundant alternatives, bounded

Near-misses (diagnostic counterexamples):

Consolidate (persisted → policy′) — the backward pass. Reads from Remember (which caches the ranked output) and writes to the substrate, reshaping how each forward stage processes on the next cycle. Contains its own inner loop: mutate, score, select. The outer pipe cannot observe this loop directly — it can only notice that Attend’s behavior changed. Babbage’s Difference Engine (1822) is the mechanical precedent: take a sequence of values, compute successive differences, extract the generating polynomial. Ranked examples in, compact rule out. The interface is Consolidate’s type signature; Babbage built it without Attend to feed it.

I-Con (2025) built a periodic table for this column. A blank cell predicted a new algorithm that beat the state of the art.

OperationPreconditionPostcondition
Gradient descent updateLoss contributions + current weightsWeights updated, future predictions altered
Bayesian posterior updatePrior parameters + weighted observationsPosterior compressed, future inference altered
K-means updateWeighted points + codebook size kk prototypes replacing many points, lossy
Incremental PCAObservations in high dimensionLow-rank basis, future projection altered
Decision tree inductionRanked labeled examplesCompact rule set, future classification altered
Prototype condensationRanked candidates + compression budgetSmall exemplar set, lossy approximation for future matching

Remember (ranked → persisted) — the last forward stage. Lossless relative to its input: no additional loss at this step. Remember also serves as the cache for Consolidate: ranked outcomes are stored here, and Consolidate reads from them asynchronously. Remember is not a separate store. It is the historically shaped substrate, the part of the medium that carries the system’s past forward. A database row is Remember for the database pipe but Cache for the CRM pipe. A log entry is Remember for the logger but Cache for the monitoring pipe.

If the thing being persisted is a representation rather than the final entity, it’s Cache at this level, not Remember. The discipline: list write operations only.

OperationPreconditionPostcondition
WAL append + fsyncSerialized state recordDurable on crash, recoverable next cycle
Transaction commitValidated write setPersisted, visible for future reads
Git object write + commitContent-addressed objects + manifestDurable commit graph, retrievable by hash
Checkpoint serializationIn-memory model/statePersisted checkpoint, loadable on next run
Copy-on-write snapshot commitConsistent compressed state imagePersistent snapshot, addressable by version
SSTable flushImmutable key-value run in memoryDurable on-disk run, retrievable by key

Grid

The catalog is a list. A list lets you browse. Browsing doesn’t scale. You need an index. The index needs axes.

Take Filter. Two axes, selection semantics vs. error guarantee:

ExactBounded approximationProbabilistic
PredicateWHERE, range queryThreshold filtering (soft margin)Bloom filter
SimilarityExact NN pruningk-NN radius pruningLSH filtering
DominancePareto filteringε-dominance filteringStochastic dominance

Every cell fills with a known algorithm. Bloom filter at predicate × probabilistic — the most deployed probabilistic data structure in computing. ε-dominance at dominance × bounded — standard in multi-objective optimization. Stochastic dominance at dominance × probabilistic — decision theory since the 1960s. A clean partition with no awkward fits validates the axes, but a 3×3 where every cell fills on sight is a confirmation, not a prediction.

Take Attend. Lay operations on output form vs. redundancy control:

NoneImplicitExplicit
Top-k slateHeap top-kBeam searchMMR, DPP top-k, xQuAD
Single bestargmaxTournament selectionSimulated annealing, CMA-ES
Path/treeDijkstra, A*MCTSPortfolio solvers

The right column filled late. CS built ranking algorithms for decades and almost never baked redundancy control into the postcondition. It was bolted on after. Simulated annealing and CMA-ES find single optima by explicitly diversifying the search: temperature schedules force basin-hopping, covariance matrices enforce spread. Portfolio solvers spawn threads with different random seeds; stochasticity encourages divergence, budget kills at deadline, final selection picks the best from a diverse pool. Biological evolution does this with mutation rate as the stochastic dial.

The grids organize the catalog. The axes partition the design space cleanly enough to prescribe: given a broken slot, name the coordinates, look up the candidate. Whether finer grids can do more — find blank cells, predict operations that haven’t been composed yet — is the next question.

Future work

The derivation establishes contracts. The catalog and grids give the pharmacy. Two things are ready now:

Three need formalization work. The existence proofs and types are defined; the composition proofs are sketched:

One needs new theory. The derivation is for linear pipelines only:

Every gap in the bin is an almost that hasn’t been named yet.


Written via the double loop.