cons
Part of the cognition and methodology series. Builds on Functor Wizardry and The Natural Framework.
We all want to do our best work. But it takes so long, and so much of it is repetitive. Even with AI tools, we watch ourselves repeat: same prompts, same debugging patterns, same review cycles. You solved this exact class of issue three months ago, but the solution lives in a chat transcript you’ll never reopen. The agent that helped you has no memory of it either. Where does the experience go between sessions?
Functor Wizardry left this gap open. The pipeline composes, but the circle doesn’t close. No epmem → smem morphism, no monoid. Here’s how it closes.
126 turns, 4 days
I worked on Soar across 24 sessions. The consolidation harness logged every turn — 21,000 action records total, 1,369 in Soar sessions, 126 with Soar-relevant intent. Here’s what those 126 turns actually did:
Perceive the history. Load action records — tool sequences, prompt intents, timestamps. 21,000 entries. Grep for Soar-relevant turns: 126 survive.
Filter the noise. TF-IDF on prompt tokens drops the “yes” and “sure” confirmations, the deploy cycles, the PageLeft crawls. What survives is the arc: research → diagnose → prescribe → implement → write.
Attend the arc. Rank the survivors into phases, identify the decision points a human can’t skip:
- Research (Mar 19–23): fetch arxiv papers, read Soar source, email Laird. 40+ turns of WebFetch → Read(.pdf) → Grep → Read(.cpp).
- Diagnose (Mar 23–25): map to framework roles, substantiate against source code. Read(.md) → Agent → Grep → Edit(.md).
- Prescribe (Mar 25): look up parts bin, propose candidates. Read(.yml) → Edit(.yml) → Edit(.md).
- Implement (Mar 25): draft PRs against Soar repo. Read → Agent → Bash(gh).
- Write up (Mar 25–27): draft posts, humanize, codex review. Edit(.md) → Skill(humanize) → Skill(codex) → Bash(deploy).
Five phases. Five human decisions: what’s broken, which diagnosis, which algorithm, which PR, which framing. Everything between decisions is tool sequences that repeat.
Transmit the result: this post and the skill spec it implies. Both prose with contracts.
Consolidate: compress the arc into a pipeline that runs next time. The checkpoints are defined, the intermediate documents are identified, and each one only has to satisfy the postcondition that feeds the next step. A consolidated “diagnose foreign system” skill would run:
- Research (automated) → checkpoint: human picks what’s broken → diagnosis document
- Diagnose (automated) → checkpoint: human confirms role mapping → prescription document
- Prescribe (automated) → checkpoint: human selects algorithm → spec document
- Implement (automated) → checkpoint: human reviews PR → code
- Write up (automated) → checkpoint: human edits framing → published post
Each checkpoint produces a prose document. Each document compresses the sessions that preceded it. Each only needs to be good enough for the next phase. The quality bar is the contract, not perfection.
Closing the loop
The loop closes when the output changes the input. A new skill means future sessions run differently. The action store shifts. Next time the pipeline runs, it perceives a different distribution.
If the skill worked, the old pattern vanishes — absorbed. If it didn’t, the old pattern persists and the pipeline proposes again. The pipeline is its own error signal.
- Cycle 0: No skills. 126 raw turns. Pipeline extracts the five-phase arc.
- Cycle 1: Diagnosis skill installed. Next foreign codebase: 5 checkpoints instead of 126 turns. Pipeline perceives the compressed distribution.
- Cycle n: Skills compose. The pipeline finds higher-order patterns — sequences of skills, not sequences of tools. Each cycle’s proposals reflect the current library, not the original raw logs.
The store trends toward fewer recurring patterns. Bad skills create temporary growth, but self-correction absorbs them. The direction is toward the limit.
The monoid
Without the epmem → smem morphism, the three stores don’t compose and you don’t get monoidal structure. With it:
- epmem → smem: Episodes become patterns.
- smem → pmem: Patterns become procedures.
- pmem → epmem: Run the skill on the next codebase. Procedures produce new episodes.
The operation is one full cycle: take an action history and a skill cache, produce an updated skill cache, run from that cache next time. In the limit, grouping cycles doesn’t change the final cache. The identity is null consolidation: nothing learned, cache unchanged. The monoid closes.
What comes free
Idempotency. On a converged skill set, run the full cycle twice on the same action store, same result. Two iterations to convergence, same as the slop-detection finding.
Self-correction. A bad skill creates new recurring patterns — workarounds. Next cycle detects them, proposes replacements. The error signal is the action store itself.
Compositionality. Skills compose because they’re endofunctors on the same cache. Composition emerges from the monoid, not from explicit design.
Time compression. The next diagnosis takes hours instead of days. Not because the work disappeared, but because the human attention moved from every turn to five decisions.
But the boundary moves. Each cycle, some of what was Attend becomes Filter. I used to judge every word choice by hand. Then I noticed the patterns: filler words, stacked hedges, choppy rhythm. Those became /humanize, /tighten, /sharpen. My taste, codified. The judgment didn’t disappear. It moved upward: from words to sentences to arc.
That’s the ratchet. cons doesn’t just add skills to the cache. It absorbs Attend into Filter, and the human moves to higher ground.
The first skill specs were loose: sixty lines of agent prompts, codex filter logic, dead-end formats. I was encoding implementation because I couldn’t yet separate judgment from procedure.
After the second run I saw what was wasted. Development stage kept getting re-discovered at every step instead of flowing forward as a constraint. Fan-out logic was identical across agents. Gate questions were multiple-choice when they should have been open-ended — I was constraining my own thinking at the one point where I needed freedom.
Three cycles compressed the implementation detail into sub-skills and pushed the constraints forward. What’s left is five open questions at five gates. The rest is mechanical. Four days became an hour of pure Attend.
This isn’t new. An entire generation of startups ran this loop and called it Lean Startup: build (procedures → episodes), measure (episodes → knowledge), learn (knowledge → procedures).
Product, startup, incubator. You attend to the product until the process stabilizes. Then you attend to the process. Then you attend to the process that produces processes. Each layer shift is a consolidation cycle: the contents become the machinery, and you move up to monitor the machinery instead of operating it.
Written via the double loop. This post is its own demo: the conversation that wrote it performed perceive → filter → attend → transmit on the Soar episodes, and the result is the consolidation record you just read.