Prereg Checklist

Part of the methodology series.

Every thinker in the scientific method collection diagnosed a failure mode. A pre-registration that survives all twenty questions has been stress-tested against four centuries of mistakes.

This is written for humans and agents. An agent running an investigation should answer these questions before it touches the data, changes a prompt, expands a sample, or summarizes results. The point is not ceremony; it is to prevent the investigation from becoming adaptive in ways the final report hides.

Each question comes from a thinker’s core insight, ordered by the arc: empiricists first, then systematizers, falsificationists, causal inference, crisis and repair, and finally the trail. A pre-registration doesn’t need to answer every question, but every skipped question should be marked with a reason.

The questions

#	Source	Question	What it catches
1	Bacon	Are you observing the phenomenon systematically, or cherry-picking instances that fit?	Idol of the cave: selection bias in your sample
2	Bacon	Is your data collection procedure fixed before you see results?	Idol of the tribe: fitting the method to the outcome
3	Descartes	What assumptions, if false, would invalidate the conclusion?	Unexamined premises hiding behind “obvious” framing
4	Hume	What mechanism connects your observations to the general claim? If you only have correlation, say so.	Inductive leap without a causal story
5	Hume	Would the conclusion survive on a different population, dataset, task distribution, or environment?	Generalizing from a selected sample to “the world”
6	Mill	Are you varying one thing and holding the rest constant?	Confounded comparisons
7	Mill	What is your control? Does it isolate the treatment from “any change at all”?	Missing or inadequate control condition
8	Chamberlin	What competing explanations would produce the same result? Are you testing between them?	Confirmation bias: only one hypothesis on the table
9	Fisher	Is assignment to conditions randomized, or could a confound explain the difference?	Systematic bias in treatment assignment
10	Popper	What specific observation would make you say “the hypothesis is wrong”?	Unfalsifiable claims dressed as predictions
11	Popper	Is the falsification bar high enough to be informative, or is it set where success is easy?	Weak predictions that can’t fail
12	Kuhn	What assumptions does your field, benchmark, or task definition make invisible? What result would they prevent you from noticing?	Invisible assumptions of the field
13	Platt	Does this experiment exclude at least one alternative, or does every outcome confirm?	Experiments that cannot distinguish between theories
14	Feynman	How would you fool yourself? What’s the most likely way the result is an artifact of the method?	Cargo cult rigor: the form of a test without the honesty
15	Pearl	Is your claim causal? If so, what is the intervention, and can you rule out confounders?	Causal language without a causal design
16	Ioannidis	Given your sample size, flexibility, and prior probability, what’s the chance a positive result is actually true?	Underpowered studies with researcher degrees of freedom
17	Mayo	Could your test pass even if the hypothesis is false? How severe is the test?	Passing an easy test and calling it evidence
18	Gwern	Will you publish the full trail — data, nulls, exclusions, prompt iterations, failed pilots, scoring changes, analysis forks — or only what confirms?	Goodharting the method by curating what’s visible
19	Gwern	Are your predictions timestamped and specific enough to be scored?	Vague predictions that can be claimed as correct after the fact
20	Ramdas	If you plan to peek at results or expand the sample, does your evidence measure remain valid?	Optional stopping and sequential testing without anytime-valid guarantees

How to use this

Before registering, walk through all twenty questions with the prereg open. For each one, write down the answer or write “skipped — [reason].” The exercise takes thirty minutes and catches the failures that feel obvious in retrospect.

Not all questions are equal. Questions 10 and 14 (Popper and Feynman) are the most likely to surface fatal problems. For alternatives you hadn’t considered, question 8 (Chamberlin) is the sharpest. Question 18 (Gwern) is the one most often skipped and most often regretted.

A pre-registration that cannot name what would refute the hypothesis is not registering an experiment; it is registering a story. One that won’t publish the trail is asking readers to trust the part least deserving of trust: the filtered final narrative.

The arc, compressed

Bacon: observe, don’t speculate. Hume: your observations don’t prove what you think. Mill: isolate the cause. Chamberlin: consider alternatives. Popper: name what would refute you. Feynman: name how you’d fool yourself. Pearl: is the causal claim justified? Ioannidis: is the power adequate? Mayo: is the test severe? Gwern: publish the trail.

Each exists because someone skipped it, got a wrong answer, and spent decades correcting the mistake.

Derived from the scientific method collection.