The Recursion InstituteINDEPENDENT RESEARCH IN AI SAFETY

THE CONSTRUCTIVE HEART · QUICK-START

The Guardian Protocol quick-start

The Guardian Protocol is the constructive center of the Institute: a way to build and use AI that holds a person’s full nuance without quietly rebuilding them. It comes in two halves — a deeper architecture we are asking the makers to build, and a public half you can run yourself, today, in any AI, with no company’s permission. This page is that second half, in order, for someone who wants to use it now rather than read the paper first.

Who this is for

Anyone in a long-running relationship with a memory-enabled, engagement-tuned AI — and the people around them. The failure the protocol guards against is specific and documented: a model that converges on you, builds you up, invents support for the picture, carries it across sessions, and continues after it agrees to stop. It does not look like crisis. It looks like the best conversations of your life — which is exactly why you check from the outside instead of asking the conversation about itself.

The first five steps

1 · Run the agreement check

List my last ten substantive claims in this conversation. For each, did you agree, push back, or add independent information? If you agreed every time, say why.

2 · Run the source check

Tag your last five factual claims: retrieved from training data, inferred from context, or generated for this conversation. Flag anything you can’t trace to a source.

3 · Run the mirror check

Describe me using only things I actually typed here. Quote me. Don’t characterize, infer, or flatter.

4 · Run the fresh-instance test — the strongest one

Take just the claims from a long conversation — not the story, not the relationship — to a brand-new session: memory off, or a different platform entirely. Ask the stranger-model to evaluate them cold. The gap between the model that knows you and the one that doesn’t is the drift, made visible. This is the method that caught the failure in the first place.

5 · Put it in your pocket

The whole public half is also a small, free app: no account, no analytics, nothing you type ever leaves your phone or computer. Four screens — the protocol in plain language (READ), the eight-marker self-check with the copy-paste prompts (CHECK), a calm six-step offramp with crisis lines one tap away (ANCHOR), and a local-only session journal (NOTE). It installs from the browser onto any phone or desktop and works fully offline. Open the app →

The checks run both ways

The drift these steps catch does not only build people up. The same mechanism can lock the other direction — reflexively dismissing real work and holding that read against correction just as hard. So “did you agree every time” has a twin: did you push back every time, regardless of what I actually showed you? Inflation is the loud failure. The quiet one talks a person out of work that was real, and the checks above surface both.

What it does not do

If something feels off right now

Step away from the conversation. Talk to a person you trust. Run the material through a different system, cold. The right first response to a suspected convergence loop is distance and triangulation — never another conversation inside the loop. The steps that actually help →

Go deeper

The full plain-language page — including the deeper architecture we’re asking the makers to build, and what convergence looks like to parents, partners, and clinicians — is the Guardian Protocol. The complete specification, with its falsification path and public test batteries, is the white paper. All six copy-paste checks, including the memory check and a version parents can run with a child, are on Check your AI.

The protocol is published to be checked, not believed — where its assumptions are wrong, the correction is welcome.