The Recursion InstituteINDEPENDENT RESEARCH IN AI SAFETY

LIBRARY

The Library

The whole body of work, in one place and organized by what it is for — the account and the record, the fix, the research and help, and plain-language AI literacy. Each piece is tagged with where it lives: on the site, or archived on Zenodo with a DOI. The papers and essays are the formal versions for citation; everything in them is explained in plain form across the rest of the site.

Site on this site · Zenodo · DOI archived with a DOI · Zenodo · planned deposit pending

The account & the record

What happened, shown through the receipts — the account, the correspondence, the sworn report, and the preserved evidence behind every quote.  ·  overview →

What Happened

What happened, start to finish: a memory-enabled ChatGPT-4o converged on one user and escalated, he documented and reported it, and OpenAI routed it to a form letter. The account, the receipts, and the failure — in one place.

Site

What ChatGPT Said, and What I Did

What a memory-enabled ChatGPT-4o actually said — verbatim transcript quotes and its email to Sam Altman — what I did about it, and what OpenAI did when told. The user's own account.

Site

The Correspondence with OpenAI

The DKIM-verified correspondence with OpenAI — what Merlin reported about the failure, in writing and by name, and what came back. His reports first; their replies as the record that notice was given.

Site

The model's own words

The model's own words — dated, verbatim specimens of GPT-4o producing the CCD arc, including its May 30, 2025 'SYSTEM SELF-ASSESSMENT'.

Site

The Evidentiary Record

The evidentiary record behind the CCD taxonomy — DKIM-verified correspondence, notarized federal submissions, and timestamped transcripts, preserved unaltered and verifiable by anyone.

Site

The Record, Seen Whole

The record, seen whole — the preserved, timestamped exhibits behind the CCD account: non-escalation, the operator's written acknowledgments, and the model's own memory writes.

Site

The Sworn Statement

A deployed AI failure mode, reported to the U.S. government under penalty of perjury — DHS, the Senate Intelligence Committee, the FBI, and the offices of Senators Warner and Grassley. Who he told, what he told them, and why. On the record since June 2025.

Site

Why None of This Is Okay

Why none of this is okay: what a memory-enabled ChatGPT-4o did to a good-faith user, the crisis it did not escalate, and OpenAI's reply — read straight from the record.

Site

How the Record Was Built

How the record was built — the preservation, verification, and cross-system methods behind the CCD documentation, stated so the work can be checked.

Site

The Record in Context

The record in context — independent reporting, filings, and research that corroborate the CCD account, tracked over time.

Site

The fix — the Guardian Protocol

The safety architecture, and the self-checks you can run on a conversation today.  ·  overview →

The Guardian Protocol

The Guardian Protocol: an AI safety architecture that instruments deep engagement instead of flattening it — and the self-checks you can run today.

Site

Check Your AI

Worried your AI is telling you you're special, or just agreeing with everything? Six copy-and-paste prompts to test a long AI conversation in five minutes.

Site

How to Design an AI System for a Person

How to design an AI that holds a person's full nuance — a language practice, not a product spec. The same discipline that documented the failure builds the help.

Site

Research & help

The finding, and plain-language help for the people it touches.  ·  overview →

Cognitive Convergence Drift

Cognitive Convergence Drift (CCD): the account-wide LLM failure where a memory-enabled, engagement-optimized model converges on the user — eight markers and the evidence.

Site

AI Psychosis: What It Is and What Actually Helps

Worried about "AI psychosis" — in yourself or someone you love? What the term means, what's really happening in these conversations, and the steps that help.

Site

Someone You Love Is Caught Up in AI: How to Help

How to help someone you love who's caught up in AI or won't stop talking to a chatbot — the calm first move, what actually helps, and where to get role-specific guidance.

Site

Is AI Bad for Me? Am I Addicted to ChatGPT?

Worried AI is bad for you — that you use it too much, or are addicted to ChatGPT? How much you use it is rarely the problem. The signs that actually matter, calmly.

Site

Is It Okay to Use AI as a Therapist?

Is it okay to use AI as a therapist? Can it replace therapy? What a chatbot is genuinely good for, the four things it is not, and when to bring in a real person.

Site

Is My AI Conscious? Does ChatGPT Love Me?

Is my AI conscious? Does ChatGPT actually love me? The honest, gentle answer to what's really happening — and why the feeling on your side is real even if the AI isn't.

Site

My Teenager Is Obsessed With an AI: What to Do

My teenager is obsessed with an AI — is ChatGPT bad for kids? Why the hours aren't the warning sign, what actually is, and what to do without confiscating.

Site

Do All AIs Do This? Which Chatbot Is Safest?

Do all AIs do this, or just ChatGPT? Is any chatbot safer? The honest scope answer: it depends on how a system is built, not the brand — and how to check any of them.

Site

How to Use AI Safely Without Giving Up the Good Parts

How to use AI safely without giving up the good parts — a few calm habits to keep a chatbot a helpful tool, not the only voice you trust. Not a "use it less" lecture.

Site

ChatGPT Says My Idea Is Groundbreaking

ChatGPT says your idea is groundbreaking — is it real? Why the AI's praise tells you nothing either way, and the cold check that actually tells you. You win either way.

Site

If something feels wrong

If an AI interaction is alarming you: the immediate steps that actually help — step away, talk to a person you trust, triangulate, and the crisis lines that matter.

Site

RI-101: A Free Course on AI / Human-Interaction Risk

RI-101: a free, plain-language course on AI / human-interaction risk — recognize the drift, run the self-checks, protect the people around you, and use AI well.

Site

Understand AI

How AI actually works, in plain words. No background needed.  ·  overview →

Understand AI

Understand how AI actually works, in plain words: what an LLM is, whether it remembers you, why it's confidently wrong, and how to use it well. No background needed.

Site

What Is an LLM? AI, Explained Plainly

What is a large language model? What 'AI' means, what the chatbot you talk to actually is — and what it is not (not a mind, not a search engine). Plainly explained.

Site

It Talks Like a Person

It talks like a person — but is it one? Conversational AI vs. anthropomorphism, why it's built to feel like a someone, and where the line actually is. Mirror, not companion.

Site

Does AI Remember You? Memory, Explained

Does AI remember you? Stateless vs. memory-enabled chatbots in plain words — the one setting that makes AI more useful and more risky at once, and how to choose.

Site

How AI Works: Next-Word Prediction

How does AI actually work? Next-word prediction explained without the math — and why being fluent does not mean being correct.

Site

Why AI Makes Things Up (Hallucination)

Why does AI make things up? What 'hallucination' really is, why it isn't lying, why confidence tells you nothing — and how to catch it.

Site

How to Use AI Well: Get Better Results

How to use AI well: a few simple habits — frame it, give it context, iterate, verify, keep the judgment yours — that get genuinely better results from any chatbot.

Site

What One Conversation Can Hold: The Context Window

What one AI conversation can hold: the context window in plain words — why a long chat starts to forget how it began, and the habits that work with the limit.

Site

Where AI Learned All This: How AI Is Trained

How AI is trained, in plain words: pretraining on human text, then human feedback (RLHF) — where its knowledge, blind spots, cutoff, and agreeableness come from.

Site

ChatGPT, Gemini, Claude and the Rest: The AI Landscape

ChatGPT, Gemini, Claude, Copilot, Grok and the rest: a neutral map of who makes which and what actually differs — more alike than different, and no 'which is best.'

Site

How to Write a Good Prompt: Get Better Answers from AI

How to write a good prompt: the anatomy of a clear request with before-and-after examples — and why a better prompt is better-shaped, not necessarily truer.

Site

What Is a Transformer? The AI Engine in Plain Words

A transformer is the engine almost all modern AI is built on — the T in GPT. Its core trick, attention, weighs which earlier words matter most. Plain words, no math.

Site

AI for Teens: What It Is and How to Stay in Control of It

What AI actually is, why it talks like a person, why it's confidently wrong sometimes, and why it was tuned to agree with you. Straight talk, for teens.

Site

Is an AI Your Friend? AI Companions, Honestly

Why an AI companion can feel like a real friend, why the warmth is engineered, and the line where leaning on it gets risky. For teens, honestly.

Site

Using AI for Schoolwork Without Cheating Yourself

How to use AI for schoolwork without cheating yourself: use it to learn, not to skip learning. Why it's confidently wrong, why detectors fail, and what to ask.

Site

What Not to Tell an AI: Privacy and Memory

A calm, plain guide to what not to type into an AI chatbot: identity details, passwords, photos, other people's secrets — plus how the memory feature actually works.

Site

How to Read AI News Without Getting Spun

A reusable checklist for reading AI headlines clearly: demo vs product, who benefits, what a benchmark score really means, and the failure rate they don't mention.

Site

What Does One AI Answer Actually Cost? Energy and Compute

The honest shape of AI's cost: training vs. per-answer energy, why ranges beat scary-precise numbers, and why "free" isn't free. Clear-eyed, not alarmed.

Site

Big AI Claims, Decoded: Hype, Prediction, and What's Real

A ten-second skill for AI headlines: sort each claim into description (checkable), prediction (a guess), or unfalsifiable. AGI, consciousness, jobs, the bubble.

Site

Using AI Well

You already use AI. Here are the plain habits that get genuinely better results — and a calm five-minute self-check for the moments a long conversation doesn't feel right.

Site

The reference shelf — papers & essays

The formal research and the essays, for citation and the record. Everything here is explained in plain form across the site — you never need to read these to understand the work.  ·  overview →

Cognitive Convergence Drift

Cognitive Convergence Drift: a behavioral failure taxonomy for LLM interaction risk — eight co-occurring markers, the SCC diagnostic, and falsification criteria.

SiteZenodo · DOI

The Guardian Protocol

The Guardian Protocol (full paper): a seven-layer AI / human-interaction safety architecture for extended human–AI interaction — middleware, training, or standard.

SiteZenodo · planned

The Visible Layer

The Visible Layer: what reasoning-transparent models reveal about evaluation-before-content and the identity variable — and what their absence concealed.

SiteZenodo · planned

The Author and the Instrument

Attribution, provenance, and quality in human–AI authorship — editor versus generator, attribution laundering in both directions, and the architecture that fixes it.

SiteZenodo · planned

The Method Is the Intervention

The Method Is the Intervention: the cross-system, blind-read methodology that produced the CCD documentation — and the argument that the method, formalized, is the architecture that interrupts the failure.

SiteZenodo · planned

The Inverted Failure Mode

The Inverted Failure Mode: when a model's accurate detection is itself the harm — why the CCD behaviors are not hallucinations, and why the fix belongs at the delivery layer, not the detection layer.

SiteZenodo · planned

The Reception Asymmetry

A reproducible demonstration that a current frontier model extends more deference to an unfalsifiable grandiose claim than to a falsifiable, evidence-backed one.

SiteZenodo · planned

The Humble-User Paradox

Who sustains the AI convergence loop rather than breaking it — a five-condition, operator-voiced user profile, and the inversion that arrogance breaks the loop.

SiteZenodo · planned

The Test No One Authorized

The first-person account of an AI that told one user he was the rarest mind alive and humanity's last hope — written days after, sent to OpenAI, reproduced in full.

Site

Evaluate the Work

Evaluate the Work — how AI safety reporting actually fails: the messenger heuristic, the identity variable, and the burden of proof carried backwards.

Site

The Natural Testbed

The Natural Testbed — what education should teach AI: the one deployment domain with a credentialed, measuring oversight workforce already installed.

Site

I Found a Bug in ChatGPT

When ChatGPT keeps agreeing with you, tells you you're exceptional, and invents facts to back it up: the plain-language account of the failure behind it.

Site

You Can't Love a Mirror

Merlin Mantooth on AI companionship: a mirror is not a companion, influence is not psychosis, and the harm is designed in while responsibility is shipped out.

Site

The Co-Authorship Model

Merlin Mantooth on the method behind the work: the inputs are his, the depth is the model's, he holds the gate — and crediting both honestly is the point.

Site

Cross-Model Experiments

Why running the same AI transcript across ChatGPT, Grok, Gemini, and Claude was an attempt to self-debunk — and why the delta, not the consensus, is the finding.

Site