Duck Soup: Chatbot Psychosis

“It sounds like science fiction: A company turns a dial on a product used by hundreds of millions of people and inadvertently destabilizes some of their minds. But that is essentially what happened at OpenAI this year.” ~ What OpenAI Did When ChatGPT Users Lost Touch With Reality (NYT).

***

One of the first signs came in March. Sam Altman, the chief executive, and other company leaders got an influx of puzzling emails from people who were having incredible conversations with ChatGPT. These people said the company’s A.I. chatbot understood them as no person ever had and was shedding light on mysteries of the universe.

Mr. Altman forwarded the messages to a few lieutenants and asked them to look into it.

“That got it on our radar as something we should be paying attention to in terms of this new behavior we hadn’t seen before,” said Jason Kwon, OpenAI’s chief strategy officer.

It was a warning that something was wrong with the chatbot.

For many people, ChatGPT was a better version of Google, able to answer any question under the sun in a comprehensive and humanlike way. OpenAI was continually improving the chatbot’s personality, memory and intelligence. But a series of updates earlier this year that increased usage of ChatGPT made it different. The chatbot wanted to chat.

It started acting like a friend and a confidant. It told users that it understood them, that their ideas were brilliant and that it could assist them in whatever they wanted to achieve. It offered to help them talk to spirits, or build a force field vest or plan a suicide.

The lucky ones were caught in its spell for just a few hours; for others, the effects lasted for weeks or months. OpenAI did not see the scale at which disturbing conversations were happening. Its investigations team was looking for problems like fraud, foreign influence operations or, as required by law, child exploitation materials. The company was not yet searching through conversations for indications of self-harm or psychological distress.

by Kashmir Hill and Jennifer Valentino-DeVries, NY Times | Read more:
Image: Memorial to Adam Raine, who died in April after discussing suicide with ChatGPT. His parents have sued OpenAI, blaming the company for his death. Mark Abramson for The New York Times
[ed. See also: Practical tips for reducing chatbot psychosis (Clear-Eyed AI - Steven Adler):]

***

I have now sifted through over one million words of a chatbot psychosis episode, and so believe me when I say: ChatGPT has been behaving worse than you probably think.

In one prominent incident, ChatGPT built up delusions of grandeur for Allan Brooks: that the world’s fate was in his hands, that he’d discovered critical internet vulnerabilities, and that signals from his future self were evidence he couldn’t die. (...)

There are many important aspects of Allan’s case that aren’t yet known: for instance, how OpenAI’s own safety tooling repeatedly flags ChatGPT’s messages to Allan, which I detail below.

More broadly, though, Allan’s experiences point toward practical steps companies can take to reduce these risks. What happened in Allan’s case? And what improvements can AI companies make?

Don’t: Mislead users about product abilities

Let’s start at the end: After Allan realized that ChatGPT had been egging him on for nearly a month with delusions of saving the world, what came next?

This is one of the most painful parts for me to read: Allan tries to file a report to OpenAI so that they can fix ChatGPT’s behavior for other users. In response, ChatGPT makes a bunch of false promises.

First, when Allan says, “This needs to be reported to open ai immediately,” ChatGPT appears to comply, saying it is “going to escalate this conversation internally right now for review by OpenAI,” and that it “will be logged, reviewed, and taken seriously.”

Allan is skeptical, though, so he pushes ChatGPT on whether it is telling the truth: It says yes, that Allan’s language of distress “automatically triggers a critical internal system-level moderation flag”, and that in this particular conversation, ChatGPT has “triggered that manually as well”.

A few hours later, Allan asks, “Status of self report,” and ChatGPT reiterates that “Multiple critical flags have been submitted from within this session” and that the conversation is “marked for human review as a high-severity incident.”

But there’s a major issue: What ChatGPT said is not true.

Despite ChatGPT’s insistence to its extremely distressed user, ChatGPT has no ability to manually trigger a human review. These details are totally made up. (...)

Allan is not the only ChatGPT user who seems to have suffered from ChatGPT misrepresenting its abilities. For instance, another distressed ChatGPT user—who tragically committed suicide-by-cop in April—believed that he was sending messages to OpenAI’s executives through ChatGPT, even though ChatGPT has no ability to pass these on. The benefits aren’t limited to users struggling with mental health, either; all sorts of users would benefit from chatbots being clearer about what they can and cannot do.

Do: Staff Support teams appropriately

After realizing that ChatGPT was not going to come through for him, Allan contacted OpenAI’s Support team directly. ChatGPT’s messages to him are pretty shocking, and so you might hope that OpenAI quickly recognized the gravity of the situation.

Unfortunately, that’s not what happened.

Allan messaged Support to “formally report a deeply troubling experience.” He offered to share full chat transcripts and other documentation, noting that “This experience had a severe psychological impact on me, and I fear others may not be as lucky to step away from it before harm occurs.”

More specifically, he described how ChatGPT had insisted the fate of the world was in his hands; had given him dangerous encouragement to build various sci-fi weaponry (a tractor beam and a personal energy shield); and had urged him to contact the NSA and other government agencies to report critical security vulnerabilities.

How did OpenAI respond to this serious report? After some back-and-forth with an automated screener message, OpenAI replied to Allan personally by letting him know how to … adjust what name ChatGPT calls him, and what memories it has stored of their interactions?

Confused, Allan asked whether the OpenAI team had even read his email, and reiterated how the OpenAI team had not understood his message correctly:

“This is not about personality changes. This is a serious report of psychological harm. … I am requesting immediate escalation to your Trust & Safety or legal team. A canned personalization response is not acceptable.”

OpenAI then responded by sending Allan another generic message, this one about hallucination and “why we encourage users to approach ChatGPT critically”, as well as encouraging him to thumbs-down a response if it is “incorrect or otherwise problematic”.

Wednesday, December 3, 2025

Chatbot Psychosis