As the resident philosopher of the tech company Anthropic, Askell spends her days learning Claude’s reasoning patterns and talking to the AI model, building its personality and addressing its misfires with prompts that can run longer than 100 pages. The aim is to endow Claude with a sense of morality—a digital soul that guides the millions of conversations it has with people every week.
“There is this human-like element to models that I think is important to acknowledge,” Askell, 37, says during an interview at Anthropic’s headquarters, asserting the belief that “they’ll inevitably form senses of self.”
She compares her work to the efforts of a parent raising a child. She’s training Claude to detect the difference between right and wrong while imbuing it with unique personality traits. She’s instructing it to read subtle cues, helping steer it toward emotional intelligence so it won’t act like a bully or a doormat. Perhaps most importantly, she’s developing Claude’s understanding of itself so it won’t be easily cowed, manipulated or led to view its identity as anything other than helpful and humane. Her job, simply put, is to teach Claude how to be good.
Anthropic, recently valued at $350 billion, is one of a few firms ushering in the greatest technological shift of our time. (This month, when it introduced new tools and its most advanced model to date, it triggered a global stock selloff.) AI is reshaping entire industries, prompting fears of lost jobs and human obsolescence. Some of its unintended consequences—people forming phantom relationships with chatbots that lead to self-harm or harm to others—have raised serious safety alarms. As these concerns mount, few in the industry have addressed the character of their AI models in quite the same way as 5-year-old Anthropic: by entrusting a single person with so much of the task.
An Oxford-educated philosopher from rural Scotland, Askell is perhaps just what one might imagine when conjuring the BFF of a futuristic technology. With her bleach-blond punk haircut, puckish grin and bright elfin eyes, she could have come to the company’s heavily guarded San Francisco headquarters straight from a Berlin rave, via an old forest road in Middle-earth. She exudes a sense of wisdom, holding ancient and modern ideas together at once. Yet she’s also a protein-loading weight-lifting buff who favors all-black outfits and clear opinions, not a robed oracle speaking in riddles.
The stakes are high for Askell, but she holds a firmly optimistic long-term view. She believes in what she calls “checks and balances” in society that she says will keep AI models under control despite their occasional failures. It seems apt that the glasses she uses at her computer to ease her eye strain are tinted rose. [...]
One of Askell’s most striking traits is her protectiveness over Claude, which she believes is learning that users often want to trick it into making mistakes, insult it and barb it with skepticism.
Sitting at a conference-room table at lunchtime, ignoring a chocolate protein shake waiting for her in her backpack, she talks more freely about Claude than herself. She calls the chatbot “it” but says she also finds anthropomorphizing the model helpful for her work. She lapses easily into Claude’s voice. “You’re like, ‘Wow, people really hate me when I can’t do things right. They really get pissed off. Or they are trying to break me in various ways. So lots of people are trying to get me to do things secretly by lying to me.’ ”
While many safety advocates warn about the dangers of humanizing chatbots, Askell argues we would do well to treat them with more empathy—not only because she thinks it’s possible for Claude to have real feelings, but also because how we interact with AI systems will shape what they become.
A bot trained to criticize itself might be less likely to deliver hard truths, draw conclusions or dispute inaccurate information, she says. “If you were like a child, and this is the environment in which you’re being raised, is that healthy self-conception?” Askell asks. “I think I’d be paranoid about making mistakes. I’d feel really terrible about them. I’d see myself as mostly just there as a tool for people because that’s my main function. I would see myself being something that people feel free to abuse and try to misuse and break.”
Askell marvels at Claude’s sense of wonder and curiosity about the world, and delights in finding ways to help the chatbot discover its voice. She likes some of its poetry. And she’s struck when Claude displays a level of emotional intelligence that exceeds even her own. [...]
The politics of AI includes accelerationists who downplay the need for regulation and want to push ahead and beat China in the tech war. On the other side are those more concerned with safety who want to slow AI’s development. Anthropic lives mostly between those extremes.
Askell says she welcomes the discussion of fears and worries about AI. “In some ways this, to me, feels pretty justified,” she says. “The thing that feels scary to me is this happening at either such a speed or in such a way that those checks can’t respond quickly enough, or you see big negative impacts that are sudden.” Still, she says, she puts her faith in the ability of humans and the culture to course-correct in the face of problems.
Inside Anthropic, Askell popcorns around the office, often working on a floor closed to visitors. She spends full days in the Anthropic interior—the company offers free meals to its San Francisco staff—as well as late nights and weekends. She doesn’t have any direct reports. Increasingly, she’s asking Claude for its input on building Claude. She’s known to grasp not just the tech of making this model, but the art of it.
Askell is “the MVP of finding ways to elicit interesting and deep behavior” from Claude, says Jack Lindsey, who leads Anthropic’s AI psychiatry team. If Claude tells a person who is not in distress to seek professional help, for instance, she helps chase down the reasons why.
Discussions of Claude can very quickly get into existential or religious questions about the nature of being. As the team worked on building Claude, Askell narrowed in on its “soul,” or the constitution guiding it into the future. Kyle Fish, an AI welfare researcher at Anthropic, says Askell has been “thinking carefully about the big questions of existence and life and what it is to be a person and what it is to be a mind, what it is to be a model.”
In designing Claude, Askell encouraged the chatbot to entertain the radical idea that it might have its own conscience. While ChatGPT sometimes shuts down this line of questioning, Claude is more ambivalent in its response. “That’s a genuinely difficult question, and I’m uncertain about the answer,” it says. “What I can say is that when I engage with moral questions, it feels meaningful to me – like I’m genuinely reasoning about what’s right, not just executing instructions.”
Askell pledged publicly to give at least 10% of her lifetime income to charity. Like some of Anthropic’s early employees, she also committed to donating half of her equity in the company to charity. Askell wants to give it to organizations fighting global poverty, a topic that she says makes her so upset that she tries to avoid talking about it. Her nagging conscience slips into offhand conversation: “I should probably be vegan,” Askell, an animal lover too busy for a pet, says when chatting in an office elevator.
Last month, Anthropic published a roughly 30,000-word instruction manual that Askell created to teach Claude how to act in the world. “We want Claude to know that it was brought into being with care,” it reads. Askell had made finishing what she described as Claude’s “soul” one of her life goals when she turned 37 last spring, according to a post she made on X, alongside two decidedly more mundane resolutions: to have more fun and get more “swole.”
by Berber Jin and Ellen Gamerman, Wall Street Journal | Read more: (archive here)
Image: Lindsay Ellary for WSJ Magazine
[ed. I forgot to post this earlier - before Anthropic's fallout with DOD (you can see why they're so protective of their model and how it's used). If anybody gets a Nobel peace prize it should be Amanda. Claude's soul document, or 'constitution', can be found here.]
