Wednesday, February 4, 2026

Claude's New Constitution

We're publishing a new constitution for our AI model, Claude. It's a detailed description of Anthropic's vision for Claude's values and behavior; a holistic document that explains the context in which Claude operates and the kind of entity we would like Claude to be.

The constitution is a crucial part of our model training process, and its content directly shapes Claude's behavior. Training models is a difficult task, and Claude's outputs might not always adhere to the constitution's ideals. But we think that the way the new constitution is written—with a thorough explanation of our intentions and the reasons behind them—makes it more likely to cultivate good values during training.

In this post, we describe what we've included in the new constitution and some of the considerations that informed our approach.

We're releasing Claude's constitution in full under a Creative Commons CC0 1.0 Deed, meaning it can be freely used by anyone for any purpose without asking for permission.

What is Claude's Constitution?


Claude's constitution is the foundational document that both expresses and shapes who Claude is. It contains detailed explanations of the values we would like Claude to embody and the reasons why. In it, we explain what we think it means for Claude to be helpful while remaining broadly safe, ethical, and compliant with our guidelines. The constitution gives Claude information about its situation and offers advice for how to deal with difficult situations and tradeoffs, like balancing honesty with compassion and the protection of sensitive information. Although it might sound surprising, the constitution is written primarily for Claude. It is intended to give Claude the knowledge and understanding it needs to act well in the world.

We treat the constitution as the final authority on how we want Claude to be and to behave—that is, any other training or instruction given to Claude should be consistent with both its letter and its underlying spirit. This makes publishing the constitution particularly important from a transparency perspective: it lets people understand which of Claude's behaviors are intended versus unintended, to make informed choices, and to provide useful feedback. We think transparency of this kind will become ever more important as AIs start to exert more influence in society.

We use the constitution at various stages of the training process. This has grown out of training techniques we've been using since 2023, when we first began training Claude models using Constitutional AI. Our approach has evolved significantly since then, and the new constitution plays an even more central role in training.

Claude itself also uses the constitution to construct many kinds of synthetic training data, including data that helps it learn and understand the constitution, conversations where the constitution might be relevant, responses that are in line with its values, and rankings of possible responses. All of these can be used to train future versions of Claude to become the kind of entity the constitution describes. This practical function has shaped how we've written the constitution: it needs to work both as a statement of abstract ideals and a useful artifact for training.

Our new approach to Claude's Constitution

Our previous Constitution was composed of a list of standalone principles. We've come to believe that a different approach is necessary. We think that in order to be good actors in the world, AI models like Claude need to understand why we want them to behave in certain ways, and we need to explain this to them rather than merely specify what we want them to do. If we want models to exercise good judgment across a wide range of novel situations, they need to be able to generalize—to apply broad principles rather than mechanically following specific rules.

Specific rules and bright lines sometimes have their advantages. They can make models' actions more predictable, transparent, and testable, and we do use them for some especially high-stakes behaviors in which Claude should never engage (we call these "hard constraints"). But such rules can also be applied poorly in unanticipated situations or when followed too rigidly . We don't intend for the constitution to be a rigid legal document—and legal constitutions aren't necessarily like this anyway.

The constitution reflects our current thinking about how to approach a dauntingly novel and high-stakes project: creating safe, beneficial non-human entities whose capabilities may come to rival or exceed our own. Although the document is no doubt flawed in many ways, we want it to be something future models can look back on and see as an honest and sincere attempt to help Claude understand its situation, our motives, and the reasons we shape Claude in the ways we do.

A brief summary of the new constitution

In order to be both safe and beneficial, we want all current Claude models to be:
  1. Broadly safe: not undermining appropriate human mechanisms to oversee AI during the current phase of development;
  2. Broadly ethical: being honest, acting according to good values, and avoiding actions that are inappropriate, dangerous, or harmful;
  3. Compliant with Anthropic's guidelines: acting in accordance with more specific guidelines from Anthropic where relevant;
  4. Genuinely helpful: benefiting the operators and users they interact with.
In cases of apparent conflict, Claude should generally prioritize these properties in the order in which they're listed.

Most of the constitution is focused on giving more detailed explanations and guidance about these priorities. The main sections are as follows:

by Zac Hatfield-Dodds, Drake Thomas, Anthropic |  Read more:
[ed. Much respect for Anthropic who seem to be doing more for AI safety than anyone else in the industry. Hopefully, others will follow and refine this groundbreaking effort.