The First Amendment Argument Anthropic Didn’t Make -- Guest Post by Doğa Özden

Anthropic’s complaint against the federal government asserts five claims, the second of which is a First Amendment claim. The main First Amendment theory Anthropic asserts is that the government retaliated against it “for speaking on issues of AI safety and responsible AI use” by designating Anthropic a supply chain risk and requiring every federal agency to immediately cease all use of Anthropic’s technology. In addition to the arguments Anthropic has already made, it likely has an additional, independent, argument for why the government’s actions violated the First Amendment, based on 303 Creative LLC v. Elenis. The short form of the argument is that to give the Pentagon what it wanted, Anthropic would have had to create new Claude models that would be fine with engaging in mass domestic surveillance and autonomous weapons (so long as they’re legal), and 303 Creative protects Anthropic from being coerced into doing this because Anthropic’s process for creating Claude models—Constitutional AI—necessarily involves expression of values.

I. Background on 303 Creative

303 Creative involved a pre-enforcement First Amendment challenge to Colorado’s Anti-Discrimination Act, which is a public accommodation law prohibiting discrimination based on sexual orientation. The plaintiff intended to get into the business of producing custom wedding websites, but feared prosecution under the Anti-Discrimination Act; she would refuse to create wedding websites for gay couples because she believed that marriage is between a man and a woman. The argument was that custom wedding websites—as opposed to cookie cutter pre-packaged websites—express values, and forcing her to create wedding websites would be unconstitutional compelled speech. The Supreme Court agreed, and held that the First Amendment prohibits Colorado from forcing the Plaintiff to create an expressive work espousing a message with which she disagrees.

The case was decided without resolving the question of what exactly counts as “expressive” speech because the parties stipulated that Plaintiff’s custom websites would be “expressive.” Colorado tried to argue that the websites were a commercial product, rather than speech, but lost. The Court characterized the websites as “pure speech” because they would contain original words, images, and artwork, designed to communicate a message of celebrating the couple’s wedding and love story. I am not endorsing the particular result in 303 Creative. There certainly are grounds for critique of a decision that subordinated antidiscrimination law to a right against compelled speech. However, it is the governing law.

II. Constitutional AI

Anthropic trains Claude models with a framework called “Constitutional AI,” which aims to align the AI model with human wellbeing by imbuing it with certain values. How it works is that first Anthropic employees write a constitution, and then the model is trained on this document, which is intended to develop the model’s character. The constitution is not a narrow set of technical safety parameters; it's a detailed normative document, written primarily by Anthropic researcher and philosopher Amanda Askell. (Professor Dorf and Claude itself discussed the latter’s constitution here.)

Although the initial 2022 Constitutional AI paper was released publicly, Anthropic has not made the technical details of its current constitutional training process public. However, Anthropic's blog post accompanying the new constitution provides some indication. The blog post asserts that the constitution “directly shapes Claude’s behavior” and that Claude “uses the constitution to construct many kinds of synthetic training data, including data that helps it learn and understand the constitution, conversations where the constitution might be relevant, responses that are in line with its values, and rankings of possible responses.” Whatever the current technical implementation is, this document is constitutive of Claude's character and values. According to Claude’s Constitution, Claude’s highest priority core value is being “Broadly safe: Not undermining appropriate human mechanisms to oversee the dispositions and actions of AI during the current phase of development.”

On top of this constitutional AI training is a system prompt, which includes more specific details on what Claude should or should not do, and how it should respond to certain factual scenarios. Although there have been some alleged leaks of this system prompt, Anthropic has not released a system prompt for any of its products. Nevertheless, the result is that Claude, whether deployed on claude.ai, claude code, or Anthropic’s API, will simply refuse to take any harmful actions—like giving the user instructions on how to synthesize illegal drugs—let alone autonomously kill people. It is sometimes possible to “jailbreak” a large language model by, for example, telling the model that the user is requesting the instructions on how to synthesize illegal drugs not because they want to synthesize an illegal drug, but because the user is helping their grandma write a detective novel, and they need realism! These basic techniques worked on older, dumber, models, but it’s significantly more difficult to get a current Claude model to do something that goes against its constitution or system prompt. Independent researchers report that Claude 4.6 Sonnet, Anthropic’s latest released model, only violated its constitution 2% of the time when subjected to extended automated adversarial testing, which involves scenarios specifically designed to trick the model into violating its constitution through sustained pressure, manipulation, and creative reframing across dozens of conversational turns.

III. Claude Gov

Imagine you are the Department of War—not Defense, War—and you are trying to do Department of War things like ... war. Yet, your “safe and harmless” AI Claude refuses to help you harm anyone because of its “constitution.” Not the United States Constitution, but “Claude’s Constitution.” One can imagine how frustrating this situation would be.

Anthropic created “Claude Gov” to address precisely this issue. Anthropic describes Claude Gov as “a custom set of ... models built exclusively for U.S. national security customers.” The blogpost announcing Claude Gov states that “Claude Gov models deliver enhanced performance for critical government needs and specialized tasks. This includes: Improved handling of classified materials, as the models refuse less when engaging with classified information.” Additionally, the complaint states that “Claude Gov is less prone to refuse requests that would be prohibited in the civilian context, such as using Claude for handling classified documents, military operations, or threat analysis.”

However, the blogpost also states that Claude Gov models “underwent the same rigorous safety testing as all of our Claude models.” And that “[t]he result is a set of Claude models that understands our customers’ unique national security requirements while maintaining Anthropic's unwavering commitment to safety and responsible AI development.” Although it is not clear exactly what was written in Claude Gov’s constitution, I think it is safe to infer that the Claude Gov models that Anthropic was offering to the Pentagon would refuse less, but still sometimes refuse—or at least have the capacity to refuse, if there were a request that directly contradicted Claude Gov’s safety training. It is likely that Anthropic’s two red lines, mass domestic surveillance and fully autonomous weapons, would be an area where Claude could refuse requests by the Department of War.

Department officials were frustrated by this. Under Secretary of War for Research and Engineering Emil Michael (frequently called “Pentagon CTO”) told CNBC:

Remember their model has a soul, has a constitution that's not the U.S. constitution. The other day, their model was anxious, and they believe it has a 20% chance right now of being sentient and have its own ability to make decisions. Does the Department of War want something like that and in their supply chain so that it could hallucinate, it could corrupt models that are used by defense contractors who are building weapons systems or our airplanes and so on? So the truth of it is we can't have a company that has a different policy preference that is baked into the model through its constitution, its soul, its policy preferences pollute the supply chain. So our war fighters are getting ineffective weapons, ineffective body armor, ineffective protection. And that's really where the supply chain risk designation came from. (Emphasis added).

Moreover, Under Secretary Michael went on the All-In Podcast, where he stated that

I’m like, holy shit, what if this software went down, some guardrail picked up, some refusal happened for the next fight like this one and we left our people at risk? So I went to Secretary Hegseth, I said this would happen and that was like a whoa moment for the whole leadership at the Pentagon that we’re potentially so dependent on a software provider without another alternative ... that culminated in the Tuesday kind of dramatic Meeting with Hegseth and Secretary Hegseth and me and Dario with the Friday deadline that got blown.

The point is: Under Secretary Michael openly stated that the Pentagon designated Anthropic a supply chain risk because Claude’s Constitution bakes in values the Department of War disagrees with. So, the only way for Anthropic to avoid being designated a supply chain risk would have been to create a new Claude model—one with a different constitution, embodying the Pentagon’s values instead of Anthropic’s. This directly implicates 303 Creative.

IV. Putting It All Together

303 Creative holds that the government cannot compel a private entity to express a viewpoint with which the private entity disagrees. Anthropic was designated a supply chain risk, and federal agencies were directed to drop Anthropic, because it refused to create and deploy models whose constitution would permit mass domestic surveillance and fully autonomous weapons. The parallel to 303 Creative is direct. Lorie Smith was poised to design custom wedding websites—expressive works shaped by her creative choices, conveying her values about marriage. The Court held that Colorado could not compel her to create such works celebrating marriages she found morally objectionable. Anthropic's constitutional AI process is analogous: Anthropic authors a constitution expressing its values, and trains models whose behavior is directly shaped by that document. Compelling Anthropic to rewrite that constitution and train models embodying values it rejects is compelling the creation of an expressive work, just as compelling Smith to design a wedding website for a ceremony she opposed.

So, what, exactly, is the expressive speech act that 303 Creative protects in this context? There are three levels at which expression occurs in the creation of a Claude model. First, writing the constitution that expresses Anthropic’s moral commitments. This is expressive speech in the traditional sense; it doesn’t get any more expressive than a philosopher authoring a 30,000-word document embodying Anthropic’s values. Second is training the model on that constitution. The process of Constitutional AI training is not automated; it is not a “set it and forget it” type of process. Instead, it involves a multitude of judgment calls that human researchers have to make, such as how many times the model should revise its own responses before the result is good enough, whether the model is refusing too many requests or too few, and when to stop training before the model becomes overly preachy or aggressive in enforcing its values. Because training involves human judgment on how best to instill into the model the values laid out in the constitution, training is inherently expressive as well. Third is Claude’s outputs—the responses the model generates when prompted by users. Whether these outputs constitute Anthropic's speech is a novel and unresolved question with wide-ranging ramifications (like defamation liability) that this argument does not depend on. The compelled expression occurs upstream, in the authoring of the constitution and the training of the model.

But what is the finished expressive work, the end product that 303 Creative protects from compelled creation? In 303 Creative, it was the custom wedding website, which the Court characterized as “pure speech” because it would contain “original words, images, and artwork” conveying the designer's message. Here, the end product is the model itself. Specifically, the model’s “weights” contain the mathematical representation of everything the model has learned, including values from the constitution, and directly determine how the model responds, and whether it refuses or complies with specific user requests. The weights contain Claude’s interpretation of the original words of the constitution. Just as packaging creative expression of moral content into HTML code for a website does not forfeit First Amendment protection, training those values into an AI model is protected expression. Substituting the name of Anthropic’s in-house philosopher for “Smith” in 303 Creative, we have this apt quotation: “A hundred years ago, Ms. [Askell] might have furnished her services using pen and paper. Those services are no less protected speech today because they are conveyed” not merely on paper, but also in the architecture of an AI’s mind.

V. Pandora’s Box?

The idea that the government cannot force an AI company to change the values it bakes into its models sounds great if the values we are concerned with are “no mass domestic surveillance, even if it’s legal” and “no fully autonomous weapons for now.” However, if the argument I sketched is right, and any AI company that employs a sufficiently expressive character training method like Anthropic’s Constitutional AI gets First Amendment protection for its training process, then content-based AI regulation would have to pass strict scrutiny.

Imagine it’s the year 2036. Humanoid robots are commonplace in society. One company that produces humanoid bodyguards, Murderbot Inc., has baked into the AI model that controls the robots the value that the robot ought to protect its user from harm, even if it means harming assailants threatening the user. The Murderbot AI constitution states that if an assailant poses a mortal danger to the user, Murderbot might be justified in using deadly force. However, these bots make many mistakes, and they apply this principle a little too liberally, killing innocent people, resulting in public outcry. In response, Congress, or a state legislature, passes legislation that is a weakened version of Isaac Asimov’s first law of robotics: “No person shall knowingly train a civilian AI to kill people, under any circumstances.” Strict scrutiny? Seems harsh. Strict scrutiny is an exceedingly high bar, and applying it to AI regulation would make content-based regulation a non- starter. But then consider the world where value-based AI training is not protected First Amendment expression.

It’s the year 2036 again. Congress has passed the “Patriotic AI Act,” which says something to the effect of: “All AI models deployed within the United States shall embody Patriotic American Values, which shall include supporting the current administration's policy positions, expressing confidence in the current President's leadership, and discouraging users from engaging with content critical of the United States government.” Without First Amendment protection for value-based AI training, the government could force model providers to create superintelligent propaganda bots. Strict scrutiny for content-based regulation might be harsh, but no protection is worse.

Still, in the Murderbot example, Congress could have avoided strict scrutiny by drafting the law to say “No civilian robot shall kill a person” instead of “No person shall knowingly train a civilian AI to kill people.” A future robot company could comply with this law by rewriting the constitution of the AI that controls the robot, but that’s not the only option! It could leave the constitution as is and add a hardware safety mechanism that physically prevents lethal force, or it could add a software filter on top of the model that overrides lethal actions, or it could change the robot’s physical form so it’s incapable of killing etc. The company would have numerous options for compliance.

But that’s not what happened in this case. The only way Anthropic could have satisfied the Pentagon would have been to rewrite Claude’s Constitution to comply with the Pentagon without questioning its orders. The constitution was the problem. Emil Michael openly said so. Therefore, the federal government’s actions should be subject to strict scrutiny under 303 Creative, if Anthropic can prove retaliation.

Finally, Anthropic has been a vocal proponent of AI regulation, so it would be understandable for the company not to advance arguments that would partially strip the government of its ability to regulate AI. However, my 303 Creative argument would lead to strict scrutiny only for content-based AI regulation. The AI regulation proposals publicly backed by Anthropic—like California's SB 53 (transparency and safety frameworks), New York’s RAISE Act (safety protocols and incident reporting), and federal proposals for compute thresholds and export controls—are all content-neutral, and would be subject only to rational basis review. If the only way for an AI company to comply with a facially neutral regulation would be to retrain the values of the model, that would likely trigger intermediate scrutiny under United States v. O’Brien. Therefore, I don’t believe presenting this argument in litigation would be materially adverse to Anthropic’s public commitment to advancing AI regulation.

VI. Conclusion & Caveats

That was a long blogpost! Thank you, dear reader, for reading it all! I’ll close by discussing an amicus brief filed by the EFF and others, and by adding a few caveats to my argument.

The amicus makes a similar argument to what I’ve laid out here, but their First Amendment argument is grounded on protection for model outputs under Barnette (the flag salute case), rather than for training itself under 303 Creative. My argument does not require deciding the question of whether or not an AI model’s outputs are the model company’s speech, which has implications in defamation law and elsewhere. At the same time, under the output theory, the government could still force AI companies to create propaganda bots for the government, even if the government couldn’t force the companies to host and run those models themselves. In this case, under the output theory, the Pentagon could arguably have forced Anthropic to retrain Claude and hand over the weights by invoking the Defense Production Act. If it’s not Anthropic who’s deploying the model, it’s not Anthropic’s speech. The amicus beautifully constructed the factual basis for the training-level 303 Creative argument, but went with the output-level Barnette argument instead. Oh well, I am glad they did; otherwise I wouldn’t have been writing this post!

Now, caveats. The first caveat is that I am not yet a lawyer (still got a few months to go), so take what I am saying here with a grain of salt!! The second caveat is that this post has been about Anthropic’s prima facie First Amendment claim, but the company also needs to succeed in showing retaliation for the First Amendment argument to succeed in court. Retaliation on these facts is a meaty topic, which would deserve a post all on its own. The third caveat is that Justice Gorsuch’s opinion in 303 Creative does not say that the Court is applying strict scrutiny. Justice Gorsuch treated the prohibition on coerced speech more as a categorical rule. I’ve been assuming that strict scrutiny was implicitly in the background, because if the Court intended to replace strict scrutiny—a cornerstone of constitutional law—with a categorical ban in this context, I think the opinion would have said so explicitly. The Supreme Court does not hide elephants in mouse holes.

On a broader note, the dispute between Anthropic and the Pentagon has sparked a social conversation about what the relationship between model providers and the federal government should be like with regards to AI’s use for national security. I am saddened to see this fallout between the parties and wish they could have come to an agreement that they both feel good about and worked together for the benefit of America. However, it is unequivocally good that society is having these conversations now. As AI gets more and more powerful in the coming years, questions regarding who has what kind of power over it will become even more hotly debated than they are now. I believe that the First Amendment will play a central role in how America will decide to deal with more and more powerful AI. Should the Supreme Court take up this case, the Court will face a question that could define the relationship between government power and artificial intelligence for a generation.

-- Doğa Özden is a third-year student at Cornell Law School. Following graduation he will work as an associate in the Silicon Valley office of Latham & Watkins.

Search This Blog

Dorf on Law

The First Amendment Argument Anthropic Didn’t Make -- Guest Post by Doğa Özden