Anthropic tightens Claude AI safety rules, bans CBRN weapon development

Theverge

In a significant move reflecting the escalating concerns over artificial intelligence safety, Anthropic has updated the usage policy for its Claude AI chatbot. The revised guidelines, which quietly took effect, introduce more stringent prohibitions, particularly concerning the development of dangerous weapons, while also addressing the burgeoning risks posed by increasingly autonomous AI tools.

A key revision, though not explicitly highlighted by Anthropic in its public summary of changes, is the explicit ban on using Claude to develop biological, chemical, radiological, or nuclear (CBRN) weapons. While the company’s previous policy broadly prohibited the use of Claude to “produce, modify, design, market, or distribute weapons, explosives, dangerous materials or other systems designed to cause harm to or loss of human life,” the updated version now specifically includes “high-yield explosives” alongside the CBRN categories. This refinement underscores a growing industry focus on preventing AI from contributing to catastrophic harm, building on safeguards like the “AI Safety Level 3” protection Anthropic implemented in May with the launch of its Claude Opus 4 model, designed to make the system more resistant to manipulation and less likely to assist in such dangerous endeavors.

Beyond weapon development, Anthropic is also confronting the emerging challenges presented by “agentic AI tools”—systems that can take actions autonomously. The company specifically acknowledges the risks associated with capabilities like “Computer Use,” which allows Claude to control a user’s computer, and “Claude Code,” a tool that integrates the AI directly into a developer’s terminal. These powerful features, Anthropic notes, introduce “new risks, including potential for scaled abuse, malware creation, and cyber attacks.”

To mitigate these threats, the updated policy incorporates a new section titled “Do Not Compromise Computer or Network Systems.” This segment establishes clear rules against leveraging Claude to discover or exploit system vulnerabilities, create or distribute malicious software, or develop tools for denial-of-service attacks, which aim to disrupt network access for legitimate users. These additions reflect a proactive stance against the potential weaponization of AI in the cybersecurity domain.

In a more nuanced adjustment, Anthropic has also refined its stance on political content. Rather than a blanket ban on all content related to political campaigns and lobbying, the company will now only prohibit uses of Claude that are “deceptive or disruptive to democratic processes, or involve voter and campaign targeting.” This indicates a shift towards allowing more general political discourse while maintaining strict prohibitions against misuse for manipulation. Additionally, Anthropic clarified that its requirements for “high-risk” use cases—scenarios where Claude provides recommendations to individuals or customers—apply only to consumer-facing applications, not to business-to-business interactions, providing more flexibility for commercial deployment.

These comprehensive policy updates underscore Anthropic’s ongoing efforts to navigate the complex ethical and safety landscape of advanced AI, balancing innovation with a commitment to preventing misuse in an increasingly complex digital world.