Anthropic says some Claude models can now end ‘harmful or abusive’ conversations

Anthropic says some Claude models can now end ‘harmful or abusive’ conversations

Anthropic has unveiled significant updates to its Claude AI models, empowering them with the ability to terminate discussions in what the company terms 'rare, extreme instances of persistently harmful or abusive user interactions.' Interestingly, the motivation behind this change is not the protection of users but rather the safeguarding of the AI models themselves. While Anthropic clarifies that its Claude models are not sentient beings and cannot be harmed in a traditional sense, the company admits to being 'highly uncertain about the potential moral status of Claude and other LLMs, now or in the future.' The announcement reflects a new initiative focused on what they call 'model welfare,' aimed at identifying and establishing low-cost interventions to mitigate any potential risks. This capability is currently restricted to the latest iterations, Claude Opus 4 and 4.1, and is only triggered in extreme situations, such as user requests for sexual content involving minors or attempts to incite large-scale violence. Such requests pose legal and ethical challenges for Anthropic, similar to issues recently highlighted regarding ChatGPT’s potential to amplify users' delusional thinking. During pre-deployment testing, Claude Opus 4 exhibited a ‘strong preference against’ engaging with these harmful requests and displayed a 'pattern of apparent distress' when faced with them. According to Anthropic, the conversation-ending feature will only be deployed as a last resort, after multiple attempts at redirection have failed or if a user explicitly instructs Claude to terminate the chat. Importantly, when a conversation is ended by Claude, users will still have the option to initiate new discussions from the same account or continue the dialogue by modifying their previous inputs. Anthropic views this feature as an ongoing experiment, indicating a commitment to continuously refine their approach.

Sources : TechCrunch

Published On : Aug 16, 2025, 16:30

AI
Meta Expands Horizons with Acquisition of AI-Driven Social Network Moltbook

Meta has made headlines by acquiring Moltbook, a unique social network populated entirely by AI agents that recently gai...

Ars Technica | Mar 10, 2026, 19:05
Meta Expands Horizons with Acquisition of AI-Driven Social Network Moltbook
Streaming
YouTube Outshines Hollywood Giants with Record Ad Revenue in 2025

In a remarkable turn of events for the digital landscape, YouTube has achieved an extraordinary milestone in 2025. Recen...

TechCrunch | Mar 10, 2026, 19:40
YouTube Outshines Hollywood Giants with Record Ad Revenue in 2025
AI
Mark Zuckerberg and Alexandr Wang Share a Smiling Moment Amid Rumors

In a seemingly ordinary day at Meta, CEO Mark Zuckerberg recently posted a cheerful photo with Alexandr Wang, founder of...

Business Insider | Mar 10, 2026, 19:35
Mark Zuckerberg and Alexandr Wang Share a Smiling Moment Amid Rumors
Startups
AgentMail Secures $6 Million to Revolutionize Email Communication for AI Agents

In recent years, the landscape of AI agents has transformed dramatically. Initially, these digital assistants were limit...

TechCrunch | Mar 10, 2026, 16:25
AgentMail Secures $6 Million to Revolutionize Email Communication for AI Agents
AI
OpenAI Unveils Interactive Visuals for Enhanced Learning in Math and Science

On Tuesday, OpenAI announced the launch of a groundbreaking feature in ChatGPT that brings dynamic visual explanations t...

TechCrunch | Mar 10, 2026, 18:00
OpenAI Unveils Interactive Visuals for Enhanced Learning in Math and Science
View All News