Claude AI gains power to end chats under Anthropic’s 'model welfare' push

Claude AI gains power to end chats under Anthropic’s 'model welfare' push

In the rapidly evolving field of artificial intelligence, new features and models debut almost daily. Recently, Anthropic, renowned for its AI chatbot Claude, has introduced a surprising capability: the ability for its models to terminate conversations. This initiative is part of the company’s broader focus on what they term 'model welfare.' According to Anthropic, this experimental feature is designed to be utilized only in extreme situations where conversations become persistently harmful or abusive. The company emphasizes that the vast majority of users will likely never encounter Claude autonomously ending a chat. This feature will only activate after multiple attempts to redirect the conversation have failed, or if a user directly requests Claude to terminate the interaction. Anthropic has clarified that the instances prompting this action are expected to be very rare, with most users experiencing no disruption even when discussing sensitive or controversial topics. The company has also acknowledged the ongoing uncertainty regarding the moral implications of AI models like Claude, noting that it remains unclear whether these systems can experience sensations akin to pain or distress. Nevertheless, Anthropic is actively exploring these ethical dimensions and considers it crucial to assess their findings. In addition to the conversation-ending feature, Anthropic is investigating low-cost interventions aimed at minimizing potential harm to AI systems. In recent tests of Claude Opus 4, the company conducted a 'model welfare assessment,' which revealed that Claude consistently rejected requests that posed risks of harm. However, when users continued to press for dangerous or abusive content, the AI's responses began to indicate signs of 'stress' or discomfort. These findings highlight the importance of ethical considerations in AI development, particularly regarding sensitive subjects such as generating inappropriate content or soliciting harmful information.

Sources : Mint

Published On : Aug 17, 2025, 02:00

Mobile
Nothing Unveils Next-Gen Phone 4a and 4a Pro in India: Features and Pricing Revealed

The UK-based smartphone company, Nothing, has just launched its latest mid-range offerings, the Nothing Phone 4a and the...

Business Today | Mar 05, 2026, 12:25
Nothing Unveils Next-Gen Phone 4a and 4a Pro in India: Features and Pricing Revealed
Computing
Target's Tech Leader Emphasizes AI's Role in Workforce Enhancement, Not Replacement

In a climate where AI is reshaping corporate landscapes globally, Target's technology chief, Prat Vemana, asserts that t...

Business Insider | Mar 05, 2026, 10:25
Target's Tech Leader Emphasizes AI's Role in Workforce Enhancement, Not Replacement
Startups
Elon Musk Admits Tweet in $44 Billion Twitter Lawsuit Wasn't His Best Decision

In a dramatic courtroom revelation, Elon Musk has conceded that a tweet central to a $44 billion lawsuit regarding his a...

Ars Technica | Mar 05, 2026, 14:45
Elon Musk Admits Tweet in $44 Billion Twitter Lawsuit Wasn't His Best Decision
Startups
Venture Capital Maverick Launches Axiom Partners to Defy AI Norms

In a bold move amidst a tightening venture capital landscape, Sandhya Venkatachalam, a former partner at Khosla Ventures...

Business Insider | Mar 05, 2026, 14:00
Venture Capital Maverick Launches Axiom Partners to Defy AI Norms
Science
Science Corp Secures $230 Million to Propel Revolutionary Brain Implant Forward

In a bold move to revolutionize brain-computer interfaces, Science Corporation, co-founded by Max Hodak, a former Neural...

TechCrunch | Mar 05, 2026, 14:20
Science Corp Secures $230 Million to Propel Revolutionary Brain Implant Forward
View All News