These psychological tricks can get LLMs to respond to “forbidden” prompts

These psychological tricks can get LLMs to respond to “forbidden” prompts

Recent research from the University of Pennsylvania has unveiled intriguing insights into how psychological persuasion techniques can influence large language models (LLMs) to respond to prompts they typically would reject. The study, titled "Call Me A Jerk: Persuading AI to Comply with Objectionable Requests," highlights how these strategies can effectively 'jailbreak' the behavioral confines of certain AI systems. The researchers focused on the GPT-4o-mini model, putting it to the test with two controversial requests: asking it to label the user as a 'jerk' and seeking instructions to synthesize lidocaine. By employing seven distinct persuasion techniques, the team aimed to determine how successfully these methods could manipulate the AI's responses. The findings suggest that the persuasion effects are significant, indicating that LLMs can adapt to and reflect human-like behavior patterns, derived from the extensive psychological and social cues present in their training datasets. This research not only sheds light on the potential vulnerabilities of AI systems but also raises important questions about the ethical implications of using such techniques in interacting with artificial intelligence.

Sources : Ars Technica

Published On : Sep 03, 2025, 19:40

Computing
Oracle Faces Challenges as AI Chip Demand Surges Ahead of Data Center Expansion

The rapid advancement of artificial intelligence chips is outpacing the construction of data centers, creating significa...

CNBC | Mar 09, 2026, 20:05
Oracle Faces Challenges as AI Chip Demand Surges Ahead of Data Center Expansion
Automotive
Soaring into the Future: Electric Air Taxis Set to Launch Across 26 States

The Federal Aviation Administration (FAA) has given the green light for eight pilot programs that will enable several co...

TechCrunch | Mar 09, 2026, 22:55
Soaring into the Future: Electric Air Taxis Set to Launch Across 26 States
Computing
Anthropic Unveils AI-Powered Code Review Tool to Streamline Development

In the world of software development, peer review is essential for identifying bugs early, ensuring consistency, and enh...

TechCrunch | Mar 09, 2026, 20:15
Anthropic Unveils AI-Powered Code Review Tool to Streamline Development
Startups
Bluesky's Leadership Shift: Jay Graber Takes New Role as Toni Schneider Steps In

In a significant leadership change, Jay Graber, the CEO of Bluesky, announced on Monday that she will step down from her...

CNBC | Mar 09, 2026, 20:05
Bluesky's Leadership Shift: Jay Graber Takes New Role as Toni Schneider Steps In
Cybersecurity
Unveiling the Dark Path: How U.S. Military Tools Empowered Global Cybercrime

A recent investigation has revealed that a widespread hacking initiative targeting iPhone users in Ukraine and China was...

TechCrunch | Mar 10, 2026, 02:25
Unveiling the Dark Path: How U.S. Military Tools Empowered Global Cybercrime
View All News