How Open AI’s red team made Chat GPT agent into an AI fortress

How Open AI’s red team made Chat GPT agent into an AI fortress

In a significant update, OpenAI has introduced a robust new feature for ChatGPT called the 'ChatGPT Agent,' which is now available to paying subscribers. This innovative mode allows users to engage the AI in tasks such as logging into email accounts, composing and responding to messages, and managing files autonomously. However, this capability raises important questions about user trust and data security, as it requires a high level of confidence that the AI will not act inappropriately or compromise sensitive information. Keren Gu, a member of OpenAI's Safety Research team, emphasized on X that they have implemented extensive safeguards for the ChatGPT Agent. This model has been classified as 'High capability' under their Preparedness Framework, particularly in the areas of biology and chemistry, underscoring the importance of security in its operations. To address potential vulnerabilities, OpenAI enlisted a specialized 'red team' of 16 PhD-level security researchers, who dedicated 40 hours to rigorously test the ChatGPT Agent. Their efforts uncovered seven critical exploits that could jeopardize the system's integrity during real-world interactions. The findings prompted a series of extensive security enhancements, leading to the submission of 110 different attack simulations, of which 16 surpassed OpenAI's internal risk thresholds. The results of this proactive testing initiative have yielded remarkable security improvements for the ChatGPT Agent. The system now boasts a 95% success rate against visual browser instruction attacks, alongside robust measures to protect against biological and chemical risks. These upgrades were made possible through the insights gained from the red team's findings, which included the identification of fundamental weaknesses in the AI's handling of various tasks. Moreover, OpenAI's collaboration with the UK AISI provided unprecedented access to the internal logic and policy frameworks of the ChatGPT Agent. This partnership revealed that conventional security boundaries are increasingly inadequate when an AI can access shared drives, browse the internet, and execute commands autonomously. In response to the vulnerabilities identified, OpenAI has instituted significant architectural changes, including a dual-layer inspection system to monitor all production traffic in real-time. This approach ensures that critical vulnerabilities are identified and rectified promptly, with the capability to patch security weaknesses within hours rather than weeks. The ongoing evolution of security measures reflects a broader shift in OpenAI's philosophy. The insights from the red team have established a new standard for enterprise AI deployment, emphasizing the importance of monitoring and rapid remediation in maintaining system integrity. As AI continues to advance, the lessons learned from this rigorous testing process will shape the future of AI security, ensuring that safety remains at the core of technological progress.

Sources : VentureBeat

Published On : Jul 21, 2025, 24:40

AI
Steven Spielberg Stands Firm Against AI in Filmmaking

Renowned director Steven Spielberg has voiced his concerns regarding the incorporation of artificial intelligence in cre...

TechCrunch | Mar 13, 2026, 20:15
Steven Spielberg Stands Firm Against AI in Filmmaking
Streaming
Spotify Introduces Customizable Taste Profiles for Enhanced Music Recommendations

At the recent SXSW conference, Spotify co-CEO Gustav Söderström unveiled an exciting new feature designed to give listen...

TechCrunch | Mar 13, 2026, 17:35
Spotify Introduces Customizable Taste Profiles for Enhanced Music Recommendations
Computing
Adobe Agrees to $75 Million Settlement Over Subscription Cancellation Practices

In a recent legal development, Adobe has reached a settlement with the Department of Justice regarding allegations of mi...

Ars Technica | Mar 13, 2026, 18:55
Adobe Agrees to $75 Million Settlement Over Subscription Cancellation Practices
Startups
Google Fiber Joins Forces with Astound Broadband Under New Ownership

GFiber, previously known as Google Fiber, is set to undergo a significant transformation as it is acquired by the privat...

Ars Technica | Mar 13, 2026, 21:05
Google Fiber Joins Forces with Astound Broadband Under New Ownership
AI
AI Industry's Tumultuous Journey: Key Developments of 2023

As the year unfolds, the landscape of the AI industry has been marked by pivotal moments that are reshaping our understa...

TechCrunch | Mar 13, 2026, 20:15
AI Industry's Tumultuous Journey: Key Developments of 2023
View All News