GEPA optimizes LLMs without costly reinforcement learning

Researchers from the University of California, Berkeley, Stanford University, and Databricks have unveiled a groundbreaking AI optimization technique named GEPA, which demonstrates remarkable efficiency in adapting large language models (LLMs) for specialized tasks. Unlike conventional reinforcement learning (RL) approaches that depend heavily on extensive trial-and-error strategies, GEPA capitalizes on the inherent language understanding of LLMs, enabling them to self-evaluate, identify errors, and refine their instructions over time. This innovative method not only surpasses existing techniques in accuracy but also significantly reduces the number of trial runs needed—by as much as 35 times. For businesses engaged in developing complex AI solutions, this translates into swifter development cycles, reduced computational expenses, and more reliable applications. In today’s landscape, enterprise AI systems frequently involve intricate workflows that interlink multiple LLM modules, databases, code interpreters, and customized logic to tackle advanced tasks such as multi-step research and data analysis. Traditionally, optimizing these systems has relied on RL methods, such as Group Relative Policy Optimization (GRPO), which adopts a black-box approach. This entails executing tasks, receiving a basic success metric (a scalar reward), and incrementally adjusting the model’s parameters based on this feedback. However, the main drawback of RL is its inefficiency in sampling; it often demands tens of thousands of trial runs to extract meaningful insights, making it impractical for real-world applications that involve costly tool calls. Lakshya A. Agrawal, a co-author of the study and doctoral student at UC Berkeley, highlighted the complexity of RL as a significant barrier for many organizations. He explained that many teams have resorted to manual prompt engineering due to the high costs and complexities associated with RL. GEPA, designed for optimizing systems built on high-performance models that cannot be easily fine-tuned, allows teams to enhance performance without the need for managing custom GPU clusters. The central challenge posed by the researchers was how to maximize learning signals from each expensive rollout to facilitate effective adaptation of complex AI systems, especially in data-limited or budget-constrained environments. GEPA, which stands for Genetic-Pareto, addresses this by substituting sparse rewards with comprehensive, natural language feedback. This approach utilizes the entire execution of an AI system, including reasoning steps and tool calls, converting them into text that an LLM can analyze and comprehend. GEPA’s methodology rests on three fundamental pillars: genetic prompt evolution, reflection with natural language feedback, and Pareto-based selection. The first pillar treats a population of prompts like a gene pool, where prompts are systematically ‘mutated’ to generate potentially improved versions. The second pillar involves the LLM reflecting on comprehensive execution traces and outcomes after running a few rollouts, enabling it to diagnose issues and enhance prompts accordingly. The third pillar, Pareto-based selection, ensures a broad exploration of solutions by maintaining a diverse set of specialist prompts. This strategy prevents stagnation in suboptimal solutions and enhances the likelihood of discovering a prompt that performs well across various inputs. The researchers emphasize the importance of ‘feedback engineering,’ which involves revealing the rich textual details that systems often overlook. In evaluations across diverse tasks, including multi-hop question answering and privacy-preserving queries, GEPA consistently outperformed GRPO. For example, in optimizing a QA system, GEPA achieved results in approximately three hours compared to 24 hours with GRPO, while also delivering better performance and significantly lower costs. Moreover, the study revealed that systems optimized with GEPA exhibit greater reliability when processed with unseen data, indicated by a smaller generalization gap. This enhanced reliability is attributed to GEPA’s use of comprehensive, natural-language feedback, which fosters a deeper understanding of success rather than merely identifying patterns in training data. Another notable advantage of GEPA is that it generates shorter instruction-based prompts, which can lead to reduced latency and lower costs for API-driven models. The researchers also explored GEPA’s potential as an inference-time strategy, transforming AI from a mere answer generator to an iterative problem solver. Agrawal envisions a future where GEPA can be seamlessly integrated into CI/CD pipelines, refining multiple optimized versions of code and facilitating continuous optimization processes. The authors believe GEPA represents a significant evolution in AI development, making high-performing systems more accessible to end-users who possess relevant domain expertise but may lack the time or desire to navigate the complexities of RL.

Sources : VentureBeat

Published On : Aug 20, 2025, 03:41

Gadgets

Airtel Secures $1 Billion Investment to Expand Data Center Operations in India

Bharti Airtel, a leading telecom operator in India, has successfully secured $1 billion in funding for its data center d...

CNBC | Mar 31, 2026, 03:45

Airtel Secures $1 Billion Investment to Expand Data Center Operations in India

Startups

Sycamore Secures $65 Million Seed Funding to Revolutionize Enterprise AI Solutions

In a significant boost for enterprise AI development, startup Sycamore has successfully closed a remarkable $65 million ...

TechCrunch | Mar 30, 2026, 22:05

Sycamore Secures $65 Million Seed Funding to Revolutionize Enterprise AI Solutions

Computing

Micron Stock Takes a Dive as Market Reacts to Earnings Report

Shares of Micron Technologies experienced a significant drop of 10% on Monday, marking a continued trend in the company'...

CNBC | Mar 30, 2026, 20:15

Micron Stock Takes a Dive as Market Reacts to Earnings Report

Computing

Court Blocks Nexstar and Tegna Merger Amid Competition Concerns

A federal judge has intervened in the proposed $6.2 billion acquisition of Tegna by Nexstar Media Group, ordering an imm...

Ars Technica | Mar 30, 2026, 20:25

Court Blocks Nexstar and Tegna Merger Amid Competition Concerns

Bluesky's New AI Assistant Faces Backlash as Users Hit Block Button

Bluesky recently introduced an AI assistant named Attie, designed to help users craft personalized social media algorith...

TechCrunch | Mar 30, 2026, 18:35

Bluesky's New AI Assistant Faces Backlash as Users Hit Block Button

View All News

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolio

case study

follow us on

follow us on

GEPA optimizes LLMs without costly reinforcement learning

Airtel Secures $1 Billion Investment to Expand Data Center Operations in India

Sycamore Secures $65 Million Seed Funding to Revolutionize Enterprise AI Solutions

Micron Stock Takes a Dive as Market Reacts to Earnings Report

Court Blocks Nexstar and Tegna Merger Amid Competition Concerns

Bluesky's New AI Assistant Faces Backlash as Users Hit Block Button

Collaborate with Benzatine Infotech

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolios

case study

follow us on

follow us on

portfolio

case study

follow us on

follow us on

GEPA optimizes LLMs without costly reinforcement learning

Airtel Secures $1 Billion Investment to Expand Data Center Operations in India

Sycamore Secures $65 Million Seed Funding to Revolutionize Enterprise AI Solutions

Micron Stock Takes a Dive as Market Reacts to Earnings Report

Court Blocks Nexstar and Tegna Merger Amid Competition Concerns

Bluesky's New AI Assistant Faces Backlash as Users Hit Block Button

Collaborate with Benzatine Infotech