
In an era brimming with investment in artificial intelligence, innovative researchers are finding fertile ground to launch their ideas. One such venture, Inception, has successfully raised $50 million in seed funding, spearheaded by Menlo Ventures. Prominent figures in the AI field, including Andrew Ng and Andrej Karpathy, have also contributed through angel investments. Under the guidance of Stanford professor Stefano Ermon, who specializes in diffusion models, Inception aims to expand the application of these models beyond their traditional use. Diffusion models generate results through iterative refinement, a method that contrasts sharply with conventional word-by-word generation. This technology has already demonstrated its capabilities in popular image-based AI systems like Stable Diffusion and Midjourney. With this new funding, Inception has unveiled an upgraded version of its Mercury model, specifically tailored for software development. This model has been successfully integrated into a variety of development tools, such as ProxyAI, Buildglare, and Kilo Code. Ermon emphasizes that the diffusion-based approach offers substantial improvements in two critical areas: latency and computational cost. "These diffusion-based LLMs are significantly faster and more efficient than current alternatives," Ermon explains. He notes that the innovation stemming from this different methodology holds great promise for AI development. To understand the distinction between diffusion models and the more commonly used auto-regression models, it’s important to recognize their structural differences. Auto-regression models, like GPT-5 and Gemini, predict the next word in a sequence based on previous inputs, while diffusion models incrementally adjust the overall response to meet the desired outcome. Although auto-regression models have dominated text-based AI applications, research indicates that diffusion models may excel in handling large text volumes and complex data. Ermon points out that diffusion models provide notable advantages when processing extensive codebases, particularly in terms of flexibility and hardware utilization. Unlike auto-regression models, which execute tasks sequentially, diffusion models can handle multiple operations at once, resulting in drastically reduced latency for intricate tasks. "We’ve been benchmarked at over 1,000 tokens per second, far exceeding the capabilities of current autoregressive technologies," Ermon states, highlighting the speed and efficiency of their approach.
For nearly two decades, the Giant Magellan Telescope (GMT) has been a focal point in the quest for advanced optical tele...
Ars Technica | Jan 19, 2026, 17:10
Bungie, the renowned developer behind Destiny, has officially announced the release date for its long-awaited revival of...
Ars Technica | Jan 19, 2026, 21:15
MacBook Pro enthusiasts anticipating new high-performance models might soon find their wait coming to an end. After Appl...
Ars Technica | Jan 19, 2026, 20:00
Asus has officially announced a halt to its smartphone business, a decision confirmed by chairman Jonney Shih at a recen...
Ars Technica | Jan 19, 2026, 18:25
In a recent blog post, OpenAI's CFO, Sarah Friar, announced that the company aims to make 2026 a pivotal year for the 'p...
CNBC | Jan 19, 2026, 18:35