
Indian startup Sarvam AI has unveiled an innovative speech model designed to automate multilingual video dubbing, presenting its technology as a solution for creators, educators, and broadcasters seeking to translate content efficiently across various Indian languages. This launch positions Sarvam as a contender against established global entities like ElevenLabs. Launched on February 1, Sarvam Dub is an artificial intelligence system that enables the preservation of a speaker’s voice while translating audio into multiple languages. The model incorporates built-in controls to synchronize with the timing of the original video, enhancing the dubbing experience. "We’re excited to introduce Sarvam Dub, a cutting-edge AI dubbing model that empowers creators to extend the life and reach of their content rapidly," the company stated in its announcement. Traditionally, dubbing has involved a lengthy process requiring translators, voice artists, and studio time. Sarvam emphasizes that their new model can drastically reduce this time, claiming, "what used to take weeks of scripting, recording, studio time, and publishing effort can now be dubbed in minutes." Utilizing zero-shot voice cloning and advanced cross-lingual speech models, Sarvam Dub maintains the original speaker's identity even as the language shifts. This is particularly challenging in India, where content often traverses numerous regional languages and accents. Additionally, the system incorporates duration control directly into speech generation, addressing the common issue of unnatural-sounding voices that arise from post-production adjustments. “High-quality dubbing necessitates intrinsic duration control in speech generation, allowing timing to be shaped as the voice is created rather than modified later,” Sarvam explained. To validate its performance, the company assessed over 700 audio samples from 64 speakers across ten Indian languages and English, achieving state-of-the-art results, particularly in cross-lingual scenarios where voice preservation is most challenging. Sarvam AI has already begun testing the system in various sectors, including public communication, education, and broadcasting. Notably, the startup collaborated with the Indian Institute of Technology (IIT) Madras to dub technical lectures into multiple languages. The Indian Union Budget 2026 was also a notable milestone, being the first national budget dubbed live using AI, with Finance Minister Nirmala Sitharaman’s speech streamed in both Kannada and Hindi. Live dubbing introduces unique challenges related to speed, but Sarvam claims its engineering team has successfully achieved a 6.6-times reduction in latency through optimized model tracing, selective and post-training quantization, and intelligent caching. These enhancements are said to make the system suitable for real-time broadcasting.
Amjad Masad, the founder of Replit, has witnessed a remarkable evolution in his company over the past decade, particular...
TechCrunch | May 01, 2026, 23:15
A baffling medical case has emerged involving a 78-year-old man who suffered from severe skin lesions and debilitating u...
Ars Technica | May 01, 2026, 21:10
In a rapidly evolving landscape, there's growing pressure on businesses to enhance their use of AI tokens. Sylvain Duran...
Business Insider | May 02, 2026, 09:55Uber is setting its sights on a groundbreaking initiative that transcends its conventional role in ride-sharing. The com...
TechCrunch | May 02, 2026, 07:15
In a bold statement, Sam Altman, the CEO of OpenAI, emphasized a significant transformation in startup dynamics, attribu...
Business Insider | May 01, 2026, 20:10