
On Tuesday, Tencent unveiled HunyuanWorld-Voyager, an innovative AI model designed to create 3D-consistent video sequences from a single image. This groundbreaking tool allows users to navigate virtual environments by piloting a camera path, immersing them in a rich visual experience. The model generates both RGB videos and depth information, enabling direct 3D reconstruction without relying on conventional modeling techniques. However, it’s important to note that while HunyuanWorld-Voyager produces stunning visuals, it does not create true 3D models. Instead, it generates 2D video frames that maintain spatial coherence, simulating the movement of a camera through a genuine 3D environment. Each generation from the model yields 49 frames, which equates to about two seconds of video. Users can combine several clips to create sequences that last several minutes, according to Tencent’s announcement. Remarkably, as the camera moves around objects, their relative positions remain fixed, and the perspective shifts naturally, akin to navigating a real 3D space. While the output consists of video with accompanying depth maps rather than traditional 3D models, this depth information can be transformed into 3D point clouds for reconstruction purposes. The system operates by taking a single input image along with a user-defined camera trajectory. Users can dictate camera actions such as moving forward, backward, or turning, using the intuitive interface provided. Despite its impressive capabilities, the AI model does have limitations. Like many models based on the Transformer architecture, it primarily imitates patterns identified in its training data, which restricts its ability to generalize to unfamiliar scenarios. To develop Voyager, researchers trained the model using over 100,000 video clips, including computer-generated scenes from Unreal Engine, effectively teaching it how cameras behave in 3D game environments.
The surge in artificial intelligence has led to an unprecedented acceleration in the growth of startups, many of which a...
Business Insider | Mar 07, 2026, 10:00A team of researchers, headed by paleontologist Paul C. Sereno from the University of Chicago, has uncovered groundbreak...
Ars Technica | Mar 07, 2026, 12:35
OpenAI has announced another delay in the rollout of its 'adult mode' feature for ChatGPT, which aims to provide verifie...
TechCrunch | Mar 07, 2026, 17:45
In the modern landscape of warfare, traditional methods of surveillance such as satellites and drones are being joined b...
Ars Technica | Mar 07, 2026, 11:35
Retail investors have long been excluded from the startup investment scene, but Robinhood is attempting to revolutionize...
TechCrunch | Mar 07, 2026, 02:20