In a recent internal communication, an Nvidia employee raised concerns about the cooling system utilized by Microsoft for its new Blackwell GPU installation. This exchange took place as Nvidia rolled out its cutting-edge GB200 Blackwell architecture within Microsoft's data centers, responding to the escalating demand for computing power needed to support AI model training and operations. The Blackwell architecture, unveiled in March 2024, boasts performance capabilities approximately double that of its predecessor, Hopper, as stated by Nvidia CEO Jensen Huang during the launch. The email, originating from a member of Nvidia's Infrastructure Specialists (NVIS) team, detailed the installation of two GB200 NVL72 racks, each integrating 72 Nvidia GPUs. Given the substantial heat produced by these closely packed GPUs, liquid cooling technology was implemented. However, the Nvidia employee described Microsoft's cooling methodology as seemingly 'wasteful' due to its reliance on a less efficient cooling approach that lacked significant water usage. Despite this criticism, the memo acknowledged that the system allowed for greater flexibility and fault tolerance. This perspective was further contextualized by Shaolei Ren, an associate professor of electrical and computer engineering at the University of California, who explained that many data centers deploy a secondary air-cooling system to manage overall heat expulsion. Ren noted that while air cooling is often more energy-intensive, it addresses public concerns about water consumption—an issue that is becoming increasingly prominent as the demand for data center resources grows. Microsoft has stressed its commitment to sustainability, aiming to achieve a carbon-negative, water-positive, and zero-waste status by 2030. The company has also announced plans for a zero-water cooling design for future data centers, as well as innovations in on-chip cooling technology. The internal email also highlighted some challenges faced during the Blackwell hardware installation, a typical occurrence during the initial phase of new technology deployments. The staffer indicated that significant time was dedicated to creating validation processes and ensuring that the steps were comprehensible to those unfamiliar with cluster and system validation practices. Furthermore, the transition process between Nvidia and Microsoft required more solidification than previous deployments. Despite these challenges, the memo indicated that the quality of Blackwell's production hardware had improved compared to earlier testing samples, with both racks achieving a 100% pass rate in specific compute performance assessments. An Nvidia spokesperson reinforced the capabilities of the Blackwell systems, asserting that they deliver exceptional performance, reliability, and energy efficiency across diverse computing applications, with numerous systems already deployed by clients like Microsoft to meet rising global AI demands.
In a bold move reflecting the growing influence of artificial intelligence, Atlassian, the Australian productivity softw...
TechCrunch | Mar 12, 2026, 17:45
Substack is making significant strides in the realm of video content with the introduction of its new Substack Recording...
TechCrunch | Mar 12, 2026, 18:45
In an exciting development for AI enthusiasts, Perplexity has introduced its latest innovation: the 'Personal Computer.'...
Ars Technica | Mar 12, 2026, 17:45
Lucid Motors is setting its sights on the bustling midsize SUV market, a move that could prove pivotal for the company's...
Ars Technica | Mar 12, 2026, 17:55
Recently released documents have revealed startling admissions from a regional director at Live Nation, who allegedly br...
Ars Technica | Mar 12, 2026, 20:50