Powering and cooling AI and accelerated computing in the data room

Click for: original source

Artificial intelligence (AI) is here, and it is here to stay. “Every industry will become a technology industry,” according to NVIDIA founder and CEO, Jensen Huang. The use cases for AI are virtually limitless, from breakthroughs in medicine to high-accuracy fraud prevention. AI is already transforming our lives just as it is transforming every single industry. It is also beginning to fundamentally transform data center infrastructure. By Anton Chuchkov, Brad Wilson.

AI workloads are driving significant changes in how we power and cool the data processed as part of high-performance computing (HPC). A typical IT rack used to run workloads from 5-10 kilowatts (kW), and racks running loads higher than 20 kW were considered high-density – a rare sight outside of very specific applications with narrow reach. Mark Zuckerberg announced that by the end of 2024, Meta will spend billions to deploy 350,000 H100 GPUs from NVIDIA. Rack densities of 40 kW per rack are now at the lower end of what is required to facilitate AI deployments, with rack densities surpassing 100 kW per rack becoming commonplace and at large scale in the near future.

The transition to accelerated computing will not happen overnight. Data center and server room designers must look for ways to make power and cooling infrastructure future-ready, with considerations for the future growth of their workloads. Getting enough power to each rack requires upgrades from the grid to the rack. In the white space specifically, this likely means high amperage busway and high-density rack PDUs. To reject the massive amount of heat generated by hardware running AI workloads, two liquid cooling technologies are emerging as primary options: Direct-to-chip liquid cooling, Rear-door heat exchangers.

While direct-to-chip liquid cooling offers significantly higher density cooling capacity than air, it is important to note that there is still excess heat that the cold plates cannot capture. This heat will be rejected into the data room unless it is contained and removed through other means such as rear-door heat exchangers or room air cooling. Interesting read!

[Read More]

Tags ai servers cloud miscellaneous big-data