- About Us
- Data Centre Services
- Service & Maintenance
- Products
- UNINTERRUPTIBLE POWER SUPPLIES
- Small Single Phase UPS
- Rack Mount UPS
- Three Phase UPS
- Modular UPS
- Central Battery Systems
- UPS Maintenance Bypass Switches
- Backup Generators
- Network & Server Racks
- RACK POWER DISTRIBUTION (PDU)
- Data Centre Cooling
- Netbotz Environmental Monitoring
- EcostruXure IT Software
- FIRE SUPPRESSION SYSTEMS
- Brands
- News
- Contact Us
How will AI change your data centre?
Artificial Intelligence (AI) is a subject that has been talked about for some time, but since the release of ChatGPT in 2022 the conversation has gathered pace and if it’s not already here, AI is definitely coming!
The use cases for AI appear to be endless, but the one thing they will all have in common is way they will change the demands on your data centre infrastructure.
AI workloads are driving significant changes in how we power and cool the data processed as part of high-performance computing (HPC). IT is being accelerated with GPUs to support the computing needs of AI models, and these AI-chips can require about five times as much power and five times as much cooling capacity in the same space as a traditional server.
With the surge of AI and HPC, data centre operators face the challenge of building a data centre infrastructure that can cope with the high density power and cooling demands of AI.
From large training clusters to small edge inference servers, AI is becoming a larger percentage of data centre workloads. Total data centre power consumption is expected to almost double from 57GW in 2023 to 93GW by 2028, but the AI portion of that power consumption will grow from 4.5GW in 2023 to 18.7GW in 2028, almost 4 times it’s current share.
Data Centre operators must now consider the impact of these high density AI applications and start thinking of ways to plan and manage them. With network latency another key issue, the location of the compute is also expected to see a significant change from 95% Centralised and 5% Edge in 2023 ,to 50% Centralised and 50% Edge by 2028.
That means governments, financial institutions, banking, manufacturing and healthcare organisations need to deploy lots of smaller, high density, edge data centres over the next 4-5 years. These high density demands create many challenges when it comes to power, cooling and management.
Power
These Edge data centres need to be able to cope with rack densities that are currently being estimated in excess of 100kw per rack. That’s nearly 10 times the average load for what is currently considered a high density edge rack.
Ten years ago, a 300 kW UPS could support 100 network and server racks at an average rack density of 3 kW/rack. Today that same power couldn’t even support the minimum configuration of an NVIDIA DGX SuperPOD (a single 358 kW 10-rack row at 36 kW/rack).
Currently the highest capacity standard off-the-shelf rack PDU is rated at 63A, but with densities increasing rack PDU’s rated up to 160A and new methods of power distribution are likely to be required.
Cooling
Today air cooling, whether chilled water or direct expansion (DX) is deployed in almost every data centre throughout the world. But air cooling has limits and is not suitable for rack densities above 20kW/rack.
Smaller AI clusters and inference server racks that are configured at 20 kW per rack or less can still be air cooled. For these racks, good airflow management practices (e.g., blanking panels, hot and cold aisle containment) should be followed to ensure more effective and efficient cooling. However, with a single 8-10U AI server consuming 12 kW, it’s easy to exceed this 20 kW threshold.
Adding to this challenge, due to latency limitations, servers in large AI clusters cannot be spread out (to lower rack density). To optimise performance and rack space in the data centre, new data centre cooling technologies will have to be adopted by the data centre operators and IT managers at the edge.
When AI rack densities go above 20 kW, strong consideration should be given to liquid-cooled servers. There are several liquid cooling technologies and architectures. Direct-to-chip (DTC), sometimes called conductive or cold plate, and immersion are the two main categories. Direct-to-chip is currently the preferred choice, as it has better compatibility with existing air cooling and is also easier for retrofit applications. If given the choice, data center operators should select liquid-cooled servers to improve performance and reduce energy cost, which can offset the investment premium.
Compared to traditional chilled water systems, direct-to-chip liquid-cooled servers have more stringent requirements in water temperature, flow, and chemistry. This means that operators cannot run water directly from a chiller system through a chip’s cold plate. Even if 100% of the servers are direct-to-chip liquid-cooled, there is still a need for supplemental air-cooling to cool other equipment like network switches and heat conduction from liquid-cooled servers.
Racks
All of these changes to power and cooling technologies trickle down to how the network and server racks accommodate this new technology. As AI servers are getting deeper, there is less space in the back of the server rack to mount rack PDUs and liquid cooling manifolds. As server power densities continue to increase, it will become very difficult if not impossible to accommodate the necessary power and cooling distribution in the back of a standard-width server rack (i.e., 600 mm / 24 in).
Standard server racks lack the space required for AI servers, common 42U high racks will likely be too short to accommodate all the servers, switches, and other equipment. For example, a 64-port network switch implies that the rack would have 8 servers, each with 8 GPUs. At this density, and assuming a 5U server height, the servers alone would consume 40U, leaving only 2U of remaining space to accommodate other devices.
With heavy AI servers, a high-density server rack can weigh over 900 kg (2000 lb). This places a significant load on IT racks and raised floors, both in terms of static and dynamic (rolling) load bearing capacity. IT Racks not rated for these weights may experience deformation to frames, leveling feet, and/or casters. Furthermore, raised floors may not support these heavy racks.
For AI servers racks will need to be higher, deeper and wider and support static loads up to 1800kg. Data centre floors, and raised floors in particular, should be assessed to ensure they can support the weight of an AI cluster. This is especially important to raised floor dynamic capacity when moving heavy IT racks around the data centre.
Some manufacturers such as APC by Schneider Electric, are already developing new network & server racks, rack PDU’s and data centre cooling technologies to deliver high density power and cooling to the IT rack. If you have any questions about planning for AI or would like to find out more information about the challenges you will face in your data centre, contact one of our team today.
For more information about power, cooling and racks for AI applications or to request a quotation, get in touch with Source UPS today!
Call: 01252 692559 or email: info@sourceups.co.uk
We've come to rely on the guys at Source UPS over the past few years. They're ability to turn quotes around quickly is a huge benefit when tendering and they always find us the best solution at the lowest price. Great guys, great company!