As artificial intelligence (AI) takes off in enterprise settings, so will data centre power usage. AI is many things, but power efficient is not one of them.
For data centres running typical enterprise applications, the average power consumption for a rack is around 7kW. Yet it’s common for AI applications to use more than 30kW per rack, according to data centre organisation AFCOM.
That’s because AI requires much higher processor utilisation, and the processors - especially GPUs - are power hungry. Nvidia GPUs, for example, may run several orders of magnitude faster than a CPU, but they also consume twice as much power per chip. Complicating the issue is that many data centres are already power constrained.
Cooling is also an issue: AI-oriented servers require greater processor density, which means more chips crammed into the box, and they all run very hot. Greater density, along with higher utilisation, increases the demand for cooling as compared to a typical back-office server. Higher cooling requirements in turn raise power demands.
So what can you do if you want to embrace AI for competitive reasons but the power capacity of your existing facility isn’t up to the high-density infrastructure requirements of AI? Here are some options.
Consider liquid cooling
Fan cooling typically loses viability once a rack exceeds 15kW. Water, however, has 3,000 times the heat capacity of air, according to CoolIT Systems, a maker of enterprise liquid cooling products. As a result, server cabinet makers have been adding liquid pipes to their cabinets and connecting water piping to their heat sinks instead of fans.
“Liquid cooling is definitely a very good option for higher density loads,” says John Sasser, senior vice president for data centre operations at Sabey, a developer and operator of data centres. “That removes the messy airflow issue. Water removes a lot more heat than air does, and you can direct it through pipes. A lot of HPC [high performance computing] is done with liquid cooling.”
Most data centres are set up for air cooling, so liquid cooling will require a capital investment, “but that might be a much more sensible solution for these efforts, especially if a company decides to move in the direction [of AI],” Sasser says.
Run AI workloads at lower resolutions
Existing data centres might be able to handle AI computational workloads but in a reduced fashion, says Steve Conway, senior research vice president for Hyperion Research. Many, if not most, workloads can be operated at half or quarter precision rather than 64-bit double precision.
“For some problems, half precision is fine,” Conway says. “Run it at lower resolution, with less data. Or with less science in it.”
Double-precision floating point calculations are primarily needed in scientific research, which is often done at the molecular level. Double precision is not typically used in AI training or inference on deep learning models because it is not needed. Even Nvidia advocates for use of single- and half-precision calculations in deep neural networks.
Build an AI containment segment
AI will be a part of your business but not all, and that should be reflected in your data centre.
“The new facilities that are being built are contemplating allocating some portion of their facilities to higher power usage,” says Doug Hollidge, a partner with Five 9s Digital, which builds and operates data centres. “You’re not going to put all of your facilities to higher density because there are other apps that have lower draw.”
The first thing to do is assess the energy supply to the building, Hollidge says. “If you are going to increase energy use in the building, you’ve got to make sure the power provider can increase the power supply.”
Bring in an engineer to assess which portion of the data centre is best equipped for higher density capabilities. Workload requirements will determine the best solution, whether it be hot aisle containment or liquid cooling or some other technology. “It’s hard to give one-size-fits-all solution since all data centres are different,” Hollidge says.
Spread out your AI systems
An alternative approach - rather than crowding all your AI systems into one spot hotter than Death Valley in August - is to spread them out among the racks.
“Most of the apps are not high density. They run at eight to 10kW and up to 15kW. You can handle that with air,” says David McCall, chief innovation officer with QTS, a builder of data centres.
In an optimised heterogeneous environment, a collocation provider might have a rack or two in a cabinet to host an HPC or AI environment, and the rest of the racks in the cabinet are dedicated to hosting less power-hungry applications, such as databases and back-office apps.
That won't yield a 5kW rack, but it gets a rack closer to 12kW or 15kW, which is an environment that air cooling can handle, McCall says.
Control hot air flow in the data centre
A standard data centre layout is hot aisle/cold aisle, where the cabinets are laid out in alternating rows so that cold air intakes face each other on one front-facing aisle, and hot air exhausts face each other on the alternating back-facing aisle.
That works fine, but access can be tricky if an IT worker needs to get behind a cabinet to work on a server.
The other problem is that air is “messy,” as Sasser put it. Power is often easier to model because it flows through conductors, and you can control (and thus plan and model) where power goes. Air goes where it wants and is hard to control.
Sabey customers that want higher density environments use a hot aisle containment pod to control air flow. The company puts doors at the end of the hot aisle and plastic plates over the top, so heat is directed into a ceiling intake pipe and the barriers keep hot air and cold air from mixing.
"In an air-cooled server world, the advice I give is go with a hot aisle containment environment,” Sasser says. "The other advice I would give is make sure the data centre is tested for air flow, not just modelled for airflow. Modelling is dependent on a lot of variables, and they easily change."
Consider a chimney cabinet
Another way to help manage temperatures in data centres is to use a chimney cabinet. Instead of venting the hot air out the back, a chimney cabinet uses good old physics convection to send hot air up into a chimney, which is then connected to an air conditioning vent. Chatsworth Systems is best known for this style of cabinets.
“The air pathway is more constrained this way,” Sasser says. “Since that air pathway is more constrained, you can get greater density into a cabinet than with a hot aisle pod.”
Process data where it resides
Moving data around has a very high energy cost: It can take up to 100 times more energy to move data than it takes to process data, Conway says.
Any form of data movement requires electricity, and that power drain increases with the volume of data – a significant issue for data-intensive AI applications. “You want to move data as rarely and as little distance as you can,” Conway says.
“The solution is not to have to move the data any more or further than is absolutely necessary. So people are striving to put data closer to where it is processed. One thing cloud services providers and people who use cloud services agree on is it doesn’t make sense to move a massive amount of data to a third-party cloud,” he says.
Consider leasing data centre space
Most of the companies looking to implement AI are corporations that lease data centre space from a data centre operator, Hollidge says. There are some data centre operators that are not capable of handling high density AI computation, but some have transitioned to offering a portion of high density environments for AI.
“You might have to go through a few providers before finding it, but there is more attention being paid to that on the data centre operations side,” Hollidge says. And a third-party data centre provider gives you more growth options.
“Most of the time you are better off entering into a flexible lease that allows you to expand and grow your AI business as opposed to building ground up.”
Wait for next-generation servers
Supercomputers to date haven’t been very data friendly, Conway says. As supercomputers have gotten bigger, the designs have gotten less data-centric. The result is that more data has to be moved around and shuttled between processors, memory, and storage systems. And as discussed above, it costs more power to move data than to process it.
The first exascale systems will come with more accelerators and more powerful interconnections for moving around data. And many innovations that start in supercomputing, including GPUs and storage-class memory (SCM), eventually work their way down to more mainstream servers.
Future servers also will come with a more heterogeneous chip layout; instead of all x86 CPUs, they will include GPUs, FPGAs, and AI accelerators.
And for high speed storage, NVMe-over-Fabric and SCM will become more affordable. Servers are set to change in the coming years, and many of the advances will benefit enterprise AI application environments.