Microsoft to speed up Bing with FPGAs next year
- 13 August, 2014 07:30
It started off as an experiment, but Microsoft now wants to speed up and return more accurate Bing search results with the help of reconfigurable chips called FPGAs (field-programmable gate arrays) in data centers.
Next year, Microsoft will start installing FPGAs in its data centers, hoping to speed up and increase the accuracy and relevancy of search results.
"Increasing capabilities and reducing the cost of Microsoft cloud services relies on increasingly efficient hardware," said Andrew Putnam, a researcher at Microsoft, during the Hot Chips conference in Cupertino, California, on Wednesday.
The FPGAs are intended to help Microsoft deliver text and image search results much faster. Their use will also reduce the number of servers for processing search requests in Microsoft's data centers.
The FPGAs will be used in specially designed servers and will be connected through a specialized cable called Catapult.
"Hardware specialization is ... possibly the way to gain efficiency and performance we need to improve data center services. The Catapult reconfigurable fabric offers a flexible, elastic pool of resources where we can build these kind of accelerators," Putnam said.
Catapult started off as the name of an experiment in 2011 to put FPGAs in servers, and as a way to circumvent traditional data center designs. Microsoft was playing with the idea of specialized hardware to tackle processing for Bing, which would be much faster on FPGAs than through traditional CPUs.
"Technology has not been a friend ... in the last handful of years. It becomes more and more difficult to get those kind of efficiency gains and cost reductions by just waiting for technology to get better," Putnam said.
An experimental deployment of FPGAs in 1,632 servers in Microsoft's data center yielded promising results.
"The FPGAs provided two times improvement in throughput and 29 percent reduction in network latency. Half the servers were required to do all the processing," Putnam said. "If we need to improve the efficiency of the architecture, we can do it via hardware specialization."
But the decision to deploy FPGAs took a while. Microsoft initially looked at FPGAs and ASICs (application-specific integrated circuits). ASICs would be better on the power-efficiency scale than FPGAs, but they would be less flexible regarding software. But FPGAs, with their ability to be quickly reprogrammed, can be repurposed much more easily.
"As soon as the software changes, that FPGA accelerator can be updated along with the application change. But the ASIC is going to be a waste of power, or going to hold software development back," Putnam said.
The FPGAs run code to determine query results, which are then sent back to the CPU for delivery to the Web. The faster computing through FPGAs allows for more services to be separated, differentiated and applied.
One service could allow filtering and ranking of results. A separate service within the FPGAs could provide scores and measure relevancy of queries to results. The scores are then sorted and returned as results. It also allows more math and image recognition services to be applied to the results.
But as the number of FPGAs grows, it could be a challenge reprogramming all of them.
Microsoft's data center administrators hated the idea of FPGAs in servers. There were many reasons for the FPGA model not to fit well in data centers.
"First of all, it's a single point of failure and it really complicates the rack design, the thermals and maintainability. You have an FPGA box that's spitting out a ton of heat, all the other servers are at a different level and it becomes really hard to control," Putnam said.
It's also a security nightmare, as the FPGAs run different operating system images and software than conventional servers. FPGAs on the main network would also create latency problems, limiting the elasticity of a data center, Putnam said.
In 2011, Microsoft designed prototype boards with six Virtex-6 FPGAs and Intel Xeon CPUs, The plan was to put these boards into 1U, 2U and 4U rack-mount servers and slip them into racks across data centers. But that didn't work out well.
So when reconsidering implementing FPGAs in data centers, Microsoft set up two design requirements. The FPGAs would go into dedicated servers that could be used for tasks outside processing Bing search. Also, the server couldn't increase the hardware failure rates or require network modifications.
Limitations were also set for Microsoft's server design team: The FPGAs had to be less than 30 percent of the cost of a server, and couldn't draw more than 10 percent of the overall power drawn by the system.
And that led to the birth of the Microsoft Open Compute Server, a 1U-tall server with a PCI-Express slot that could squeeze in a single FPGA. The compact server had two eight-core Xeon CPUs, 64GB of DRAM, two solid-state drives, four hard drives total and a 10-gigabit Ethernet slot. The servers were used in the experiment earlier this year.
"Your only slot for adding accelerators was tiny, about 10 centimeters by 10 centimeters," Putnam said.
The FPGA accelerator board was an Altera Stratix V G5 D5 card, which had an 8GB DDR3, 32MB flash and eight-lane Mini-SAS connectors. It was plugged into a PCI-Express slot. The FPGA on each server had a heatsink of its own.
The CPUs and other components generated a lot of heat, and the FPGA were getting baked in a 68 degrees Celsius (154.4 degrees Fahrenheit) inlet. "That's the air we're supposed to cool the [FPGA] with," Putnam remarked.
The FPGAs were built on a specialized network connected through cables. Search requests were bounced from the CPU to a local FPGA, with the request then rerouted across a separate network of PGAs covering computer vision, math acceleration and other search services. The FPGA network was scalable, and didn't rely on the network connecting the computers.
"From the perspective of a CPU, its local FPGA can handle what actually took a lot of FPGAs," Putnam said. "You can add a bunch of services to this kind of FPGA network. Every CPU thinks it's attached to all of these services."
As the number of FPGAs in a data center piles up, programming them could become a nightmare. Microsoft has ported over filtering, ranking, relevancy and sorting tools so FPGAs remain relevant and are easier to reprogram.
"They are actually going to push [FPGAs] into their data centers, but of course, in order to keep them there, we're going to have to really improve the future of programmability," Putnam said.