Menu
Facebook query software crawls data mountains more quickly

Facebook query software crawls data mountains more quickly

Facebook has updated its Presto SQL query engine for faster performance

Facebook engineers have revamped the company's open source Presto query engine to run up to four times faster, making it even more feasible for very large scale data warehouse work.

"We have a large number of internal users at Facebook who use Presto on a continuous basis for data analysis. Improving query performance directly improves their productivity, so we thought through ways to make Presto even faster," wrote Dain Sundstrom, in a blog post introducing the improvements.

Facebook created Presto to run SQL commands against large amounts of unstructured data, notably data the company kept in Hadoop file systems. Understood by countless database administrators, SQL (Structured Query Language) has long been the cornerstone of relational database systems and data analysis systems.

Last November, Facebook open sourced Presto. Like with many of the programs the company has developed in-house, Facebook is hoping others will deploy the code and submit bug fixes and improvements. Organizations could use the software to run data analysis on sets of data that would be too large, or too expensive to fit into a commercial data warehouse.

To improve performance, Facebook engineers revamped Presto's software for reading data. The software can now directly ingest data from Hadoop systems in columns, rather than by reading them in by rows, which required additional time for restructuring the data.

The software now takes note of the minimum and maximum values of each column, which allows it to more quickly find a set of data with a small range of values. It also evaluates queries more intelligently, employing the user's filter terms to minimize the data set being inspected, saving processing time.

With these improvements, Facebook engineers found that Presto was able to execute single single column reads up to four times faster, as measured against a 600-million-row data set residing on 14 servers, with each server running 16 cores and 64GB of memory.

Presto seems to be gaining popularity outside of Facebook. Argyle uses the software as the basis of its own real-time fraud detection software for telecommunications companies. Presto "gives us a lot of flexibility," said Arshak Navruzyan, Argyle vice president of product management.

Argyle paired Presto with open source data store Accumolo. Argyle's customers keep terabytes, and even petabytes worth of user call data, which is far too large to hold in traditional data warehouses.

Presto queries serve as the basis of Argyle's alert messaging system, which flags unusual behavior for its customers. For Argyle, Presto provides an interface for user-facing business intelligence software such as Tableau.

Airbnb also uses Presto, in conjunction with its own software, to allow employees to access the company's operational data.

Presto is one of a number of SQL query engines developed for interrogating massive amounts of data across multiple servers in parallel. Cloudera developed Impala for similar reasons. Pivotal also created HAWQ for the same purpose, borrowing technology from its Greenplum database technology.


Follow Us

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags Facebooksoftware

Featured

Slideshows

Reseller News kicks off awards season in 2019 with Judges' Lunch

Reseller News kicks off awards season in 2019 with Judges' Lunch

The 2019 Reseller News Innovation Awards has kicked off with the Judges Lunch in Auckland with 70 judges in the voting panel. The awards will reflect the changing dynamics of the channel, recognising excellence across customer value and innovation - spanning start-ups, partners, distributors and vendors. Photos by Christine Wong.

Reseller News kicks off awards season in 2019 with Judges' Lunch
Reseller News welcomes industry figures for 2019 Hall of Fame lunch

Reseller News welcomes industry figures for 2019 Hall of Fame lunch

Reseller News welcomed 2018 inductees - Chris Simpson, Kendra Ross and Phill Patton - to the third running of the Reseller News Hall of Fame lunch, held at the French Cafe in Auckland. The inductees discussed the changing landscape of the technology industry in New Zealand, while outlining ways to attract a new breed of players to the ecosystem. Photos by Gino Demeer.

Reseller News welcomes industry figures for 2019 Hall of Fame lunch
Upcoming tech talent share insights at inaugural Emerging Leaders Forum 2019

Upcoming tech talent share insights at inaugural Emerging Leaders Forum 2019

The channel came together for the inaugural Reseller News Emerging Leaders Forum in New Zealand, created to provide a program that identifies, educates and showcases the upcoming talent of the ICT industry. Hosted as a half day event, attendees heard from industry champions as keynoters and panelists talked about future opportunities and leadership paths and joined mentoring sessions with members of the ICT industry Hall of Fame. The forum concluded with 30 Under 30 Tech Awards across areas of Sales, Entrepreneur, Marketing, Management, Technical and Human Resources. Photos by Gino Demeer.

Upcoming tech talent share insights at inaugural Emerging Leaders Forum 2019
Show Comments