Menu
IBM speeds deep learning by using multiple servers

IBM speeds deep learning by using multiple servers

IBM's Distributed Deep Learning spreads model training across any number of hardware nodes—as long as they’re IBM nodes

For everyone frustrated by how long it takes to train deep learning models, IBM has some good news: It has unveiled a way to automatically split deep-learning training jobs across multiple physical servers -- not just individual GPUs, but whole systems with their own separate sets of GPUs.

Now the bad news: It's available only in IBM's PowerAI 4.0 software package, which runs exclusively on IBM's own OpenPower hardware systems.

Distributed Deep Learning (DDL) doesn't require developers to learn an entirely new deep learning framework.

It repackages several common frameworks for machine learning: TensorFlow, Torch, Caffe, Chainer, and Theano. Deep learning projecs that use those frameworks can then run in parallel across multiple hardware nodes.

IBM claims the speed-up gained by scaling across nodes is nearly linear. One benchmark, using the ResNet-101 and ImageNet-22K data sets, needed 16 days to complete on one IBM S822LC server. Spread across 64 such systems, the same benchmark concluded in seven hours, or 58 times faster.

IBM offers two ways to use DDL. One, you can shell out the cash for the servers it's designed for, which sport two Nvidia Tesla P100 units each, at about $50,000 a head.

Two, you can run the PowerAI software in a cloud instance provided by IBM partner Nimbix, for around $0.43 an hour.

One thing you can't do, though, is run PowerAI on commodity Intel x86 systems. IBM has no plans to offer PowerAI on that platform, citing tight integration between PowerAI's proprietary components and the OpenPower systems designed to support them.

Most of the magic, IBM says, comes from a machine-to-machine software interconnection system that rides on top of whatever hardware fabric is available.

Typically, that's an InfiniBand link, although IBM claims it can also work on conventional gigabit Ethernet (still, IBM admits it won't run anywhere nearly as fast).

It's been possible to do deep-learning training on multiple systems in a cluster for some time now, although each framework tends to have its own set of solutions.

With Caffe, for example, there's the Parallel ML System or CaffeOnSpark. TensorFlow can also be distributed across multiple servers, but again any integration with other frameworks is something you'll have to add by hand.

IBM's claimed advantage is that it works with multiple frameworks and without as much heavy lifting needed to set things up. But those come at the cost of running on IBM's own iron.

This article originally appeared on InfoWorld.


Follow Us

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags CloudIBMstorageserver

Featured

Slideshows

Leading female front runners of the Kiwi ICT industry honoured at 2019 WIICTA

Leading female front runners of the Kiwi ICT industry honoured at 2019 WIICTA

Reseller News has honoured the leading female front runners of the New Zealand ICT industry at the 2019 Women in ICT Awards (WIICTA) in Auckland. The awards recognised standout individuals across six categories, spanning Entrepreneur, Rising Star, Shining Star, Community, Technical and Achievement. Photos by Gino Demeer.

Leading female front runners of the Kiwi ICT industry honoured at 2019 WIICTA
Reseller News kicks off awards season in 2019 with Judges' Lunch

Reseller News kicks off awards season in 2019 with Judges' Lunch

The 2019 Reseller News Innovation Awards has kicked off with the Judges Lunch in Auckland with 70 judges in the voting panel. The awards will reflect the changing dynamics of the channel, recognising excellence across customer value and innovation - spanning start-ups, partners, distributors and vendors. Photos by Christine Wong.

Reseller News kicks off awards season in 2019 with Judges' Lunch
Reseller News welcomes industry figures for 2019 Hall of Fame lunch

Reseller News welcomes industry figures for 2019 Hall of Fame lunch

Reseller News welcomed 2018 inductees - Chris Simpson, Kendra Ross and Phill Patton - to the third running of the Reseller News Hall of Fame lunch, held at the French Cafe in Auckland. The inductees discussed the changing landscape of the technology industry in New Zealand, while outlining ways to attract a new breed of players to the ecosystem. Photos by Gino Demeer.

Reseller News welcomes industry figures for 2019 Hall of Fame lunch
Show Comments