IBM speeds deep learning by using multiple servers

IBM speeds deep learning by using multiple servers

IBM's Distributed Deep Learning spreads model training across any number of hardware nodes—as long as they’re IBM nodes

For everyone frustrated by how long it takes to train deep learning models, IBM has some good news: It has unveiled a way to automatically split deep-learning training jobs across multiple physical servers -- not just individual GPUs, but whole systems with their own separate sets of GPUs.

Now the bad news: It's available only in IBM's PowerAI 4.0 software package, which runs exclusively on IBM's own OpenPower hardware systems.

Distributed Deep Learning (DDL) doesn't require developers to learn an entirely new deep learning framework.

It repackages several common frameworks for machine learning: TensorFlow, Torch, Caffe, Chainer, and Theano. Deep learning projecs that use those frameworks can then run in parallel across multiple hardware nodes.

IBM claims the speed-up gained by scaling across nodes is nearly linear. One benchmark, using the ResNet-101 and ImageNet-22K data sets, needed 16 days to complete on one IBM S822LC server. Spread across 64 such systems, the same benchmark concluded in seven hours, or 58 times faster.

IBM offers two ways to use DDL. One, you can shell out the cash for the servers it's designed for, which sport two Nvidia Tesla P100 units each, at about $50,000 a head.

Two, you can run the PowerAI software in a cloud instance provided by IBM partner Nimbix, for around $0.43 an hour.

One thing you can't do, though, is run PowerAI on commodity Intel x86 systems. IBM has no plans to offer PowerAI on that platform, citing tight integration between PowerAI's proprietary components and the OpenPower systems designed to support them.

Most of the magic, IBM says, comes from a machine-to-machine software interconnection system that rides on top of whatever hardware fabric is available.

Typically, that's an InfiniBand link, although IBM claims it can also work on conventional gigabit Ethernet (still, IBM admits it won't run anywhere nearly as fast).

It's been possible to do deep-learning training on multiple systems in a cluster for some time now, although each framework tends to have its own set of solutions.

With Caffe, for example, there's the Parallel ML System or CaffeOnSpark. TensorFlow can also be distributed across multiple servers, but again any integration with other frameworks is something you'll have to add by hand.

IBM's claimed advantage is that it works with multiple frameworks and without as much heavy lifting needed to set things up. But those come at the cost of running on IBM's own iron.

This article originally appeared on InfoWorld.

Follow Us

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags CloudIBMstorageserver



Malwarebytes shoots the breeze with channel, prospects

Malwarebytes shoots the breeze with channel, prospects

A Kumeu, Auckland, winery was the venue for a Malwarebytes event for partner and prospect MSPs - with some straight shooting on the side. The half-day getaway, which featured an archery competition, lunch and wine-tasting aimed at bringing Malwarebytes' local New Zealand and top and prospective MSP partners together to celebrate recent local successes, and discuss the current state of malware in New Zealand. This was also a unique opportunity for local MSPs to learn about how they can get the most out of Malwarebytes' MSP program and offering, as more Kiwi businesses are targeted by malware.

Malwarebytes shoots the breeze with channel, prospects
EDGE 2019: Channel forges new partnerships during evening networking

EDGE 2019: Channel forges new partnerships during evening networking

Partners, vendors and distributors reconnected during a number of social gatherings during EDGE 2019. The first evening saw the channel congregate for a welcome party at the Hamilton Island yacht club, while the main poolside proved to be the perfect stop for a barbecue on the final night.

EDGE 2019: Channel forges new partnerships during evening networking
Show Comments