Menu
Databricks takes the human intervention out of Spark processing

Databricks takes the human intervention out of Spark processing

A new workflow feature for Databricks Cloud can automate routine deployment tasks

Databrick's now offers a way to schedule Spark jobs in the cloud

Databrick's now offers a way to schedule Spark jobs in the cloud

Databricks wants to make it possible to take humans out of the loop entirely when it comes to running complicated data analysis jobs.

The company, which offers a commercial version of Spark , now offers a tool to automate the setting up and executing of analysis written to run on the open source data processing platform.

"You can express very complicated workflows using this thing," said Ali Ghodsi, Databricks' director of engineering. "There is no human in the loop any more."

Founded by several of the original developers of Spark, Databricks offers a commercial version of the platform designed to run on Amazon Web Services and eliminate many of the mundane chores of setting up and maintaining an in-house deployment.

Spark can be used to analyze very large data sets across multiple servers for tasks such as generating recommendations for an Internet service for users, or to predict future revenue of a company.

As customers get more comfortable with using big data, they are increasingly scheduling their analysis jobs to run on a regular basis, requiring an administrator to log into a console to coordinate all the steps needed to run the job.

The new feature for Databricks Cloud, called jobs, provides a way for administrators to set up schedules to run standalone Spark jobs at specified intervals. A user could schedule a Spark application to run on a specific Databricks cloud cluster at a scheduled time. Users can decide whether to use a dedicated cluster for maximum performance, or a cluster shared with other users to save money.

The service notifies the user when the task completes. The service also creates a log detailing if the task was completed successfully or not, and can alert the administrator if something goes awry.

In effect, the feature establishes a way to create a production pipeline, which is a series of jobs that execute automatically and in coordination with each other. An administrator can set up a workflow that executes two Spark jobs at the same time, and wait for both to finish. When both are completed, the workflow can then start another job that uses the results from the first two. If one of the two initial jobs fail, then the entire workflow can be terminated.

Jobs are written in Spark notebooks. Similar to iPython notebooks for Python, Spark notebooks are user-generated packages that contain all the components needed to run an interactive data analysis job across a cluster. Spark Notebooks can be written in Python, Scala, SQL, or a combination of each.

Pricing for Databricks is tiered, based on usage capacity, support model, and feature-set. It will start at several hundred dollars per month.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com


Follow Us

Join the newsletter!

Or
Error: Please check your email address.

Tags softwareapplicationsdata miningDatabricks

Featured

Slideshows

Bumper channel crowd kicks off first After Hours of 2018

Bumper channel crowd kicks off first After Hours of 2018

After Hours made a welcome return to the channel social calendar with a bumper crowd of partners, distributors and vendors descending on The Jefferson in Auckland to kick-start 2018. Photos by Gino Demeer.

Bumper channel crowd kicks off first After Hours of 2018
Looking back at the top 15 M&A deals in NZ during 2017

Looking back at the top 15 M&A deals in NZ during 2017

In 2017, merger and acquisitions fever reached new heights in New Zealand, with a host of big name deals dominating the headlines. Reseller News recaps the most important transactions of the Kiwi channel during the past 12 months.

Looking back at the top 15 M&A deals in NZ during 2017
Kiwi channel closes 2017 with After Hours

Kiwi channel closes 2017 with After Hours

The channel in New Zealand came together to celebrate the close of 2017, as the final After Hours played out in front of a bumper Auckland crowd.

Kiwi channel closes 2017 with After Hours
Show Comments