Google and IBM have teamed up to offer a curriculum and support for software development on large-scale distributed computing systems, with six US universities signing up so far.
The program is designed to help students and researchers get experience working on Internet-scale applications, the companies said. The relatively new form of parallel computing, sometimes called cloud computing, hasn't yet caught on in university settings, said Colleen Haikes, an IBM spokeswoman.
"Right now, although the technique is being used in industry, it's not being taught in universities," she said.
IBM and Google are providing hardware, software and services to add to university resources, the two companies said.
The University of Washington signed up with the program late last year. This year, five more schools, including the Massachusetts Institute of Technology, Stanford University and the University of Maryland, have joined the program. The two companies expect to expand the program to other universities in the future.
The program focuses on parallel computing techniques that take computational tasks and break them into hundreds or thousands of smaller pieces to run across many servers at the same time. The techniques allow Web applications such as search, social networking and mobile commerce to run quickly, the companies said in a press release.
IBM and Google have dedicated a cluster of several hundred computers, including PCs donated by Google and IBM BladeCenter and other servers, and the companies expect the cluster to grow to more than 1,600 processors.
The companies call these clusters "cloud" computing. A cloud is a collection of machines that can serve as a host for a variety of applications, including interactive Web 2.0 applications. Clouds support a broader set of applications than do traditional computing grids, because they allow various kinds of middleware to be hosted on virtual machines distributed across the cloud, Haikes said.
IBM and Google have created several resources for the program, including the following:
- A cluster of processors running an open-source version of Google's published computing infrastructure, including MapReduce and GFS from Apache's Hadoop project, a software platform that lets one easily write and run applications that process vast amounts of data.
- A Creative Commons-licensed curriculum on parallel computing developed by Google and the University of Washington.
- Open-source software designed by IBM to help students develop programs for clusters running Hadoop. The software works with Eclipse, an open-source development platform.