With its Dallas project, Microsoft seems to be exploring the idea of becoming a data broker. At the Tech Ed conference this week in New Orleans the company discussed how its planned data broker service might operate.
"Dallas is a broker for discovering information," said Adam Wilson, a program manager working on Dallas. The data sets themselves, available by APIs (application programming interfaces), come from a variety of data sources.
Microsoft hopes Dallas will become an "iTunes for data," said Douglas Purdy, Microsoft chief technology officer for data and modelling, speaking in another session.
During his session describing Dallas, however, Wilson did not reveal any details of when the service would be made commercially available. The Community Technology Preview version, running on Microsoft's Azure cloud computing platform, is available now.
The idea of posting raw data sets online so they can be reused by online applications has slowly been gaining traction over the past few years, with such efforts as the US government's Data.Gov repository and the UK's [a href="http://www.data.gov.uk">Data.Gov.UK</a>], as well as volunteer efforts such as Open Linked Data.
Like these other efforts, Microsoft is not providing any of this data. Instead it will serve as a storefront for content producers of information sets. Already, Microsoft has gathered a number of sample data sets from the likes of the Associated Press, Infogroup, NASA, Naviteq, the United Nations and the Zillow real estate listing company.
While many of these content providers already have their own customer-facing interfaces for their data, Microsoft is hoping they will see Dallas as an easier-to-manage alternative. "If you have a database, you have to think about the how to present the data, how to bill for it, how to support it," Wilson said. This service could eliminate a lot of these headaches.
Microsoft will host the data itself, or link back to the original data source. Dallas provides programmers the tools for searching for and though data sets, and building the APIs that can be used to draw the data.
The potential customers for the Dallas data sets will most likely be organizations that need data for their own applications or services. For instance, a supply company can design an online ordering form that could automatically fill in the billing and shipping addresses by drawing that data from the dataset of business addresses supplied by Infogroup, a provider of such information. For customers, Dallas also will offer the benefits of having a unified interface for searching for data from multiple sources, and providing a unified format for the data itself.
Microsoft has not worked out pricing, Wilson said, though it could be either a monthly subscription rate, or a fee based on how much data is drawn. Unlike Apple's iTunes, Dallas will not set the prices for the data -- each provider will set its own fees. Like Apple, however, Microsoft will probably take a percentage for running the storefront.
Wilson provided a sample application that shows what can be done with these data sets. Microsoft built a Facebook application that groups all of a Facebook user's US Facebook friends by the cities they live in, and orders them by which cities have the highest crime rates. A data set from the US Federal Bureau of Investigation provides the crime statistics for the country.
A developer can form a query by visiting the Dallas site, finding a data set of interest and then, through an interface called the Service Explorer, specify what information is needed. The Service Explorer translates the request into an OData request, which is an HTTP address that the application can then use to request the data online. The request can then be copied and inserted into an application, which would parse the results.
Using this query as a template, the application itself could then be programmed to alter the request for different query conditions.