Microsoft is developing a connector that will allow Excel users to download and analyse output from Hadoop, potentially opening the open source data processing platform to a much wider audience.
Microsoft is working on the connector with Hortonworks, a Yahoo spinoff that offers a Hadoop distribution and commercial support services.
"What makes this announcement significant is that Microsoft is opening up Apache Hadoop to literally millions of new users," said Hortonworks CTO Eric Baldeschwieler. "There are many more millions of Excel and PowerPivot users that can now derive value from Apache Hadoop using software that is already very familiar to them."
The connector was among several Hadoop-related open source projects that Microsoft and Hortonworks announced yesterday at the O'Reilly Strata Data Conference, being held this week in Santa Clara, California. The two companies formed a partnership last year to adapt Hadoop to the Windows ecosystem.
The connector will be an ODBC (Online Database Connector) that interacts with Hadoop through the Hive data warehouse system. Users will be able to analyse data downloaded from Hive in Excel, using tools such as Excel PowerPivot.
"Microsoft's ODBC driver for Hive is something they've built so that Excel, Power Pivot and other Microsoft tools can connect to Hadoop (via Hive) more easily," said Shaun Connolly, Hortonworks vice president of corporate strategy. "Hortonworks is focused on working with Microsoft to bring this technology into open source so it can be more widely available to and used by the Apache Hadoop community."
Microsoft has already taken some efforts to embed Hadoop in its ecosystem. The company has released a Hadoop connector for SQL Server. It also offers connectivity to an instance of Hadoop on its Azure cloud service, currently as a developer preview.