Microsoft is the latest of the world's top IT vendors has to climb aboard the Hadoop 'big data' bandwagon.
The company Wednesday announced it will collaborate with Yahoo spin-off Hortonworks to develop a Apache Hadoop implementation for its Windows Server and Windows Azure platforms.
Under the strategic partnership, Hortonworks will lend its domain expertise to help Microsoft integrate Hadoop into its Windows technology.
Microsoft said it expects to have a preview of a Hadoop-based service for Windows Azure by the end of this year, and one for Windows Server sometime in 2012. The Windows Server Hadoop implementation will work with existing Microsoft BI tools, Microsoft said in a statement.
Uniting the power of cloud with the power of data
Microsoft made the announcement at the PASS Summit, a SQL Server user conference held in Seattle.
The move will help Microsoft customers better manage their 'big data' requirements, said Microsoft corporate vice president Ted Kummert in a statement. "The next frontier is all about uniting the power of the cloud with the power of data to gain insights that simply weren't possible even just a few years ago."
Microsoft's move comes barely a week after Oracle unveiled a Hadoop-based big data appliance, along with a new Oracle NoSQL database and an open source distribution of the R programming language for statistical analysis.
Like Microsoft, Oracle said its Hadoop offering aims to tap a growing enterprise interest in big data analytics.
Just yesterday, IBM announced plans to Platform Computing a Toronto-based maker of software for managing the large computing clusters on which Hadoop typically runs.
Hadoop, an open-source software framework that supports big data applications, is increasingly attracting the attention of top IT executives for its ability to handle massive volumes of unstructured data like email content, weblogs, clickstream data, audio and video files, and sensor data.
Looking to collect unstructured data
A growing number of companies are looking to collect and analyse such unstructured data to glean new business insights. But to date they have been somewhat hampered in that because of the inherent scalability limitations of conventional relational database management products that are designed mostly to handle structured, relational data.
Early adopters of Hadoop such as Yahoo, AOL, Google and others have been using Hadoop to store and analyse petabytes of unstructured data. Other enterprise data warehouse technologies have not been able to easily handle such tasks.
Gartner analyst Merv Adrian said Microsoft's alliance with Hortonworks is not surprising.
"Every leading database vendor needs to ensure that customers who want to exploit Big Data don't move any more of their share of wallet away than necessary," he said, "[The] only question was whether they would go it alone or align with someone."
The partnership is a big plus for Hortonworks, which has a very deep pool of experts in the Apache Hadoop technology, he said.
Hortonworks has room to prove itself
Hortonworks, spun out of Yahoo earlier this year, is "fresh out of the blocks" Adrian said.
Cloudera is the current market leader in commercial Hadoop systems, he said.
"It's likely that many Microsoft customers starting to think about Big Data for the first time won't have heard much about [Hortonworks]," Adrian said. "This gives Hortonworks a nice visibility bump."
A Hadoop distribution for Windows should appeal to users with substantial Windows investments, said Stephen O'Grady, an analyst with Red Monk.
"Hadoop has crossed the chasm, so to speak, and become a mainstream technology product," he said.
"As such, it's obviously important that Microsoft have a competitive implementation available for its platform," O'Grady said. "It's clear that Microsoft feels optimization and tuning for the platform is important to ensure its success and to differentiate."