In the never ending quest for a competitive advantage, organisations are turning to large repositories of corporate and external data to uncover trends, statistics and other actionable information to help decide on their next move. Those data sets, along with their associated tools, platforms and analytics, are often referred to as "Big Data," a term that is gaining popularity among technologists and executives alike.
Although decision makers have realised there is value in Big Data, getting hold of that value has proved elusive in most businesses. That's where IT can help, creating services that empower researchers to delve through large data stores, to perform analytics and discover important trends. In other words, IT will prove to be the catalyst that delivers on the promise of Big Data.
Big Data has already proved its importance and value in several areas. Organisations such as the US National Oceanic and Atmospheric Administration (NOAA), US National Aeronautics and Space Administration (NASA), several pharmaceutical companies and numerous energy companies have amassed huge amounts of data and now leverage Big Data technologies on a daily basis to extract value from them.
NOAA uses Big Data approaches to aid in climate, ecosystem, weather and commercial research, while NASA uses Big Data for aeronautical and other research. Pharmaceutical companies and energy companies have leveraged Big Data for more tangible results, such as drug testing and geophysical analysis. The New York Times has used Big Data tools for text analysis and web mining, while Disney uses them to correlate and understand customer behaviour across its stores, theme parks and web properties.
Big Data plays another role in today's businesses: Large organisations increasingly face the need to maintain massive amounts of structured and unstructured data, from transaction information in data warehouses to employee twitter messages, from supplier records to regulatory filings, to comply with government regulations. That need has been driven even more by recent court cases that have encouraged companies to keep large quantities of documents, email messages and other electronic communications such as instant messaging and IP telephony that may be required for e-discovery if they face litigation.
Perhaps the biggest challenge facing those pursuing Big Data is getting a platform that can store and access all the current and future information and make it available online for analysis cost effectively. That means a highly scalable platform. Such platforms consist of storage technologies, query languages, analytics tools, content analysis tools and transport infrastructures. There are many moving parts for IT to deploy and look after.
There are many proprietary and open source resources for these tools, often from startups but also from established cloud technology companies such as Amazon.com and Google, in fact use of the cloud helps solve the Big Data scalability issue, both for data storage and computational capability. However, Big Data does not necessarily have to be a "roll your own" type of deployment. Large vendors such as IBM and EMC offer tools for Big Data projects, though their costs can be high and hard to justify.