There is big hype being generating around the topic of Big Data, but there is no doubt that data volumes across global enterprises are ramping up exponentially.
IDC’s latest estimates puts the volume of digital content generated in 2012 at 2.7 zettabytes (ZB), up a staggering 48% from 2011. Over 90% of this information is unstructured (images, videos, MP3 files, and files based on social media). As such it contains potentially valuable information, but is difficult to analyse. As businesses seek to squeeze high-value insights from this data, the research group expects Big Data to become the next "must have" competency in 2012 and to see data and analytics technologies, including in-memory databases and BI tools, move into the mainstream.
Such moves to maximise the value of Big Data stores were recently undertaken at the research and development department of consumer products giant Unilever, which employs over 6,000 specialists stationed in 20 countries. It recently began an eScience project intended to accelerate the company’s access to global information. The programme was conceived to promote the usage of public data for biology and informatics research. To underpin the Big Data service a highly-scalable environment that supports parallel computation and heavy data storage demands was required.
Pete Keeley, Unilever Research’s eScience IT lead for cloud solutions, explained the company decided to partner with genomic data analysis company Eagle Genomics for the project. The programme relies on large Amazon Web Services cloud-based processing, and Amazon relational database service to store the computation results. Amazon’s cloud storage buckets maintain non-relational data as it migrates between various instances.
“Unilever’s digital data programme now processes genetic sequences twenty times faster — without incurring higher compute costs. In addition, its robust architecture supports ten times as many scientists, all working simultaneously,” Keeley said.
Speeding up the generation of reports derived from Big Data was also a priority for Deutsche Postbank in the UK. Its core business in London is corporate lending for commercial real estate (CRE) and, as a result, branch managers need accurate and timely management reports to help them make investment decisions. The financial company reported that a recent Big Data analytics roll out was required because it was finding CRE reporting were becoming increasingly time consuming and involved multiple staff to manually collect data from disparate sources.
In an attempt to address these issues the company evaluated multiple analytics offering from companies including Oracle, QlickView and Logica. Ultimately it selected an IBM Smart Analytics system designed to provide a single view of business information so that decisions can be made based on up-to-date information.
Head of IT at Deutsche Postbank, Clarel Sookun, said: "The system has really proved its worth by reducing the time it takes to produce our reports for CRE from hours to minutes and they are 'accurate to the penny'. The IBM system allows us to centrally store actual and reconciled data and gives us the ability to produce further management reporting at a higher quality level."
The success of the system has led to plans to extend it to the bank's finance operations, analysing loan and interest data, and performing 'what-if' scenarios using graphical data.
Sookun added: "The bank is now able to analyse data in different formats and produce subsets of reports as and when required, which used to be a very labour-intensive task. Implementing the system also unexpectedly highlighted inaccuracies in our existing data and we are now confident in consistent, high quality reporting. The beauty of the new system is that it can also integrate with our existing systems."
Attempts to improve access to Big Data recently led to one of the most “infrastructure challenging” project for global information group Thomson Reuters, which has with 55,000 staff spread across 100 countries. The organisation currently uses NetApp storage to support a wide array of business systems and research platforms. The Thomson Reuters shared IT infrastructure and cloud are built on NetApp storage, the VMware vSphere virtualisation platform, and Cisco networking switches. This infrastructure – supporting around 16 petabytes of global data - provides a foundation for Oracle-based content and metadata systems for Thomson Reuters’ Westlaw legal research service, which is used by 98% of the largest firms in the United States.
A recent enhancement of the Westlaw service – to create the WestlawNext offering required a major infrastructure upgrade for the company. “Our infrastructure strategy for WestlawNext required dramatically scaling shared storage,” said says Mick Atton, vice president and chief architect, Thomson Reuters Professional Division.
To facilitate this the company implemented a shared IT infrastructure and cloud foundation, which allowed it to avoid building a new two-megawatt data centre at an estimated cost of $65m. “NetApp performance and on-the-fly scalability allow us, for example, to search on a significantly larger body of content, as well as to seamlessly handle the unpredictable workloads of a new launch. In just the first year, we scaled to bring on some 6,000 new users, exceeding our first year adoption forecast by 50%,” said Cary Felbab, vice president, technology, Thomson Reuters Professional Division.
For global property consultancy Knight Frank a key objective of its Big Data strategy centred on deploying a system to help staff combine geospatial data with other sources of information. To facilitate this, working with Microsoft Partner intelligence, Knight Frank upgraded to Microsoft SQL Server 2012. The integration, reporting, and analysis capabilities — combined with Bing Maps for Enterprise — provided a platform for building a reporting and analysis tool for the London residential development team.
The first step in this process was to develop an application based on Microsoft SharePoint Server 2010 Enterprise, to give access to rich data on residential development opportunities in London. As a result the London team can use maps to either find an existing development site or add a new one. The filtering and segmentation capabilities of the application can then be used to drive deeper-level reporting and analysis provided by Microsoft SQL Server 2012 Reporting Services and Microsoft SQL Server 2012 Power View.
Given the exponential rate at which corporate data volumes are expanding it is apparent that the CIOs cannot afford to shy away from the analytical and infrastructure issues posed by Big Data. And it is equally apparently that successfully taming the Big Data tiger can deliver compelling business benefits. The recent Economist Intelligence Unit report, The Deciding Factor: Big Data & Decision making, reveals that nine out of 10 business leaders believe data is now the fourth factor of production, as fundamental to business as land, labour and capital. The study, commissioned by Capgemini among over 600 C-level executives and senior management and IT leaders worldwide, indicates that the use of Big Data has improved businesses’ performance, on average; by 26% and that the impact will grow to 41% over the next three years. The majority of companies (58%) claim they will make a bigger investment in Big Data over the next three years.
Two-thirds of executives consider their organisations are ‘data-driven’, reporting that data collection and analysis underpins their firm’s business strategy and day-to-day decision-making.
“The exploitation of Big Data fuels a step change in the quality of business decision-making,” said Paul Nannetti, global sales and portfolio director, Capgemini.
“But it’s not only through harnessing the many new sources of data that organisations can obtain competitive advantage. It’s the ability to quickly and efficiently analyse that data to optimise processes and decision making in real time that adds the greatest value. In this way, genuinely data-driven companies are able to monitor customer behaviours and market conditions with greater certainty, and react with speed and effectiveness to differentiate from competition.”