Big data analytics firm Splunk has launched a new software product called Hunk, which enables organisations to explore, analyse and visualise data that is stored in Hadoop.
While many organisations have managed to stand up a Hadoop cluster and get data into it, they are still struggling to get value out, according to Splunk, because there are not enough people with the right MapReduce programming skills to pull meaningful data out of Hadoop.
Hunk, which is currently in beta, aims to solve this problem by giving organisations insight into their data assets without the need for custom development, costly data modelling or lengthy batch processing iterations.
“We can point Hunk at a customer's Hadoop cluster, and within an hour they are able to explore and start analysing data. They love that kind of productivity,” said Sanjay Mehta, VP of Product Marketing for Splunk.
Built using Splunk's virtual index technology, Hunk abstracts away the analytics functionality from the storage layer. This means that the Splunk technology stack can be used to for interactive exploration, analysis and visualisation of data stored anywhere.
While many Hadoop-based query technologies require a schema to be declared at the point that the user starts collecting data, Splunk has a notion of 'schema-on-the-fly', which means that any interpretation of the data is deferred until the time the user queries it.
“That's a really key concept because if you have an understanding of the schema at the time that you're trying to collect data, then you're immediately imposing some restriction on that data,” said Mehta.
“Data is inherently unstructured, you don't know what it looks like and it can change. So you need to be able to defer those kinds of interpretations until the point where you actually ask the question.”
Unlike Hadoop connectors – which extract data from Hadoop, transform it and run operations against it in a data warehouse – Hunk enables organisations to analyse and explore data natively in Hadoop, without having to transform and load it into other systems.
Users can also connect data from external relational databases, using Splunk DB Connect, and correlate this data to spot trends, identify patterns of interest and enrich insights.
Hunk's visualisation tools allow organisations to build graphs and charts to make their data more meaningful, and create custom dashboards that can be viewed and edited on laptops, tablets or other mobile devices via secure role-based access controls.
“The goal here is to democratise access to data. What we want to do is lower the barrier for entry for any type of user to be able to gain intelligence from that data,” said Mehta.
Splunk's Hunk software works against Cloudera, Hortonworks and MapR commercial distributions of Hadoop, as well as the Apache version. Splunk also hopes to extend support to other big data technologies such as MongoDB and Cassandra over time.
Hunk will run in private beta for the next few months, and is expected to become generally available by the end of this year.