Big data's origins lie not with Google or even IBM, but in a 1970's Chilean government effort to move to socialism, so we learned from the latest New Yorker ("The Money Issue").
The country's Marxist-leaning leader of the time, Salvador Allende, was nationalizing the country's key industries, and so naturally he wanted to build a "hyper-modern information system" that could show government officials how productive the country's factories were, and how happy the entire populace was -- all in real time!
New Yorker writer Evgeny Morozov details how the Project Cybersyn's many features -- some actual and some just planned -- anticipated our big data-driven, cloud-computing, hyper-connected, Internet-of-many-things world of today.
The system's operation center sounds like nothing so much as a room-sized business intelligence dashboard. Four screens could show hundreds of pictures and figures, as well as statistical information about factory production, helpfully summarized with up and down arrows. The system could even do predictive analytics of a sort.
If you're wondering how this early system could present all this data in a visual fashion, the answer is, by hand. Graphics artists were hired to update the screens.
In true cloud fashion, each factory would feed this "command center" with pertinent data, such as how many supplies were at hand, or what the current rate of production of whatever the factory was producing.
Cybersyn even predated the Internet of Things. One planned feature was a living room gadget that allowed each citizen to indicate his or her own level of happiness by way of a dial that ranged from extreme unhappiness to complete bliss. This happiness data would also then be returned to central planning, by way of the television airwaves, where it would tallied to produce a national happiness index.
More broadly though, today's big data systems strikingly resemble Cybersyn in that both attempt to "collect as much relevant data from as many sources as possible, analyze them in real time, and make an optimal decision based on the current circumstances rather than on some idealized projection," Morozov observed.
Despite all this big data at Allende's fingertips, he failed to fend off a (CIA-sponsored) coup d'état, or his own mysterious death shortly thereafter.
As for Cybersyn? The next leader, notorious dictator Augusto Pinochet, had no truck with central planning, preferring to leave economic progress in the hands of a free market, and so died the wildly prescient system.
Big data has been very, very good for New Relic.
"We're seeing a lot of interest in people wanting to use data to run their business better," Jim Gochee, senior vice president of products at New Relic, told us in a telephone interview.
New Relic originally made its bank on an application performance management (APM) service, which provides developers and administrators with valuable insights into how well (or not) their applications are running.
Since the service required that users place a data collecting software agent on all the apps they wanted to monitor, it was a natural extension for the company to also offer data analysis services.
Earlier this year, the company launched a data analysis service called Insights that builds on this APM infrastructure. The company marketed Insights not to administrators but to line-of-business managers.
The approach has seemingly paid off. This week, at New Relic's annual user conference, FutureStack, the company trumpeted that Insights now collects 2.1 trillion system events every month.
That's some big data. And it gets bigger. Insight user queries touch 34 trillion stored system events a day, the company calculates.
"To give you an idea of the scale of this, the entire Twitter stream is less than one percent of what we're inserting into Insights every day," Gochee said.
Much like New Relic, Splunk originally found its success helping administrators and developers monitor system performance and troubleshoot problems, by using Splunk's search engine set loose upon a mountain of machine data. And like New Relic, Splunk expanded its technology to offer big data analysis tools for the business leader.
This week, Splunk dove deeper into the big data waters, offering connectivity to data sets in Hadoop and Amazon's Simple Storage Service. "Anywhere there's machine data, that's where we go," said Splunk Chief Technology Officer Todd Papaioannou, during a live interview caught by SiliconAngle during Splunk's own user conference.
Striding over from the business intelligence space, open source BI-software vendor Pentaho certainly kept big data in mind when it developed the latest commercial version of its eponymous open source BI suite--Pentaho Version 5.2 is out now.
The updated bits come with, among other goodies, a drag and drop interface for moving data back and forth between Hadoop and another data source, reports the U.K's Computerweekly.
No stranger to Hadoop, Teradata continues to bridge its data warehouse systems to Apache data processing platform. Thursday, the company announced its partnership with Cloudera, leaving us to wonder how Teradata's Unified Data Architecture will fit with Cloudera's envisioned Enterprise Data Hub. Call in Pivotal, and we could drop it all in a data lake. Big data buzzphrase joke. You laugh here.
Don't forget, appreciators of data in voluminous portions, next week is the New York Strata + Hadoop World, which is rocking the cavernously-large Javits Center this year.
Simply by judging from the meeting requests we've been getting from vendors, the conference should generate a lot of insights, or least some buzz. New machine learning systems? Hub or Lake? Fast data ingestion systems? News from a big cloud vendor? Turn your happiness dial to bliss and come back to find out.