Cleversafe is one of the few enterprise storage technologies that began life as an open source project, but that's not the only thing that makes the company unusual. The organizations interested in Cleversafe's open source storage software aren't dealing with mere gigabytes of data -- they're storing terabytes, petabytes, or even more.
According to CEO Chris Gladwin, Cleversafe's long-term goal is nothing less than to store all of the world's data. And with an aim that lofty, he says, open source is the only way to go. Cleversafe's technology provides fast, reliable, highly available storage, but it does it in a unique way. While most storage solutions offer reliability through redundancy -- making copies of data across disks, arrays, or geographic locations -- Cleversafe storage is distributed in the true sense of the word.
The technology has its origins with a set of mathematical formulae published by MIT cryptography professor Adi Shamir in 1979. It's difficult to describe, but for simplicity's sake, imagine that you take a document and split it up into a set of "slices," each of which contains part of the whole. Then you take each slice and store it on a different server in a different geographic location. With it you also store parts of some of the other slices.
Once you've dispersed your data in this way, you can use special algorithms to re-create the entire document, even if some of the servers go down. All you need is access to any majority of the servers for the full data set to remain available. "It's like parity on steroids," Gladwin says.
Because of its design, a Cleversafe data store is incredibly reliable. Gladwin says the current configuration of the software offers approximately 12 9s of availability -- equivalent to one hour of downtime for every 1 million years.
It's also incredibly secure. No single server contains enough contiguous data to be useful to a thief, and even then strong encryption protects the data slices.
Because a client doesn't need access to all of the servers on the storage grid at once, Cleversafe also potentially offers better performance than a connection to a single, centralized server. Each client can simply choose the set of servers that offers the best network performance to its own location.
What's more, it eliminates the need for expensive, high-end storage equipment. Because the reliability and availability is built into the design of the storage middleware itself, a Cleversafe grid can be built with cost-effective commodity hardware. "Google has demonstrated that concept with its search application," Gladwin says.
The Cleversafe software currently exists only as an open source project, available under the Gnu GPL (General Public License). The company plans to offer support services around a commercial version of the software later this year. Potential customers are likely to come from the health care and financial services industries, labs, supercolliders, and anyone else who needs to store massive amounts of data.
The really interesting application, however, will be when companies begin to use the Cleversafe software to offer mass storage as a managed service, in the same way that they provide Internet connectivity today. "We're having trouble finding ISPs that don't want to offer this kind of service," Gladwin says. "Same thing with hosting companies."
According to Gladwin, what makes this idea possible is the fact that Cleversafe's technology is open source. "If we held this as a proprietary technology and said, 'We're the one company that's going to control it, trust us' ... I just don't think that that's going to scale," he says.
In Cleversafe's model, no one company holds the keys to your data. No one company owns the protocols, codes, or formulae that give you access to it. The data isn't even stored at a single site. Instead, everyone works together -- efficiently, reliably, and securely -- for the good of each individual.
As Gladwin puts it, "When you can say to the world, 'Here's a way to store your data such that you, the data owner, are in complete control, and no one else can see your data, and no one else can screw it up,' that's a powerful message."