Given that cloud computing is still emerging, it shouldn't come as a surprise that opinions vary widely on the best way to architect the storage. In fact, it seems likely that there is no such panacea - different types of private cloud almost always require different approaches.
Or do they?
In a recent interview, Piston Cloud CEO Joshua McKenty asserted flatly that, when it comes to private clouds, the best approach is to integrate the storage with the servers, a setup that offers performance far beyond that of more traditional approaches.
"The right model is a two U server with a bunch of JBOD," McKenty said. "Because you're limited not by the rate of any one drive, but by the number of spindles."
It works, according to McKenty, because direct-attached storage in the servers doesn't have to get routed through a single gatekeeper like a filer or even a SAN switch. Instead of "20 servers all talking to this one filer at a total bandwidth of 10G," he said, each server has its own 10G port, meaning you have 200G worth of bandwidth. Even if the individual disks are slower and you need to store multiple copies of files, total performance far exceeds what you can achieve with a more traditional setup.
This idea is "heresy" as far as many IT departments are concerned, McKenty said, because they want to use NAS or SAN to handle their storage workloads instead of integrating storage and compute on every node. He described how a NASA operations team tried to replace one of his integrated systems.
"They brought in a $20,000 filer and put it in the bottom of the rack. It had Fibre Channel. And they put in a Brocade switch. And they cabled everything up with Fibre Channel and FCoE. And they spent four days of down time tweaking this thing, and they got 20% of the performance that we'd been getting out of our configuration," he asserted.
But is this integrated JBOD approach just for private clouds that have special NASA-like workloads?
Absolutely not, McKenty said in a follow-up conversation. "We've seen customers use the same architecture for VDI workloads, Web hosting, risk analysis and other compute-heavy things like financial simulations or Monte Carlo simulations."
The fact that storage density and processor speed are growing far more quickly than network capacity - an idea McKenty described as the mobility gap - is a key reason why the distributed model works so well, he says.
"The speed of light trumps your infrastructure. The latency of moving your data is never going to keep up with the growth of how much data we're saving and how much processing we want to do on it," he says. "And yet, if you look at a SAN or NAS architecture, you're always putting all of your storage essentially in one device on the far end of a single wire."
'Stepping back in time 10 or 15 years'
Needless to say, not everyone agrees with McKenty's characterisation of the issues involved.
"What he's talking about doing is sort of like stepping back in time about 10 or 15 years," says Brocade product marketing manager Scott Shimomura. "He's making the argument that the direct-attached storage approach is the best way to ensure performance, and that just isn't the way customers think about their storage today."
He also criticises the idea that one system represents the way forward for nearly all uses.
"The trap that many people fall into is they want to say 'cloud is this' or 'big data is that,' or 'there's one type of infrastructure that's going to solve world hunger,' and the reality is that it depends on the workload," he says. "You're going to have different aspects of an application leverage different parts of your infrastructure."
Director of Product Marketing for HP Storage Sean Kinney also disagrees, arguing that the integrated compute/storage approach makes it difficult for users to adapt to changing workloads in a flexible and agile way.
He also points out that the idea of keeping multiple copies of single files in order to improve access speed flies in the face of the industry trend toward data de-duplication.
"If disk cost is not an issue - which, for almost every IT customer, it is - this might work," Kinney says. "Enterprise IT tends to be fairly cautious. What McKenty is proposing, at this point looks to be a little too much like a science project."
For his part, the Piston Cloud CEO scoffs at the notion of cost being an issue - compared to the price of specialty storage systems, distributed storage is far more economical. "Converged infrastructure using the hard drives in every server is way less expensive than dedicated filers," he says.
So who's right?
As far as performance goes, McKenty likely has a point, according to Forrester Research Principal Analyst Andrew Reichman.
"Simple storage architectures, I think, end up working better than large arrays in a lot of ways," he says. "The disk that's on-board a server is the fastest disk I/O that you can have, because it's connected to the bus. Part of the problem with storage arrays is that they share capacity across a large number of servers, so if you're talking about an application that doesn't really have a huge amount of storage capacity you probably won't have enough spindles to get good speed."
That said, he notes, the Piston Cloud CEO may have oversold the wide array of potential use cases. Low data growth and high CPU intensity are the key factors in making a workload suitable for the model, Reichman argues. Nevertheless, sufficiently advanced algorithms for copying and handling data could make such a system highly effective.
On the cost issue, it's difficult to conclusively say whether McKenty's "heretical" ideas hold up, according to researcher Seamus Crehan.
"Disks may be cheaper than buying a storage array system with specialized interfaces (e.g., Fibre Channel). But if you have to buy a lot of redundant disks and there are possibly other associated network and management costs, then this may be mitigated," Crehan says. "It is hard to tell without comparing all of the pieces of specific deployments. Businesses have very diverse sets of IT needs and what makes sense for one may not make sense for another."
Running the numbers
In any case, Crehan says that companies may be moving in the direction of integration. Sales of the type of switching hardware that would be used in McKenty's framework are on the rise.
"The market segment is enjoying tremendous growth and cloud deployments are an important driver of this. At the same time, despite very strong 4Q11 results, total fibre channel switch shipments did not grow in 2011," he says. This could imply a downturn in interest for storage arrays.
Still, Crehan warns that organizational and technical inertia could hamper some companies looking to make an upgrade.
"More traditional enterprises have a lot of existing infrastructure to integrate with new deployments as well as organisational structures (such as separate teams for compute, storage and networking), which make some of these cloud-type implementations more challenging," he says.