It's a cliché but still true that the electricity grid delivers power to us through sockets in the wall. Can storage be supplied the same way? Such a concept is being advocated by HP, IBM, Network Appliance, Oracle and others. It says computing should be a utility delivered by computing grids through sockets in the wall (see Techworld feature) and that a storage grid should be part of that.
What is a storage grid?
A storage grid is a set of virtualised storage services that function as a single logical resource. But this doesn't help us much. A BlueArc NAS box that virtualises disk inside the box can be said to be a storage grid. (See GridToday)
Network Appliance has said that its acquisition of Spinnaker will help it produce storage grids. These are part of general computing grids which make distributed computing resources function as a single resource. However, NetApps storage grid vision appears limited to working within the Oracle 10g environment which is intended to run Oracle and Oracle-approved applications only. This is as restricted as Oracle's RAC - real application clusters - are.
A SAN with a virtualisation box in front of it can be said to be a storage grid. There is nothing new in this.
For storage grids to be new there has to be some kind of logical or virtual storage structure embracing many different kinds of storage and virtualising them all. That, after all, is what a computing grid does; virtualises computing. Only companies supplying many tiers and types of storage to enterprises will have the expertise and need to develop storage grids embracing their products. Count the possibles: IBM; EMC; HP; StorageTek; perhaps Fujitsu Softek (or will it be Softek shortly?); perhaps Veritas and one or two others. IBM's Storage Tank comes closest to this idea presently in that it has a global file system and a storage infrastructure that sits in front of all kinds of IBM storage and virtualises it.
Once you have this though, you still have distinct storage serving needs. Some apps want files; others want blocks. A storage grid has to serve files to the apps that want files and blocks to those that want blocks. The grid infrastructure needs NAS access methods and SAN access methods.
It must have more. For a storage grid to function as a storage utility it must be able, surely, to provide backup and archiving services and lifecycle services that move data from one storage medium to another as access and data regulatory compliance needs dictate. But there is a crucial difference between electricity grids and storage grids.
An amp, a volt, an ohm - all are constant ideas, whichever electricity socket you plug your 3-pin plug into. But a byte, is not a byte, is not a byte. Its contents differ. You can have data lifecycle management based, for example, upon simple file access levels and policies determining a shift to different storage media if access rate falls below a threshold. To have information life cycle management you need to understand what the bytes contain. Otherwise, how will you know that e-mails referring to financial matters should be kept fifteen years and others deleted after five? The data information difference is crucial.
Where does the upper edge of a storage grid stop and the lower edge of applications concerned with storage start? Does a storage grid stop with common bytes and offer data lifecycle management, or does it look into byte contents and offer information lifecycle management? It is a moveable feast and the edge location will surely depend upon which supplier is building a storage grid on which products.
Building a storage grid, one that offers more than re-hyped NAS or SAN virtualisation, will be difficult, lengthy and fraught with looming standards issues as soon as you step outside one supplier's homogeneous set of products. Storage grids, like computing grids, are likely to be very significant things in the future.
Experiment with them for specific projects, if you will, but don't expect realistic and well thought out storage grids to appear for five years or more. When they do they will be for enterprises only. If you are going to have petabytes of data across your enterprise then storage grids are worth tracking. If not, park the concept in your nearest multi-storey hype park for the time being.