Storage consumes electricity to spin disks or stream tapes (or access flash). It also uses electricity to cool enclosures and power the chips used in controllers. How can we reduce the amount of electricity needed?

The simple answer is to reduce the number of spinning disk spindles or tape drives. Actually I'd reckon automated tape devices are very efficient in terms of energy consumed per GB of data stored. A typical tape library will have 3 - 10 drives, a robot and hundreds of cartridges in slots. At any one time ten drives could be spinning, one robot could be moving and 750 plus cartridges could be sitting idle in their slots.

Contrast this with a disk drive array containing, say, 150 drives. Every drive is spinning. The controllers needed are humming with electricity. Obviously the cost per stored GB of watts consumed is much higher than for a tape library.

Another thing to be said about disk arrays s that with average utilisation below 50 percent then 75 of those drives are spinning around with empty platters. We're wasting electricity spinning empty disks.

It's not that simple of course. There is some data on every disk generally. Only if you drive disk utilisation up towards 70 - 80 percent can you produce empty disks which can be retired. Storage array virtualisation (think DataCore, StoreAge, IBM's SVC, etc) and thin provisioning (think 3PAR, Pillar and others) are proven techniques for reducing the number of spindles you have.

If you do this then, as data rises and energy costs increase, you will arrive back at square one though. What is to be done?

Radical data storage cuts

An answer is to either radically cut the amount of data stored on disk and/or to find a way to spin down disks that are not actually being accessed. Both ideas are suited to secondary or nearline storage and to virtual tape libraries (VTL). Unlike primary storage you don't need lightning fast access to data in these two cases so you can afford slight increases in data access time that both ideas involve.


Radical cutting of stored data size can be achieved with de-duplication (think Diligent, Data Domain and others) which involves a searching sub-file level search for repeated data strings and the replacement of repeated strings by pointers to a base string. It's a super-efficient form of compression and de-dupe ratios of 20 - 30:1 has been found to be achievable. The cost is the CPU cycles and time needed to run the de-dupe algorithms when writing data and reverse them when reading or, rather, reconstructing data.

The attraction is obvious. With de-duping what originally required a 100 disk VTL would now require a 10-20 disk VTL, meaning an 80 percent or more reduction in spindles. It sounds fanciful in the extreme but it appears to be rock-solid technology


Another solid storage slimming technology is to use MAID technology, standing for a Massive Array of Idle Disks (think Copan and Nexsan). The data is stored on disks and only the disks that are actually needed for I/O are actually spun up. The rest are spun-down. The array controller knows what data is where. When a request comes in for data on a quiescent drive it will be spun up.

This means you can pack many more drives in the same space as an array with all-spinning drives because you're not generating anywhere near so much heat. We might assume that 60 - 70 percent of a MAID array's drives are spun down at any one time. So a 100-disk MAID array will only consume the electricity needed to drive and cool 30-40 drives or less.

De-duped MAID

Of course we could combine these technologies and go much further. Let's envisage a 1,000 drive MAID array but with de-duped data stored on it. The de-duping says that the 1,000 drives are not actually needed. We only really need 200 drives (using the de-dupe numbers above). Then we use MAID technology and only need 80 drives actually spinning (again using the numbers above).

Moving from a 1,000-drive array to an 80-drive array would result in a massive cut in the energy consumption of that storage.

Sun is reselling Copan MAID arrays and working with FalconStor to develop its VTL Plus virtual tape library for enterprise use. FalconStor has a deal with Copan and FalconStor has de-dupe technology. Sun has a keen eye on data centre energy costs. It's easy to see what might develop here.

This is all back-of-an-envelope product speculation but it looks pretty compelling against a scenario of rising energy costs limiting data centre expansion.

What's more there are no alternative technologies in sight at all. What signs would indicate that the picture I've painted above might actually happen? There are two: EMC, HDS, IBM, and NetApp taking up de-duping technology; and the same companies taking up MAID technology.

What about tape?

It stores data in offline tape cartridges and has the best watts per stored GB of any storage technology - except paper-based ones. It's also cheaper to buy than disk on a stored GB capacity basis. I'd say tape is about to get a fresh lease of life.

PS. The same point of view applies to optical disk technology such as Plasmon's UDO.