It's well understood now - the principles of storage virtualisation. Pool your disk storage together and represent the various physical drive arrays as a single and coherent pool of storage from which logical units (LUNs) can be constructed and assigned with no restriction that the physical disk underlying a LUN has to be on one controller.

You get better storage utilisation and easier management because the virtualisation function masks the details of individual storage controllers and their individual ways of doing mirroring, copying and so forth. But, and this is a really large but, it is very hard for virtualisation devices to virtualise across different vendor's arrays because they all carry out failure processing differently.

It's quite easy, IBM's chief virtualisation architect Steve Legg, says, to virtualise across heterogeneous storage controllers when carrying out reads and writes. Blocks go back and forth like clockwork. It's when they don't that problems start.

Legg architected IBM's SVC, or SAN Volume Controller. He could have located it as a host-based system, the way Veritas' Volume Manager is implemented. HP has recently chosen this approach. He could have bolted it onto a storage array, much as Hitachi Data Systems has implemented its Universal Storage Platform.

Instead, it was placed as an in-band appliance in the storage fabric for several reasons. One was that a server or host-based appliance uses up host mips. You can't guarantee virtualisation responsiveness if the host is going to be dedicating its mips to something else when a storage request comes along. Legg says, "I can't be detirministic about my service levels if I do it there."

Storage fabric edge virtualisation issue
A storage array-based virtualisation function can't see the whole array and has another problematic issue associated with it.

Do you have a specific virtualisation function for each class of array: high-end, mid-range and low-end? If you do then that is three products which adds cost and complexity. In Legg's own words: "If you weld the SVC to the side of ESS you have to buy a Shark to have it - a bit of a hurdle (for FastT owners) If I weld it to the side of a DS4000 I have a separate version. I end up with multiple versions."

It's simpler to have one virtualisation box in the storage fabric to get over these issues. Legg says, "It's partly heterogeneity and partly entry cost. It was a conscious decision not to do things the TagmaStore way."

Legg said, "All we do is sit in the middle (of the SAN) as a traffic cop. An in-band topology - quite deliberately."

The hosts accessing the SAN see it as a storage controller; the storage controllers see it as a host. Things are neatly zoned. By designing it be built out of clustered nodes which are active:active pairs of systems, then it can provide redundant components - protecting against componenent failure - and linearly scalable growth. And it can see individual storage controllers and virtualise across them. This is what IBM's SVC looks like; an X-series system implemented as a cluster of active:active pairs.

The Karenina principle
But what happens if a disk drive in a controller's array fails to respond to a request? What should the virtualisation function do then? Legge says that what Tolstoy wrote in the opening of his novel, Anna Karenina, is pertinent; "All happy families are more or less like one another; every unhappy family is unhappy in its own particular way."

Storage controllers from different vendors deal with failing drives, 'are unhappy', in their own particular ways. They have different wait periods, varying re-try counts and so forth. What makes heterogeneous virtualisation hard is understanding how different vendors storage controllers deal with failure and error recovery and handling it so that the upstream hosts experience consistency.

According to Legg, "The complicated bit of virtualising across storage vendors is understanding the detailed error recovery. They all do it quite differently. That's what you qualify when you add support for a new storage controller."

Legg said that when IBM wanted to add heterogeneity to the SVC, "We contacted other array vendors. HDS said we could borrow Thunder and Lightning arrays. HP said we could buy their arrays. EMC didn't return our call." SVC now supports EMC, HDS, HP and IBM storage arrays.

SVC Status
Legg said, "There are 1,000 SVC implementations world-wide. Over 800 are in production and using it in anger." Customers include BT and Canon Fotango. SVC has been in the field for 18 months and version 4 of the software has just been relesed. New functionality is planned to be added in 2005 when there should be two software releases.

It's clear that storage virtualisation is rapidly maturing and becoming usable and reliable technology. The debate over its placement: host-based; storage controller-based; or fabric-based, is by no means over. But the vendors are staking out their positions and customers becoming better able to weigh up the pros and cons of each approach.