There's suddenly a lot of talk about disk and storage virtualisation.

EMC is building software that will run on a SAN fabric switch, virtualise storage and move data between servers without applications noticing (using technology from its VMware acquisition).

IBM will ship its new Virtualization Engine in IBM servers and storage this year. This will simplify management of mixed environments of IBM and non-IBM systems and include the ability to run tens, even hundreds of Linux servers in partitions on one mainframe. So, suddenly, a four-processor system is a "40-way" system running one or multiple operating system types or versions at the same time.

Virtualisation of servers makes one physical server appear to many virtual servers. Disk virtualisation makes many drive arrays appear to be one single big disk. What is going on?

What is virtualisation?
Virtualisation is generally the making of many things appear to be one, or of one thing being made to seem like many things.

Disk virtualisation, virtual memory and computer clusters are all examples of "many into one" virtualisation. In disk virtualisation a set of different physical disk arrays is made to appear as a single big disk thanks to a dedicated computer system sitting between them and the servers accessing them.

The dedicated box could be running DataCore's SANsymphony software for example. It receives requests from servers which are addressed to what the servers think are actual disk volumes. The software re-maps these requests into the actual disks in the separate arrays. (RAID arrays also use virtualisation. A file appears to be written to one volume but can be striped across many disks.)

In a virtual memory system, a computer's operating system (OS) swaps parts of applicaton code in memory out to a swap file on disk to make room for other applications. When the first application runs, its sense of its address space is that it is all in memory, in a kind of virtual memory. If there is a program-jump to a part of its code which is actually on disk then the application doesn't break; instead the OS halts the application, swapping out part of some other application's code, and fetches the wanted code back into memory from disk. Then it resumes the halted application.

In a cluster of computers, client PCs or terminals refer to applications on a server. They "think" they are referring to an application on a named server, but the cluster control software farms their request out to any one of a number of servers on the cluster. It represents the cluster as one big server and accessing clients don't need to know any details of actual servers on the cluster.

One-to-many virtualisation
With IBM mainframes, the OS can use memory partitioning technology to make one mainframe run hundreds of different "instances" of Linux and an application. Each Linux instance "thinks" it has the whole mainframe to itself. The beauty of this is that it is very scalable and Linux instances can be cloned on demand to fulfil requests coming in over a network.

Disk partitions are the same kind of thing, though much simpler to create and manage. On desktops, it's possible to partition a disk into seemingly separate drives, a C drive, a D drive and so on. Actually they are on on one physical drive but only the OS knows about that. Applications using the drives assume that each logical drive is a complete and separate drive.

Like Russian dolls this form of virtualisation changes to "many into one" virtualisation inside the disk enclosure. There two, three, four or five platters are made to appear as one disk drive by the drive controller.

What are EMC and IBM doing?
EMC's VMware originally provided the ability to run many virtual servers on one physical server, the "one into many" virtualisation. If you are going to move applications from one virtual server to another, even from one physical server to another, as the Veritas Ejasent acquisition does, then you need to be able to alter the storage access so that the networked storage now links to the new application instance with no break in service. The application is moved or started ("instantiated" in the jargon) on a new (virtual or physical) server and storage has to be provisioned for it. Once the application moves then the storage access has to move with it.

The storage router that EMC is building will help achieve this. IBM's Virtualisation Engine will also help achieve it. EMC's software is said to be deliverable in the second half of 2005.

IBM plans to roll out Virtualisation Engine technologies and services across its servers and storage products later this year. They will appear first in IBM's new iSeries servers expected in the second quarter.