Simple SANS have one hop or link between server and storage device that means just two ports on one switch are involved. The switch just passes the messages straight through, from port to port. This is not like Ethernet where signals go all round the LAN and Ethernet hubs don't have ports in the same way. As SANs get larger more switches are needed and the switches have to be joined by Inter-Switch Links (ISLs). These use up ports and can get overloaded if traffic isn't carefully planned.

The over-use of ISLs can also add latency to the SAN fabric traffic, so much so that SAN links timeout. When a SAN link fails or an administrator alters the zoning then the routing tables in the SAN switches have to be updated. Such re-build traffic can interrupt the SAN's performance. It doesn't take long in the general scheme of things, but the more ports and switches, the more re-build work there is to do. It means, in bad cases, that web transactions requiring SAN access might fail.

Director switches from Brocade and McData get over this by having many more ports than other switches and by having guaranteed capacity links between them. A 64-port director can replace four 16-port switches and perform better than those four switches meshed together with ISLs.

As SANs get larger still a core-edge topology can emerge with edge switches linking servers to the central director(s) and (back end) edge switches also linking multiple storage devices to the directors.

This naturally increases the number of hops that a message needs as it traverses the SAN from front edge switch to director to back end switch. However it is a much better alternative than building the same SAN from smaller switches linked together because the message latencies are more predictable and controllable.

One aspect of this is that a SAN rebuild can take a relatively longer period of time because of the mass of ports and zones involved. If just one zone is altered or one port fails then the whole SAN can be temporarily paralysed by a rebuild.

A way around this is to double-up SAN devices and build for resilience by avoiding single points of failure. Naturally the acquisition cost of the SAN and the management of it immediately become greater burdens.

A different route
An alternative route to building larger SANs is to use SAN routers. These are specialised switches, such as Brocade's Fabric AP7420, which connect SANs together into a logical SAN (LSAN). Instead of building out one large SAN you would build smaller SAN islands and connect them together via SAN routers. Cisco and McData have similar devices. Such a router would appear to an individual SAN as another port. The router would connect that port through itself to a port on the connected SAN and it's connected device or server would appear in the first SAN as a phantom device with what could almost be characterised as a virtual private network (VPN) link to it.

In effect the SAN router is used to set up special zones connecting ports on two or more individual SANs. The routing adds layer three networking to the original single fabric layer two switching. This is analogous to what has happened over time with Ethernet and other networks. Intelligence is added to the network above layer two to make it perform better and be simpler to manage and more resilient to component failure.

If a component port or link fails in an individual SAN in this routed SAN structure then a rebuild is necessary. But it's effects are limited to the immediate SAN's fabric. They don't propagate across the router to other connected SANs. Thus failures and zone alteration effects are limited in scope and the overall SAN withstands such things better. Rebuild 'storms' cannot affect connected SAN elements across a router.

Two or more SANs connected by routers now have a second level aspect to their fabrics, a router-to-router aspect.

There is another potential benefit. If you are going to join SAN islands togther into a single fabric by adding in a Director or more lower-level switches then every element in the subsequent SAN has to have a unique name. This can involve a lot of administration and planning. The routing approach obviates the need for this. Each SAN island is logically separate from the others and what it does with regard to naming is self-contained.

Layer four functions
Suppliers have built SAN routing switches by wiring in certain functions at an ASIC level but also by using general-purpose x86 architecture processors. This means that they have a general application platform in the SAN. It could be used to carry out what we could called layer four network functions. Things such as volume management or storage virtualisation could be located on them.

For example, sharing of files in the network-attached storage sense could be achieved by running a NAS head function on such a SAN application platform. Another example is to have a tape virtualisation function running on the platform. One instance of it could be used by many servers attached to the SAN. Alacritus, for example, is working with Brocade to port its virtual tape software to Brocade's Fabric AP7420.

These ideas seem appropriate to the concept of a storage utility.

We might think of non-routed SANs as dumb networks. By adding intelligence we can create more resilient storage networks and have a routed infrastructure above the basic switch port-to-port links. We can also add storage applications and enable the SAN to carry out these functions, releasing server CPU cycles for business applications.

Of course, with Ethernet, we already have a ready-made infrastructure, in theory. But somebody, somewhere, needs to build a pure iSCSI SAN using Ethernet and demonstrate that it can provide as reliable and predictable performance as a Fibre Channel SAN with a significantly lower cost of ownership.

Until that is done it's probable that iSCSI will only be used to extend access to Fibre Channel SANs rather than to build pure Ethernet SANs with multi-level network elements.