For a long time now, we’ve had a choice when buying infrastructure hardware: chassis-based units, where you slot multiple network switch cards into a big metal box, or standalone units that hook together with cables. Well, this concept has also made it into the server infrastructure: although we’re used to having collections of separate servers, we can now implement them by slotting cards into chassis too.

The latter concept is called a "blade server" architecture, presumably as a result of some inebriated marketing bod deciding some years ago that “blade” sounded sexier than "card" or "long, thin circuit board". Blade servers are today’s Big New Thing, but the noises being made about them seem generally positive, so let’s have a look at what they are and why they’re such a good idea.

The old way
In a traditional server room you have a collection of separate computers. Each unit has its own power supplies, cooling units, memory, hard disk, processor, network connector(s) and keyboard/video/mouse (“KVM”) outlets. To connect the servers together, you link network cables into a hub/switch infrastructure, and each machine connects either to its own keyboard, video and mouse or to a KVM concentrator that lets you have multiple computers controlled from a single set of I/O kit. In larger installations you may choose to separate the disk storage from the servers, in which case you’ll either run external disk arrays connected direct to the servers via SCSI or Fibre Channel, or you’ll have some kind of SAN infrastructure into which both the servers and the disk packs connect, probably by Fibre Channel but possibly by iSCSI.

Identifying the extraneous stuff
The first noticeable thing about a typical server setup is that no matter how good your cable management is, you have a lot of electric string connecting everything together. In a set-up with servers that have even a modest degree of resilience you have two power cords per machine, plus at least one LAN connection (and maybe two – you might have a second connection dedicated to out-of-band backups, for instance). In even the most modest server farm, this can get very messy, very quickly.

The second issue with the old approach, though, is that you have a lot of duplicate kit. Each server probably has a pair of fans, a pair of power supplies, its own set of sockets for KVM, and so on. All this equipment costs money and draws a tangible quantity of power (and, of course, kicks out a tangible amount of heat).

The third issue is one of space. Although today’s equipment can be relatively frugal, space-wise – for instance by being rack-mountable – there’s still a noticeable space requirement once you get chunky servers with external disk packs.

Chucking away the bits we don’t need
If you have a modern laptop computer – one which docks into a little desktop "cradle", into which you can connect the LAN cable, the mouse, an external screen, a power block, and so on – you’ll have noticed that all these services are transmitted between the cradle and the computer via a little multi-pin connector that’s just three or four inches long.

It’s not a huge leap of logic, therefore, to figure that instead of having servers with their own connections, we could build little modules that hold the stuff each server needs (the CPU, RAM, etc) but to leave out the KVM/LAN connectors and the power supplies. We then build a metal box to put these latter features in, and provide the link between chassis and server module by small form-factor connectors. By sharing things like power provision, we’re instantly saving space – and reducing the number of bits that might break. We can still have redundant components such as power supplies, but instead of needing two per server we might have just three or four power supplies for a rack of half-a-dozen servers.

Interconnections
Right, so what we’ve achieved so far is to take all the bulky and/or duplicated bits of our servers, and to put them in the chassis instead of the server module. Aren’t we just putting all our eggs in one basket? After all, 10 servers in a chassis still need 10 (or 20) LAN connections!

Well, yes and no. Of course, servers need network connections. The thing is, though, there are two types of use for network connections in a server farm: communications from one server to another; and communications between servers and stuff outside the farm. So let’s stuff a network switch into the chassis. You see, although we’d imagine a 20-port switch as a box or a card (blade) with 20 bits of wire going into 20 UTP ports, we’ve already got our blades connected into the chassis. So for the servers to talk to each other, all we need to do is have a switching processor to direct the traffic – so what we effectively have is a switch with no ports! We can make it a layer 3 switch, do VLANs, QoS, whatever we wish – we can have all the facilities of a switch without cluttering it up with physical ports.

Of course, we’ll still need connections to the outside world. What we do, therefore, is stuff in a card to provide a small number (typically between one and four) of uplinks for connecting the chassis to the rest of the world.

At the same time, we can consider what we’re going to do about connecting to external disk or tape arrays – and we arrive at the same answer. Just as we’d have a network adaptor with LAN uplinks, why not have (say) a Fibre Channel adaptor to link our chassis with our SAN. And, of course, there’s nothing to prevent us from implementing local storage by setting aside part of the space in the chassis to insert disks, and to provide an internal SAN switch just like we did our internal LAN switch.

Making the most of close coupling
The final step we can take with blade-based servers is to exploit the fact that we effectively have: (a) a bunch of identical servers; and (b) a set of shared connections, disks, power supplies and so on. Aside perhaps from the serial number engraved electronically into the CPU, we have no physical restrictions on any server module. That is, if a server dies and we want to commission a different server to take its place, we don’t have to walk to the server room and move (say) the LAN cable from one machine to another – we can theoretically do it all electronically.

Blade server systems are, therefore, generally able to do clever things like having hot-spare blades that can take over in the event of a problem – or even which can provide additional resource in the event of a particular application wanting it. Just as we have LAN, power and KVM provided by our magic little connector between the chassis and each blade, why not have a high-speed proprietary bus which can dump an OS image onto a spare blade in seconds and kick it into life to take the place of a failing/failed colleague?

Any reason not to do it?
Are there any drawbacks? Well, yes – and again, they’re just the same as in the networking world. Number one is that all your eggs are in one basket – although you can provide multiple PSUs, you generally have a shared chassis bus and so if that dies, you’re stuffed. The counter to this problem, of course, is that the single-point-of-failure are just bits of metal and wires – there are no moving parts and so it’s unlikely that if it works OK on day one, it’ll die on day 1,000. Another issue is, of course, that if you need your servers to be in physically different places, there’s no point considering the blade approach.

Finally, blade servers are likely to appeal most to those building biggish server farms. If you’re implementing a couple of servers, because the chassis has a tangible cost you’ll be paying more for a chassis plus a couple of servers than you would for a pair of standalone machines. As your server count increases, though, you’re only buying the blades, and so the average cost per blade will fall.

Summary
The benefits of blade servers over standalone units are, then, precisely the same as the benefits of blade-based network switches over standalone ones. You have fewer bits that can go wrong, you have a close, high-speed interconnect between units, you save space, you save on power provision and heat dissipation, you don’t have electronic network spaghetti, and you can provide for failure or extra processing power via hot-spares and automated failover. And although they’re fairly pointless for those who have only a handful of servers, or those who have a lot of servers that need to be in different places, in all but these cases it’s well worth considering the blade approach.