Content Switching is a term that encompasses a variety of manipulation techniques for getting data to users in a way that optimises the use of your network by not necessarily serving the data from where they might expect.

There are two main facets to Content Switching - basically caching (we've already looked at to some degree at Web Caching) and load balancing, although there are enhancements to these, such as pre-positioning and content routing that we’ll dig into at a later date.

First though, the basics of Content Switching as it applies to server load balancing.

Topologies

The reasons for installing content switches are pretty simple. You want to add performance, scalability and resilience to your server farm. Rather than have one server, that’s struggling to keep up with demand, and is a single point of failure, you want several, all of which are capable of serving the same content to your users - and are transparent to them. But you don’t want those users to have to know to connect to different server addresses. So you have one virtual IP address (VIP) that your users know, and is the one your DNS server will give out, but several real addresses (RIPs) assigned to the actual server interfaces, that only the content switch knows about. That way if one server fails, the content switch simply stops sending it traffic, and your users don’t have to know. Similarly, it makes it easy to add a server, and all you have to do is tell the switch how to distribute user data to it.

There are two main ways that you physically integrate your content switch into your network. Either it’s inline between your servers and the rest of the network, where your users are, or it's off to one side, out of the direct data flow - often known as a one-armed or single-armed deployment.

If it's installed inline, then you may have your servers plugged directly into the content switch, with an uplink from there to a LAN switch or router, or you might connect your servers to the LAN switch, and just use two ports on the content switch to connect it inline between the LAN switch and the rest of your data infrastructure - you'll need less ports on your content switch in that case, though not any less throughput. Actually there is a third way here, where you could have a content switching blade installed in your LAN switch, so utilising the port density and switching capacity of the switch that you’ve already paid for.

If the content switch is inline in the client to server data flow, then it's relatively easy for it to intercept the data, take the destination VIP and translate it into an RIP and send it off to the best server. If it’s a single-armed deployment, there are a few options and tweaks to make sure the traffic flows the way you need it to. Either way, how does the switch choose the 'best' server?

Load Balancing Algorithms - Layer 4

Without having to be aware of the actual content that the servers are hosting, content switches can load balance on a variety of metrics. This is the most basic form of load balancing - we'll cover content-aware balancing and persistence in the next article.

Whichever method use, the switch will keep track of what's happening in a table of active sessions. It needs to do this so that it knows how to route a reply packet back from the server to the client. When the first packet arrives from a client, the switch will note the source and destination IP addresses and ports used. It will choose a server and NAT the packet so that it's now pointed at the address of the real server, not the virtual one, and send it off. When the reply comes back, and for any future traffic in that flow, it can look up the table to see what to do with the packet, to make sure it goes to the correct server.

Metrics

Least Connections: the switch simply sends a service request to the server that currently has the least sessions open to it. It’s not immediately obvious, but this does actually take account of differences in processing power between servers, since the more powerful ones will tend to finish sessions more quickly, and so will be sent more new sessions, compared to the other, less powerful servers.

Round Robin: the switch simply sends sessions to each server in turn. This isn't a good one to choose if you do have servers of different capacities, as the server takes no account of that, so it will quite happily send the same number of user sessions to your brand spanking new, top of the range UltraSPARC IV as to the Pentium3 desktop that you decided to turn into a web server just because you had it lying about spare.

Weighting: most switches should give you the ability to add a weighting factor to the above algorithms, so that you can better control allocation of resources, so that they will still use a round robin process, for instance, but you can specify that twice as many sessions get directed to your high end server as your low end one.

So far there's been nothing about providing any level of persistence, so a user, if they make several discrete requests from a Web server, could easily be pointed at different servers each time (although the packets making up each individual request will be sent to the same one). It is possible to make sure certain users go to specific servers based on IP address, so that there’s still no need to look further than Layer 4.

IP Address Hashing: by taking part or all of the source or destination address and running it through some algorithm, it’s possible for the switch to choose between possible real servers based on either where the request is coming from, or where it’s going to. So all users on a particular subnet, for example, can be directed towards 'their own' server. It may be possible, depending on manufacturer implementation to extend this to port numbers too.

The content switch may also be able to do health checking on the servers, and feed this information into their algorithm, either by just pinging each server, or using application checks, although this moves us into the realm of Layer 7, content-aware switching. We’ll look at this next, but for many scenarios, the methods described above are sufficient. Layer 4 switching is relatively easy nowadays - since the IP information always appears at the same point in the packet, ASIC designers have been able to build chips which can read this information in hardware, so it’s very fast. This means that if this level of switching suits your needs, you won’t have to install a more complex - and therefore more expensive - switch.