When users access data via their web browsers, whether it's from public Internet web servers or your corporate Intranet, serving up application data in a more user-friendly manner, a lot of the data that they pull across the network is actually pretty static. Since the use of Web servers to front-end information encourages the design of pages with moving icons, images and fancy banners, we're talking about sizeable amounts of data moving about.

That isn't much of an issue over the LAN, where bandwidth's plentiful, but can be rather painful if you're at the far end of a bandwidth-constrained WAN. Caching isn't new, but some of the clever stuff to make it transparent to your users, and to make sure you only cache the bits of a page that can afford to be cached, is worth investigating. We'll also look at how to tell if it's working and you're actually saving bandwidth.

Building the Right Network
To optimise WAN bandwidth, the idea is to install caches at your remote sites so that some of the data your users request is served to them from there, not pulled across the WAN. You'll also want to do this transparently, so that you don't have to configure your users' PCs with proxy settings to point them at your caches. Your caches may be inline, between your users and the outside world, in which case they will have to recognise traffic that can and cannot be cached; or they are not, in which case you'll need to be running some redirection mechanism on your network.

For example, Foundry's ServerIron switches run Transparent Cache Switching (TCS) which redirects users' HTTP traffic to cache servers. The ServerIron uses the responses back from the caches as keepalives - if it stops seeing a cache replying, it will stop sending to that cache.

Cisco uses its Web Cache Communication Protocol (WCCP) on its routers (and switches) to perform a similar task. There's a bit more intelligence in this setup in that WCCP also runs on the caches (assuming they are Cisco ones, although WCCP is also supported on the likes of Squid). Routers and cache engines become aware of each other and form a ‘service group'. Once the service group has been established, one of the cache engines is designated to determine load assignments among the cache engines and pass this information to the routers. Multicast is used so that multiple routers can be used for resilience, and traffic to any TCP port, not just port 80, can be sent to the caches.

Caching the Right Data
When the cache sees the user's request, it will check to see if it already has that content. If it does, it will send it to the user: otherwise it will access the destination server, retrieve the data, send it to the client and keep a copy itself in case of future requests.

To ensure that out-of-date information isn't sent by the cache to the user, HTTP has the notion of freshness. HTTP 1.1 allows content authors to specify how long content may be cached: it may be non-cacheable, cacheable, or cacheable until an explicit expiry date. There's a mechanism called IMS (If-Modified-Since), that's activated when you hit the refresh button on your browser, to check with the server if anything's changed on the page, which is used by the caches to confirm with the server that it's okay to serve its content without downloading it new from the server.

These freshness checks and expiry dates are specific to each separate object on a page, so the most likely scenario is that a user will get the static content (navigation bars, photo images, most text, etc) from a cache engine, while the more active part of the screen (stock prices, exchange rates or inventory count) comes over the WAN from the actual web server, and it's all built dynamically on the screen.

Bypassing the Cache
You can set filters so that traffic to or from specific addresses doesn't go to the cache engines. There are also instances where a client needs to communicate directly with the web server rather than the cache - for instance, for IP source address authentication - in which case a cache can take itself out of the request path to the server.

Pre-Positioning
More in the realms of a full-blown content delivery model, so we'll cover it in depth in a later article, it's possible to push content onto some cache engines before anyone asks for it. More for the likes of video streams perhaps than just web pages, data can be pre-positioned over the WAN out of normal office hours, so that the next morning, when everyone in a branch office wants to watch that new corporate video, it's already there and the WAN bandwidth won't take the hit at peak utilisation.

Calculating Bandwidth Savings
If you're going to deploy caching, it's to speed things up for your users and to save bandwidth. So make sure it made some difference. You should run some baseline tests prior to installing caching, timing downloads at various peak times, and then repeat them once the caches are in place. You should also see your bandwidth requirements decrease - if you monitor the traffic going over your WAN router interface for a few days once the caches have been in place and are fully populated, you should expect to see a significant decrease in I/O rate. If you don't, then something is wrong and you need to investigate further.