One of the problems facing organisations that use a great deal of remote management and monitoring is that management-related traffic can place a surprisingly significant additional load on the network. Although one can mitigate this to a certain extent using devices such as RMON probes, and the proprietary data concentrators that come with management or monitoring packages, this only goes part of the way to addressing the problem. One answer is to build a separate network to carry the management-related traffic - to do so-called "out of band" (OOB) management.
OOB simply means that the management traffic isn't flowing along the same bits of string as your corporate traffic and is therefore not occupying bandwidth that could be used by data that's making the company money. In an IP world, this basically means setting up a new IP subnet and using this to carry the additional traffic.
The simplest way to proceed is to provide each server you wish to manage with a separate network adaptor and to connect it to your OOB management subnet. This network could use completely separate switches. Or, if your existing switches can be partitioned to provide "virtual" separate switches, and of course there are sufficient spare ports, then that's a less expensive alternative. For SNMP-manageable devices on the network, you'd set the IP address such that it lives in the appropriate subnet and connect it into your management LAN appropriately.
Not enough ports
There will, in most installations, be devices that you can't connect to the OOB subnet for some reason. Say, for instance, you have a firewall with one "internal" port that lives on the production LAN and one "external" port that connects to the Internet - it's usually not possible to either add an interface, or configure the internal interface, such that it can exist on both the production and OOB LANs. In these cases, you simply have to live with the fact that the only way to manage such devices is via the production network.
If you have a number of locations connected by WAN links, you have a variety of options for dealing with OOB management in the wide area. To extend the concept of separating management and production traffic into the WAN, the old-fashioned approach would be to deploy remote data collection devices, such as RMON probes, and to use some kind of dial-up technology (generally ISDN) to transport the data as required (and to provide the OOB link when reconfiguring devices from afar using SNMP or proprietary protocols). With the growing availability of ADSL and Cable Modem links, it's becoming increasingly feasible to use these inexpensive, always-on technologies in place of dial-up, at greater speeds and lower cost than traditional ISDN links.
VLANs and bandwidth management – a compromise
Depending on the type and specification of the equipment in your network, it may be possible to implement a "virtual" OOB management network using VLAN and quality-of-service guarantees within the existing LAN, without the need to install further hardware. Imagine, for instance, that both your switches and the NICs in your servers support IEEE802.1p VLAN tagging, and that the switch infrastructure can be configured to provide bandwidth guarantees to the various VLANs.
It's no problem to create a new VLAN and to drop the servers into that VLAN without the need to add extra adaptors. You may decide, for instance, that your 1Gbit/s infrastructure can be split into a 990Mbit/s production LAN (which, if you've spent all that money on QoS-capable kit, will probably be split up further) and a 10Mbit/s OOB management network. The extent to which you can implement such bandwidth shaping depends entirely on the capabilities of your existing equipment. But if you have spent significant sums on top-notch kit the chances are you can extract at least some extra value from its facilities in this way.
If we think about it for a moment, what we achieve through this VLAN approach is actually what we wanted to achieve in the first place. Generally, people don't implement OOB management because they don't want management traffic on the production network, but because they want to be able to control what management traffic flows down the production network. That is, it's the unpredictability of the management traffic levels (particularly when something goes pear-shaped and zillions of Syslog messages and SNMP traps are hurtling about) that motivates the introduction of an OOB management network.
What we're out to prevent is that old network manager adage: "This thing would be working fine if it weren't gummed up with all those damned error messages". By using QoS tools to contain and place limits on the management traffic, we're ensuring that the production network's performance is consistent, even in times of crisis, and we don't even give a stuff that (in our example) we've set aside a whole 10Mbit/sec of our bandwidth for management, even if it doesn't get used 98% of the time. After all, it guarantees that when we want to reconfigure something, the bandwidth is there to let the SNMP messages get through. Anyway, if we did give a stuff we'd simply configure it as a variable-rate channel with a cap of 10Mbit/sec instead of a fixed 10Mbit/sec chunk.
Other uses for OOB networks
In general, OOB networks mean increased cost. They do, of course, help us provide performance guarantees, but the cost can still be hard to justify because most of the time everything will be working fine and the management traffic probably won't have a massive adverse effect on the system. We can increase the value of such networks, though, by considering running other services on them, not least our networked backup systems.
Backups are bandwidth-intensive but reasonably short term and for this reason they're generally run overnight. This isn't necessarily the nicest way to run them, though - we may actually prefer to run a small "snapshot" backup once per hour instead of running a single big one overnight. But we're precluded from doing this because the network starts to crawl when the backup runs. We can therefore consider shifting ancillary functions, such as backups, to the management network and running them at times that are suited to the business, not just the network. (Obviously, there are server loading and open file issues to consider here but you get the idea).
In many ways the relationship between the production network and the management network could be likened to a real time (production) versus non-real-time (management) network, where the production traffic is unimpeded as far as possible by non-production traffic. So you could, for instance, start to ship traffic between your corporate email servers using the OOB link, as a few seconds here and there don't matter for email communications. Likewise synchronisation messages between your NDS or AD servers - the same logic applies and it keeps the traffic off the production network.
The technology behind setting up an out-of-band management network is straightforward, as it uses largely the same technologies and protocols as the production network. Yet it provides a means for ensuring greater reliability of the production network. The cost can often be justified by using it to carry not just management traffic but also other non-production stuff, for which instant delivery isn't critical but which might otherwise get in the way of production data. Although some devices might not be physically able to connect directly to the management network, at least you've reduced the traffic on the production network by moving those devices that could connect to the OOB network. As the devices you couldn't move reach the end of their lives, you can ensure that their replacements are less restrictive, connection-wise.