A significant part of network and system management is knowing what's going on when things are pretty normal. Yes, it can all get very exciting (well, that's one way of putting it) when there's a major issue and everyone works round the clock to get it fixed, and there's a certain satisfaction in coaxing the network back to life, but hopefully that doesn't happen all that often. Although it's not always as visible, you can make just as big a difference to your company's business by managing the network in the good times too.
Knowing the utilisation patterns, what traffic types are on the network, and where the traffic flows concentrate can help you give your users better performance, save the company money, and reduce the firefighting you have to do.
To analyse your network's behaviour you'll need stats. First, decide what it is you're interested in finding out. If you just collect as much information as you can and produce reports with no clear goal in mind, don't bother, as you'll be wasting your time.
Most people want to know about network and resource utilisation for capacity planning, as this tends to be where the obvious quick wins are. for example, when do you need to start thinking about upgrading WAN links? Do your routers need more memory? Is the disk space filling up on your servers? Long term growth trending will also help you spot anything strange going on, such as an unexpected hike in usage that might indicate a problem or a major change in working practices.
Decide which devices you want to monitor. It may be easy to monitor every interface on every device, or you may want to choose just what you consider the most important.
You may want more than just raw utilisation numbers though. Where is all that traffic going - traffic profiles have changed since the days when everyone connected to a mainframe and practically nothing else. Your network may have been ideal when all the traffic flowed into and out of a central site, but might now be adding delays and complexities when most traffic is peer to peer between sites. Locating the traffic flows may pinpoint new bottlenecks and a need for reoptimisation.
Similarly you should be aware of the users and applications that use up most of your network resources. If 90 percent of your traffic is for one particular application, then you had better tune your network to that.
Another aspect that's specifically related to performance management is monitoring the network for response time and packet drops. This shows what your users are experiencing - it's more of a troubleshooting procedure, but again trending these figures will tell you if something's running out of steam before you run into problems.
Tools of the trade
You may already have management stations that poll devices for information. Utilisation stats can be retrieved by polling interfaces and CPUs at regular intervals, using the likes of MRTG (which is free) or HP OpenView (which isn't) to access MIB values.
For more detailed information, such as top talkers or protocols in use, you probably have to invest in some strategically placed RMON probes. If you have Cisco hardware, NetFlow can provide similar information. You can get a snapshot of this sort of information using a network analyser, but these tend to be rather expensive to have dedicated in place. You'll get some other ideas in Network Management for Free.
Analysing the data
You've decided what information you need, and gathered it. Now what? Reports can be as fancy or otherwise as you want. Importing the data into a spreadsheet and producing a series of graphs will present the information clearly and allow you to compare shifts in behaviour over time.
Alternatively you can go with a Web-based report generator designed for capacity planning and long term trending, such as Netscout's Performance Manager. This provides tabbed reports, giving headline-type executive summaries that can be zoomed in on to show situations to watch, and more detailed information on response times, applications and hosts, so that readers can get just the level of detail they need.
However you present the information, pay attention to what it's telling you. Make sure you understand why the top talkers are the ones they are. The way the traffic flows round your network shouldn't really be a surprise, so if you're seeing flows where you don't expect them, find out what's causing them. And make sure you keep tabs on how traffic levels are growing so you have plenty of warning if you need to start thinking about upgrading.