It's a common scenario, according to Burton Group senior analyst Eric Siegel, and not one that will be solved by throwing more technology at the issue - especially considering the state of networks today. Wireless, VoIP, virtualisation, service-oriented architecture (SOA), Web services, unified communications and other emerging technologies make it impossible for staff to diagnose network performance problems as they did in the past, with clear lines drawn between IT groups.
The merging of applications with networks on LANs and WANs to perform sophisticated transactions makes troubleshooting performance problems a bit like looking for a needle in a haystack.
"The network is now the backplane for multicomputing systems, whereas in the past all the processing happened on the mainframe," Siegel explains. "Now everything is all intertwined and intermeshed, and when things don't work well, it shows that some control over the environment has been lost at a time when tolerance for something being down is non-existent."
To solve this dilemma, however, Siegel does not suggest network managers talk to a handful of vendors to hear about their latest and greatest features. Rather, he says, IT shops can look a bit closer to home to find the technology, as well as the human capital they need to assemble a team of all-star staff to tackle the environment's 20 top network and application-performance issues. This team could use the measurement, monitoring and management technologies already deployed in most IT environments to gather the information they need to become advanced problem solvers.
"You probably already have all the tools out there collecting the data you need to really isolate problems, and look around because the skills are most likely in-house as well," Siegel says. "Instead of spending $1 million on new software, put $100,000 of staff hours and human capital toward building a framework of people for rapid problem-identification and diagnosis."
The situation is common: Senior network technicians are often relegated to fire-fighting because junior staff can't fix the most common problems as they come into the network operations centre. If organisations can discipline themselves to put the five or six smartest, most-experienced network, application, storage, systems and other technicians in a group to measure the normal state of these intertwined systems, when problems do occur, lower-level staff can more easily identify who is responsible for solving the problem at hand, Siegel says.
Instead of the entire IT staff swarming to solve problems as they arise, this group would be dedicated to getting to the root of the issue and quickly determining if the problem lies on the network or with an application, for example.
"It's rare to find one person who is a specialist in everything. And these types of problems aren't happening enough for companies to pay to have one of these experts always on staff, but you can create the skills you need from the people you have to address the problems when they arise," Siegel says. "It's not about book-learning; those on staff with years of experience could be part of a team of heavy-duty end-to-end diagnosticians."
If it seems unlikely that an IT organisation could create a Network A-Team in-house, it could also contract such services from experts for hire. Companies, which teach network diagnosis, also work on contract and can come in when a problem has existing IT staff stumped. It is still a good idea for companies to collect a measurement of the normal state of existing network components - everything from DNS to SOA applications to help the problem-solving along. With this baseline in place, network organisations will be able to address problems more quickly as they arise.
Whether the diagnostic staff is in-house or external, Siegel says IT organisations should start compiling the following information:
* Contact information for all diagnostic staff
* The location, necessary passwords and other operations information for all diagnostic tools
* Skeleton plans to be fleshed out by the incident managers during the course of each incident
* A standardised format for recording diagnostic steps and their results.
"Flexibility and quick, categorisation-based incident analysis is the key to rapid incident-handling," Siegel says. "Enterprises must also drive to create an organisational structure oriented toward co-operation and a systems view of rapid diagnosis."