We had been, in the main, happy with our ATM network, with its 155Mbps backbone and its mixture of 10/100Mbps switches and 10Mbps hub-stacks. It had been in operation for some years. Now, for the reasons acknowledged by the rest of the planet, it was time to upgrade to Gigabit Ethernet.
We spent 18 months arguing, rationalising and begging for money to overhaul the network. We finally managed to convince the powers that be to part with a not inconsiderable amount of money. Following a complex tendering process we finally selected an equipment vendor and a reseller.
We planned a staged migration. In the first phase we would build a new core and also migrate more than twenty departments onto it. We would have preferred to undertake this exercise early in the summer vacation but, naturally, by the time we had obtained the money and purchased the equipment we were into August with ‘clearing’ looming on the horizon.
Assisted by an able engineer from ADA networks we built the new core. All appeared to go well – extremely well. We named our core switches after the Simpson family - Homer, Marge, Bart and Lisa. We now had a super-fast performance core.
Busy on the border
It all went without hitch in the way that network migrations usually don’t. It came as rather a shock when, three days later, there appeared to be a module failure on one of the core routers; on a module that provided routing to many of our departmental subnets. Quick examination of the equipment indicated that there was a hardware failure. We quickly replaced the module and all appeared to be well… except that it wasn’t; we started getting complaints about the network being slow. Frantic phone calls were made to the reseller and to the vendor; diagnostic procedures were initiated.
At the same time we noticed that our border firewall was rather busy, as opposed to sitting there filing its nails as it usually does. The firewall is a big powerful Netscreen – but it was suddenly working overtime. Closer inspection revealed that we were experiencing ‘wormage’. Earlier in the summer we thought we had escaped an influx of Blaster, but now we were being nobbled, at the most inopportune moment, by Welchia and Nachi. Over the course of the next few hours it became evident that numerous machines on the campus had been infected.
Our equipment vendor swore that their kit would be unaffected by the worm. We had no choice but to embark on days of pulling the network apart and rebuilding it. I am sure that every network ‘techie’ has experienced this kind of scenario; you start with a lovely resilient ring topology, and then gradually strip it down to a bare minimum ‘twig’ architecture in a vain attempt to isolate the problem. No matter what we did, we couldn’t isolate it.
“Is it the worm?” we would ask periodically. “No!” The vendor was convinced that there was no problem; no one else on the planet had reported anything. Obviously we were jinxed. Hours turned into days of going round in circles and achieving nothing. ADA networks were exceptionally helpful, supplying spare equipment, expertise and support.
Finally - after more than a week - the vendor admitted that its equipment was after all affected by the worms. We were exhausted and flat. It was, however, good to know that we had been right all along, because this sort of occurrence makes you start to question your technical ability and kills your confidence.
We finally had a meeting with the vendor and our reseller. The vendor kindly offered to provide us with some high-end kit, which was not subject to the same problem, until they could fix the routers we had purchased. We gladly accepted. They have since released a patch for our kit which is currently under test.
We seem to have been terribly unlucky with our timing. Looking on the bright side, we have learnt a lot about the equipment and are now confident that problems will be solved.
Vanessa is a network manager at Royal Holloway College.