Do you ever feel that, despite your best efforts, things just conspire against you? You really make the effort to design and build your network for resilience, you dual-home all your switches, provide two WAN links into all your remote sites, even insist that the Service Providers run their tail circuits in separate ducts into the buildings, and yet, through no fault of your own, you end up with a single point of failure that isolates several of your sites. How can this happen?

Let me give you a recent example. A large financial organisation I’ve done some work for was replacing the WAN infrastructure for a sizable section of its operation. Each site was to be connected to the WAN via dual links, completely resilient, provided by two separate telcos. The finance house had gone out to tender and awarded the contract, via a systems integrator, to two Service Providers.

One was a global telecommunications provider originally based in the US. Lets call them Telco A. A for AT&T, I guess you could say.

The other was again a provider with a global presence, but this time originating from the UK. Telco B, let’s say. As in BT, just for example.

The only slight problem was that Telco A, although being a huge company with a massive global presence, didn’t actually have a very large PoP footprint in the UK outwith the Southeast. So Telco A subcontracted some of the circuits for its customer to a third telco that covered more of the UK. This would be Telco C. C as in Cable and Wireless, maybe? (don’t you just love it when a story fits together like this!).

This is where it starts to get complicated. Telco C has lots of PoPs, but rather than dig up the streets, it gets its last mile tail circuits wholesale from Telco B in lots of places. And more than just that, it also uses part of Telco B’s infrastructure to connect these back to its own sites.

Meanwhile Telco B, in trying to fulfil this pretty large order from the finance house, is running short of capacity in some places, so has gone to Telco C to provide some short-term extra bandwidth.

Are you still with me?

Now, while it’s always nice to see competing companies enter into the spirit of cooperation, the outcome of all this behind-the-scenes jiggery pokery was that a local access PoP failure by Telco B took down both tail circuits to some of the customer’s remote branches and left them completely cut off. And a later infrastructure problem for Telco C—which the customer didn’t even know was in the equation at all (and wouldn’t have been too pleased about as they had consciously not included them in the tender process as they had fallen put with them)—affected both Telco A and Telco B connectivity and hit another set of customer sites.

What comeback does the customer have in these cases? There’s nothing to specifically say that Telco A and B have to provide every inch of the network on their own infrastructure—in fact that would be pretty unusual. After all, you may have your broadband service from any one of a number of ISPs, but if it’s being provided over a landline, chances are you’re on the BT network for at least part of it.

Telcos probably don’t even know exactly how they are going to provision all the services when they respond to a bid request for several hundred or thousand sites. Once they’re into the detailed implementation planning, they probably won’t tell the customer that level of detail. There’s something not right here—I just don’t quite know how to avoid it.