Halifax Bank of Scotland, the UK's largest mortgage and savings bank, suffered a problem common to large companies: It had an enormously complex storage area network (SAN) that left IT managers blind to the source of brownouts and outages.

"The fabric itself was a bit of a black hole. All we could see was throughput and megabytes per second. That was not enough. We needed more detail," Richard Briggs, the bank's senior storage technical infrastructure developer, said at the Storage Networking World conference in San Diego.

The bank's infrastructure consists of 3,000 Fibre Channel ports split across two data centres in West Yorkshire, England, more than 300 Sun E15K servers running Solaris, 149 Brocade switches and a petabyte of storage capacity on Hitachi Data Systems Lightning 9900 V Series and Thunder 9500 V Series arrays. Symantec's Veritas volume management software is used across all of the bank's storage systems.

The bank, which has 67,000 employees and 20 million customers, also needed a precise storage testbed to ensure upgrades to its production site would not cause shutdowns, so it built an exact duplicate of its production SAN, right down to the Brocade Communications Systems switches and Hitachi Data Systems arrays.

While the mini-SAN was an expensive venture, it has been indispensable for ensuring smooth hardware and software upgrades to the production site, said Briggs, who is also chairman of the Brocade UK User Group.

Finisar monitoring technology

But to discover the source of brownouts and outages, the institution rolled out monitoring technology from Finisar that allowed it to pinpoint problems that it could take back to vendors to help solve.

"It was a little scary at first," said Briggs, who admitted that seeing details of his infrastructure's performance opened his eyes to much needed improvements.

Finisar's NetWisdom SAN performance product is deployed on an application server and as an appliance that sits in the SAN's fabric and monitors I/O traffic.

The NetWisdom product also comes with a protocol analyser tool for testing and designing SANs.

"If there's a fault, we can identify what part of the SAN had it," said Dave Buse, general manager of network tools at Finisar.

The analyser essentially captures live data traces in the host bus adapter environment, which can then be sent to a vendor, whose engineers can use to debug systems, Buse said.

One example of how the trace functionality worked occurred when the bank experienced a data corruption problem when replicating from its primary site to its disaster recovery site.

The bank's switch vendor, array vendor and telecommunications provider said their systems were working flawlessly, Buse said.

"They had them in an off-site conference room and said we need to solve this problem," said Paul Hansen, vice president of marketing at Finisar. "They sent traces to all three, and the storage vendor said it is us."