Nationwide Mutual Insurance consolidated 14 distributed storage-area networks into four in two production data centers in only 90 days using just four full-time storage experts for the actual migration. The result: by simplifying its SAN architecture, Nationwide doubled the number of terabytes its administrator can manage.
Alan Grantham, a storage architect at Nationwide, in Columbus, Ohio, told users at Storage Networking World yesterday that the massive storage consolidation, which was completed a year ago, came off without a glitch because of one thing - planning.
Grantham said he wasn't allowed to reveal the project's cost or discuss return on investment, but he spoke candidly about barring senior IT management from his data center during the project, calculating the risk involved in consolidating a storage infrastructure and the importance of testing every failure you ever had in the past on the new SAN before going live. The project also involved consolidating 102 Fibre Channel switches to 20 director-class switches from Cisco Systems Inc.
According to Grantham, there are three paths for going live on a new storage architecture: a live migration, a partial migration and an off-line migration.
Live migrations represent a very high risk to businesses because any operator error can create an outage, Grantham said.
A partial migration, which uses multipath software to disable one data path at a time in order to redirect data flow to the new SAN, requires a well-documented environment and is also very high risk because one error resulting in a disabled path creates an outage on a host.
Nationwide chose to perform an off-line migration, where host servers are brought down and then reconnected to the new SAN. While it requires a business outage, "it's the least complex, and it's often the fastest," Grantham said.
"We were doing 100 servers in five hours," he said. "Most environments want to be at that stage."
But in order to perform an off-line migration, business unit leaders must be sold on the benefits and reduced risks.
Grantham said a major roadblock to winning approval to any SAN project is gaining business-unit approval, and the way to do that is to sell them on the business benefits.
"If you do not sell your business units, consolidation is incredibly hard. I probably went out and met with 60 large-scale business units and said, 'This is what we're doing, this is why, this is your benefit and this is what happens if you don't do it,' " he said.
Grantham also strongly suggested creating a data center cable strategy to ensure resiliency, so that no single floor-tile accident, for example, can take out an entire network. He also recommended documenting the entire storage infrastructure, even if you have a good storage resource management tool.
"I have a very good SRM tool. However, I still have my guys walk through racks of servers counting Fibre cables and saying this server is not on our list," he said. "We found 11 servers not documented anywhere on our SAN, and some of them were 64-processor servers running critical business [processes]."
The No. 1 killer of all projects is lack of funding, Grantham said. If, as an IT manager, you think it's going to cost a certain amount, fight for that amount, he said.
Grantham also echoed a common mantra of IT project managers: "Communications, communications, communications."
Make sure you have a clear division of responsibilities, and form a small dedicated team, he said. At Nationwide, a team made up of Grantham, two engineers and one vendor contractor performed the migration.
"Break migration down to monkey-friendly tasks. Have easy-to-follow processes, check lists, and you have plug-in cable," Grantham said. "Yes, it's a pain. But on the night of the migration, they won't be questioning what they're doing."