It isn't often that you have access to your local neighbourhood data centre, literally a few steps down the block. But I did. I live in the USA, in a residential area of St. Louis called the Central West End, and I pass by the offices of the Regional Justice Information Service (REJIS) almost every day. When I learned that it was going to be moving its data centre, I knew that I had to be there for the actual move.
That was before I found out about the background check and how the move was taking place during the middle of the night. But I am getting ahead of the story.
In the process of reporting on the move, I got to see some terrific best practices about how to pick up your servers and minimise downtime, too. (See the best practices tips scattered throughout this story.)
Moving a data centre isn't easy under the best of circumstances. And no matter how hard you plan, there are still things that you don't think about, like a brand new elevator that wasn't working.
REJIS is an interesting enterprise: It was founded in 1976 to provide IT services to the public sector in the St. Louis area. The organisation now handles more than 200 different government clients for applications development and it supports more than a dozen different programming languages.
Tip: "Put staff at various remote locations during the move," says Eric Gorham, director of IT for the organisation. That way, services can be tested as they are restored -- without effecting the main data centre staff.
It does batch and online processing, and hosted facility and server management. With more than $15 million in annual revenue, most customers are county and local government criminal justice agencies. But REJIS also provides data connections to the National Park Service police that are based at St. Louis' most noticeable landmark, the Gateway Arch.
REJIS also supports more than 1,000 mobile devices that are in local police cars. There are about 150 employees in the building, most as you can imagine involved in IT-related jobs. REJIS has about 100 Intel-based servers, mostly Dells, and an IBM eSeries, too.
Given REJIS's client base, it has all sorts of connectivity to its clients. Its complex network comprises Frame Relay, T1s, ISDN, cable modems, MultiProtocol Label Switching, fibre and even dial-up. All of these links are encrypted, as you might imagine given the sensitivity of the data that traverses these networks. And all of these connections had be moved from their old wiring closet to the new one next door.
With all this connectivity, REJIS needed to take some extra steps to ensure that all the communications lines would work after the move.
REJIS also has a background investigation system and implemented the first automated fingerprint system in Missouri. I got to experience that first-hand -- to enter its data centre during the move, I had to be checked out. This was one database that I didn't want any hits on and, luckily, I passed.
"When a cop pulls someone over on the street and runs a check on their plate and their driving licence, you can get over 20 different responses from various law enforcement databases," says Eric Gorham, director of IT for the organisation. "We can then organise this information for the officers in their patrol car."
REJIS had outgrown its 30-year-old data centre, located in the basement of the office building.
Tip: Don't forget about cables and other consumable but necessary supplies, too. Order plenty of extras and see if you can negotiate a liberal return policy for the unused ones.
The old space didn't have enough floor space or cooling capacity, and REJIS also wanted new disaster recovery features. When it was time to expand, it decided to build a new data centre next door, in a former parking lot, and double its floor space in the process. The offices in the old building are still being used; the data centre is the only occupant of the new building.
The new data centre would accommodate hot and cold aisles for distributed servers, rather than the old mainframe designs of the last century. "We had 12-in. raised floors that were getting crowded and reducing our airflow," says Gorham. "The new data centre has 24-in. floors so we don't have to worry about hot spots anymore." The new data centre also provides more floor and rack space to grow the managed-hosting part of the service, and be a more secure facility in case of fire, earthquakes, potential flooding, severe wind events and tornados.
A basement isn't a good place for a data centre. "We are in an earthquake zone and our old data centre was below grade, not to mention being underneath a five-story building. We have had some water there before, and it could have been worse," says Gorham. "Plus, there was a lot of dead wiring, too, that was blocking airflow. We needed a better disaster recovery plan and having a new data centre in a separate building will help."
Plus, the old building didn't have a loading dock, making deliveries of new computer gear difficult. Finally, REJIS wanted to act as a backup location for the main state data centre, located a few hours away in Jefferson City.
All these features were incorporated into the new building.
Tip: During the actual move, make sure at least one person has a master contact list of every staffer's home, mobile and pager numbers, as well as contact information for key clients and vendors.
The move was scheduled for the beginning of Memorial Day weekend [a US bank holiday], to give IT staffers an extra day in case they had problems bringing everything back online and because most of their clients who work normal daytime, weekday hours wouldn't be around. It was scheduled to take place in the middle of the night, at 0200, during the lightest user load on their system.
This meant some extra planning: "At two in the morning, you aren't going to find an extra five-foot cable, so we bought many extra supplies just in case," says Gorham. "Happily, we can return the unused ones to our vendors."
Now, I am not a night person. Usually, I am in bed by 2130. But to get this story, I set my alarm and arrived shortly before 0100 on Saturday 26 May.
Despite the shortness of my commute, I am probably the last person to get to REJIS's office. There are already about 50 people crowded into the common break room. And, already, something has gone wrong. The new elevator in the new building wasn't working, and Frank, the elevator technician on duty that night, had been called in to debug it.
The total distance for the move from the old to new buildings is less than 100 feet -- but it is up two flights of stairs and down a couple of halls. Hence, the need for the working elevator.
When Frank heard where he had to go, he was a bit worried. "I am going to pull up to this building in the middle of the night and every cop in St. Louis is going to be there," he told me as he was sitting on the floor of the service room, trying to debug the elevator circuitry. "Sure enough, here I am at this high-tech cop shop."
The reason he is sitting on the floor is because there is this huge pile of manuals next to him. "The manuals aren't well written, and the documentation on the error message that I am getting is very obscure." Sounds a lot like what most of us have had to do with computer manuals.
While Frank puzzles over the manuals, Gorham is downstairs in the break room, giving everybody their instructions for the night. We are sitting next to a large stack of our own -- a large collection of pizzas, cookies (carefully home-baked and labelled by flavour by some volunteer), sodas and plenty of coffee. It is going to be a long night ahead and the movers need their fuel. "If you are going to ask people to come to work in the middle of the night, you want to make sure that they are going to be fed and keep them happy while they are working," says Gorham.
To get people juiced, Gorham draws a few random names for prizes -- this being St. Louis, more than two dozen people have donated pairs of Cardinals baseball tickets, and he has also purchased several restaurant gift certificates. Then we all go to work.
Tip: Use triage to determine the most mission-critical machines to move first.
At 0200, Frank gets the elevator going and the moving crew begins to take apart the control consoles that are needed in the new building. The crew starts to roll the IBM e-Server and virtual tape system across the floor and upstairs into the new building. Everything will sit on top of WorkSafe Technologies' ISO-Base seismic isolation platforms that can dampen vibrations in case of earthquakes.
As each machine is taken off the old rack, it goes by Chassidy's station. She is in charge of blowing out and clearing the accumulated dust with a reverse vacuum.
REJIS isn't moving everything out of the basement; some of it's being recycled or tossed. Staffers are dismantling two robotic tape libraries that have outlived their usefulness, along with 10,000-plus tapes. "Our new library has the same storage with just 1,500 tapes and is compatible with the state data centre formats, making it better for disaster recovery purposes," says Gorham. And REJIS plans on recycling a huge pile of bus and tag cables, and other old equipment to EPC Recycling.
Within a couple of weeks, the basement will be completely empty of all equipment, cabling, and even the raised floor will be removed, too. "We don't have any current plans for the basement, but are thinking about a few ideas for it," says Gorham.
By 0230, "all of our red servers are in place," says Gorham. "We thought it was going to take longer." REJIS tagged each server by colour, with red being the most critical (law enforcement servers, for example) that needed to have the least downtime. REJIS has a lot of non-law enforcement clients, such as doing the general data processing for a local city government -- running payroll jobs and processing other HR data -- that was triaged as not quite so mission critical.
By 0358, the huge IBM eSeries servers are connected and back online. "I'm happy about the limited downtime," says Gorham. "We were really only down for 118 minutes for mission-critical services."
Tip: Tag every server with information about its IP address, old and new rack locations and server name. Place the tags near where the cable is connected on the back. When you cut the cables, you still have the tags to guide where to hook up the new wires.
Seeing how things are going well, I head home for a few hours' rest. I am back by 0630 and walk into the server room, to see only one cable laying across the floor: "We just had one bad connection that we needed to fix," says Gorham. That was it; all other systems are up and running.
By 0700, as breakfast is served in the break room, most of the crew is finishing up and getting ready to head home. But not before a few lucky ones are picked at another random lottery to get some Cardinals tickets as their reward for a good night's work.
Looking back with the benefit of a couple of weeks of hindsight, Gorham says he is very satisfied with the overall move. "I would have done some of the building's system tests earlier on -- such as making sure that major electrical and air conditioning systems were working, and maybe we should have had the elevator guy on site," he says. "And it would have been nicer if we had more new racks to spread out our servers."
But that's about it. All in all, it was a good night for a move.
David Strom is the former editor-in-chief at both Tom's Hardware.com and Network Computing magazine, and is now a podcaster, blogger, public speaker and freelance writer for numerous IT publications and The New York Times, among others. His blog is located at http://strominator.com and he can be reached at [email protected]