At The Hartford Financial Services Group, virtualisation has gone mainstream in a big way. The USA-based insurer has deployed 700 virtual servers and another 1,300 virtual PCs on 50 host systems. Bruno Janssens, senior architect of infrastructure architectural services, says the company plans to roll out another 300 to 400 virtual machines this year.
Although the insurer has realised benefits, Janssens and other early mainstream adopters says it's easy to get into trouble if you don't plan right. From capacity planning to hardware configurations to scaling virtual machines in the enterprise, the pitfalls can derail major projects if you're not careful. Virtualisation, Janssens says, is not a one-sized-fits-all solution. "Everything is not going to be able to be virtualised," he says.
Where virtualisation fits, however, the pay-off can be substantial. About 45 percent of data centre virtualisation projects involve server consolidation, according to consultancy IDC. Those projects on average have shrunk the server installed base by 20 percent, resulting in capital and operating cost savings that have delivered a 25 percent return on investment, says IDC analyst John Humphreys.
Now those organisations are leveraging live migration tools such as VMware's VMotion as well as dynamic load-balancing tools such as VMware's Distributed Resource Scheduler to improve the management and administration of those virtual machines. "By 2010, we think the majority of spending on virtual servers will be for high availability, business continuity and utility computing," Humphreys says.
But the majority of organisations are still ramping up major virtualisation projects in data centres. "While we can say that 60 percent to 70 percent of people are using [virtualisation], a lot of those are two or three or four servers in development and testing environment," says Bob Gill, an analyst at research firm TheInfoPro. As these organisations move into production deployments, unexpected issues can arise. "You can run into traps," says Janssens. Here are some of the more common missteps.
Ad hoc capacity planning
Planning is the key to a successful virtualisation project. "If you don't have an objective and don't plan correctly, you could really find yourself in a heap of trouble. You're putting all of your eggs in one basket, and you're sharing resources," says Matt Dattilo, vice president and CIO at PerkinElmer in Waltham, Mass.
Currently, 245 out of the health sciences company's 560 servers -- about 45 percent -- run within virtual machines.
Organisations that take an ad hoc approach to planning virtual server application workloads can quickly run into resource bottlenecks. "You have to do a classification of your applications before you can even start thinking about virtualisation," says The Hartford's Janssens. For each application, memory, processor and I/O needs must be fully understood to effectively determine which applications are good candidates for virtualisation and which will play well with others on the same physical machine, he says.
That means both measuring resource-utilisation rates over time and testing how specific virtual machines will work when combined on the same hardware. "Take the time to do that analysis of workloads that fit together [rather than] just throwing five or six virtual machines on the next physical box," says Philip Borneman, assistant director of IT for the city of Charlotte. About half the city's 300 servers are virtualised on just over a dozen hosts.
"Mixing the wrong pieces will sub-optimise the use of the hardware platform dramatically," says David Rossi, managing partner at Sapien LLC, a hosted application service provider that runs its entire set of hosted service offerings on virtual servers.
And even if applications are potentially compatible, misallocating resources within virtual machines can cause trouble. Each virtual machine adds overhead, and performance may suffer as virtual systems vie for physical resources such as memory, processor cycles and network bandwidth. "You need to build limits into each virtual machine, how much it can grow," says Janssens. "If you don't stop that, it will take over everything. That can be a real danger."
In some cases, the expected results may not pan out in the real world. Dattilo says when it comes to resource utilisation, sometimes two plus two equals five. He had a series of applications that consistently posted utilisation rates of two percent to 10 percent. "When we started piling them onto an ESX host, we saw a spike in CPU utilisation" when certain applications within virtual machines were all put onto the same host, he says. "We were hoping it would be all linear, but it ended up not being that way."
But Dattilo's team had a plan of attack and tested first. "A lot of this is almost being done in the dark," says TheInfoPro's Gill, whose firm regularly surveys IT executives on virtualisation and other subjects.
"We ask, 'How do you characterise what gets collocated together?' Most say that they simply add virtual servers, [then] they see problems and back off. It's not like people go in cognisant upfront as to what they're going to run into," says Gill.
VMware and others offer consulting services and capacity planning tools that can help. "There are tools and techniques to aid you in your migration to a virtual environment, such as ... from VMware and the Free Software Foundation," says Charles R. Whealton, senior systems engineer at a large European pharmaceutical company. The problem, says Gill, is that "people are either not aware of them or they're not using them."
Creating hardware conflicts
Virtual machines don't always play well with every physical server configuration, warns George Scangas, lead IT infrastructure analyst at Welch Foods Early on, the juice maker installed a network interface card into a server, and it didn't get along with VMware. "We had to shut down 10 [virtual] servers to pull it out and put another one in," he says. Now Scangas pays close attention to the vendor's hardware compatibility list and tries to stick with it whenever possible.
Application hardware requirements can also be a problem. Some vertical market applications use their own device drivers and are only compatible with a short list of hardware. That hardware may not be compatible with the virtual machine host, says Tristan Lyonnet, a virtualisation consultant based in France.
Hardware homogeneity can also be a factor in virtual server environments. VMware's live migration tool, VMotion, can move virtual machines only between physical servers that use CPUs from the same processor family. "Even if the chip sets are different, it won't let you do it," says Scangas. That limitation also affects VMware's DRS, which relies on VMotion to migrate workloads among up to 16 physical hosts.
Kevin Thurston, director of IT infrastructure at PerkinElmer, says he started with Intel-based systems and then was limited when he wanted to switch to AMD's Opteron line. It's also important not to buy systems that use a processor family that's nearing the end of its life. "It's good to standardise on one platform. But also make sure you can grow it," he says. Thurston went with 2.8GHz Xeon processors initially, but those have been discontinued. PerkinElmer kept its Intel-based server farm but has built pools of virtual servers at other sites using AMD-based systems.
Specifying the correct amount of processing power, memory and other resources in the physical equipment supporting virtual machines is critical, users say. "Our first implementation was a little short on hardware resources," Thurston acknowledges. The specification, developed by VMware's own consultants, called for plenty of RAM but not enough CPU resources for PerkinElmer's particular mix of virtual machines. "When we started, the virtual machines were chewing up a lot of CPU cycles," he says.
Bryan Peterson, principal systems engineer at The University of Utah Health Care, had the opposite problem: plenty of CPU power but not enough memory. He started out with dual-processor ProLiant servers with 4GB of RAM but could only get eight virtual machines per host system. Peterson quickly moved to an 8GB configuration as the consolidation platform for his Windows-based servers, which now support 21 virtual machines per physical server.
Storage can be yet another challenge. To use VMotion, Peterson needed to configure storage on a clustered file system, which meant configuring a SAN. But the 15GB per virtual server he had initially allocated became woefully inadequate. "People kept requesting 50 or 100GB, [and] the database team was asking for 250GB" for its servers, he says, adding that they quickly ran out of space. "When we went live, we had a terabyte [of storage], which we gobbled up in a matter of weeks," he says. "Then we had to budget for more." That was a big and unexpected cost, according to Peterson.
Falling into application support gaps
Although it's less of a problem today than it was a year ago, many application vendors are still reluctant to support their products when running on a virtual machine. It pays to check first. Even today, says Peterson, many clinical application providers aren't up to speed on the technology. "We follow their suggestions because we don't want to run into support problems down the road."
Bigger applications can also be a problem from a support standpoint. "We were gung-ho to virtualise our Citrix environment and ran into some opposition by the Citrix team that manages that environment," says Peterson. The main reason: Citrix Systems didn't recommend its use with VMware, he says. However, a Citrix spokesperson responded that the vendor does support VMware and has for some time.
But in testing, Peterson and the Citrix team found another issue: they were only able to allocate 15 Citrix clients per virtual machine.
"If [the Citrix team] could only get 15 users per Citrix VM, they would have to create a whole bunch more Citrix VMs" than they would need on standard physical servers, he says. Peterson says his group plans to "revisit that and do some more testing."
The Hartford is also testing Citrix on VMware. Janssens says the company has 3,500 Citrix users using about 1,800 published applications residing on 325 servers. To get workloads up, he recommends working with the latest version of ESX Server. "ESX 3.0, which supports 64 bit, can take much bigger workloads," he says.
Getting ahead of your expertise
While it's possible to get up to speed quickly with virtualisation by bringing in outside help, IT still needs to support it. "You have to have people working on this who are very knowledgeable," Janssens says, and the rapid pace of adoption can get ahead of the IT organisation's learning curve.
In many organisations, staffers have already spent time with products like VMware Server and VMware Workstation, but enterprise-grade virtualisation products such as ESX Server are a different animal. "Plan on six months for you to play with it in a test environment before you go to production," Janssens says. In the meantime, he says, consider bringing in a partner that can help with the planning assessment and deployment.
Borneman recommends sending staff for training -- if you can find it. Training "is still not cheap and it's not everywhere, but it is improved," he says.
Computerworld reporter Patrick Thibodeau contributed to this story.