The maturation of virtualisation solutions in the x86 and UNIX realm has opened the door to an endless array of choices. More choices offer organisations greater flexibility, but they can also introduce confusion and complexity.
Every virtualisation technology operates in a slightly different manner. This is compounded by the fact that every IT environment is vastly different, with its own unique operating patterns, technical compositions, and business constraints. Because of this, there’s probably never going to be one ideal virtualisation technology for every IT scenario. So it’s better to focus resources on choosing the right technology for a specific situation instead of finding the ever-elusive “perfect” solution.
There are several factors that influence the choice of virtualisation software.
Mobility and motioning
Motioning enables applications to move between physical servers without disruption. Available on VMware’s VMotion, XenMotion, and IBM P6 LPARs, motioning has the potential to transform capacity management.
However, it’s not without its problems. Motioning can introduce volatility, and create vexing challenges for management groups tasked with incident management and compliance issues. To gauge whether motioning is a good option in the environment, organisations first need to analyse maintenance windows, consistency of workload patterns, and disaster recovery strategies.
When combined on a single physical platform, maintenance windows become intermingled. This can easily create scenarios where there is no window of time available for hardware maintenance. The same problem arises for software freezes.
The ability to motion virtual machines can alleviate this problem by allowing servers to be moved offline for scheduled maintenance or software updates. Alternatively, without motioning in place, the proper initial placement of applications on virtual hosts is extremely important. In either case, making the right placement decisions is critical, since the mere act of motioning may constitute a change that violates a software freeze.
Consistency of workload patterns
The advantages of motioning may vary widely depending on the level of volatility in workload patterns. It can be very useful to leverage space capacity in highly volatile workloads. However, those benefits diminish in low-volatility scenarios.
Organisations can analyse the ideal placements based on the variations in utilisation patterns for each day or the week or both. If patterns do not vary widely from day to day, a static placement may be sufficient and the volatility of motioning avoided. If the patterns are significantly different from day to day, a more dynamic solution is warranted.
Disaster recovery strategy
If application-level replication or hot spares are part of the disaster recovery plan, motioning may undermine these efforts. For example, one might inadvertently place a production server in the same locale as its disaster recovery counterpart. To avoid such pitfalls, organisations undergo a detailed analysis of disaster recovery strategies, roles, cluster strategies, cluster roles, and replication structures.
Overhead and scalability
There are numerous aspects of an operational model that may affect the success of virtualisation. These include the way I/O is handled, the maximum number of CPUs per VM, as well as the way vendors license their software on the platform. Organisations can bypass these overhead and scalability concerns by considering the following factors.
Software components such as database servers that are I/O intensive may be better suited to virtualisation technologies that don’t use virtual device drivers, as these device drivers “tax” the CPUs with every I/O transaction they perform, causing the system to hit its limits before it otherwise would. Techniques such as VMware’s raw device mapping also provide higher efficiency in this area, but the use of such features prevents motioning.
To determine the best approach, organisations can use a strategy-specific overhead model that adds up CPU utilisation numbers based on the I/O activity on the physical servers. This is an easy way to catch any workload types that are unsuitable for a given virtualisation solution.
Non-compute intensive applications
Theoretically, it is possible to place many non-compute intensive applications together on a virtual host. However, there are many factors that may limit the scalability of this scenario. Moreover, pinpointing which factors are constraining the environment can be tricky.
The first step is to apply a CPU “quantisation” model. If a technology uses a virtual-CPU-per-physical-CPU model that is rigid in nature, the number of virtual systems is limited by the number of CPUs. While this issue is fading away as new fractional models become available to allow for allocation of fractions of physical resources, it is still prudent to be aware of this constraint to prevent unpleasant surprises.
Memory is a more complex part of the equation. Applications that aren’t doing very much will often utilise the same amount of memory as similar applications that are more active. Combining even a small number of these applications can quickly tax the memory capacity of the target system while making very little impact on CPU utilisation.
The scalability of the underlying architecture also complicates matters. Some applications will crash when running too many images, regardless of what they are doing. Others leverage robust backplane interconnects, caching models, and high context switching capacity to allow for a maximum number of virtual machines without compromising reliability. To determine if moving to “fat nodes” makes sense, organisations factor in platform scalability and the workload “bend.”
Software licensing models
Some applications are not supported on specific virtualisation technology. Even if support isn’t an issue, however, software licensing models may play a major role in the RoI gained from virtualisation. If, for example, applications are licensed per physical server, this drastically minimises any hopes for gains from virtualisation endeavours. This leaves businesses seeking a physical configuration that will support the workload, typically requiring abandonment of vertically scaled infrastructures in favour of smaller, commoditised servers deployed in a horizontally scaled fashion.
Organisations often have little guidance for ensuring security in a physical-to-virtual transition, as there are no published guidebooks or best practices for securing a virtual environment. The following are key considerations to minimise risk.
Mixing security zones in a virtual environment is a bad idea, since most virtualisation technologies don’t provide for strong enough security isolation models. For example, it wouldn’t make sense to place systems that are connected to sensitive internal networks on the same physical host as those connected to a DMZ.
In addition, many virtualisation solutions have administrator-level roles that allow for viewing of the disk images of all the virtual images. This introduces huge vulnerabilities by allowing sensitive security zones to be bridged. This problem is exacerbated with virtualisation solutions that have internal network switches that control traffic between VMs on the same physical host. These technologies allow virtual systems to completely bypass all established port-level firewall filters, deep packet sniffers, and QoS rules governing traffic in the environment. This opens up the entire environment to threats that cannot be detected by network-level security tools.
Many virtualisation technologies allow access to information stored in offline virtual images by simply mounting them as disk images. While this gives users added convenience, it also introduces considerable liabilities, making it easy for someone to walk off with a hard drive. In addition, this makes it all the more important to be careful when virtualising any application that leaves residual data in temp files or other local storage.
Savvy businesses undergo “what if” scenarios to determine the most lucrative virtualisation solution for their environment. This includes taking into consideration licensing costs, implementation expenses, and hardware/software savings. For example, the cost of implementation varies widely based on the cost of transitioning to next-generation servers and storage, application engineering, etc.
Moreover, if application-level changes are required, costs can skyrocket. As a general rule, if functional or user acceptance level testing is involved, RoI quickly disappears.
Another consideration is the type of hardware in use. Virtualising high-density physical infrastructures such as blades vastly reduces the server footprint, but the cost of associated cooling systems may outweigh the benefits. Likewise, the use of fat nodes and large vertically scaled servers offers high scalability and efficiency but come with higher up-front costs. Commoditised rack-mounted servers are simple and easy to deploy, but since they share fewer hardware components they may be less effective in offering economies of scale.
An important and often unexpected issue arising from virtualisation is the lack of viable chargeback models. This means virtual environments must be designed so that they do not cross departments. If charge-backs are in place, the solution must provide a way to obtain accurate utilisation information to ensure equitable billing for compute resources.
The sheer volume of virtualisation offerings makes choosing the right solution a confusing and often overwhelming task. Having the foresight to analyse key factors impacting the virtualisation effort enables businesses to avoid critical missteps. Moreover, conducting “what if” scenarios that analyse business and technical constraints - from security to workloads - drives more informed decision-making that will ultimately mean the difference between success and failure in this complex era.
Andrew Hillier is co-founder and CTO of CiRBA. You can reach him through www.cirba.com.