There appears to be a challenger to 'VM sprawl' as the scourge of virtualisation success - a problem I call 'VM stall'.

We know about 'VM sprawl'; because new virtual machines are so easy to deploy, organisations can end up with more VMs that they can handle, or even use. This has the potential to cause severe problems to availability, performance, compliance, costs, security, and more.

However, I am seeing more and more evidence of this new phenomenon I think of as 'VM stall', the tendency for virtualisation deployments to stall once the 'low-hanging fruit' has been converted (typically around 20-30 percent of servers).

I think it happens more or less like this...

In general, organisations start virtualisation deployments by converting relatively low-risk, low-impact systems - dev/test servers, Web servers, file servers, internal applications, - to virtualisation. With a big impact, great results, and reasonably fast and easy implementation, it is a great hit with IT and business owners. This may even spawn a 'virtual first' initiative, where all new server requests are deployed as virtual servers by default.

However, when faced with the next step, converting the remaining existing servers - including tier 1 business services, customer-facing environments, enterprise-wide systems, third-party applications, multi-platform services, and composite applications - virtualisation projects often stall.

I was interested to see the notion of VM stall confirmed in some new research into virtualisation coming out of Prism Microsystems, a software vendor in the SIEM market.

One of the most interesting outcomes in this research was again the low penetration of server virtualisation within each organisation. As the chart below shows, most organisations have still virtualised less than a third of their production servers.

What's more, fully 15 percent have not even started to virtualise their production servers at all!

It might seem that this is really at odds with 'the common wisdom' that sees virtualisation as mature, ubiquitous, commoditised and even passé. We hear so much about virtualisation, how it has been a top priority for years, about how everyone is deploying virtualisation. For example:

  • The IBM Global CIO Study 2009 in September showed 76 percent of 2500 global CIOs are undergoing or planning virtualisation projects.
  • The Gartner 2010 CIO Survey in January reported that virtualisation is the top priority for over 1,500 global CIOs (up from number 3 the previous year).
  • In January, CDW's Server Virtualisation Life Cycle Report (registration required) found that 90 percent of respondents have implemented server virtualisation at some level.
  • As far back as 2008, EMA research showed 75 percent of enterprises were using virtualisation for production use cases.
  • The Prism Microsystems report the chart above comes from states that 85 percent of their sample have adopted virtualisation to some degree.

I am even starting to hear that virtualisation is set to be irrelevant, becoming nothing more than just a stepping stone to cloud.

However, despite the widespread adoption of virtualisation as a percentage of organisations, it is consistently still very low as a percentage of production servers.

Indeed, this is not the only recent (and not so recent) research study to highlight this issue. Over time, CIOs have reported a persistent difficulty in expanding their virtualisation deployments beyond the initial 20-30 percent of servers. For example:

  • Around six months ago, Gartner reported that "only 16 percent of workloads are running in virtual machines today."
  • Research from EMA has found that the average organisation has only virtualised around 25 percent of servers (and only retired just 17 percent).
  • The CDW Server Virtualization Life Cycle Report cited above showed that just 34 percent of the average organisation's total server infrastructure consists of virtualised servers.
  • CIO and HP survey in October 2009 reported that on average just 38 percent of mission-critical business services have been virtualised by companies with virtualisation projects.
  • Forrester Research from May this year (conducted for CA) shows that the average enterprise has virtualised only around 30 percent of their servers.

At a time when so many organisations are experiencing VM sprawl, it seems hard to believe that VM stall is such an issue. Yet time and again we see that organisations find it difficult to 'get over the hump' of the initial 20-30 percent of servers, and difficult to move from low-risk/low-impact servers to high-risk/high-impact services.

If this were just a point-in-time observation, then VM stall might not exist. The low penetration rate may just be a point in the deployment cycle.

However, VM stall appears to be a longitudinal effect, as it has been holding many deployments at around 20-30 percent of servers for several years. IIRC, something resembling VM stall was cited as an issue in EMA research as far back as 2008, and again in 2009. The CDW virtualisation lifecycle research also reinforces the potential for long-term VM stall. In it, even organisations that self-report as "fully deployed" for server virtualisation have only virtualised 37 percent of their servers. So while many organisations see VM stall as a short-term delay to virtualisation rollout, many others are seeing VM stall as a permanent situation.

I see many possible causes for VM stall. For example:

Risk aversion - High-risk, high-impact services have more stakeholders, more politics, larger and more distributed infrastructures, greater cost of failure and downtime, reduced or non-existent third-party support, and maximum management attention, among many other risk factors. The risk of failure may be too great, and the newest technology is always blamed for any new problems. Without new ways to address continuity, availability, performance, cost allocation, and other business requirements, conversion risk may be enough to stall virtualisation deployment.

Resourcing - With around 20-30 percent of servers converted, virtualisation staffing starts to become a real challenge. As I talked about recently with my great mate, David Marshall, staff and skills shortages put a real throttle on virtualisation deployments, especially as virtualisation starts to scale. Not only is demand for virtualization skills still high, but supply continues to lag. Plus, the problem is getting worse, not better. Without the resources and skills to go forward, there is often little alternative to VM stall.

Scalability - With one (typically small) team trying to manage a quarter of the entire server workload, staff from the virtualisation project team simply cannot handle further virtualization deployment. In some cases, the virtualization technology itself does not scale well either; and in others, the management tools do not scale. Throwing more bodies at the problem is rarely the answer -- after all, nine women cannot make a baby in one month. So organizations end up with VM stall almost by default, as they find that they need to fundamentally change their processes and technologies to enable further virtualisation growth.

Manageability - New IT management issues come up as the scale and risk of virtualisation deployment increases. Enterprise virtualisation needs new approaches to performance assurance, process automation, VM mobility, continuity planning, security and audit, software compliance, OEM support, configuration compliance, and more. The importance of manageability is greatly magnified for high-risk/high-impact services, but few (if any) organizations seem to have the virtualisation-aware management tools to scale to handle enterprise-class virtualisation deployments. Again, VM stall happens almost by default, as IT tries to figure out enterprise-class manageability.

There may be more or different causes, but whatever the reasons, there is little doubt in my mind that VM stall exists. It is not universal -- indeed, every study shows that a decent percentage of organisations are able to power through it -- but for the majority of organisations, it appears to be very real. I have personally seen many enterprises going through it. More and more research continues to support it. For affected organisations, it is a significant problem, too, because stalled virtualisation deployment means the highly desirable outcomes of virtualisation -- OpEx reduction, improved continuity, greater IT and business agility, energy cost reduction, ROI, etc. -- either stalls as well, or even starts to backslide.

Whether VM stall represents as big a problem as VM sprawl, time will tell; but it is certainly a significant and growing challenge to the success of virtualisation -- and a fundamental driver for better virtualisation management.

Andi Mann is the vice president of product marketing for CA Technologies' Virtualization and Automation Business Unit.