As more companies begin to move their applications into the cloud, many are finding out that not all applications are created equally. By that, I mean not all applications are tailor-made to migrate into the cloud.
Part of the problem for enterprise organisations is the workload demand that certain applications require: either a large number of processing cores or larger amounts of memory than traditional cloud architectures can accommodate. While virtualisation technology is normally the answer to many cloud-related questions, this time virtualisation has to be used and thought of in a different way in order to meet the challenge.
When we typically talk about virtualisation, we do so in terms of partitioning or making many smaller virtual machines out of a single larger physical machine.
In doing so, we optimise the workload of a single physical machine, but what happens when you need to go the other way? What happens when you need to optimise the workload demand of an application that requires more processor cores or more memory than a single physical machine has to offer?
ScaleMP is one company providing such a technology. Its new virtualisation technology is being described in terms of virtualisation for aggregation rather than consolidation.
To find out more, I spent some time with Shai Fultheim, the founder and president of ScaleMP, and we talked about the differences between scaling up and scaling out with virtualisation.
InfoWorld: You and I were talking about different ways of viewing server virtualisation. Can you explain these a bit more?
ScaleMP: The best analogy for the two ways to view server virtualisation is to think of how IT, for years, has viewed storage. There are two ways to purchase and implement storage: large storage arrays and smaller in-system, JBOD or NAS.
In the former, companies have partitioned large storage arrays for particular workloads, users, or departments. The single large array provides them an easier way to manage and distribute storage. In the latter, there have been storage management tools available to concatenate the distributed smaller arrays to appear as a large pool or single storage resource. This is exactly what is now occurring in server virtualisation.
Most IT professionals are familiar with server virtualisation for consolidation, which, like the first example in storage, consolidates workloads and increases the utilisation rates of a single x86 system. This type of virtualisation is focused on scaling out, or horizsontal scale, and provides compute resources to transactional or stateless applications that are the majority of IT data centre workloads.
The latter form of virtualisation, or what we call virtualisation for aggregation, provides the concatenation virtualisation for those workloads that require more processing power and more memory than one system can provide. In the traditional server world, this is called scaling systems up, or vertical scale. This capability takes numerous small servers and creates large virtual systems with tens or hundreds of CPUs and single system RAM in excess of 4TB.
InfoWorld: How then does virtualisation for aggregation work, and do you find it is best suited to specific environments or workloads?
ScaleMP: Server virtualisation for aggregation solves three fundamental problems facing high-performance IT today by being cost effective; offering simplified cluster installation and management; and scaling up aggregation and redistributing cloud resources for high performance or large memory workloads.
Today the largest virtual SMP is created from up to 16 discrete x86 server nodes. This creates a virtual SMP with 128 cores and 4TB of RAM. However, with the introduction of the new Nehalem EX 8-core processors, the size of the virtual SMP grows dramatically.
In addition, ScaleMP will shortly announce support for 32-node systems. Both of these advances mean that by the end of the year customers will be able to build low-cost SMPs with thousands of cores and close to a hundred terabytes of RAM.
On a per-core basis, these systems will be significantly lower cost than a proprietary SMP but will obviously be larger than any but the very largest supercomputers. The revolutionary thought in all this is that the cost of building the most powerful system configurations will drop dramatically, and any IT organisation with cloud infrastructure will be able to aggregate a supercomputer on the fly.
The applications that are targeted by virtualisation aggregation are those that get better performance with additional processors or those applications that require files over 64GB. Combining or aggregating multiple x86 servers and their CPUs and memory to create a large virtual machine enabling IT to run compute-intensive applications such as those found in financial analytics, data warehousing, genomics, visualisation, simulation, computer-aided engineering, electronic design, and analysis.
The current solution supports up to 16 x86 server nodes in any configuration. From a software programming perspective, the virtualisation layer supports all variants of Linux running unmodified from their original distribution. In other words, Red Hat Linux runs right out of the box. Once you boot the OS, the system configuration command will expose up to 128 cores and 4TB of memory depending on what you have in each node. The applications will then run unmodified on top of the OS just as though you have a native SMP.
InfoWorld: If you would, explain why someone would virtualise a server for SMP.
ScaleMP: Every enterprise has a high-performance computing need. Big CPU or memory workloads are present, for example, in almost every analytics-driven or data-warehousing environment. In addition, every research organisation has computer-aided design, engineering, or visualisation workloads.
Traditionally these workloads have run on large proprietary SMP systems. These dedicated symmetric multiprocessing (SMP) systems are the dinosaur of IT. They are expensive, often purchased for dedicated applications or workloads, and difficult to repurpose when they become obsolete.
In the absence of an alternative, many customers have been forced to move their SMP code (OpenMP) to a message passing optimised code such as MPI. They've done this in order to take advantage of the price and performance benefits of commodity x86 servers. x86 architecture is much higher performing than the proprietary RISC architectures and are often less than 20% the cost of equivalent systems.
In order to do this with boxes that simply don't have the native SMP scale is to create a high-performance cluster. The benefits of a low-cost, high-performance cluster infrastructure, however, comes at the cost of manageability. Cluster solutions are very complex with each node having their own OS instance, a clustered file system, and, in most cases, moving the code from OpenMP to MPI is not trivial.
Virtualisation for aggregation solutions solves all of these problems.
InfoWorld: And what are some of the benefits that can be enjoyed?
ScaleMP: A virtual SMP enjoys the benefits of the commodity low-cost, high-performance infrastructure without requiring costly software modifications - it will support both OpenMP and MPI codes equally well - but unlike a cluster solution, there is no need for multiple instances of the OS, a clustered file system, separate I/O management, etc.
All of the system node resources are virtualised and optimised to be managed as a single instance. This benefit is so significant that existing high-performance cluster customers have virtualised existing clusters with no cost or performance benefits. What they do enjoy is vastly superior simplified system management value.
Another distinct value for customers considering an x86-based virtual SMP solution is that they can always repurpose the nodes. Because this solution utilises off-the-shelf servers, when the SMP is upgraded or repurposed, the nodes can easily be broken down and reused in another part of the infrastructure.
A cloud solution, for example, is designed specifically to do this: create a dynamic SMP from cloud system resources, run a large SMP workload, and then disaggregate the SMP back into the cloud system pool. This not only provides a tremendous amount of flexibility for cloud IT infrastructure, but it also dramatically improves cloud ROI by extending the number and kind of applications that are able to run in the infrastructure.
InfoWorld: What should companies consider before deploying virtualisation for aggregation?
ScaleMP: The most prevalent reason today for customers to deploy an aggregation solution is simply to take advantage of the power and low-cost infrastructure of commodity systems instead of making a traditional purchase of a large proprietary SMP. However, as customers get a better understanding of their workloads, more and more customers are trying to address these workloads in infrastructure that was never built to run SMP codes.
As enterprise IT looks to deploy cloud infrastructure, they are immediately faced with these applications and workloads. They can either tell the particular internal customers to purchase their own IT solutions, or search for ways to improve the utility of their existing infrastructure. Since the cloud is enabled by virtualisation, using virtualisation to create large SMP is an obvious and congruent decision.
The secondary decision for customers looking to deploy virtualisation for aggregation is to determine the application requirements for optimal performance. Does the application perform better with a large number of processing cores? How many? Do application file sizes exceed in-the-box maximums? Application parameters are pretty critical for systems like visualisation. While additional processing cores are beneficial, most of these systems are restricted by available RAM. It's not uncommon to see multiterabyte files rendered in real time. It's critical to get system sizing right for the application -- shape the system to the workload rather than the workload to the system.
InfoWorld: And what are the best practices that should be considered?
ScaleMP: An organisation should identify their target applications before deployment and configure their hardware accordingly. By clearly identifying the target workloads, systems can be properly optimised to suit the particular needs of the application. By using virtualisation technology, customers have the flexibility to create cost effective "unbalanced" systems: many processing cores with a small amount of RAM or few processors with a very large amount of RAM. These kinds of systems may be higher performing and, more importantly, far more cost-effective than a balanced system for specific workloads.
InfoWorld: Can virtualisation for aggregation and virtualisation for consolidation be used together? And if so, what is the result?
ScaleMP: These two distinct virtualisation technologies -- one for partitioning and one for aggregation -- seem to address very different use cases: one for multiple parallel workloads and one for high-performance computing, or HPC workloads.
However, these two distinct uses of virtualisation are not mutually exclusive and can be used in tandem to enable a very powerful set of server resource capabilities -- capabilities that empower end-users to shape compute resources to fit their workloads on the fly.
As already mentioned, virtualisation for partitioning and aggregation already reside in the same infrastructure when it comes to cloud deployments. As customers demand different workload services, IT administrators can provision whatever system resources are necessary, within a single system or across a set of systems, to provide the optimal performance and SLAs for their end-user.
InfoWorld: What could this lead to? Is there a future where there can be a virtual machine running on top of a virtual machine?
ScaleMP: Yes. There are two obvious use cases, but as the technology matures there will certainly be more The first use case places many small VMs on top of a large virtual machine, which is essentially a large SMP created using a hypervisor to aggregate multiple nodes into a large server and then splitting that system into multiple VMs using a hypervisor for partitioning. Provisioning smaller VMs from this larger pool provides a better platform for load balancing, dynamic system provisioning, and hardware migration. Ultimately, IT can also dynamically add and remove resources to the pool. In VMware's world, this is called vMotion. There is no equivalent technology to provide this functionality for competitive hypervisors. Running partitioning VMs on top of a large SMP would provide this functionality.
The second use case creates one large VM out of many small VMs -- partitioning many servers into multiple virtual machines and then aggregating some of those VMs into a larger virtual SMP system. This circumstance may develop when an IT manager has deployed multiple virtual servers taking up a certain amount of CPU power, but is also setting aside CPU capacity in case the need arises. By running a hypervisor for aggregation, an IT administrator can collect and aggregate all this available yet unused computing resource across the data centre to create a new larger or customised system that can be utilised for another workload.
Many organisations have begun to deploy their on-premise resources in a cloud-like fashion. And to realise the full benefits of their private cloud, they want to have the flexibility of deploying any kind of workload on the fly. With server virtualisation for consolidation, IT could partition their servers to run more workloads. But now, if IT has a workload that requires more than the CPU power or memory found in any given box, it can aggregate resources and provision a virtual SMP. And as Fultheim talks about in the two VMs on top of VMs use case, you can begin to see how server virtualisation for aggregation can make the cloud much more elastic.