Virtualisation is a trend that has been gaining popularity for several years. There has been a particular increase over the last two to three years thanks to growing need to make economies.
Vrtualisation works by taking lots of machines and combining them onto one piece of hardware, where each copy of the operating system runs within its own virtual machine. The question is often asked,"how does this affect the overall performance of my database-centric applications?" The answer is that we have done a lot of benchmarking and testing that have found it to have a significant effect. To understand why this is, we have to look at why people want to do virtualisation in the first place. Industry experts Gartner and IDC agree that the typical server is only utilised 10 to 20 percent of the time. Therefore, 80 to 90 percent of the time that machine is doing nothing other than warming the room.
The promise of virtualisation is to take those spare computing cycles, and to put them into production in a way that delivers return on your investment for that hardware. We have seen companies who have put as many as 120 virtual machines on a single piece of hardware.However, it is important to understand the impact of virtualisation on data access.Data access for any application, or more specifically retrieving data from the database, can be very CPU and memory intensive, and accounts for as much as 75 percent to 95 percent of all the time spent in a data centric application.
When 80 percent of the time the CPU is not being used, you have spare cycles, which can offset bad algorithms, bad data access code, or a bad JDBC driver. However, this can become a problem when the utilisation rises to 80 percent or 90 percent. Despite the benefits of virtualisation, there are limits to it, particularly issues of scalability. Once the limits of hardware are tested, inefficient drivers or code can create a bottleneck and scalability rapidly decreases.
For example, an enterprise might have an application that uses an application server where it performs its data access,connecting to Oracle and is running fine with 100 users. We have seen many times that once that environment is virtualised and excess CPU and memory are no longer available, all of a sudden this application starts under-performing. It usually turns out to be that either the data access code (Hibernate, JDBC, .NET, OBDC, etc) is not written efficiently, or there is some piece of middleware – some driver – that is written inefficiently and uses too much CPU or too much memory. The result is that within the virtualised environment, the excessive use of memory and CPU and disk (and sometimes the network) does become a big issue.
Ten or 15 years ago, when the hardware was slower and more expensive, code had to be better written, or at least more efficiently. This requirement disappeared when hardware became so much cheaper and faster. With virtualisation, those same problems become important again: code has to be better; algorithms must be better, and the database middleware must be better. In fact, every component of the stack has to be better if the fruits of virtualisation are to be realised. The cost of inefficient techniques becomes clear when thinking about scalability lost, which is usually the reason behind virtualising in the first place. The good practices discussed in the Data Access Handbook are even more essential for successful virtualisation.
Progress DataDirect Technologies’ Vice President of Research and Development, Rob Steward uses his 15 years of experience developing database access middleware to oversee the development and strategy of DataDirect’s data connectivity products. He is also responsible for developing DataDirect Sequelink, and DataDirect XQuery,and DataDirect Shadow. Rob has spoken on data access performance at many industry events including: Microsoft PDC, Devscovery, WinDev, and Virtualization World. Rob is also co-author of The Data Access Handbook.