How to benchmark virtual machines - performance

I am trying to perform a fair comparison of XenServer vs ESX and one comparison I would like to make is performance with multiple VMs. Does anyone know how to go about benchmarking VM performance in a fair way?
On each server I would like to run a fixed number of XP/Vista VMs (for example 8) and have some measure of how quickly each one runs when under load. Ideally I would like some benchmark of the overall system (CPU/Memory/Disk/Network) rather than just one aspect.
It seems to me this is actually a very tricky thing to do and obtain any meaningful results so would be grateful for any suggestions!
I would also be interested to see any existing reports or comparisons that have been published (preferably independent rather than vendor biased!)

As a general answer, VMware (together with other virtualization vendors in the SPEC Virtualization sub-committee) has put together a hypervisor benchmarking suite called VMmark that is available for download. The VMmark website discusses why this benchmark may be useful for comparing hypervisors, including an FAQ and a whitepaper describing the benchmark.
That said, if you are looking for something very specific (e.g., how will it perform under your workload), you may have to roll your own variants of VMmark, especially if you are not trying to do the sorts of things that VMmark benchmarks (e.g., web servers, database servers, file servers, etc.) Nonetheless, the methodology behind its development should be of interest.
Disclaimer: I work for VMware, though not on VMmark.

I don't see why you can't use common benchmarks inside the VMs: WinSAT, Passmark, Futuremark, SiSoftware, etc... Host the VMs over different hosts and see how it goes.
As an aside, benchmarks that don't closely match your intended usage may actually hinder your evaluation. Depending on the importance of getting this right, you may have to build-your-own to make it relevant.
Why do you want to bench?

How about some anecdotal evidence?
I'm going to assume this is a test environment, because you're wanting to benchmark on XP/Vista. Please correct me if I'm wrong.
My current test environment is about 20 VMs with varying OS's (2000/XP/Vista/Vista64/Server 2008/Server 2003) in different configurations on a Dual Quad Core Xeon machine with 8Gb RAM (looking to upgrade to 16Gb soon) and the slowest machines of all are Vista primarily due to heavy disk access (this is even with Windows Defender disabled)
Recommendations
- Hardware RAID. Too painful to run Vista VMs otherwise.
- More RAM.
If you're benchmarking and looking to run Vista VMs, I would suggest putting your focus on benchmarking disk access. If there are going to be performance differences elsewhere I doubt they would be of anything significant.

I recently came across VMware's ESX Performance documentation - http://www.vmware.com/pdf/VI3.5_Performance.pdf. There is quite a bit in there on improving performance and benchmarking.

Related

Linux/Windows performance comparison of MATLAB against other library using containers

I need to compare the performance of MATLAB and an open-source alternative (eg numpy/scipy) on particular matrices problems on both Windows and Linux. It has been expressly asked that the comparison is executed strictly on the same hardware platform - for obvious comparison reasons.
My question is, would using containers (Windows and Linux distro images) satisfy this requirement?
I believe setting two VMs would satisfy the requirement, but using containers would be much less of a hassle and make the tests easily reproducible on any machine, but I'm not too familiar with their architecture or how they access their host's hardware.
Thanks in advance for any help.
Essentially, when talking about performance tests, you'd want your test machines to be as close as possible to the final end-use machine.
While virtualization and containerization might offer different advantages in terms of ease of setup, reusability and whatever else, they also introduce a degree of distortion in the metrics obtained in such tests due to their architecture and implementation.
I don't know the exact degree to which a performance test on a Windows container would differ from a performance test run on a Windows machine, and if such difference is trivial or not, but the first paragraph answers my question.

Testing perceived performance

I recently got a shiny new development workstation. The only disadvantage of this is that the desktop apps I'm developing now run very, very fast, and so I fear that parts of the code that would be annoyingly slow on end users' machines will go unnoticed during my testing.
Is there a good way to slow down an application for testing? I've tried searching around, but all of the results I've been able to find seem pretty fiddly to set up (e.g., manually setting up a high-priority CPU-bound task on the same CPU core as the target app, or running a background process that rapidly interrupts and resumes the target app), and I don't know if the end result is actually a good representation of running on a slower computer (with its slower CPU, slower RAM, slower disk I/O...).
I don't think that this is a job for a profiler; I'm interested in the user's perception of end-to-end performance rather than in where the time goes for particular operations.
setup a virtual machine, give in as little ram as needed and also you can have it use 1,2 or more CPUs. I like VirtualBox myself install your app and test with different RAM configs
Personally, I'd get an old used crappy computer that is typical of what the users have and test on that. It should be cheap and you will see pretty fast how bad things are.
I think the only way to deal with this is through proper end-user testing, i.e. get yourself a "typical" system for testing and use that to identify any perceptible performance bottlenecks.
You can try out either Virtual PC or VMWare Player/Workstation, load an OS onto it, and then throttle back the resources. I know that with any of those tools you can reduce the memory to whatever you'd like. You can also specify the number of cores you want to use. You might even be able to adjust the clock speed in VMWare Workstation... I'm not sure.
I upvoted SQLMenace's answer, but, I also think that profiling needs to be mentioned, no matter how quickly the code is executing - you'll still see what's taking the most time. If you find yourself with some free time, I think profiling and investigating the results is a good way to spend it.

Analogs of Intel's Cluster OpenMP

Are there analogs of Intel Cluster OpenMP? This library simulates shared-memory machine (like SMP or NUMA) while running on distributed memory machine (like Ethernet-connected cluster of PC's).
This library allows to start openmp programs directly on cluster.
I search for
libraries, which allow multithreaded programms run on distributed cluster
or libraries (replacement of e.g. libgomp), which allow OpenMP programms run on distributed cluster
or compilers, capable of generating cluster code from openmp programms, besides Intel C++
The keyword you want to be searching for is "distributed shared memory"; there's a Wikipedia page on the subject. MOSIX, which became openMOSIX, which is now being developed as part of LinuxPMI, is the closest thing I'm aware of; but I don't have much experience with the current LinuxPMI project.
One thing you need to be aware of is that none of these systems work especially well, performance-wise. (Maybe a more optimistic way of saying it is that it's a tribute to the developers that these things work at all). You can't just abstract away the fact that accessing on-node memory is very very different from memory on some other node over a network. Even making local memory systems fast is difficult and requires a lot of hardware; you can't just hope that a little bit of software will hide the fact that you're now doing things over a network.
The performance ramifications are especially important when you consider that OpenMP programs you might want to run are almost always going to be written assuming that memory accesses are local and thus cheap, because, well, that's what OpenMP is for. False sharing is bad enough when you're talking about different sockets accessing a common cache line - page-based false sharing across a network is just disasterous.
Now, it could well be that you have a very simple program with very little actual shared state, and a distributed shared memory system wouldn't be so bad -- but in that case I've got to think you'd be better off in the long run just migrating the problem away from a shared-memory-based model like OpenMP towards something that'll work better in a cluster environment anyway.

Platforms for running memcached

Is there any reason in particular why it's recommended to run memcached on a Linux server? Is it really that bad an idea to run it on a Windows Server box? What about an OS X Server box?
The biggest reason that I read is about TCO. In other words, for each windows box that we run memcached on, we have to buy a copy of Windows Server and those costs add up. The thing is that we have several servers that have older processors but a lot of RAM - perfect for memcached use. All of these boxes already have Windows Server 2003 installed on them, so there's not really much savings to installing Linux. Are there any other compelling reasons to use Linux?
This question is really "what are the advantages of Linux as a server platform" I'll give a few of the standard answers:
Easier to administer remotely (no need for RDP, etc.) Everything can be scripted or done via CLI.
Distributions like the Ubuntu LTS (Long Term Support) versions guarantee security updates for years with zero software cost. Updates can easily be installed via command line, and generally don't require a reboot.
Higher performance. Linux is generally considered to offer "more bang for the buck" on a given piece of hardware. This is generally due to lower resource requirements.
Lower resource requirements. Linux runs just great on 256MB or less of RAM, and on very small CPUs
Breadth of available software & utilities.
It's free. (As in beer)
It's free. (As in freedom) This means you can see, change, and file bugs against the code you're running, and talk directly with the developers.
Remember, that TCO includes the amount of time that you (the administrator) are spending to maintain the machine. Linux has a lower TCO because it's easier to maintain, and you can spend your time doing something other than administering a server...
Almost all of the FAQs and HOWTOs are written from Linux point of view. Memcache was originally created only for Linux, the ports came later. There is port to Windows, but it's not yet in the official memcache distribution. Memcache on Windows is still guerrilla style. For example there is no memcache for x64 Windows.
As of memcache on MacOS X on servers: niche of niche of niche.
There doesn't seem to be any technical disadvantage to running it in windows. It's mainly a cost thing. If the licenses are just sitting around unused, there's probably no disadvantage at all. I do recall problems on older windows with memory leaks in older windows APIs, particularly the TCP stuff -- but presumably that stuff is all fixed in modern windows.
If you are deploying memcached you probably have a fairly significant infrastructure (many, many machines already deployed). Even if you are dedicating new machines to memcached, you'll want to run some other software on them for system management, monitoring, hardware support etc. This software may be customised by your team for your infrastructure.
Therefore, your OS platform choice will be guided by what your operations team and hardware vendor will support for use in production.
The cost of a few Windows licences is probably fairly immaterial and you probably have a bulk subscription already - in fact the servers may be ordered with Windows licences already on them.
Having said that, you will definitely want a 64-bit OS if you're running memcached - using a 32-bit OS is not clever and will mean that most of your RAM cannot be used (you'll be limited to around 3G depending on the OS).
I'm assuming that if you're deploying memcached, you'll be doing so on hardware with LOTS of ram - it's pretty pointless otherwise, after all.

Database/Web servers as a Virtual Machines vs Bare Metal?

I manage a database (Oracle 8i) and web server (IIS) for about 50 simultaneous users on average and a theoretical limit of 100 simultaneous users. A mid level system.
We just got a dual-socket Quad-core XEON - 16GB RAM - SAS-RAID-10 beast and I am exploring the possibilities of taking these two separate servers and merging them into two virtual machines both running on the new server (Server 2009 Hyper-V).
1) In general, what are the performance penalties (as well as any gotchas and hidden consequences) of running both the database and web servers as virtual machines on one mega server vs running them on two separate slower boxes? Is it a big NO-NO or it is something worth trying for a mid-level system that will never need to scale?
2) What are the general performance penalties (in percentage) and gotchas for virtualizing just the database server? We run Oracle 8i (but are considering moving to MS SQL Server).
3) If only stress tests can determine an reasonable answer, what would be the easiest way to test these scenarios (tools / configuration).
Thanks in advance for any generous knowledge-sharing.
If you are looking to do this, I would check Microsoft's site and best practices on how to do it. There is a podcast on Deep Fried Bytes that talks about how the Microsoft.com site is setup to use virtual servers and some of their practices on how they implement it. They don't seem to have performance penalties on how they run it, but I am not certain of the details (it also talks about how they use server virtualization like a real organization would and not a company with unlimited amounts of money to throw at a problem).
I believe this is the podcast:
http://deepfriedbytes.com/podcast/episode-8-behind-the-scenes-at-microsoft-com/
With regards to databases, see this question:
Virtualized SQL Server: Why not?
Note that this is specific to sql server, but many of the same principles will apply for oracle.
As for web servers, virtualization is a great idea. It can make it easier to increase reliability and scalability.
I think at your level of concurrent user connections, and the power of the machine, you won't have too many performance issues running SQL Server on a VM.
We have a mix of VMWare ESX VMs and bare-metal OS' running app, web, and DB servers and without a doubt the heaviest loaded DBMS system is on bare-metal machine (Quad proc quad core, etc.). All the little guys, though, live on VMs, and we haven't noticed any problems (even using iSCSI over GigE).
One thing to consider is you won't get any fault tolerance out of a single setup like this because a CPU failure will bring down the entire box, thus blowing up your whole app.
More info on SQL Server HA and Hyper-V, just FYI:
http://blogs.technet.com/andrew/archive/2008/11/10/sql-server-2008-hyper-v-and-high-availability.aspx
Be aware that Oracle has its own guidelines on running in a virtual machine.
The product I work with utilizes Oracle on the back-end, and for heavy use, the overhead of a VM has had negative effects on it.
8i is well past EOL, and was around before virtualization was a Big Thing(tm), so moving to a new edition of Oracle might also be a good plan at the time your migrate to virtualization.
Oracle blog article on 11g in a VM - http://blogs.oracle.com/MingMan/2007/11/oracle_database_11g_successful.html
If you're concerned about timing, also be aware of known clock-drift issues in hypervisors, and available fixes (either from the OS or virtualization vendors).
I recently came across an article dealing with Virtualization Security. I thought it would be worth mentioning here.

Resources