There has always been interest in running Citrix XenApp (formerly Citrix Presentation Server) workloads on the VMware Virtual Infrastructure platform. With the advent of multi-core systems, purchasing decisions are driven towards systems with 4-16 cores. However, using this hardware effectively is difficult due to limited scaling of the XenApp application environment. In addition to the usual benefits of virtualization, these scaling issues make running XenApp environments on ESX even more compelling.
We
recently ran some performance tests to understand what can be expected
in terms of performance for a virtualized XenApp workload. The results
show that ESX runs common desktop applications on XenApp with
reasonable overhead compared to a native installation, and with
significantly better performance than XenServer. We hope this data will
help provide guidance when XenApp environments are transitioned from physical hardware to a virtualized environment.
Together with partners, we have been developing a desktop workload for over a year. The workload has been tested extensively on virtual desktop infrastructure (VDI) environments with one user per virtual machine (VM). VDI results have been presented and published in numerous locations (e.g., http://www.vmware.com/resources/techresources/1085, VMworld 2008 presentation VD2505 with Dell-EqualLogic). Great attention was paid to selecting the most relevant applications as well as to specifying the right types and amount of work each should do. Many other Terminal Services-style benchmarks fail to be representative of actual desktop users. Porting the workload from a VDI environment to the XenApp environment was straightforward.
XenApp was run in a single 14 GB 2-vCPU Virtual Machine (VM) booted with Windows Server 2003 x64. The hypervisors used were ESX 3.5 U3 and XenServer 5. The VMs for both had the appropriate tools/drivers installed. The XenServer VM had the Citrix XenApp optimization enabled. For comparison, the tests were run natively with the OS restricted to the same hardware resources. The hardware is a HP DL585 with 4 quad-core 2210 MHz “Barcelona” processors and 64 GB memory. Rapid Virtualization Indexing (RVI) was enabled.
The test consists of 22 operations, always executed in the following order:
IE_OPEN_2 |
Open Internet Explorer |
IE_ALBUM |
Browse photos in IE |
EXCEL_OPEN_2 |
Open Excel file |
EXCEL_FORMULA |
Evaluate formula in Excel |
EXCEL_SAVE_2 |
Save Excel file |
FIREFOX_OPEN |
Open Firefox |
FIREFOX_CLOSE |
Close Firefox |
ACROBAT_OPEN_1 |
Open PDF file |
ACROBAT_BROWSE_1 |
Browse PDF file |
PPT_OPEN |
Open PowerPoint file |
PPT_SLIDESHOW |
Slideshow in PowerPoint |
PPT_EDIT |
Edit PowerPoint file |
PPT_APPEND |
Append to PowerPoint file |
PPT_SAVE |
Save PowerPoint file |
WORD_OPEN_1 |
Open Word file |
WORD_MODIFY_1 |
Modify Word file |
WORD_SAVE_1 |
Save Word file |
IE_OPEN_1 |
Open Internet Explorer |
IE_APACHE |
Browse Apache doc in IE |
EXCEL_OPEN_1 |
Open Excel file |
EXCEL_SORT |
Sort column in Excel |
EXCEL_SAVE_1 |
Save Excel file |
The two horizontal lines labeled “QoS” denote the Native latency for 35 and 38 users. Either of these may be considered as a reasonable maximum Quality of Service for latency. They correspond to somewhat less or more, respectively, of half of the available CPU resources (see the CPU figure below), which is a commonly used target for XenApp. At higher utilizations not only does the latency increase rapidly but operations may start to fail. We required that all operations succeed (just like a real user expects!) for a test to be deemed successful. The points where the QoS lines cross the ESX and XenServer curves gives the number of users that can be supported with the same total latency. Normalizing with the number of Native users (35 or 38) gives the fraction of Native users each virtualization product can support at the given total latency:
ESX consistently supports about 86% of the native number of users, while XenServer supports about 77%. Shown below is the average CPU utilization during the second to fifth iteration of the last user, given as a percentage of a single core. Perfmon was used for Native, esxtop for ESX, and xentop for XenServer. ESX uses less CPU than XenServer no matter how the comparison is made: for a given number of users, or for a given total latency:
XenApp and other products that virtualize applications are prime candidates to be run in a VM. These results show that ESX can do so efficiently compared to using a physical machine. This was shown with a benchmark that: represents a real desktop workload, uses a metric that includes latencies of all operations, and requires that all operations complete successfully. Furthermore, ESX supports about 13% more users than XenServer at a given latency while using less CPU.