Virtualizing XenApp on XenServer 5.0 and ESX 3.5
There has always been interest in running Citrix XenApp (formerly Citrix Presentation Server) workloads on the VMware Virtual Infrastructure platform. With the advent of multi-core systems, purchasing decisions are driven towards systems with 4-16 cores. However, using this hardware effectively is difficult due to limited scaling of the XenApp application environment. In addition to the usual benefits of virtualization, these scaling issues make running XenApp environments on ESX even more compelling.
We
recently ran some performance tests to understand what can be expected
in terms of performance for a virtualized XenApp workload. The results
show that ESX runs common desktop applications on XenApp with
reasonable overhead compared to a native installation, and with
significantly better performance than XenServer. We hope this data will
help provide guidance when XenApp environments are transitioned from physical hardware to a virtualized environment.
Together with partners, we have been developing a desktop workload for over a year. The workload has been tested extensively on virtual desktop infrastructure (VDI) environments with one user per virtual machine (VM). VDI results have been presented and published in numerous locations (e.g., http://www.vmware.com/resources/techresources/1085, VMworld 2008 presentation VD2505 with Dell-EqualLogic). Great attention was paid to selecting the most relevant applications as well as to specifying the right types and amount of work each should do. Many other Terminal Services-style benchmarks fail to be representative of actual desktop users. Porting the workload from a VDI environment to the XenApp environment was straightforward.
XenApp was run in a single 14 GB 2-vCPU Virtual Machine (VM) booted with Windows Server 2003 x64. The hypervisors used were ESX 3.5 U3 and XenServer 5. The VMs for both had the appropriate tools/drivers installed. The XenServer VM had the Citrix XenApp optimization enabled. For comparison, the tests were run natively with the OS restricted to the same hardware resources. The hardware is a HP DL585 with 4 quad-core 2210 MHz “Barcelona” processors and 64 GB memory. Rapid Virtualization Indexing (RVI) was enabled.
The test consists of 22 operations, always executed in the following order:
IE_OPEN_2 Open Internet Explorer IE_ALBUM Browse photos in IE EXCEL_OPEN_2 Open Excel file EXCEL_FORMULA Evaluate formula in Excel EXCEL_SAVE_2 Save Excel file FIREFOX_OPEN Open Firefox FIREFOX_CLOSE Close Firefox ACROBAT_OPEN_1 Open PDF file ACROBAT_BROWSE_1 Browse PDF file PPT_OPEN Open PowerPoint file PPT_SLIDESHOW Slideshow in PowerPoint PPT_EDIT Edit PowerPoint file PPT_APPEND Append to PowerPoint file PPT_SAVE Save PowerPoint file WORD_OPEN_1 Open Word file WORD_MODIFY_1 Modify Word file WORD_SAVE_1 Save Word file IE_OPEN_1 Open Internet Explorer IE_APACHE Browse Apache doc in IE EXCEL_OPEN_1 Open Excel file EXCEL_SORT Sort column in Excel EXCEL_SAVE_1 Save Excel file
The two horizontal lines labeled “QoS” denote the Native latency for 35 and 38 users. Either of these may be considered as a reasonable maximum Quality of Service for latency. They correspond to somewhat less or more, respectively, of half of the available CPU resources (see the CPU figure below), which is a commonly used target for XenApp. At higher utilizations not only does the latency increase rapidly but operations may start to fail. We required that all operations succeed (just like a real user expects!) for a test to be deemed successful. The points where the QoS lines cross the ESX and XenServer curves gives the number of users that can be supported with the same total latency. Normalizing with the number of Native users (35 or 38) gives the fraction of Native users each virtualization product can support at the given total latency:
ESX consistently supports about 86% of the native number of users, while XenServer supports about 77%. Shown below is the average CPU utilization during the second to fifth iteration of the last user, given as a percentage of a single core. Perfmon was used for Native, esxtop for ESX, and xentop for XenServer. ESX uses less CPU than XenServer no matter how the comparison is made: for a given number of users, or for a given total latency:
XenApp and other products that virtualize applications are prime candidates to be run in a VM. These results show that ESX can do so efficiently compared to using a physical machine. This was shown with a benchmark that: represents a real desktop workload, uses a metric that includes latencies of all operations, and requires that all operations complete successfully. Furthermore, ESX supports about 13% more users than XenServer at a given latency while using less CPU.