A few weeks ago, Johanna Holopainen blogged about virtualizing voice and other real-time applications. I want to continue that discussion by looking at the performance requirements for real-time applications in a virtualized environment.
First, let’s be clear about the term “real time,” because it means different things in different settings. Computer scientists use “hard real-time” for a system that must perform within a given time frame to avoid catastrophic results: think pacemakers, anti-lock brakes, aircraft control systems. Real-time is about predictability and low jitter and low latency. VoIP has a higher tolerance for jitter than true real-time applications. By that standard, voice over IP (VoIP) can be thought of as “soft real-time” or “near real-time”: nobody dies if a few voice packets arrive late (although if you’re responsible for a large VoIP implementation, too much latency may not be healthy for your career). In the rest of this blog, when I say “real-time,” think “near real-time.
Real-time and virtualization are somewhat at odds with each other. Virtualization spreads computing resources across multiple virtual machines as a way to increase utilization and flexibility. Real-time applications dedicate specific resources to reduce computing overhead and ensure low latency. Not too long ago, running real-time applications in a virtualized environment was risky. However, as we will see, it’s different now, thanks to developments in hypervisor design and hardware-assisted memory management.
First-generation hypervisors—VMware’s included—focused on IT efficiency, resource utilization, availability and other business-oriented benefits appropriate to enterprise applications. While they achieved these goals in spectacular fashion, extreme low latency and real-time performance was not possible in these early hypervisor versions. That had to do primarily with all of the functions that the hypervisor had to perform, While the impact of these functions was small enough to not affect the performance of enterprise applications, it wasn’t good enough for real-time.
Starting with vSphere 4, VMware improved performance of the hypervisor and worked with Intel and AMD to incorporate virtualization functions in the CPUs such that these real-time applications can run virtualized. For voice, latency has been cut by as much as 4-5X due to these improvements. I’ll talk more about hypervisor design technology in a future blog, but here’s a white paper that covers the subject in detail.
Hardware-assisted Virtualization Technology (VT)
When x86 hypervisor’s first came to market they had to perform some specialized technology to allow multiple operating systems to execute privileged code on the physical cores hosting the virtual machine. VMware’s foundation technology for this is Binary Translation (BT). While BT solved the problem to allow multiple virtual machines to share the same physical CPU’s VMware worked with leading x86 processor vendors to enhance the performance of executing this privileged code. That technology is called Virtualization Technology (VT– Intel /AMD). VT is a key enabler for hosting of real-time applications.
Hardware-assisted Memory Management
Memory management is one of the primary bottlenecks affecting performance of real-time virtual machines. Intel and AMD have added hardware assistance for memory management to their server CPU architectures. This feature—called Extended Page Tables (EPT) in Intel Nehalem processors and Rapid Virtualization Indexing (RVI) in AMD processors—effectively reduces the memory overhead, which cuts latency for real-time applications.
With Voice Over IP (VOIP), a common real-time application, it isn’t the average latency that matters as much as the maximum or worst case latency. Qualitative testing (testing with real people giving their feedback) found that as long as the worst case latency stayed under 100ms, the call quality for VOIP was still perceived to be good.
The figure below shows benchmark test results for two different Intel CPUs—the Intel 5450 (HarperTown) and Intel 5560 (Nehalem) with EPT—running a Voice Over IP test workload . Both tests used a 4 vCPU VM with VMXNET3 network adapter on vSphere 4. The improvement with EPT is impressive: Worst-case latency is well below the 100 millisecond level required for good call quality (note the 20X scale difference between the two diagrams).
So here’s what you need to know about running voice applications in a virtual environment:
- The hypervisor scheduler must be optimized for real-time (which is the case for vSphere 4.0 and later)
- The processor must have hardware-assisted memory management (EPT for Intel, RVI for AMD)
In case you think this is just a theoretical discussion, check out Mitel Virtual Solutions, a family of unified communications (UC) solutions optimized for vSphere environments. Mitel is using virtual appliances to good advantage—and that’s another topic for a future blog.
Have your own experience running real-time applications on VMware? Tell us about it.