Web/Tech

Comparing Fault Tolerance Performance & Overhead Utilizing VMmark v1.1.1

VMware Fault
Tolerance (FT), based on vLockstep technology and available with VMware
vSphere, easily and efficiently provides zero downtime and zero data loss for
your critical workloads. FT provides continuous availability in the event of
server failures by creating a live shadow instance of the primary virtual
machine on a secondary system.  The
shadow VM (or secondary VM), running on the secondary system, executes sequences
of x86 instructions identical to the primary VM, with which it proceeds in
vLockstep.  By doing so, if catastrophic
failure of the primary system occurs it causes an instantaneous failover to the
secondary VM that would be virtually indistinguishable to the end user. While
FT technology is certainly compelling, some potential users express concern
about possible performance overhead. In this article, we explore the
performance implications of running FT in realistic scenarios by measuring an
FT-on environment based on the heterogeneous workloads found in VMmark, the tile-based
mixed-workload consolidation benchmark from VMware®.

Figure 1 : High Level Architecture of
VMware Fault Tolerance

Pic1

Environment Configuration :

System under Test

2 x Dell PowerEdge R905

CPUs

4 Quad-Core AMD Opteron 8382
(2.6GHz)

4 Quad-Core AMD Opteron 8384
(2.7GHz)

Memory

128GB DDR2 Reg ECC

Storage Array

EMC CX380

Hypervisor

VMware ESX 4.0

Application

VMmark v1.1.1

Virtual Hardware (per tile)

8 vCPUs, 5GB memory, 62GB disk

  •  VMware Fault Tolerance currently
    only supports 1 vCPU VMs and requires specific processors for enablement; for
    the purposes of our experimentation our VMmark Database and MailServer VMs were
    set to run with 1vCPU only.  For more
    information on FT and its requirements see
    here.
  • VMmark
    is a benchmark intended to measure the performance of virtualization environments
    in an effort to allow customers to compare platforms.  It is also useful in studying the effect of
    architectural features. VMmark consists of six workloads (Web, File, Database,
    Java, Mail and Standby servers). Multiple sets of workloads (tiles) can be added
    to scale the benchmark load to match the underlying hardware resources. For
    more information on VMmark see
    here.

 

Test Methodology :

An
initial performance baseline was established by running VMmark from 1 to 13
tiles on the primary system with Fault Tolerance turned off for all workloads. FT
was then enabled for the MailServer and Database workloads after customer
feedback suggested they were the applications most likely to be protected by FT.
The performance tests were then executed a second time and compared to the
baseline performance data.

 

Results
:

The
results in Table 1 are enlightening as to the performance and efficiency of
VMware’s Fault Tolerance.  For this case,
“FT-on Secondary %CPU”, indicates the total CPU utilized by the secondary
system under test.  It should also be
noted that, for our workload, the default ESX 4.0, High Availability, and Fault
Tolerance settings were used and these results should be considered ‘out of the
box’ performance for this configuration.
Finally, the secondary system’s %CPU is much lower by comparison to the
primary system because it is only running the MailServer and Database
workloads, as opposed to the six workloads that are being run on the primary
system.

Table 1:

Pic2b

You can see that as we scaled
both configurations toward saturation the overhead of enabling VMware Fault
Tolerance remains surprisingly consistent, with an average delta in %CPU used
of 7.89% over all of the runs.  ESX was
also able to achieve very comparable scaling for both FT-on and FT-off
configurations.  It isn’t until the FT-on
configuration nears complete saturation, a scenario most end users will never
see, that we start to see any real discernable delta in scores.

It should be noted that these
performance and overhead statements may or may not be true for dissimilar
workloads and systems under test.  From
the results of our testing you can see that the advantage of having Mail
servers and Database servers truly protected, without fear of end-user
interruption, is completely justified.

It’s a tough world out there; you
never know when the next earthquake, power outage, or someone tripping over a
power cord will strike next.  It’s nice
to know that your critical workloads are not only safe, but running at high
efficiency.  The ability of VMware Fault
Tolerance technology to provide quick and efficient protection for your
critical workloads makes it a standout in the datacenter.

All information in this post
regarding fut
ure directions and intent are
subject to change or withdrawal without notice and should not be relied on in
making a purchasing decision of VMware’s products. The information in this post
is not a legal obligation for VMware to deliver any material, code,
or
functionality. The release and timing of VMware’s products remains at VMware’s
sole discretion.