By any measure, FT (Fault Tolerance) is a ground-breaking technology. Introduced a year ago with vSphere 4.0, FT enables an application to continue uninterrupted even after a complete and catastrophic physical server failure.
But how does it work? What is the impact or requirement upon the network? Dan Scales, Mike Nelson, and Ganesh Venkitachalam from the VMware engineering team that brought FT to life, recently published an in-depth, 27 page technical report on “The Design and Evaluation of a Practical System for Fault-Tolerant Virtual Machines.” The paper goes deep with discussion on the FT protocol, the implementation issues for network I/O, disk I/O, benchmarks, and an assortment of other topics. It really is a fascinating read.