Vmware introduced a new groundbreaking technology with vSphere that promised to give IT professionals some sanity after work. However, I am personally more disappointed by how much fault tolerance really offers.
Basically Fault Tolerance is a feature that runs a shadow copy of a VM on a different hosts. If the host running the primary VM dies, the shadow VM becomes the primary VM and starts another shadow VM on a differnt host. It sounds wonderful, but there are certain things that are often overlooked. According to VMware, “the Primary VM captures all nondeterministic events and sends them across a VMware FT logging network to the Secondary VM. The Secondary VM receives and then replays those nondeterministic events in the same sequence as the Primary VM, typically with a very small lag time. As both the Primary and Secondary VMs execute the same instruction sequence, both initiate I/O operations. However, the outputs of the Primary VM are the only ones that take effect: disk writes are committed, network packets are transmitted, and so on. All outputs of the Secondary VM are suppressed by ESX. Thus, only a single virtual machine instance appears to the outside world.”
What is the cost?
-
You have to have vSphere Advanced at a minimum.
-
FT requires that the hosts for the Primary and Secondary VMs use the same CPU model, family, and stepping. Approved CPU list:
- Intel :3100 Series, 3300 Series, 5200 Series (DP), 5400 Series, 7400 Series, 3400 Series (Lynnfield), 3500 Series, 5500 Series
- AMD: 1300 and 1400 Series, 2300 and 2400 Series (DP), 8300 and 8400 Series (MP)
-
A dedicated Fault Tolerance Network (vSwitch and connectivity). Each host must have a VMotion and a Fault Tolerance Logging NIC configured. The VMotion and FT logging NICs must be on different subnets
-
The primary and secondary ESX hosts and virtual machines have to be in an HA-enabled cluster
-
Ensure that there is no requirement to use DRS for VMware FT protected virtual machines
-
Hosts that are running the primary and the secondary VM are on the same ESX/ESXI build
-
VMware Consolidated Backup (VCB) is not supported with Fault Tolerance
-
-
Virtual machines must have thick-eager zeroed disks
-
Storage VMotion is not supported for VMware FT VMs
-
Snapshots are not supported for VMware FT protected VMs
-
Ensure that the virtual machines are NOT using more than 1 vCPU
There are more restrictions and requirements but these are some of the generic once. One point to be noted here is that if the primary VM experiences a blue screen, so will the secondary VM. So in essence, FT really provides protection from HW failure of host. If all the requirements don’t turn you off and all the restrictions don’t make you upset, then go for it. I am personally not impressed with this technology yet or maybe I haven’t found a need to implement it yet. Or maybe I am spoiled by the higher standards VMware has gotten me used to.