HA Admission control – Percentage of cluster resources

I am sure we are all aware of why HA is all important and awesome to have. It helps you to finish your coffee, smoke your cigarette before rushing towards a server that just went down. Ok maybe not that but you get the idea right. Another thing to keep in mind regarding HA is the admission control policy. I like to call this the policy that saves you from yourself. Basically it keeps check of how many resources are available and how many will be needed for a failover to happen. It keeps you honest and ensures that the HA’s promise is not broken.

As we already know there are three types of Admission Control Policies to choose from:

  • Host failures cluster  tolerates
  • Percentage of cluster resources reserved as failover spare capacity
  • Specify a failover host
“Host failure cluster tolerates” creates slots which at times could create issues specially if you only have a a few VMs with High CPU counts and memory reservation. Of course you can look at advanced settings that could address this and Duncan can tell you all there is to know about this. The second option which is selecting a percentage of resources is my personal favorite specially due to the flexibility that it provides. We will go over that in a little bit. The last option which lets you specify a failover host is the one thats rarely used and rightly so. After all why would you want a host to just sit there and wait until something goes wrong?
As you may have already noticed, vSphere 5 gives you the option to specify a percentage of failover resources for both CPU and memory. Prior to vSphere 5, this was not the case. I think this is an excellent addition and our clusters will now be more flexible then ever.

25% is whats placed in there by default and what this really means is the 25% of your total CPU and total memory resource across the entire cluster is reserved for your cluster. So in other words, if you have an 8 node cluster, 25% of your resources or resources equal to two host (assuming its a balanced cluster) are reserved for an HA incident. If this happens to be a 32 node cluster and if this is a balanced cluster, resources that equate to 8 nodes will be reserved as 8 is 25% of 32. So keep that in mind before deciding what number to put there. You can’t reserve more than 50% of your resources.

Below is how the resources are calculated for the hosts:

The total host resources available for virtual machines is calculated by adding the hosts’ CPU and memory resources. These amounts are those contained in the host’s root resource pool, not the total physical resources of the host. Resources being used for virtualization purposes are not included. Only hosts that are connected, not in maintenance mode, and have no vSphere HA errors are considered.

So how do you know how much head room do yo have left in the cluster? On your cluster summary tab, you will notice there is no longer a place for you to look at slot size as this method does not use slot sizes. It basically gives you a simple view of how much room you have left.

The Current CPU Failover Capacity is computed by subtracting the total CPU resource requirements from the total host CPU resources and dividing the result by the total host CPU resources. The Current Memory Failover Capacity is calculated similarly.

In vSphere 5, vSphere HA uses the actual reservations of the virtual machines. If a virtual machine does not have reservations, meaning that the reservation is 0, a default of 0MB memory and 32MHz CPU is applied.

So assuming you went with the default of 25% for each resource, 0% as current failover capacity is something you should hope never to see. You are seeing that in my screenshot (above) because my cluster happens to be empty and has no hosts. Lets, say you went ahead and turned on a few VMs and your cluster shows something like below, (98% CPU and 95% memory), this is something to be happy about. This basically means you have 98% of CPU available and 95% of memory available in your cluster.

There is one thing to keep in mind, though 98% of my CPU and 95% of my memory appear under my current failover capacity, this does not account for the 25% of whats reserved for an HA incident. At least thats what I was able to see by the few tests that I ran. What this means is that I can only power on VMs that account for no more than 98-25 = 73% of CPU and 95-25=70% of memory thats free in the cluster. For everything else HA should try to save me from myself.

Let’s look at a quick example to see how these numbers are calculated:

  • The Configured Failover Capacity is set to 25% for both CPU and memory.
  • Cluster is comprised of three hosts, each with 9GHz and 24GB of memory.
  • There are 4 powered-on virtual machines in the cluster with the following configs (assume overhead is 100mb for all VMs in this case):
    • VM1 needs 2GHz and 1GB (no reservation)
    • VM2 needs 2GHz and 2GB (2GB reserved)
    • VM3 needs 1GHz and 2GB (2GB reserved)
    • VM4 needs 3GHz and 6GB (1GHz and 2GB reserved)

So what does our cluster have? Our cluster has 9GHz+9GHz+9GHz = 27GHz of CPU and 24GB+24GB+24Gb=72GB of memory. (These amounts are those contained in the host’s root resource pool, not the total physical resources of the host).

How much resources are we using with our four VMs that are powered on?

Memory = VM reservation + overhead = 0+100+2048+100+2048+100+2048+100= 6544MB = 6.4GB

Note we only used 2048 for VM4 even though it had 6GB configured. Thats because it only had 2GB reserved. Also, VM1 had no reservation so only overhead was used.

CPU = If no reservation use 32MHz for vSphere 5 = 32MHz+32MHz+32MHz+1GHz= 1.096GHz

So what is our current failover capacity?

Memory = (72GB – 6.4Gb)/72= 91%

CPU = (27GHz-1.096GHz)/27= 95.94%=96%

Wow, that is a lot of cluster resources left. Now lets take 25% off from our numbers to come up with exactly how many VMs can we power on before HA starts screaming back with an error.

Memory = 91- 25 = 66%

CPU = 96-25 = 71%

Now keep in mind, selecting the percentage for admission control policy isn’t going to solve all your problems. But I do think that this setting is far better than complex slot sizes and what not. This gives one a simple view of how much room you have in your cluster without messing around with slot sizes. However, unlike cluster host tolerates setting where you can simply add hosts like crazy, using the percentage method may require you to revisit your percentages as you add or remove hosts. At the same time it also gives you more flexibility. So next time you are setting a cluster, think about whats important to you.

 

HA and Admission Control

I have seen admission control being used without really understanding how it impacts your cluster and your available resources. While configuring admission control on a cluster the other day, I started thinking how this really works. The concept is pretty simple. According to VMware:

Slot size is comprised of two components, CPU and memory. VMware HA calculates these values.

The CPU component by obtaining the CPU reservation of each powered-on virtual machine and selectingthe largest value. If you have not specified a CPU reservation for a virtual machine, it is assigned a defaultvalue of 256 MHz (this value can be changed using the das.vmCpuMinMHz advanced attribute.)

The memory component by obtaining the memory reservation (plus memory overhead) of each poweredon virtual machine and selecting the largest value

HA relies on slot sizes and in the current version of ESX/i, if no reservations are used, the default slot sizes are 256 MHz and the memory overhead. Now keep in mind, if you happen to have a VM which has a reservation of 4GB, now all of a sudden your slot size has become 256 MHz and 4GB in memory. Basically now you have less slots to place your VMs and admission control will make it to where you can’t power on more VMs than what can be accommodated according to your host failures cluster tolerates setting. Basically HA will look for your worst case CPU and memory reservation to come up with the slot size. All that I just mentioned should be common knowledge.

Let’s assume you have a cluster of 3 hosts and VMs with no reservation, HA is turned on, host failures cluster tolerates is 1, admission control is enabled and your isolation response is set to shutdown. For simplifying things lets assume your cluster is balanced where each hosts has 10GHz CPU and 24GB of memory. Your cluster has a total of 30GHz CPU and 72GB of memory. The total number of VMs running is 60 and none of them have any reservation. Lets also assume your slot size is 256 MHz and 300MB (overhead). So how many slots do you have? You have 30000/256 = 117 in CPU and 72000/300 = 240 in memory. You always pick the lowest number and according to what we calculated above, you have 117 slots available on this cluster.

Let’s assume a host fails and now we only have 20GHz and 48GB left in our cluster. We now have 20000/256 = 78 and 48000/300= 160, which means we have only 78 slots available now. So you have 78 slots and 60 VMs (1 VM/slot), should all your VMs power on? No, because your cluster still has Host Failures Cluster Tolerates set to 1 and admission control is enabled. It’s important to understand how admission control really works. According to VMware:

With the Host Failures Cluster Tolerates policy, VMware HA performs admission control in the following way:

1 Calculates the slot size.A slot is a logical representation of the memory and CPU resources that satisfy the requirements for any powered-on virtual machine in the cluster.

2 Determines how many slots each host in the cluster can hold.

3 Determines the Current Failover Capacity of the cluster.This is the number of hosts that can fail and still leave enough slots to satisfy all of the powered-on virtual machines.

4 Determines whether the Current Failover Capacity is less than the Configured Failover Capacity (provided by the user).If it is, admission control disallows the operation.

So according to that, even though your cluster has enough slots to run all your VMs, but because your host failures cluster tolerates is set to 1, admission control has to make sure it only runs the load it can afford to run in case of another host failure. Basically admission control knows there are 78 slots available but it has to keep in mind that in case of another host failure it will only have 39. Because host failures cluster tolerates is set to 1, admission control will only allow 39 slots to be accommodated. So once HA realizes that 39 slots have been taken, it will not allow anymore power on. It’s saving you from yourself.

I will not throw in other complications like memory reservations or an unbalanced cluster (hosts with different resources) and how to handle that yet just to keep it simple. I do plan to post about why reservation would be a bad idea at the VM level and ways to get around the conservative slot sizes. HA and admission control are awesome tools to have, but if you don’t plan intelligently, you will soon begin to hate them.