HA and Admission Control

I have seen admission control being used without really understanding how it impacts your cluster and your available resources. While configuring admission control on a cluster the other day, I started thinking how this really works. The concept is pretty simple. According to VMware:

Slot size is comprised of two components, CPU and memory. VMware HA calculates these values.

The CPU component by obtaining the CPU reservation of each powered-on virtual machine and selectingthe largest value. If you have not specified a CPU reservation for a virtual machine, it is assigned a defaultvalue of 256 MHz (this value can be changed using the das.vmCpuMinMHz advanced attribute.)

The memory component by obtaining the memory reservation (plus memory overhead) of each poweredon virtual machine and selecting the largest value

HA relies on slot sizes and in the current version of ESX/i, if no reservations are used, the default slot sizes are 256 MHz and the memory overhead. Now keep in mind, if you happen to have a VM which has a reservation of 4GB, now all of a sudden your slot size has become 256 MHz and 4GB in memory. Basically now you have less slots to place your VMs and admission control will make it to where you can’t power on more VMs than what can be accommodated according to your host failures cluster tolerates setting. Basically HA will look for your worst case CPU and memory reservation to come up with the slot size. All that I just mentioned should be common knowledge.

Let’s assume you have a cluster of 3 hosts and VMs with no reservation, HA is turned on, host failures cluster tolerates is 1, admission control is enabled and your isolation response is set to shutdown. For simplifying things lets assume your cluster is balanced where each hosts has 10GHz CPU and 24GB of memory. Your cluster has a total of 30GHz CPU and 72GB of memory. The total number of VMs running is 60 and none of them have any reservation. Lets also assume your slot size is 256 MHz and 300MB (overhead). So how many slots do you have? You have 30000/256 = 117 in CPU and 72000/300 = 240 in memory. You always pick the lowest number and according to what we calculated above, you have 117 slots available on this cluster.

Let’s assume a host fails and now we only have 20GHz and 48GB left in our cluster. We now have 20000/256 = 78 and 48000/300= 160, which means we have only 78 slots available now. So you have 78 slots and 60 VMs (1 VM/slot), should all your VMs power on? No, because your cluster still has Host Failures Cluster Tolerates set to 1 and admission control is enabled. It’s important to understand how admission control really works. According to VMware:

With the Host Failures Cluster Tolerates policy, VMware HA performs admission control in the following way:

1 Calculates the slot size.A slot is a logical representation of the memory and CPU resources that satisfy the requirements for any powered-on virtual machine in the cluster.

2 Determines how many slots each host in the cluster can hold.

3 Determines the Current Failover Capacity of the cluster.This is the number of hosts that can fail and still leave enough slots to satisfy all of the powered-on virtual machines.

4 Determines whether the Current Failover Capacity is less than the Configured Failover Capacity (provided by the user).If it is, admission control disallows the operation.

So according to that, even though your cluster has enough slots to run all your VMs, but because your host failures cluster tolerates is set to 1, admission control has to make sure it only runs the load it can afford to run in case of another host failure. Basically admission control knows there are 78 slots available but it has to keep in mind that in case of another host failure it will only have 39. Because host failures cluster tolerates is set to 1, admission control will only allow 39 slots to be accommodated. So once HA realizes that 39 slots have been taken, it will not allow anymore power on. It’s saving you from yourself.

I will not throw in other complications like memory reservations or an unbalanced cluster (hosts with different resources) and how to handle that yet just to keep it simple. I do plan to post about why reservation would be a bad idea at the VM level and ways to get around the conservative slot sizes. HA and admission control are awesome tools to have, but if you don’t plan intelligently, you will soon begin to hate them.

HA for MSCS VMs in vSphere

A few days ago, I was complaining about not knowing why HA has to be disabled on a MSCS setup in vSphere. Turns out, only DRS needs to be disabled as HA is still supported according to KB article 1037959. If I read it correctly, even in a cluster across box(CAB) type of setup where you will have to use physical compatibility mode, HA is still supported. DRS is not supported in all vSphere and MSCS setup due to the reasons I discussed in one of the previous blogs. Although the MSCS user guide for 4.1 suggests that you can setup DRS to partially automated for MSCS machines, the pdf also mentions that the migration of these VMs is not recommended. And as the table below suggests, DRS is not supported either.

kb article 1037959

So, what does support for HA really mean? If you only have a two node cluster and have a MSCS CAB setup, the HA support will not effect you because of the anti-affinity rules. However, if your ESX/i cluster is bigger than two nodes, then HA can be leveraged and the dead MSCS VM an be restarted on a different host and still be in compliance with the anti-affinity rule that has been set. For MSCS CIB setup, HA can be leveraged on even a two node ESX/i cluster. When host one dies, host two finds itself spinning up the two partners in crime. One thing to note here is, all of this is only possible if the storage (both the boot vmdk and the RDM/shared disk) is presented to all the hosts in the cluster. I can’t imagine why anyone would not do that to begin with.

Again only a two node MSCS cluster is supported so far. With HA being supported for MSCS VMs, I guess one can certainly benefit from added redundancy. If you think this is being two redundant, just don’t use the feature and disable HA for the MSCS VMs in your environment. I would highly recommend to disable HA for the the two VMs if they are part of a MSCS CAB setup in a two node ESX/i cluster.

vSphere client for iPad (Review)

I was too excited about getting the iPad2 this year and one of the first things I started looking for was the vSphere client that VMware was supposed to make for the iPad. After standing in line and with the help of my friend, I was finally able to get my hands on Apple’s new tablet. For the next two days I religiously searched for the vSphere client for the iPad but was disappointed not to find it. Just this past Sunday, I was talking to a friend who asked me if I tried out the iPad app for vSphere. So I started searching again and it turns out I gave up searching 3-4 days before it was finally released (March 17th, 2011). After feeling left out, I finally downloaded it and took it for a spin.

You will need to download the vCMA, vSphere Client for iPad and off course a vSphere environment and an iPad will be needed. Once you have fired up your vCMA, be sure to change your password for the vCMA appliance. This is not a requirement, but if you plan on allowing remote access to your vCMA appliance, you may not want to leave it with the default password that is known by the masses. You can manage your vCMA appliance at, http://YourIP:5480. I would also assign the vCMA a static IP.

Once you have assigned the IP to vCMA, go to the settings in your iPad and tap on the “vSphere Client” and enter the IP of your vCMA in the “Web Server” field.  Read the rest of this entry »

Duplicate MACs in vCenter

I don’t have a lot of experience with Hyper-V, but I have worked with people who have. After hearing their horror stories, I don’t envy acquiring that sort of experience. Speaking of horror stories, my favorite one is, when one of my co-workers told me about a Hyper-V environment they had setup which was generating duplicate MAC addresses. I was amused but in the back of my mind I started thinking if this was possible in a vSphere setup. Yes it is.

UID

vCenter assigns MAC addresses using the unique ID that’s assigned to it under AdministrationvCenter Server Settings > Runtime Settings

This unique ID can be set to a value between 0-63. If you have two or more vCenter’s running the same instance ID, its only a matter of time before you start seeing mac conflicts in your environments.

vCenter assigns MAC addresses using a simple formula (00:50:56:80HEX+UID:00:00). So if your vCenter ID is 45, your VMs mac should be 00:50:56:ad:XX;XX. As you can tell the fourth byte is what can help you identify which vCenter was used to create the VM. The fifth and sixth bytes are the ones that are usually edited if you have a need for assigning a MAC instead of vCenter doing it for you.

Another interesting thing I noticed, I saw 3 different types of MAC addresses in my VMs. I saw,

VM1 00:50:56:ad:c2:3F

VM2 00:0C:29:73:B1:2F

VM3 00:5056:a5:d2:6F

It turns out, that VM1 was created on my new vCenter with a UID 45 (ad=80HEX + 45), VM2 was created directly on a ESXi host and VM3 was created on a different vCenter with a UID of 37(a5=80HEX + 37).

Though these little things don’t matter as much, but its important to know how all this comes together. It will be very helpful when you find yourself in a situation where your VMs have identical MACs. Someone forgot to set a unique ID for their vCenter.

 

OS X on vSphere

As vSphere 4 begins to get old and fascination with what should be expected in the next version begins, there have been reports that with vSphere 5, VMware may support OS X as a guest OS on a non Apple hardware. If Apple has really backed down and this news is indeed true, it will be interesting to see how rapidly Apple’s OS engages the datacenters across the globe.

In the past, I have witnessed requests for Apple’s OS, but they mostly got squashed due to Apple’s dedicated hardware requirement, if these reports are correct, I think Apple will benefit from reaching the market they lost in the 90s.