When I started blogging, my goals were pretty simple. I wanted a place to keep notes, and at the same time try and help out someone else who may be looking for the same type of information. In the process I managed to learn so many new things, managed to remember and hopefully helped someone else out as well. Now this post will probably go under the category of I have no clue what on earth is going on and perhaps someone out there does.
I will admit, I haven’t used VUM for sometime or as often as I used to. I dont work on the operations side of house anymore. So the other day, I started to remediate a host in a HA/DRS enabled cluster with admission controlled turned on, the remediation failed.
I figured well that must be because the host was unable to be placed in maintenance mode due to admission control settings. However, when I looked at what the cluster was doing, I was a bit surprised. There is no way anything should come in the way for this host to be placed in maintenance mode. The cluster was so under utilized.
Obviously, I tried to place the host into maintenance mode and had no issues there. So I am not exactly sure why VUM wasn’t able to do the same. Finally, I figured the host is now in maintenance mode and I might as well go ahead and get this guy remediated. No luck there, I was still slapped with the same exact error on the status bar and in the host events I saw the following.
Now I was totally confused. Why does VUM care about this at all? The host is already in maintenance mode meaning it should be out of equation for HA or even admission control for that matter. All that needs to happen here is the patches to be installed but VUM keeps complaining about admission control being enabled in the cluster this host resides in. Even though admission control or HA wont be considering the resources from this host until it comes out of maintenance mode. I would like to point out that I was trying to remediate the host not the cluster itself.
Obviously the next thing was to disable admission control and the remediation went fine. I also tried taking a host out of the cluster and that remediation went fine as well. But I am still not sure why VUM refused to patch a host that was already in maintenance mode. Perhaps someone in the community can throw some light on this. Maybe I am missing the obvious here and simply over analyzing. But this has bugged me for a few days so I decided to post this and ask the question and be sure versus making assumptions that may not be true.
By the way this happened to me on both vCenter 4.1 and vCenter 5. According to 4.1 admin guide:
When you update vSphere objects in a cluster with DRS, VMware High Availability (HA), and VMware Fault
Tolerance (FT) enabled, you should temporarily disable VMware Distributed Power Management (DPM), HA
admission control, and FT for the entire cluster.
Certain features might cause remediation failure. If you have VMware DPM, HA admission control, or Fault Tolerance enabled, you should temporarily disable these features to make sure that the remediation is successful.
Update Manager does not remediate clusters with active HA admission control.
So according to the documentation admission control must be disabled. And below is the reason for that from the same source:
If HA admission control is enabled during remediation, the virtual machines within a cluster might not migrate with vMotion.
Admission control is a policy used by VMware HA to ensure failover capacity within a cluster. If HA admission control is enabled during remediation, the virtual machines within a cluster might not migrate with vMotion.
Moreover it also states the following which I thought was important to capture. Below is why enabling admission control during remediation coud be troublesome.
It’s obviously clear the issue is Admission Control being turned on. And according to much of the documentation, admission control must be disabled. However, the rational behind that requirement is so that the host can be placed in maintenance mode. Obviously in my case the host was already in maintenance mode which leads me to believe that VUM will still check for Admission Control setting on the cluster and fail the remedition if its enabled.
One more thing to add to the mix. If you are running 1000v, upgrading the VEM would fail unless admission control is disabled on the cluster according to this kb article. Also note this issue would only occur for VEM related updates. It would be worth pointing out that my tests included both types of hosts, one running a 1000v and one running a standard switch. In my tests, both hosts were managed by the same vCenter.
Again, the purpose of this post is to share what I experienced and hopefully someone will be able to either explain why this happens or point towards a possible misconfiguration or fix. My explanation for why Admission Control needs to be disabled would be that even though the host may already be in maintenance mode, VUM would still check the Admission Control setting and simply fail the remediation if it is enabled (unless you check the box to disable it when remediating). If this is the case, perhaps future releases will make this check more efficient and not simply fail. If the purpose of this check is to make sure the host can be placed into maintenance mode without violating the admission control setting, then VUM should be looking for that piece of information rather than simply failing if Admission Control is turned on.
UPDATE: Please read this follow up post as well.