Good Ole P2V

This year I havent had much opportunity to blog due to the type of work I have been involved in. However, as soon as I find the opportunity I grab it and this one happens to be about P2V. I recently had to use P2V after a long time for a POC I cant share details about at this time. Having not done a P2V in ages, it will be safe to say that my confidence was sort of crushed from some of my initial tests. Here are some of the things I learned.

If your source machine, your P2V server (where convertor is installed), your vCenter and your target hosts have any firewalls between them, please go to this this link and make sure all the communication channels are open.

In my tests there were firewalls in play and we opened the ports accordingly. Once the ports were open we did a P2V and everything seems to have worked after some issues we faced with SSL. Once the SSL was taken out of the equation everything seems to have worked just fine.  The machine was imported to the target vCenter and powered on with no issues. Seemed pretty standard. However our later tests kept failing which was quite frustrating. The job would fail after a few seconds of being submitted with the following error:

“An error occured while opening a virtual disk. Verify that the Converter server and the running source machines have a network access to the source and destination ESX/ESXi hosts.”

BTW the error above is my least favorite as it could really mean so many things. The key is always in the log files. After looking through the log files of our repeated failed attempts, we noticed something interesting in the worker log file:

2014-10-29T11:18:21.192-04:00 [03652 info ‘task-2’] Reusing existing VIM connection to 10.194.0.120
2014-10-29T11:18:21.207-04:00 [03652 info ‘task-2’] GetManagedDiskName: Get virtual disk filebacking [Cluster-1] PhysicalServer/PhysicalServer.vmdk
2014-10-29T11:18:21.207-04:00 [03652 info ‘task-2’] GetManagedDiskName: updating nfc port as 902
2014-10-29T11:18:21.207-04:00 [03652 info ‘task-2’] GetManagedDiskName: get protocol as vpxa-nfc
2014-10-29T11:18:21.207-04:00 [03652 info ‘task-2’] GetManagedDiskName: Get disklib file name as vpxa-nfc://[Cluster-1] PhysicalServer/PhysicalServer.vmdk@ESXi4:902!52 c3 66 97 b0 92 a0 93-38 cd b9 4a 17 f8 e0 00
2014-10-29T11:18:33.048-04:00 [03652 info ‘task-2’] Worker CloneTask updates, state: 4, percentage: 0, xfer rate (Bps): <unknown>
2014-10-29T11:18:33.048-04:00 [03652 info ‘task-2’] TargetVmManagerImpl::DeleteVM
2014-10-29T11:18:33.048-04:00 [03652 info ‘task-2’] Reusing existing VIM connection to 10.194.0.120
2014-10-29T11:18:33.064-04:00 [03652 info ‘task-2’] Destroying vim.VirtualMachine:vm-810174 on 10.194.0.120
2014-10-29T11:18:34.062-04:00 [03652 info ‘task-2’] WorkerConvertTask: Generating Agent Task bundle for task with id=”task-1″.
2014-10-29T11:18:44.467-04:00 [03652 info ‘task-2’] WorkerConvertTask: Retrieving agent task log bundle to “C:WindowsTEMPvmware-tempvmware-SYSTEMagentTask-task-1-vmepkywh.zip”.
2014-10-29T11:18:44.483-04:00 [03652 info ‘task-2’] WorkerConvertTask: Bundle successfully retrieved to “C:WindowsTEMPvmware-tempvmware-SYSTEMagentTask-task-1-vmepkywh.zip”.
2014-10-29T11:18:44.483-04:00 [03652 error ‘Default’] Task failed:

One thing that stood out was the hostname that I have marked above in red. We were exporting the physical machine to a host in Cluster-1. When going through the wizard it allows you to select an individual host inside a cluster however that doesn’t really mean much. Our cluster was an 8 node cluster and we noticed in every failed attempt the job was trying to pick any of the 8 hosts in the cluster and not the one we selected (ESXi8). Also, right before submitting the job we noticed that though a particular host was selected as the target, at the summary page the target location was the cluster and not the individual host.

So remember I mentioned we have firewalls between our hosts, vcenter, source machine and P2V server? When we opened the ports we figured that we will only need access to a single ESXi host as the P2V wizard was allowing us to select an individual host in the cluster. As it turns out because this host was in a cluster all hosts in that cluster will need the communication channel open as it was evident through the worker log file. Once we made that change, the P2V worked like a champ over and over. Another option would have been to take our target host out of the cluster.

So, how did we get it to work the first time? I blame that on luck. We went back to the log files and confirmed the first time it actually worked, it happened to have selected the host we opened the ports for. I should have bought the lottery ticket instead :(.  Luckily we tested more than once and were able to confirm all that needs to be in place for this to work. I may have known this sometime ago but being away from P2V for sometime it was refreshing, challenging and rewarding to finally get it working. Convertor version was 5.5.1 build-1682692.

Lesson learned? Do lots of testing.