Microsoft Failover Clusters with shared Physical RDM disks = Slow ESXi Boot

Posted: August 23, 2012 in Uncategorized

In doing some final testing and handover for upgrading Ports of Auckland’s VMWare hosts to ESX5i, we discovered that on boot the host was appearing to hang at loading the vmw_vaai_cx module. We had already successfully rebooted this test blade a few times, but had just re-enabled the FC ports to allow it access to the SAN volumes again. Suspecting this was related, due to the module name, to the storage I did a quick bit of research and found this article.

The long and the short of it: If you are using Microsoft Failover Clusters with shared volumes (e.g. quorum disks or data disks) in VMWare, by using the Physical RDM mode, then you need to flag to each host that has access to those volumes that they are permanently reserved. This stops the host from repeatedly attempting to get a reservation and inspect the volume. In our case we have 15 RDM volumes so the boot time was fairly excessive.

For each volume you need to get the naa ID of the volume and then run an esxcli command on each host for each volume.

A neat wayto get a list of all RDM disks and their naa ID’s is to use this powershell command:

Get-VM | Get-HardDisk -DiskType “RawPhysical”,”RawVirtual” | Select Parent,Name,DiskType,ScsiCanonicalName,DeviceName

Then you just need to identify the naa ID’s that are connected to more than one VM to identify the likely MSCS candidates.

To flag the volumes you then just need to run esxcli (I prefer the powercli version which makes it easy to apply the command to a group of hosts):

$esxcli= get-esxcli -VMHost ESXhost

$$false, “naa.60030d90544d5353514c322d46455247“, $true)


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s