Archive for the ‘virtualisation’ Category

I had been repeatedly receiving a vCenter alarm from two of our new hosts for the last 3 or 4 days, both reporting that vmnic0 had lost connectivity. The initial investigation confirmed that the physical NIC was up and passing traffic. A review of the host logs showed no signs of an error and the physical upstream switch had no record of the link going down.


Previous Status: Green

New Status: Red

 Alarm Definition:

([Event alarm expression: Lost Network Connectivity; Status = Red] OR [Event alarm expression: Restored network connectivity to portgroups; Status = Green] OR [Event alarm expression: Lost Network Connectivity to DVPorts; Status = Red] OR [Event alarm expression: Restored Network Connectivity to DVPorts; Status = Green])

 Event details:

 Lost network connectivity on virtual switch “vSwitch0″. Physical NIC vmnic0 is down. Affected portgroups:”Vmotion”, “Management Network”.

The alert was reporting that the loss of connectivity was affecting two portgroups which didn’t even have this pNic as its active adapter. The portgroups that were set with this adapter as active were not listed.

It then became apparent that the alerts were being sent exactly 1 hour apart.  Smelling a rat I’ve restarted the vCenter service and so far these alerts have stopped being sent. I have yet to find a root cause for these erroneous alerts or any kb article that fits the problem but it was only occurring with the new IBM HS23E blades with ESXi 5 but not recently built HS22 blades.


For the last few days I’ve been smashing my head against a wall to build a vsphere5 lab inside a ESXi 5 host.

On first blush the whole thing seemed like a simple exercise to get a basic virtual two host setup going with a Sansymphony virtual server as a shared storage array. I figured I’d have two virtual switches, one for management and vmotion and one for iscsi as a starting point so that I could try out Host Profiles and Auto Deploy initally.

Try as I might the iSCSI connection would not work with a second vkernel port bound to iSCSI. Bind iSCSI to the initial management port and flatten the whole network configuration down to one network and iSCSI would connect to the target. Looking at netstat I could see the SYN sent on the host and SYN received on the target but that was it. Sure that I was getting things right inside the lab I reached out to google on nested esxi and found this article.

Sure enough I needed to make a key change to the virtual switches on the physicall ESXi host .

Configure a vSwitch and/or Port Group to have Promiscuous Mode enabled

What I don’t fully understand is why, right now my focus is on studying for the VCP5 exam but once that is out of the way I’ll update the post with an explanation because it makes no sense to me why it’s necessary.


Vaai Unmap

Posted: March 30, 2012 in SAN, virtualisation, VMWARE

Chad muses on being more open here on providing information to partners and customers and then inadvertently proves it by succinctly identifying that Unmap for VAAI returns in update 1 but is no longer automatic, but instead a vmkfstools command.
Probably the only way to easily reclaim space on thin provisioned volumes right now without the long delays or timeouts.
No doubt VMware’s engineering effort will continue and we’ll see it added back as an automatic process in a later update.

I ran into an interesting symptom of presenting snapshotted vmfs volumes from my production vSphere4 clusters to my isolated lab environment which is a ground up build of vSphere5.
In looking at the resource allocations for virtual machines imported into the inventory I found that the machines had memory limits applied that matched the configured vRam setting. Checking my production environment I confirmed no limits had been applied and creating new virtual machines in my lab were set to unlimited.

Not sure why this occurs, hopefully it doesn’t occur in an upgrade!

Anyone looking at SanSymphony should read this article if they are considering stretching mirrored SanSymphony volumes across two data centers.

We’ve implemented this very scenario at the port with a slight difference around mitigating the impact if god forbid a full split brain disaster should occur. We’re currently using vSphere4 so we can’t utilize the new features of vSphere5 that allow stretched esx clusters either.

Oliver Krehan’s article deals with a vSphere5 stretched esx cluster scenario and is a must read. He does suggest one vmfs volume per vm which I’m always loathed to use, as it totally negates the point of vmfs. I would add that doing everything possible to avoid a split brain scenario at the design phase is critical. I’m putting together a post soon to cover stretched SamSymphony clusters.

SanSymphony implements a primary/secondary node structure that means that it will direct IO to the primary path, (Ignoring ALUA, which is a whole different discussion) unless that node’s paths are dead.

Effectively at the port we have created two esx clusters, one in each data center which are mapped to all the vmfs volumes available at both data centers. Virtual Machines are then run on the cluster that is in the same data center as the primary node volume. This means that in normal operation, IO stays within the data center and only crosses the ISL links in a failure. In a full split brain scenario where no LAN or SAN connectivity is available between data centers, we should have changes being made only at the primary node and not at the secondary.

There are a couple of limitations, one is that a complete failure of a data center requires manual intervention (or a partial or full automated script) to restart virtual machines on the other cluster. The other is that careful management of virtual machine placement is needed to ensure virtual machines are correctly aligned on the correct cluster and vmfs volume.

EMC ProSphere

Posted: March 23, 2012 in EMC, SAN, virtualisation, VMWARE

If you’re managing any reasonable sized VMWARE farm and using EMC storage then this is for you.

The biggest management headache we face as virtualisation admins is identifying storage performance issues in what can be a complicated environment. I’ve always lamented the lack of a tool that can discover and link all the components together and get one view. Finally EMC have delivered by the looks of it, I haven’t yet tried it out yet, but once I have I’ll post my thoughts.

Hint, hint EMC, maybe develop this further to work with other vendor’s arrays?