By guest blogger, Christian Wickham, Technical Account Manager, South Australia and Northern Territory, and Local Government and Councils in Western Australia, Victoria and New South Wales at VMware Australia and New Zealand
An often forgotten VMware benefit – Pick and Mix
In my interaction with new and existing VMware customers, it strikes me how often people forget one of what I consider to be one of the best benefits of a VMware based clustered solution. It is often pleasant news for people planning or implementing a new VMware virtualisation solution to find that they can mix together different hardware from different vendors in the same cluster. You might be wondering why this is so good – here are some benefits that I see;
- You can mix servers from Dell, HP, IBM, Cisco or whoever you want – this means that you can select the hardware that provides the best balance of features for the price. So, if you get a great deal on Dell “pizza box” rack mount servers one day, and then you want to invest in Cisco blades for high memory density the next day – go ahead. The compute resources are abstracted from the virtual machines, so you can mix and match the underlying servers.
- If you want to expand your cluster onto new server hardware, it does not need to match the existing hardware – just be ‘similar’ (see section below on EVC and storage).
- You can migrate from old hardware to new hardware without downtime (depending upon your license), and as above you can have parallel running for as long as you want.
- Storage controllers (Host Bus Adapters – HBAs) can be different in the same cluster. You can have an Emulex 4Gbps dual port fiber channel PCIe card in one host and a Qlogic 8Gbps mezzanine card in a blade, and they can be in the same cluster. They just need to have the same storage presented to them consistently.
- As with storage HBAs, your network interface cards can differ too – different manufacturers, LOM (LAN on Motherboard) or add-in cards. The same points as above about consistent access apply.
- You can mix in multiple SANs and (providing you have the vSphere license for Storage vMotion) move VMs between them – allowing you to move running VMs from old to new (faster, cheaper, more reliable) SANs, including between vendors.
The benefits for your own business needs may vary, but for SMEs and businesses with lower capital budgets, being able to gradually increase the size of a cluster or re-use existing hardware that was previously used for a physical server is a significant benefit. For larger businesses too, the ability to transition to new hardware or even try out a new vendor or technology is a function that is not available from other software virtualisation providers.
A few years ago, before I started working for VMware, I deployed a physical SQL 2008 R2 failover cluster with shared storage – just 2 nodes. When the company grew, we needed to add a new node and then hit a roadblock – the HBAs in the existing nodes were no longer available. Microsoft’s requirements for Failover Clustering (Server 2012 requirements are here http://technet.microsoft.com/en-us/library/jj612869.aspx ) specify that “all elements of the storage stack should be identical […] host bus adapter (HBA), HBA drivers, and HBA firmware—that are attached to cluster storage be identical.” although we did buy the same physical servers as per Microsoft’s recommendation “We recommend that you use a set of matching computers”, the Validate Configuration Wizard would not accept a differing HBA. A rip and replace was required, causing lengthy over the weekend downtime and a very scary time for me when I crossed the ‘point of no return’ where compute and storage were separated and I had to trust that I would get access to the volumes again.
Microsoft’s Hyper-V depends upon Windows Failover Clustering, so this requires upfront planning and procurement of matching hardware to suit not only the current needs of the business, but any future growth – for many years…
In a VMware based vSphere cluster, these barriers do not prevent expansion or modification of the existing cluster. I have many customers who have upgraded from ESX 3.0 to ESXi 5.1 and completely changed all the underlying hardware (in one case from 2Gbps FC and 1Gbps Ethernet, dual core CPUs and 36GB RAM – to 8Gbps FC and 10 Gbps Ethernet, 8 core CPUs and 256GB RAM) – all without the scary rip and replace weekends with ‘point of no return’ moments.
Although you could run a cluster on servers, HBAs and network cards that are completely different with vSphere, best practice is to keep similar hardware whenever possible. This makes fault-finding easier, and can ensure that the VM experience is consistent. When putting in new hardware, please do make sure you visit the VMware Hardware Compatibility List page (http://www.vmware.com/go/hcl) to look up what the hardware vendor has tested and recommends for each version of vSphere. The manufacturers contribute to the VMware HCL from their own testing, you can be assured that this is done by the hardware vendor and not VMware’s software people.
In a scenario where you mix older CPUs with newer CPUs, mix FC or Ethernet speeds, this can have an impact on VM performance. Imagine your 4 vCPU VM is running on a host with the latest 3 GHz CPU, and this is vMotioned to a host with a 1.6 GHz CPU – performance can be affected. On this point, it is also possible that a VM might be using some of the CPU features on a newer generation of CPU, which will not be available when it moves to a host that does not have those instruction sets. To cope with this scenario, starting with ESX 3.5 Update 2 released in August 2008, VMware introduced “Enhanced vMotion Control” – EVC http://kb.vmware.com/kb/1003212 – where the cluster can be configured to mask off the newer features of modern CPUs so that VMs do not attempt to use them. When a VM is powered on, the Hypervisor will present the available CPU features as per the EVC setting, ensuring consistent CPU features in the cluster and ensuring CPU compatibility for vMotion. Two important points to note are that when people steadily upgrade their clusters, take out older hosts and put in newer hardware, that 1) the EVC level should be increased to match the lowest CPU capabilities in the cluster at that time, and 2) the VMs don’t get the additional features until they power on – not a reboot but a full cold power on (not much of an impact considering that the VMs will get the additional features with just a power cycle). A question I often get asked about EVC relates to masking features ‘down’ to a lower generation of CPUs – this does not decrease the CPU speed or remove other benefits that Intel or AMD have done to accelerate their latest CPUs, so if you put in a new host with the latest generation CPU into a cluster which is enabled for EVC, VMs will still be able to benefit from the speed of the newer CPU – just none of the additional features, as they will be masked by EVC.
vMotion between AMD and Intel CPUs is still not possible whilst the VM is powered on. You will need to power off the VM and sometimes clear CPU masks manually. As this reduces some of the benefits of features such as DRS, we recommend that your clusters contain one CPU vendor – although, you don’t have to!
In a couple of the points above, I mention that the features depend upon your license. One other benefit that people often forget is that with ESXi, the installed software is the same no matter what license you have. So, you can start with the basic “Essentials Plus” license that gives you basic vMotion and clustering capabilities, and then by simply changing the license key, enable features such as Storage vMotion and DRS without needing to even restart a service or server.
Investing in a VMware based environment enables a business to get a start and then grow as their needs eventuate. You can re-use existing hardware that can be mis-matched, providing it can access the same shared storage and networking, apply the lowest license and then grow, and grow.