Technical

Understanding the Impacts of Mixed-Version vCenter Server Deployments

(Update – April 5, 2024: While this post is older, the content, strategies, and limitations discussed here continue to be applicable to vSphere 7 and vSphere 8).

There are a few questions that come up quite often regarding vCenter Server upgrades and mixed-versions that we would like to address. In this blog post we will discuss and attempt to clarify the guidance in the vSphere Documentation for Upgrade or Migration Order and Mixed-Version Transitional Behavior for Multiple vCenter Server Instance Deployments. This doc breaks down what happens during the vCenter Server upgrade process and describes the impacts of having components – vCenter Server and Platform Services Controller (PSC), running at different versions during the upgrade window. For example, once you get some vCenter Server instances upgraded, say to 6.5 Update 1, you won’t be able to manage those upgraded instances from any 5.5 instances. While most of the functionality limitations manifest themselves when upgrading from 5.5 to 6.x, there could also be some quirks in environments running a mix of 6.0 and 6.5. There are a couple of additional questions that seem to arise from this doc so let’s see if we can address them.

The Upgrade Process

I’m not going to go through the entire process here, but it is important to understand the basics of how a vCenter Server upgrade works. Remember that there are two components to vCenter Server – the Platform Services Controller (PSC) which runs the vSphere (SSO) Domain and vCenter Server itself. For a vCenter Server upgrade, the vSphere Domain and all PSCs within it, must be upgraded first. Once that is complete, then the vCenter Servers can be upgraded. Obviously, if you have a standalone vCenter Server with an embedded PSC, this is a much simpler proposition. But, for those requiring external PSCs because of other requirements such as Enhanced Linked Mode, just remember the PSCs need to be upgraded first.

Mixed-Version Upgrade Phases

The other important point to make here is that upgrading by site is not supported. Looking at the above example, you can see there are two sites each with an external PSC and a vCenter Server. It is a common that a customer would like to upgrade an entire site, test, and then move onto the next site. Unfortunately, this is not supported and all PSCs within the vSphere Domain across all sites must be upgraded first.

Mixed-Version Support

Now, on to the questions mentioned earlier. The first question is, “Can I run vCenter Servers and Platform Services Controllers (PSCs) of different versions in my vSphere Domain?”  The answer here is yes, but only during the time of an upgrade. VMware does not support running different versions of these components under normal operations within a vSphere Domain. The exact verbiage from the article is, “Mixed-version environments are not supported for production. Use these environments only during the period when an environment is in transition between vCenter Server versions.” So, do not plan on running different versions of vCenter Server and PSC in production on an ongoing basis.

The second question is then, “How long can I run in this mixed-version mode?” This question is a bit tougher to answer. There is no magic date or time bomb when things will just stop working. This is really more of a question of understanding the risks and knowing how problems may affect the environment should something go wrong while in this mixed-version state.

The Risks

An example of one such risk would be if you were upgrading to vSphere 6.5 from 5.5. Let’s say you had your vSphere Domain (i.e. PSCs) and one vCenter Server already upgraded leaving you with 1 or more vCenter Server 5.5 instances. Imagine that something happens leaving a vCenter Server 5.5 completely wiped out. You could restore that vCenter Server 5.5 instance and be back in production as long as you have a good, current backup. If the backup you need to restore from was taken prior to the start of the vSphere Domain upgrade, you would not be able to use it to restore. The reason for this is that the vCenter Server instance that you would be restoring is expecting a 5.5 vSphere Domain and the communication between that restored vCenter Server instance and the 6.5 PSC would not work. An alternative to this would be to rollback the entire vSphere Domain and any other vCenter Servers that were upgraded.

Another risk would be if we are unable to restore that instance because the backups were bad (it does happen) or you couldn’t accept the outcome of losing the data since that backup was taken.  The result here is that you would be forced to rebuild that vCenter Server instance and re-attach all the hosts. This may not be desirable because this new vCenter Server instance would have a new UUID and all of the hosts, VMs, and other objects would also have new moref IDs. This means that any backup tools or monitoring software would see these as all net new objects and you would lose continuity of backups or monitoring. You also would have to rebuild the vCenter Server instance as 6.5 which also may not be desirable because you may have an application or other constraint that requires a specific version of vCenter Server. If you rebuild the instance as 6.5 you may break that application.

Finally, let’s consider the possibility of having a PSC failure instead of losing a vCenter Server. What happens?  Normally, you could easily repoint a vCenter Server instance to another external PSC within the same SSO Site. However, this would not be possible if the vCenter Server is not running the same version as the PSC you are attempting to repoint to. For example, if you had a vCenter Server 5.5 or 6.0 and they were pointing to a 6.5 PSC (because it has already been upgraded), if that PSC failed you would not be able to repoint that vCenter Server to another PSC. Remember that all PSCs must be upgraded first so all PSCs should be running 6.5 already. The only way to recover from this scenario is to restore or redeploy the failed PSC which may take longer than repointing.

Recommendations

So, give the above scenarios, what do we tell a customer who asks, “My upgrade plan spans multiple sites over multiple months. How should I plan my upgrade?” Here are our recommendations:

  1. Minimize the upgrade window
  2. Follow the upgrade documentation
  3. Take full backups before, during, and after the upgrade
  4. Check the interop matrices and test the upgrade first

The first recommendation is to minimize the upgrade window as much as possible.  We understand that there’s only so much you can do here, but it is important to reduce the amount of time you’ll be running different versions of vCenter Server (and PSC) in the same vSphere Domain. The second recommendation is to, no matter how tempting to do otherwise, upgrade the entire vSphere Domain (SSO Instances and PSCs) first as is called out in the vSphere Documentation. It is not supported to upgrade everything in one site and then move onto the next. You must upgrade all SSO Instances and PSCs in the vSphere Domain, across ALL sites and locations, first. Third, make sure you have good backups every step of the way. While snapshots can be a path to a quick rollback, when dealing with SSO, PSCs, and vCenter Server they don’t always work. Taking a full backup ensures the ability to restore to a known clean state. Last, and certainly not least, do your interoperability testing and test the upgrade in a lab environment that represents your production environment as much as possible.

Emad has a great 3-part series on upgrades (Part 1, Part 2, Part 3) so be sure to check it out prior to testing and beginning your upgrade. Also know and understand the risks and impacts of problems during the upgrade process. Finally, know how the upgrade process is going to affect all of the yet-to-be-upgraded parts of your environment and have good rollback and mitigation plans if any issues come up.