Technical

vSphere 6.5 Upgrade Considerations Part-3

Update 10/3/17 – by request requirements and decisions section has been added for the example scenario and blog post updated to reflect these changes.

We’ve started out by discussing the concepts and processes that make up a successful vSphere 6.5 Upgrade. Then it was time to take what we’ve learned in the first blog post of this series and apply it to a particular scenario. The first scenario covered upgrading an environment from vSphere 5.5  to 6.5.  Now let’s focus on how to upgrade an environment from vSphere 6.0 to 6.5 Update1. We will be walking through another example scenario which may or may not cover your particular environment. The key takeaway is understanding the upgrade process and concepts so you can apply them to your own environment.

Scenario

The Awesome Sticker company has five datacenter sites, three in the US and two in Europe. Each site is running vSphere 6.0 using the vCenter Server Appliance (VCSA). They are using an external deployment model for enhanced linked mode. There are three vSphere Single Sign-On domains (SSO) US, Europe, and VDI. Unlike the other deployments, the VDI environment is using an embedded Windows vCenter Server. vSAN is the company’s primary storage. The company also uses SRM for disaster recovery in currently only the US data centers.

Management has tasked Kyle, their new systems engineer, with upgrading the environment to vSphere 6.5. Kyle has already verified the current Dell hardware at each site is compatible with vSphere 6.5 from the hardware compatibility list. He was also proactive and started whiteboarding the upgrade path of each product.

Environment Discovery

Being new to the environment Kyle wanted to take some time and do an assessment of the environment. The current environment is running vSphere 6.0 Update 3. Reviewing the release notes, Kyle notices upgrades from 6.0 Update 3 to vSphere 6.5 are not supported. An avid reader of the vSphere blog, Kyle learned vSphere 6.5 Update 1 was now available and supports upgrades from vSphere 6.0 Update 3. He can now proceed with upgrading. He also wanted to take some time and do an assessment of the environment. Kyle is also planning to setup a meeting with the business owners of each department. In the meantime he started noting his observations about the environment:

  • Each site only has one Platform Services Controller (PSC).
  • vCenter Servers at each site are not properly backed up.
  • There is a Nexus 1000v running in one of the European data centers.
  • Windows vCenter Server used for VDI environment.
  • SRM is only used in the US datacenter, no DR solution in the Europe sites.
  • Virtual Standard Switches used for host management and vMotion.
  • No centralized management of ISOs.
  • Clean up of virtual switch port names. Currently, portgroup names are inconsistent and need to be cleaned up.

Requirements and Decisions

External Deployment with single PSC

As noted previously, Kyle observed that there is only a single external PSC at each site and could be a single point of failure. Kyle is considering changing to multiple PSCs per site. This all comes down to the availability needs of the company and knowing what role the PSC plays in the environment. The PSC provides authentication and management of the vSphere SSO domain. We also can’t talk about availability without mentioning Service Level Agreements (SLAs). The discussion Kyle is having with the different BUs in the company is a must and will help determine RPO and RTO requirements. All this information will factor in if a secondary PSC at each site is actually required or an overhead. Let’s consider what happens when a PSC is not online and its role in the vSphere SSO domain.

  • Users cannot login and manage the vCenter Server registered to the PSC that is not online. A single PSC can have as many as 15 vCenter Server registered in a vSphere 6.5 Update 1 SSO domain (not recommended). vCenter Server can only be registered with one PSC at a time (1:1).
  • Workloads are still running, but no new changes or deployments can take place.
  • External PSC is required only for enhanced linked mode, otherwise, use an embedded deployment.
  • Repointing across sites is not supported in vSphere 6.5 – only intra-site. This requires having a secondary PSC at each site.
  • A load balancer provides automatic failover for vCenter Server and other products using PSC for authentication. The trade-off is the added complexity.
  • A load balancer is optional for external deployments without vCenter Server High Availability (VCHA).
  • An external deployment with VCHA requires a load balancer for the PSCs as it’s not aware of the manual repoint. Also if protecting vCenter Server, PSC should be protected as well.
  • Manual repoint of vCenter Server and other products using PSC for authentication via cmsso-util. The trade-off is manual intervention.
  • Linear topology within the vSphere SSO domain, easier to manage and provides no extra overhead on the PSCs. The recommendation is to create a ring by adding a replication agreement using vdcrepadmin to have a secondary data path.
  • During patching or upgrades repointing can reduce downtime of vCenter Server.

Important resources to help guide with vSphere topology planning and upgrades

vCenter Server Backups
In case of a vCenter Server failure, it is important to ensure a proper backup is available to recover from. The Awesome Sticker company has a backup team that uses a 3rd party backup product for all workloads. vCenter Server and PSC were never included as part of the backup solution and schedule.

  • Kyle is working with them to ensure it supports VADP (VMware vSphere Storage APIs – Data Protection) and can take image level backups.
  • vCenter Server Appliance 6.5 has built-in file-based backup & restore. This supports both embedded and external PSC deployments.
    • Backup can be taken while VCSA or PSC is running, no quiescing.
    • Restore using the VCSA 6.5 ISO which was used to deploy from, no agents needed.
    • REST APIs available for automation.
    • Windows vCenter Server for VDI will use 3rd party backup solution for image-level backups until migrated.

Important References for VCSA backups

Network

Kyle noticed there is a Nexus 1000v running in one of the European datacenters. In reading the vSphere 6.5 Update 1 release notes he discovered this will be the last release to support a 3rd party switch such as the Cisco Nexus 1000. Kyle has decided to take the opportunity to migrate to the VMware Virtual Distributed Switch (VDS). He will also use this time to clean up a few other network items he wants to address.

  •  He will use the Nexus 1000v to VDS migration tool to move this environment to a Virtual Distributed Switch (VDS).
  • Kyle wrote a powercli script to find any empty portgroups in the environment and plans to have them decommissioned.
  • The portgroups will be renamed to something meaningful since each is listed as portgroup with a random number.
  • vSphere 6.5 allows only unique names across all VDS and Distributed portgroups in the same network folder. Prior versions of vSphere allowed the same name KB 2147547.
  • Since Kyle is migrating to a VDS he also plans to migrate the management and vMotion portgroups from a VSS to VDS.

Important References for VDS

Content Management

The company’s ISOs are all over the place and has become a management nightmare. Kyle noticed that some are placed on different vSAN datastores. Also, the business owners have several stored on their local machines, taking up large amounts of space. Kyle has been looking at creating a centralized library and having the different business owners subscribe. He has come up with the following list :

  • Implementing Content Library at each datacenter in a subscription model
  • vSphere 6.5 uses Content Library as an option to install new operating systems and applications on VMs using Content Library ISO file option. This will help ensure that ISOs don’t get copied to a datastore or local machine.
  • Business owners can now check their content library before downloading an ISO to avoid duplicates.
  • After ISOs Kyle will look at VM templates. VM templates can be managed by content library, but currently in OVF format.

Upgrade Order

Kyle opened a proactive SR with VMware Global Support Services (GSS) as the first step in getting ready for his upcoming upgrade. Since there are three different vSphere SSO domains, Kyle has decided that he will tackle the smallest footprint and work his way up. His plan of attack starts with the VDI environment, then the European datacenters, and finally the US datacenters.

VDI Environment
Before starting the VDI environment upgrade Kyle took a backup prior to starting.

Note: Run the following script here to get an estimated time of how long the migration process will potentially take. Add buffer to the estimated time in case of issues.

  1. The embedded vCenter Server 6.0 U3 will need to be migrated to 6.5 U1. All of the components (PSC, VC, VUM) reside on one virtual machine and will be migrated to a VCSA with embedded PSC. This will be done using the migration tool included in the VCSA 6.5 Update 1 ISO.
  2. Upgrade Horizon View server from 7.0 to 7.2. Make sure to upgrade the horizon view components in the correct order.
  3. Use VUM to upgrade the ESXi hosts for the VDI environment.
  4. Upgrade VM tools, then upgrade the horizon agent in the virtual desktops.
  5. Configure vCenter Server protocol (FTP, FTPS, HTTP, HTTPS, SCP) for backup. The native backup can be done manually or can be automated using the APIs.

European Datacenters

  1. Use the Nexus 1000v to VDS migration tool and complete network cleanup tasks.
  2. Backup the PSCs and vCenter Servers prior to upgrading. Yes, you can take a snapshot, but there is no need to shut down all the PSCs to do so. Remember the PSCs are multi-master so all the information is replicated across the vSphere domain.
  3. Upgrade all the PSCs in the vSphere SSO domain from 6.0 U3 to 6.5 U1 first. In this case, we only have two, but will be adding additional PSCs later. Mount the VCSA 6.5 U1 ISO and selecting the upgrade option from the installer menu and go through the wizard to upgrade the PSCs. Now that the environment is in mixed mode and it is important to upgrade all the vCenter Servers appliances within the vSphere SSO domain as soon as possible. There is no enforced time limit on mixed mode, but it is better to get both vCenter Server Components (PSC and vCenter Server) on the same version from a troubleshooting perspective and to gain all the new functionality in vCenter Server 6.5.
  4. Upgrade all vCenter Server Appliances within the vSphere SSO domain from 6.0 U3 to 6.5 U1 by mounting the VCSA 6.5 U1 ISO and selecting the upgrade option from the installer menu.
  5. Since there are only two PSCs, one at each site we will have to make sure that we clean up some replication agreements once we are done using vdcrepadmin. This will also give Kyle the ability to manually repoint within each site in case of a PSC failure. Although he liked the idea of the automated failover using the load balancer, he didn’t want the added complexity it brings.
    • Original PSC #1 in the London site connected to original PSC #2 in the Berlin site.
    • Deploy a new secondary PSC # 2 in the London site London, replication partner is the original PSC # 1 in the London site.
    • Deploy new a secondary PSC #2 in the Berlin site, replication partner is the original PSC # 1 in the Berlin site.
    • Use vdcrepadmin to create a new replication agreement from PSC # 2 in the Berlin site to PSC # 1 in the London site.
    • Clean up old replication agreement between original PSC #1 in the London site and original PSC #1 in the Berlin site.
  6. Upgrade vSAN.
  7. Upgrade SRM.
  8. Upgrade ESXi hosts using VUM.
  9. Upgrade VM tools / compatibility.
  10.  Upgrade VMFS.
  11.  Configure vCenter Server protocol (FTP, FTPS, HTTP, HTTPS, SCP) for backup. The native backup can be done manually or can be automated using the APIs. You only need to backup one of the PSCs in the vSphere SSO domain, but there is no harm in backing all of them up. The restore process of a PSC is only required if they all go down otherwise just deploy new.

US Datacenters

  1. Complete network cleanup tasks.
  2. Backup the PSCs and vCenter Servers prior to upgrading. Yes, you can take a snapshot, but there is no need to shut down all the PSCs to do so. Remember the PSCs are multi-master so all the information is replicated across the vSphere domain.
  3. Upgrade all the PSCs in the vSphere SSO domain from 6.0 U3 to 6.5 U1 first. In this case, we only have two, but will be adding additional PSCs later. Mount the VCSA 6.5 U1 ISO and selecting the upgrade option from the installer menu and go through the wizard to upgrade the PSCs. Now that the environment is in mixed mode and it is important to upgrade all the vCenter Servers appliances within the vSphere SSO domain as soon as possible. There is no enforced time limit on mixed mode, but it is better to get both vCenter Server Components (PSC and vCenter Server) on the same version from a troubleshooting perspective and to gain all the new functionality in vCenter Server 6.5.
  4.  Upgrade all vCenter Server Appliances within the vSphere SSO domain from 6.0 U3 to 6.5 U1 by mounting the VCSA 6.5 U1 ISO and selecting the upgrade option from the installer menu.
  5. Add a secondary PSC at each site, one at each site we will have to make sure that we clean up some replication agreements once we are done using vdcrepadmin.
    • Original PSC #1 in the Seattle site connected to original PSC#2 in the Austin site.
    • Original PSC#2 in the Austin site connected to the original PSC in the Tampa site.
    • Deploy a new secondary PSC # 2 in the Seattle site, replication partner is the original PSC # 1 in the Seattle site.
    • Deploy a new secondary PSC #2 in the Austin site, replication partner is the original PSC # 1 in the Austin site.
    • Deploy a new secondary PSC #2 in the Tampa site, replication partner is the original PSC # 1 in the Tampa site.
    • Use vdcrepadmin to create a new replication agreement from PSC # 2 in the Tampa site to PSC # 1 in the Seattle site for the ring.
    • Create a new replication agreement between newly deployed PSC #2 in the Seattle site with PSC #1 in the Austin site.
    • Create a new replication agreement between newly deployed PSC #2 in the Austin site with PSC #1 in the Tampa site.
    • Clean up the old replication agreement between original PSC #1 in the Seattle site and the original PSC #1 in the Austin site.
    • Clean up the old replication agreement between the original PSC #1 in the Austin site and the original PSC # 1 in the Tampa site.
    •  Use vdcrepadmin to validate you have nice linear topology 
  6. Upgrade vSAN
  7. Upgrade SRM.
  8. Upgrade ESXi hosts using VUM.
  9. Upgrade VM tools / compatibility.
  10. Upgrade VMFS.
  11. Configure vCenter Server protocol (FTP, FTPS, HTTP, HTTPS, SCP) for backup. The native backup can be done manually or can be automated using the APIs. You only need to backup one of the PSCs in the vSphere SSO domain, but there is no harm in backing all of them up. The restore process of a PSC is only required if they all go down otherwise just deploy new.

Validation

Kyle kept the business owners in the loop during the entire upgrade process. This started from the initial meeting to explain what the upgrade process was and what it would mean to them. The meetings also serve as a vehicle to interview the business and get a sense of their current issues and requirements. Kyle was also proactive having the business owners test their applications in a vSphere 6.5 Update 1 lab environment. This was also documented in the details of change control Kyle submitted. The change control also included a rollback plan and what testing the business owners would do in their validation testing. Once the business owners signed off on their application, Kyle was able to move to the next environment.

Conclusion

Kyle was tasked with upgrading an environment from vSphere 6.0 U3 to vSphere 6.5 Update 1. Before starting he did his research getting up to speed on what’s new with vSphere 6.5 Update 1. He also watched the new vCenter Server light board videos to learn about vCenter Server architecture, deployment, and availability as part of this upgrade process. Since vCenter High Availability was not a requirement, Kyle was able to add a secondary PSC at each site without the required load balancer reducing complexity. He backed up prior to starting the upgrade process and had a recovery/backout plan in case he needed to revert back. The important key takeaway is understanding the upgrade process and concepts and applying them to your own environment. Happy upgrading!