Multi-Cloud

Tips for Upgrading to vSphere 6.5 in a Large-Scale Environment

by: VMware Staff II, Cloud SRE Architect Tom Ralph

As a first adopter of VMware products under the VMware on VMware program, IT was able to deploy the vSphere 6.5 beta, release candidate, and general availability versions before they were released to public.

Our goal in this deployment was two-fold. First, we wanted to take advantage of vSphere’s new features, enhancements, and benefits such as greater scalability, better performance, and improved management capabilities as soon as possible.

Second, we sought to try out vSphere in a production environment, then report software bugs and share our feedback with the R&D teams to help strengthen the final product. We also share what we learn with customers as we deploy or upgrade to new software versions.

This deployment required a lot of upfront planning due to our large, highly complex IT environment.  It’s easy to upgrade a standalone vSphere environment. However, when your IT operations consist of 6 data centers, 22+ cloud instances and 175+ vCenter server instances, things get complicated quickly. Our vSphere environment includes cross-vCenter NSX, vSphere Site Recovery Manager paired sites, and other VMware and third-party applications, so we had to perform the upgrade in phases.

Why vSphere 6.5

vSphere 6.5 offered a multitude of new features that helped streamline the efficiency of our environment. These features have significantly improved how we manage our operations and our service delivery. Here are some of the features that stand out:

  • The vCenter Server Appliance is the first VMware Appliance to run on Photon OS, a Linux operating system (OS) optimized for virtualization. Photon offers a three times performance gain over Windows and a significant reduction in boot and restart times, including a less than five-minute API response time on reboots. Even better, the Appliance does not require a separate OS license.
  • vSphere 6.5 gave us not only feature parity with Windows, but an array of new, exclusive vCenter features, including integration with the VMware Update Manager, which makes deployment and configuration a snap; an improved appliance management interface for performing simple tasks; and an integrated backup and restore mechanism that eliminates the need for third-party backup solutions.
  • We’ve all gone through the hassle of having a component or vCenter OS break and the management console become unavailable. vSphere has a native vCenter High Availability (HA) solution with an active/passive/witness architecture. This means that vCenter is no longer a single point of failure and provides a five-minute recovery time objective (RTO) for disaster recovery. This HA capability does not depend on shared storage, raw device mapping (RDMs), or external databases.
  • Patching is much easier now. The OS and vCenter Server Appliance patches are maintained and provided by VMware in a single package.
  • We can now quickly deploy the vCenter using an open virtual appliance (OVA) wizard that provisions the OS, database, and vCenter application. This eliminates the Windows requirement to install the OS, patches, database, and application separately. Even better, the vPostgres database has been tuned for vCenter and does not require a separate SQL or Oracle database. All database issues are addressed by VMware.

 Upgrade Process – Lessons Learned

These new features are important, but the real story lies in the actual upgrade process. We followed the traditional phased deployment: a test lifecycle followed by stage, then production implementation. This approach helped us gain confidence, with the knowledge that production is always different. We did not proceed to the next phase until we resolved all the issues of the current stage.

During every test, we made sure we were aware of the procedures, could understand and verify the impact, and resolve issues before we moved to the next stage. We followed the same procedures in stage, with attention to the same areas, but recognized that we would encounter new issues. Here are a few tips to keep in mind before and during the upgrade process:

  • Familiarize yourself with the products by testing, testing, testing to gain confidence and understand the steps and issues encountered during the upgrade.
  • Be aware of the “scale” factor–some issues only manifest at scale.
  • Account for the previous version history.
  • Read the Release Notes & Upgrade Guide to understand product capabilities, product support, limitations, and known issues.

Performance is an important area to verify in stage because it is so similar to production. When we were ready to upgrade to production, we made sure we had a contingency plan to avoid a prolonged downtime and failures.

We took a component approach, starting with the NSX Manager, as shown below.  We automated the upgrades for NSX, Platform Services Controller (PSC), and vSphere hosts. Utilizing VMware vSphere features, we were able to migrate our VMs between hosts to achieve a zero-downtime upgrade for all VMs and their services.

Top Migration ConsiderationsWe learned several lessons during this upgrade. The lessons learned below are based on the upgrade in our environment and may not apply to yours.

Lesson 1: Verify the Interoperability matrix for VMware, third-party solutions, hardware, firmware, and drivers to make sure they are compatible with vCenter 6.5. To retain the same VM name for vCenter/PSC VMs, we renamed the existing 6.0 names to avoid conflicts. We verified the connectivity of plug-ins before and after the upgrade to make sure none were missing.

Lesson 2: DNS and NTP are critical to the upgrade process. We made sure the DNS was available throughout the upgrade process and name resolution worked fast enough and always used the same set of NTP servers across the vSphere domain to avoid drift. We recommend using static IP addresses for both vCenter and PSC.

Lesson 3: A few additional tips of things we found out along the way:

  • Faster upgrades. Large files can slow down the upgrade so avoid migrating historical performance, event, alarms, or other statistical information.
  • Automate where possible. With auto-deploy, you can easily upgrade and downgrade hosts by simply rebooting them.
  • Verify monitoring. Verify that your monitoring solutions are configured to the new services and OS version of 6.5.
  • Test existing scripts. Make sure to test and update existing, API/SDK, and PowerCLI scripts.
  • Get to know the VMware Knowledge Base (KB). The KB has tons of information on vSphere and vCenter migrations and upgrades.

Stay tuned for more information from VMware IT on how we solve challenges that are part of managing a private cloud.

To learn more, attend VMworld 2018 session VIN1180, How VMware IT Upgraded to VMware vSphere 6.7 without End-User Impact.

VMware on VMware blogs are written by IT subject matter experts sharing stories about our digital transformation using VMware products and services in a global production environment. Contact your sales rep or [email protected] to schedule a briefing on this topic. Visit the VMware on VMware microsite and follow us on Twitter.