By Romain Decker
VMware introduced a new component with vSphere 6, the Platform Services Controller (PSC). Coupled with vCenter, the PSC provides several core services, such as Certificate Authority, License service and Single Sign-On (SSO).
Multiple external PSCs can be deployed serving one (or more) service, such as vCenter Server, Site Recovery Manager or vRealize Automation. When deploying the Platform Services Controller for multiple services, availability of the Platform Services Controller must be considered. In some cases, having more than one PSC deployed in a highly available architecture is recommended. When configured in high availability (HA) mode, the PSC instances replicate state information between each other, and the external products (vCenter Server for example) interact with the PSCs through a load balancer.
This post covers the configuration of an HA PSC deployment with the benefits of using NSX-v 6.2 load balancing feature.
Due to the relationship between vCenter Server and NSX Manager, two different scenarios emerge:
- Scenario A where both PSC nodes are deployed from an existing management vCenter. In this situation, the management vCenter is coupled with NSX which will configure the Edge load balancer. There are no dependencies between the vCenter Server(s) that will use the PSC in HA mode and NSX itself.
- Scenario B where there is no existing vCenter infrastructure (and thus no existing NSX deployment) when the first PSC is deployed. This is a classic “chicken and egg” situation, as the NSX Manager that is actually responsible for load balancing the PSC in HA mode is also connected to the vCenter Server that use the PSC virtual IP.
While scenario A is straightforward, you need to respect a specific order for scenario B to prevent any loss of connection to the Web client during the procedure. The solution is to deploy a temporary PSC in a temporary SSO site to do the load balancer configuration, and to repoint the vCenter Server to the PSC virtual IP at the end. Both path are summarized in the workflow below.
NSX Edge supports two deployment modes: one-arm mode and inline mode (also referred to as transparent mode). While inline mode is also possible, NSX load balancer will be deployed in a one-arm mode in our situation, as this model is more flexible and because we don’t require full visibility into the original client IP address.
Description of the environment:
- Software versions: VMware vCenter Server 6.0 U1 Appliance, ESXi 6.0 U1, NSX-v 6.2
- NSX Edge Services Gateway in one-arm mode
- Active/Passive configuration
- VLAN-backed portgroup (distributed portgroup on DVS)
- General PSC/vCenter and NSX prerequisites validated (NTP, DNS, resources, etc.)
To offer SSO in HA mode, two PSC servers have to be installed with NSX load balancing them in active/standby mode. PSC in Active/Active mode is currently not supported by PSC.
The way SSO operates, it is not possible to configure it as active/active. The workaround for the NSX configuration is to use an application rule and to configure two different pools (with one PSC instance in each pool). The application rule will send all traffic to the first pool as long as the pool is up, and will switch to the secondary pool if the first PSC is down.
The following is a representation of the NSX-v and PSC logical design.
Each step number refers to the above workflow diagram. You can take snapshots at regular intervals to be able to rollback in case of a problem.
Step 1: Deploy infrastructure
This first step consists of deploying the required vCenter infrastructure before starting the configuration.
A. For scenario A: Deploy two PSC nodes in the same SSO site.
B. For scenario B:
- Deploy a first standalone Platform Services Controller (PSC-00a). This PSC will be temporary used during the configuration.
- Deploy a vCenter instance against the PSC-00a just deployed.
- Deploy NSX Manager and connect it to the vCenter.
- Deploy two other Platform Services Controllers in the same SSO domain (PSC-01a and PSC-02a) but in a new site. Note: vCenter will still be pointing to PSC-00a at this stage. Use the following options:
Step 2 (both scenarios): Configure both PSCs as an HA pair (up to step D in KB 2113315).
Now that all required external Platform Services Controller appliances are deployed, it’s time to configure high availability.
A. PSC pairing
- Download the PSC high availability configuration scripts from the Download vSphere page and extract the content to /ha on both PSC-01a and PSC-02a nodes. Note: Use the KB 2107727 to enable the Bash shell in order to copy files in SCP into the appliances.
- Run the following command on the first PSC node:
python gen-lb-cert.py --primary-node --lb-fqdn=load_balanced_fqdn --password=<yourpassword>
Note: The load_balanced_fqdn parameter is the FQDN of the PSC Virtual IP of the load balancer. If you don’t specify the option –password option, the default password will be « changeme ».
python gen-lb-cert.py --primary-node --lb-fqdn=psc-vip.sddc.lab --password=brucewayneisbatman
- On the PSC-01a node, copy the content of the directory /etc/vmware-sso/keys to /ha/keys (a new directory that needs to be created).
- Copy the content of the /ha folder from the PSC-01a node to the /ha folder on the additional PSC-02a node (including the keys copied in the step before).
- Run the following command on the PSC-02a node:
python gen-lb-cert.py --secondary-node --lb-fqdn=load_balanced_fqdn --lb-cert-folder=/ha --sso-serversign-folder=/ha/keys
Note: The load_balanced_fqdn parameter is the FQDN of the load balancer address (or VIP).
python gen-lb-cert.py --secondary-node --lb-fqdn=psc-vip.sddc.lab --lb-cert-folder=/ha --sso-serversign-folder=/ha/keys
Note: If you’re following the KB 2113315 don’t forget to stop the configuration here (end of section C in the KB).
Step 3: NSX configuration
An NSX edge device must be deployed and configured for networking in the same subnet as the PSC nodes, with at least one interface for configuring the virtual IP.
A. Importing certificates
Enter the configuration of the NSX edge services gateway on which to configure the load balancing service for the PSC, and add a new certificate in the Settings > Certificates menu (under the Manage tab). Use the content of the previously generated /ha/lb.crt file as the load balancer certificate and the content of the /ha/lb_rsa.key file as the private key.
B. General configuration
Enable the load balancer service and logging under the global configuration menu of the load balancer tab.
C. Application profile creation
An application profile defines the behavior of a particular type of network traffic. Two application profiles have to be created: one for HTTPS protocol and one for other TCP protocols.
||HTTPS application profile
||TCP application profile
|Enable Pool Side SSL
|Configure Service Certificate
Note: The other parameters shall be left with their default values.
D. Creating pools
The NSX load balancer virtual server type HTTP/HTTPS provide web protocol sanity check for their backend servers pool. However, we do not want that sanity check their backend servers pool for the TCP virtual server. For that reason, different pools must be created for the PSC HTTPS virtual IP and TCP virtual IP.
Four pools have to be created: two different pools for each virtual server (with one PSC instance per pool). An application rule will be defined to switch between them in case of a failure: traffic will be send to the first pool as long as the pool is up, and will switch to the secondary pool if the first PSC is down.
Note: while you could use a custom HTTPS healthcheck, I selected the default TCP Monitor in this example.
E. Creating application rules
This application rule will contain the logic that will perform the failover between the pools (for each virtual server) corresponding to the active/passive behavior of the PSC high availability mode. The ACL will check if the primary PSC is up; if the first pool is not up the rule will switch to the secondary pool.
The first application rule will be used by the HTTPS virtual server to switch between the corresponding pools for the HTTPS backend servers pool.
# Detect if pool "pool_psc-01a-http" is still UP
acl pool_psc-01a-http_down nbsrv(pool_psc-01a-http) eq 0
# Use pool " pool_psc-02a-http " if "pool_psc-01a-http" is dead
use_backend pool_psc-02a-http if pool_psc-01a-http_down
The second application rule will be used by the TCP virtual server to switch between the corresponding pools for the TCP backend servers pool.
# Detect if pool "pool_psc-01a-tcp" is still UP
acl pool_psc-01a-tcp_down nbsrv(pool_psc-01a-tcp) eq 0
# Use pool " pool_psc-02a-tcp " if "pool_psc-01a-tcp" is dead
use_backend pool_psc-02a-tcp if pool_psc-01a-tcp_down
F. Configuring virtual servers
Two virtual servers have to be created: one for HTTPS protocol and one for the other TCP protocols.
||HTTPS Virtual Server
||TCP Virtual Server
||IP Address corresponding to the PSC virtual IP
* Although this procedure is for a fresh install, you could target the same architecture with SSO 5.5 being upgraded to PSC. If you plan to upgrade from SSO 5.5 HA, you must add the legacy SSO port 7444 to the list of ports in the TCP virtual server.
Step 4 (both scenarios)
Now it’s time to finish the PSC HA configuration (step E of KB 2113315). Update the endpoint URLs on PSC with the load_balanced_fqdn by running this command on the first PSC node.
python lstoolHA.py --hostname=psc_1_fqdn --lb-fqdn=load_balanced_fqdn --lb-cert-folder=/ha --user=Administrator@vsphere.local
Note: psc_1_fqdn is the FQDN of the first PSC-01a node and load_balanced_fqdn is the FQDN of the load balancer address (or VIP).
python lstoolHA.py --hostname=psc-01a.sddc.lab --lb-fqdn=psc-vip.sddc.lab --lb-cert-folder=/ha --user=Administrator@vsphere.local
A. Scenario A: Deploy any new production vCenter Server or other components (such as vRA) against the PSC Virtual IP and enjoy!
B. Scenario B
The situation is the following: The vCenter is currently still pointing to the first external PSC instance (PSC-00a), and two other PSC instances are configured in HA mode, but are not used.
Introduced in vSphere 6.0 Update 1, it is now possible to move a vCenter Server between SSO sites within a vSphere domain (see KB 2131191 for more information). In our situation, we have to re-point the existing vCenter that is currently connected to the external PSC-00a to the PSC Virtual IP:
- Download and replace the cmsso-util file on your vCenter Server using the actions described in the KB 2113911.
- Re-point the vCenter Server Appliance to the PSC virtual IP to the final site by running this command:
/bin/cmsso-util repoint --repoint-psc load_balanced_fqdn
Note: The load_balanced_fqdn parameter is the FQDN of the load balancer address (or VIP).
/bin/cmsso-util repoint --repoint-psc psc-vip.sddc.lab
Note: This command will also restart vCenter services.
- Move the vCenter services registration to the new SSO site. When a vCenter Server is installed, it creates service registrations that it issues to start the vCenter Server services. These service registrations are written to a specific site of the Platform Services Controller (PSC) that was used during the installation. Use the following command to update the vCenter Server services registrations (parameters will be asked at the prompt).
After the command, you end up with the following.
- Log in to your vCenter Server instance by using the vSphere Web Client to verify that the vCenter Server is up and running and can be managed.
In the context of the scenario B, you can always re-point to the previous PSC-00a if you cannot log, or if you have an error message. When you have confirmed that everything is working, you can remove the temporary PSC (PSC-00a) from the SSO domain with this command (KB 2106736):
cmsso-util unregister --node-pnid psc-00a.sddc.lab --username firstname.lastname@example.org --passwd VMware1!
Finally, you can safely decommission PSC-00a.
Note: If your NSX Manager was configured with Lookup Service, you can update it with the PSC virtual IP.
Romain Decker is a Senior Solutions Architect member of Professional Services Engineering (PSE) for the Software-Defined Datacenter (SDDC) portfolio – a part of the Global Technical & Professional Solutions (GTPS) team.