This blog post explains how to setup Workspace Portal in a High Availability cluster using internal database. This blog post is based on the KB-article 2094258.
Creating a Workspace Portal cluster is very simple. You simply clone your first VM instance of the Workspace Portal and you have a cluster. It does require a load balancer and that the Workspace Portal’s Fully qualified domain name (FQDN) is pointing to your load balancer.
This blog post is a detailed step by step instruction on how to setup Workspace Portal in a HA cluster using the Workspace Portal’s internal database.
Before we start there is a couple prerequisites needed to be sorted out first.
Prerequisites
1. Deploy your first instance of the Workspace Portal VM. Your first VM will be referred to as the Master node through out this guide.
2. Configure the FQDN for your Workspace Portal implementation.
NOTE: If you are having issues changing the FQDN please have a look at this blog post.
3. In order to setup Workspace Portal in a high available cluster using its internal database you must modify the hardware. Each node requires a minimum of 8GB RAM and 4 vCPU. Since we will soon clone our master node its easier to apply the VM hardware requirements on our first VM before the clone operation.
4. Enable SSH access for root. This step is optional but since much of the configuration is made using SSH it is much easier if you allow SSH access for your root user. If needed you can always turn it off once the setup is done.
On your master node edit /etc/ssh/sshd_config and change PermitRootLogin from no to yes. Hit Esc and type :wq and Enter to save and exit.
Restart SSH daemon using command:
/etc/init.d/sshd restart
Clone your master node
Once all prerequisites are done it’s time to clone your master node.
1. Clone your Master node VM
Make sure you give your clone a unique name. Make sure not to power on the new VM once cloning is completed.
2. Edit the clone’s vApp Properties.
When the cloning operation have completed edit the vApp Properties providing a unique hostname and ip-address. Make sure you have DNS configured with correct forward as well as reverse (A and PTR) records supporting the new clone.
Power on your clone.
Disable web server temporarily
Run command:
service horizon-workspace stop
on all nodes in your cluster.
Allow ssh access for postgres user and allow connectivity to postgres server
Perform these steps on all nodes.
1. Add the user postgres to the wheel users group so that ssh can be used to connect as that user.
usermod -A wheel postgres
NOTE: If you are getting an error message running the command and you have copy and pasted the command try to replace – with -. Sometimes – gets changed when using copy and paste.
2. Next command will enable the postgres server to communicate to the slaves/master as needed.
iptables -I INPUT -p tcp --dport 5432 -m state --state NEW,ESTABLISHED -j ACCEPT
3. To preserve this after reboot, add it to /usr/local/horizon/conf/iptables.cfg in TOMCAT_tcp_all section within the double quotes with space delimitation.
vi /usr/local/horizon/conf/iptables.cfg
Edit and Replace line:
TOMCAT_tcp_all=”443 80 8443 8080 6443 5443″
with:
TOMCAT_tcp_all=”443 80 8443 8080 6443 5443 5432″
5. Create the Archive Directory by running command:
mkdir -p /var/vmware/vpostgres/9.2-archive
6. Change privileges on the archive folder:
chown postgres:users /var/vmware/vpostgres/9.2-archive
7. We need to set a password to the postgres user.
passwd postgres
Redo step 1-7 on all nodes.
Configure vPostgres to listen on all network interfaces for the master node
Perform these steps only on the Master node.
1. Edit postgresql.conf:
vi /db/data/postgresql.conf
Locate the line listen_addresses and replace localhost with *.
Save and exit the postgresql.conf file.
2. Edit the pg_hba.conf file
vi /db/data/pg_hba.conf
Ensure that all md5 in the local are replaced to trust.
3. Add an entry for the “master” node and each “slave” nodes. For example:
host all all <master_ip>/32 trust
host all all <slave_ip>/32 trust
(I suggest using spaces to separate the entries.)
Save and exit the pg_hba.conf file.
4. Restart vPostgres by running command:
/etc/init.d/vpostgres restart
Create a Replication User on master node
Again perform these steps only on the Master node.
1. Switch to user postgres
su - postgres
2. Add the user replicate to vPostgres by running command:
/opt/vmware/vpostgres/current/share/create_replication_user replicate
Enter a password at the prompt. This password is the one you assign to the newly created role in postgres for replication.
Set Rights on the Replication User
Perform these steps on the Master node.
1. Make sure you are logged in as root. Edit the pg_hba.conf file.
vi /db/data/pg_hba.conf
At the bottom of the file, add the line:
hostnossl replication replicate 0.0.0.0/0 md5
Save and exit the file.
2. Edit the postgresql.conf file:
vi /db/data/postgresql.conf
Edit and activate the following parameters:
wal_level = hot_standby
wal_sync_method = fsync
max_wal_senders = 5
wal_keep_segments = 16
replication_timeout = 60s
hot_standby = on
Exit and save the postgresql.conf file
3. Restart the vPostgres server.
/etc/init.d/vpostgres restart
Configure the Slave node
Repeat these steps on all Slave nodes.
1. Access your clone as root. Switch to the Postgres users:
su - postgres
2. Execute this command:
/opt/vmware/vpostgres/current/share/run_as_replica -h <insert master host name> -b -W -U replicate
Enter the password assigned to the postgres user and then the replicate role. There will be three questions asked, specify “yes” to all. Also it will take quite a bit of time while it is downloading the data from the master.
Finger print = yes
Enable WAL archiving = yes
Replace all data on this host = yes
3. Edit the file /db/data/recovery.conf
vi /db/data/recovery.conf
add the following line:
trigger_file = ‘/var/vmware/vpostgres/current/pgdata/trigger.txt’
Exit and save the recovery.conf file.
NOTE: If one needs to run the slave as master mode on failover, just touch the trigger file (touch <filename>, where the filename in the above example is /var/vmware/vpostgres/current/pgdata/ trigger.txt), and ensure that master node’s vPostgres is down. The trigger file is not to be created without a real failover, else the slave will start behaving as a master.
The vPostgres replication cannot be switched as vPostgres master till all the steps are completed in this document.
Verify the workings of vPostgres Replication
Perform these steps on the Master node.
1. Access your master node as root and then change user to postgres.
su - postgres
2. Execute this command:
cd /opt/vmware/vpostgres/current/share
followed by:
./show_replication_status
You should see a similar output as this:
sync_priority | slave | sync_state | log_receive_position | log_replay_position | receive_delta | replay_delta+—————————————+————+———————-+———————+———– —-+————–0 |slave-host-name| async | 0/9000000 | 0/9000000 | 0 | 0(1 row)
This indicates synchronization is configured and working.
Update entries for postgres DB on all slaves and master
Perform these steps on all nodes.
As the user root, edit the runtime-config.properties file:
vi /usr/local/horizon/conf/runtime-config.properties
Find the entry datastore.jdbc.url and replace the right side jdbc:postgresql://localhost/ saas?stringtype=unspecified
Change“localhost” to <hostname> or ip-address of the master node.
In the same file find datastore.jdbc.external and replace the right side to “true” So it should look like:
datastore.jdbc.external=true
Press Esc and enter :wq to save and exit.
Make sure you perform this step on all nodes.
Start the Workspace Service on the all nodes
As user root, start the Workspace Service on all nodes.
service horizon-workspace start
Verify your setup
Verify cloned VM have correct database settings. You access individual node’s settings by using: https://<nodes_unique_hostname>:8443.
Verify your clone is not configured to synchronize the directory. Only one Workspace Portal instance shall sync the directory.
Test Failover
Once you completed the setup it is recommended to test failover and then failing back to our initial Master node. First let’s test failover.
Shutdown Master node and trigger failover to Slave node
1. Make sure your Master node is shutdown (simulating a failure)
2. Initiate failover to slave node by (on the slave node) running command:
touch /var/vmware/vpostgres/current/pgdata/trigger.txt
Change database URL on Slave node
Edit runtime-config.properties on slave node
vi /usr/local/horizon/conf/runtime-config.properties
Find the entry datastore.jdbc.url and replace the right side jdbc:postgresql://<OLD_MASTER_HOSTNAME/IP>/saas?stringtype=unspecified to hostname or IP address of the slave node (new master).
Restart Workspace Server
Retart the Workspace Service on the slave nodes.
service horizon-workspace restart
Verify you can login to Workspace Portal
Verify you can access the Workspace Portal. Looking at the System Information and Health Dashboard you can see the new database setting and that the slave node is running and its status is all green.
Steps to bring up original master as master again
First we will make old master a slave node. In this section old slave is the current database node (due to failover). Old master is our original first Workspace Portal node.
1. Ensure that the postgres account is added to wheel group on old slave:
usermod -A wheel postgres
2. Run this command on the old master (current slave):
su - postgres
/opt/vmware/vpostgres/current/share/run_as_replica -h <current_master_host> -b -W -U replicate
Ensure that you provide the password for the postgres role on db, and then the postgres user on the shell. Answer YES on all questions.
3. On the old master (still as postres) run these commands to validate replication:
cd /opt/vmware/vpostgres/current/share
./show_replication_status
4. Ensure that the runtime-config.properties of all the nodes point to the current master node (old slave).
vi /usr/local/horizon/conf/runtime-config.properties
Find the entry datastore.jdbc.url and replace the right side jdbc:postgresql://<OLD_MASTER_HOSTNAME/IP>/saas?stringtype=unspecified to hostname or IP address of the old slave node (new master).
To convert back the old master to master status
1. On the old slave (current master) run these commands:
su - postgres
/opt/vmware/vpostgres/current/share/run_as_replica -h <old_master_host> -b -W -U replicate
This will change old slave to become a slave node. This will make all the databases as read only. Again provide password and answer YES on all questions.
2. On all nodes (as root user) stop Workspace Server
/etc/init.d/horizon-workspace stop
3. Remove read only state of old master’s database. As root, run this command on old master:
rm -f /db/data/recovery.conf
4. Restart vPostgres on old master:
/etc/init.d/vpostgres restart
5. Point the runtime-config.properties of all nodes to use the DB @ the old master.
vi /usr/local/horizon/conf/runtime-config.properties
Find the entry datastore.jdbc.url and replace the right side jdbc:postgresql://<OLD_SLAVE_HOSTNAME/IP>/saas?stringtype=unspecified to hostname or IP address of the old master node.
6. On all nodes, start Workspace Server
/etc/init.d/horizon-workspace start
This should give the status of the master to the old master again, and as we ran the run_as_replica on the ‘then current master’, it made it as a replica (read-only as well).
Ensure when you do that, the UI is brought down.
On all slave nodes; make sure the recovery.conf file does contain the trigger_file details. After doing a full failover – failback sequence you must modify the recovery.conf file on the slave node that was master during the failover. Failing to do this will not allow for a failover to happen again.
If there are multiple nodes, and they all are going to run DB, they can point to either one (master/slave). Their DB setup does not need to change, but their runtime-config.properties need to be updated during the failover and the failback.
Conclusion
Congratulations! You should now have a running Workspace Portal cluster using the internal database. We have successfully made a failover to our slave node and a failback to our original master.
Pingback: VMware Workspace Portal 2.1 & Database Support | vChips
Pingback: Horizon Workspace Portal 2.1 – load balancing using BIG-IP F5 | Piszki Lab | EN
Pingback: Upgrading Workspace Portal to VMware Identity Manager (vIDM) | that virtual boy