Home > Blogs > Horizon Tech Blog


Workspace Portal 2.1 HA cluster using internal database

This blog post explains how to setup Workspace Portal in a High Availability cluster using internal database. This blog post is based on the KB-article 2094258.

Deploying-Workspace-Portal-in-a-High-Availability-

Creating a Workspace Portal cluster is very simple. You simply clone your first VM instance of the Workspace Portal and you have a cluster. It does require a load balancer and that the Workspace Portal’s Fully qualified domain name (FQDN) is pointing to your load balancer.

This blog post is a detailed step by step instruction on how to setup Workspace Portal in a HA cluster using the Workspace Portal’s internal database.

Before we start there is a couple prerequisites needed to be sorted out first.

Prerequisites

1. Deploy your first instance of the Workspace Portal VM. Your first VM will be referred to as the Master node through out this guide.

2. Configure the FQDN for your Workspace Portal implementation.

Specify-Workspace-FQDN

NOTE: If you are having issues changing the FQDN please have a look at this blog post.

3. In order to setup Workspace Portal in a high available cluster using its internal database you must modify the hardware. Each node requires a minimum of 8GB RAM and 4 vCPU. Since we will soon clone our master node its easier to apply the VM hardware requirements on our first VM before the clone operation.

Configure-Virtual-Machine-Properties

4. Enable SSH access for root. This step is optional but since much of the configuration is made using SSH it is much easier if you allow SSH access for your root user. If needed you can always turn it off once the setup is done.

On your master node edit /etc/ssh/sshd_config and change PermitRootLogin from no to yes. Hit Esc and type :wq and Enter to save and exit.

Restart SSH daemon using command:

/etc/init.d/sshd restart

Clone your master node

Once all prerequisites are done it’s time to clone your master node.

1. Clone your Master node VM

Clone-VM

Name-clone

Make sure you give your clone a unique name. Make sure not to power on the new VM once cloning is completed.

2. Edit the clone’s vApp Properties.

Edit-vApp-options

When the cloning operation have completed edit the vApp Properties providing a unique hostname and ip-address. Make sure you have DNS configured with correct forward as well as reverse (A and PTR) records supporting the new clone.

Power on your clone.

Disable web server temporarily

Disable-web-server-temporarily

Run command:

service horizon-workspace stop

on all nodes in your cluster.

Allow ssh access for postgres user and allow connectivity to postgres server

Allow-ssh-access-for-postgres-user-and-allow-conne

Perform these steps on all nodes.

1. Add the user postgres to the wheel users group so that ssh can be used to connect as that user.

usermod -A wheel postgres

NOTE: If you are getting an error message running the command and you have copy and pasted the command try to replace – with -. Sometimes – gets changed when using copy and paste.

2. Next command will enable the postgres server to communicate to the slaves/master as needed.

iptables -I INPUT -p tcp --dport 5432 -m state --state NEW,ESTABLISHED -j ACCEPT

3. To preserve this after reboot, add it to /usr/local/horizon/conf/iptables.cfg in TOMCAT_tcp_all section within the double quotes with space delimitation.

vi /usr/local/horizon/conf/iptables.cfg

Edit and Replace line:
TOMCAT_tcp_all=”443 80 8443 8080 6443 5443″
with:
TOMCAT_tcp_all=”443 80 8443 8080 6443 5443 5432″

5. Create the Archive Directory by running command:

mkdir -p /var/vmware/vpostgres/9.2-archive

6. Change privileges on the archive folder:

chown postgres:users /var/vmware/vpostgres/9.2-archive

7. We need to set a password to the postgres user.

passwd postgres

Redo step 1-7 on all nodes.

Configure vPostgres to listen on all network interfaces for the master node

Configure-vPostgres-to-listen-on-all-network-inter

Perform these steps only on the Master node.

1. Edit postgresql.conf:

vi /db/data/postgresql.conf

Locate the line listen_addresses and replace localhost with *.

Save and exit the postgresql.conf file.

2. Edit the pg_hba.conf file

vi /db/data/pg_hba.conf

Ensure that all md5 in the local are replaced to trust.

3. Add an entry for the “master” node and each “slave” nodes. For example:

host            all             all            <master_ip>/32       trust
host            all             all            <slave_ip>/32          trust
(I suggest using spaces to separate the entries.)

Save and exit the pg_hba.conf file.

4. Restart vPostgres by running command:

/etc/init.d/vpostgres restart

Create a Replication User on master node

Create-a-Replication-User-on-master-node

Again perform these steps only on the Master node.

1. Switch to user postgres

su - postgres

2. Add the user replicate to vPostgres by running command:

/opt/vmware/vpostgres/current/share/create_replication_user replicate

Enter a password at the prompt. This password is the one you assign to the newly created role in postgres for replication.

Set Rights on the Replication User

Set-Rights-on-a-Replication-User

Perform these steps on the Master node.

1. Make sure you are logged in as root. Edit the pg_hba.conf file.

vi /db/data/pg_hba.conf

At the bottom of the file, add the line:

hostnossl           replication          replicate      0.0.0.0/0          md5

Save and exit the file.

2. Edit the postgresql.conf file:

vi /db/data/postgresql.conf

Edit and activate the following parameters:

wal_level = hot_standby
wal_sync_method = fsync
max_wal_senders = 5
wal_keep_segments = 16
replication_timeout = 60s
hot_standby = on

Exit and save the postgresql.conf file

3. Restart the vPostgres server.

/etc/init.d/vpostgres restart

Configure the Slave node

Configure-the-Slave-node

Repeat these steps on all Slave nodes.

1. Access your clone as root. Switch to the Postgres users:

su - postgres

2. Execute this command:

/opt/vmware/vpostgres/current/share/run_as_replica -h <insert master host name> -b -W -U
 replicate

Enter the password assigned to the postgres user and then the replicate role. There will be three questions asked, specify “yes” to all. Also it will take quite a bit of time while it is downloading the data from the master.

Finger print = yes
Enable WAL archiving = yes
Replace all data on this host = yes

3. Edit the file /db/data/recovery.conf

vi /db/data/recovery.conf

add the following line:

trigger_file = ‘/var/vmware/vpostgres/current/pgdata/trigger.txt’

Exit and save the recovery.conf file.

NOTE: If one needs to run the slave as master mode on failover, just touch the trigger file (touch <filename>, where the filename in the above example is /var/vmware/vpostgres/current/pgdata/ trigger.txt), and ensure that master node’s vPostgres is down. The trigger file is not to be created without a real failover, else the slave will start behaving as a master.

The vPostgres replication cannot be switched as vPostgres master till all the steps are completed in this document.

Verify the workings of vPostgres Replication

Verify-the-workings-of-vPostgres-Replication

Perform these steps on the Master node.

1. Access your master node as root and then change user to postgres.

su - postgres

2. Execute this command:

cd /opt/vmware/vpostgres/current/share

followed by:

./show_replication_status

You should see a similar output as this:

sync_priority | slave | sync_state | log_receive_position | log_replay_position | receive_delta | replay_delta+—————————————+————+———————-+———————+———– —-+————–0 |slave-host-name| async | 0/9000000 | 0/9000000 | 0 | 0(1 row)

This indicates synchronization is configured and working.

Update entries for postgres DB on all slaves and master

Update-entries-for-postgres-DB-on-all-slaves-and-m

Perform these steps on all nodes.

As the user root, edit the runtime-config.properties file:

vi /usr/local/horizon/conf/runtime-config.properties

Find the entry datastore.jdbc.url and replace the right side jdbc:postgresql://localhost/ saas?stringtype=unspecified
Change“localhost” to <hostname> or ip-address of the master node.

In the same file find datastore.jdbc.external and replace the right side to “true” So it should look like:
datastore.jdbc.external=true

Press Esc and enter :wq to save and exit.

Make sure you perform this step on all nodes.

Start the Workspace Service on the all nodes

Start-the-Workspace-Service-on-the-all-nodes

As user root, start the Workspace Service on all nodes.

service horizon-workspace start

Verify your setup

Verify-your-setup

Access the System Diagnostics Dashboard again. Verify database settings and that you now have multiple nodes listed. It may take a couple minutes before all nodes have successfully registered.

Verify database setup on clone

Verify-database-setup-on-clone

Verify cloned VM have correct database settings. You access individual node’s settings by using: https://<nodes_unique_hostname>:8443.

Verify clone is not syncing directory

Verify-clone-is-not-syncing-directory

Verify your clone is not configured to synchronize the directory. Only one Workspace Portal instance shall sync the directory.

Test Failover

Once you completed the setup it is recommended to test failover and then failing back to our initial Master node. First let’s test failover.

Shutdown Master node and trigger failover to Slave node

Shutdown-Master-node-and-trigger-failover-to-Slave

1. Make sure your Master node is shutdown (simulating a failure)

2. Initiate failover to slave node by (on the slave node) running command:

touch /var/vmware/vpostgres/current/pgdata/trigger.txt

Change database URL on Slave node

Change-database-URL-on-Slave-node

Edit runtime-config.properties on slave node

vi /usr/local/horizon/conf/runtime-config.properties

Find the entry datastore.jdbc.url and replace the right side jdbc:postgresql://<OLD_MASTER_HOSTNAME/IP>/saas?stringtype=unspecified to hostname or IP address of the slave node (new master).

Restart Workspace Server

Restart-Workspace-Server

Retart the Workspace Service on the slave nodes.

service horizon-workspace restart

Verify you can login to Workspace Portal

Verify-you-can-login-to-Workspace-Portal

Verify you can access the Workspace Portal. Looking at the System Information and Health Dashboard you can see the new database setting and that the slave node is running and its status is all green.

Steps to bring up original master as master again

First we will make old master a slave node. In this section old slave is the current database node (due to failover). Old master is our original first Workspace Portal node.

1. Ensure that the postgres account is added to wheel group on old slave:

usermod -A wheel postgres

2. Run this command on the old master (current slave):

su - postgres
/opt/vmware/vpostgres/current/share/run_as_replica -h <current_master_host> -b -W -U replicate

Ensure that you provide the password for the postgres role on db, and then the postgres user on the shell. Answer YES on all questions.

3. On the old master (still as postres) run these commands to validate replication:

cd /opt/vmware/vpostgres/current/share
./show_replication_status

4. Ensure that the runtime-config.properties of all the nodes point to the current master node (old slave).

vi /usr/local/horizon/conf/runtime-config.properties

Find the entry datastore.jdbc.url and replace the right side jdbc:postgresql://<OLD_MASTER_HOSTNAME/IP>/saas?stringtype=unspecified to hostname or IP address of the old slave node (new master).

To convert back the old master to master status

1. On the old slave (current master) run these commands:

su - postgres
/opt/vmware/vpostgres/current/share/run_as_replica -h <old_master_host> -b -W -U replicate

This will change old slave to become a slave node. This will make all the databases as read only. Again provide password and answer YES on all questions.

2. On all nodes (as root user) stop Workspace Server

/etc/init.d/horizon-workspace stop

3. Remove read only state of old master’s database. As root, run this command on old master:

rm -f /db/data/recovery.conf

4. Restart vPostgres on old master:

/etc/init.d/vpostgres restart

5. Point the runtime-config.properties of all nodes to use the DB @ the old master.

vi /usr/local/horizon/conf/runtime-config.properties

Find the entry datastore.jdbc.url and replace the right side jdbc:postgresql://<OLD_SLAVE_HOSTNAME/IP>/saas?stringtype=unspecified to hostname or IP address of the old master node.

6. On all nodes, start Workspace Server

/etc/init.d/horizon-workspace start

This should give the status of the master to the old master again, and as we ran the run_as_replica on the ‘then current master’, it made it as a replica (read-only as well).
Ensure when you do that, the UI is brought down.

On all slave nodes; make sure the recovery.conf file does contain the trigger_file details. After doing a full failover – failback sequence you must modify the recovery.conf file on the slave node that was master during the failover. Failing to do this will not allow for a failover to happen again.

If there are multiple nodes, and they all are going to run DB, they can point to either one (master/slave). Their DB setup does not need to change, but their runtime-config.properties need to be updated during the failover and the failback.

Conclusion

Congratulations! You should now have a running Workspace Portal cluster using the internal database. We have successfully made a failover to our slave node and a failback to our original master.