Introduction
As mentioned in the earlier post , VMware Cloud on AWS is an on-demand service that enables customers to run applications across vSphere-based cloud environments with access to a broad range of AWS services.
Powered by VMware Cloud Foundation, this service integrates vSphere, vSAN and NSX along with VMware vCenter management, and is optimized to run on dedicated, elastic, bare-metal AWS infrastructure. ESXi hosts in VMware Cloud on AWS reside in an AWS availability Zone (AZ) and are protected by vSphere HA.
The paper Migrating Oracle Workloads to VMware Cloud on AWS describes the deployment, migration options along with best practices when migrating Oracle Standalone and Oracle RAC on VMware on-premises (vSphere with traditional Storage or VMware HCI vSAN ) to Stretched Clusters for VMware Cloud on AWS using the approach below
- Validate functionality of current on-premise RAC setup
- Migrate DR RAC ‘prddg’ from on-premise Site B to Stretched Cluster for VMware Cloud on AWS
- Take advantage of the Stretched Cluster for VMware Cloud on AWS using the multi-AZ functionality by
- Adding new nodes to the migrated DR RAC ‘prddg’
- Create new Oracle RAC ‘vmcrac’
This post focuses on to effectively provide Site level HA along with Infrastructure level HA to an Oracle RAC on Stretched Clusters for VMware Cloud on AWS using vSphere Tags and Attributes.
Architecture Diagram – On Premise and VMware Cloud on AWS
As described in the paper above, there are two components to the solution architecture:
- On-premises vSphere cluster (Site A and Site B)
- Stretched Clusters for VMware Cloud on AWS
The on-premises setup has a two-site vSphere cluster configuration:
- Site A runs Production workloads
- Site B runs Dev , Test and Disaster Recovery (DR) workloads
- Both Site A and Site B are in Hybrid Linked mode.
- Site A and Site B vSphere clusters have access to dedicated storage
Site A comprises of a 4-node vSphere 6.7 cluster:
- Each ESXi server is a Dell PowerEdge R730xd Rack Server with 2 sockets , 14 cores each with
Intel® Xeon® Processor E5-2695 v3 at 2.30GHz with Hyper-Threading Technology and 384GB of RAM. - All ESXi servers has access to a NAS storage
Site B comprises of a 4-node vSphere 6.7 cluster:
- Each ESXi server is a Dell PowerEdge R630 Rack Server with 2 sockets , 14 cores each with
Intel Xeon Processor E5-2680 v4 at 2.40GHz with Hyper-Threading Technology and 256GB of RAM. - All ESXi servers has access to a NAS storage
The Stretched Clusters for VMware Cloud on AWS setup has the following configuration:
- A 6-node stretched cluster for VMware Cloud on AWS is setup across two AZs, 3 servers in AZ us-west-2band 3 servers in AZ us-west-2c.
- Each ESXi server is an Amazon EC2 i3p.16xlarge with 2 sockets , 18 cores each with Intel Xeon Processor E5-2686 v4 at 2.30GHz with Hyper-Threading Technology and 512GB RAM memory.
- Storage is provided by the HCI vSAN instance.
More information on this can be found here.
Anti-Affinity rules for Oracle RAC VMs
An important aspect of any Oracle RAC configuration on VMware SDDC setup is to ensure that Oracle RAC VM’s are not scheduled on the same ESXi server as it negates the HA proposition that VMware SDDC has to offer i.e. Infrastructure level HA provided by VMware SDDC to complement the Application level HA provided by Oracle RAC.
To achieve that , we need to specify VM anti-affinity rules for RAC VMs which forces the VMs to remain apart during failover actions.
More details on VM Anti-Affinity can be found here.
High Availability across Sites
In case of a Extended RAC cluster across multiple sites or AZ, best practice is to spread the RAC VM’s across multiple sites to ensure HA across the sites i.e. Site level HA , in addition to the Infrastructure and Application level HA , which ensures all RAC VM’s do not land on the same site.
More details on Oracle Extended RAC can be found here.
On-Premise – RAC VM Anti-Affinity setup on VMware SDDC
As mentioned above , Site A comprises of a 4-node vSphere 6.7 cluster as shown below. This is a traditional deployment within a site (not an extended cluster scenario).
The 2 RAC VM’s ‘prdrac01’ and ‘prdrac02’ , of the production 2-node Oracle RAC prdrac are as shown below. RAC VM ‘prdrac01’ and ‘prdrac02’ are running on two different ESXi servers.
RAC VM prdrac01
RAC VM prdrac02
We can create VM-Host affinity rules to specify whether or not the members of a selected virtual machine DRS group can run on the members of a specific host DRS group.
The “MUST” Anti-Affinity rules for RAC VM’s ‘prdrac01’ and ‘prdrac02’ are setup as shown below. The ‘Must not run on hosts in group’ specifies that Virtual machines in VM Group must not run on hosts in a Host Group.
More information about the VM Anti-Affinity rules can be found here.
What’s New September 6th, 2018 (SDDC Version 1.5)
With September 6th, 2018 (SDDC Version 1.5), new features for VMware Cloud on AWS which are now available now includes :
Compute Policies
Compute Policies enable customers to define VM placement constraints as preferential policies in their SDDC by leveraging inventory tags. In a multi-cluster environment, a single policy can be defined to constrain the placement of tagged VMs using the following capabilities:
- Simple VM-Host Affinity
This capability constrains the placement of tagged VMs on specifically tagged hosts in each cluster, thereby circumventing the need to define rules on a per-cluster basis. - VM-VM Anti-Affinity
This policy allows the user to specify anti-affinity relations between a group of VMs. These groups of VMs are identified using vSphere tags. The policy automatically applies to all the VMs that have the tags specified in the policy. DRS will try to ensure that all the VMs in the vCenter that have the policy’s VM-tag, are preferably placed on separate hosts. - Disable DRS vMotion
This policy allows the user to specify that a virtual machine not be migrated away from the host on which it was powered-on, unless the host is placed into maintenance mode.
More information on this can be found here.
vSphere Tags and Attributes
Tags and attributes allow you to attach metadata to objects in the vSphere inventory to make it easier to sort and search for these objects.
A tag is a label that you can apply to objects in the vSphere inventory. When you create a tag, you assign that tag to a category. Categories allow you to group related tags together. When you define a category, you can specify the object types for its tags, and whether more than one tag in the category can be applied to an object.
For vSphere Tags and Attributes, VMware Cloud on AWS supports the same set of tasks as an on-premises SDDC
More information on this can be found here.
VMware Cloud on AWS Setup
The Stretched Clusters for VMware Cloud on AWS setup shows a 6-node Stretched Cluster for VMware Cloud on AWS which is setup across two AZs, 3 servers in AZ ‘us-west-2b’ and 3 servers in AZ ‘us-west-2c’.
The Stretched Clusters for VMware Cloud on AWS has ESXi hosts which spans the two AWS availability zones. The fault domains listed correlate to the given AWS availability zone name.
Stretched Clusters for VMware Cloud on AWS – RAC VM Anti-Affinity & HA across AZ’s
In this case, the Stretched Clusters for VMware Cloud on AWS has 2 Fault Domains / AZ , ‘us-west-2b’ and ‘us-west-2c’ , so the Oracle RAC VM’s are an Extended RAC setup across Multi AZ’s .
Below needs to be setup to ensure Site level HA in addition to Infrastructure and Application level HA
- Anti-Affinity rules for RAC VM’s to ensure no 2 RAC VM’s land on the same ESXI server to ensure Infrastructure level HA
- Spread RAC VM’s across AZ’s to ensure all RAC VM’s do not land in the same AZ to ensure Site level HA
High-level steps for Anti-Affinity rule setup for RAC VM’s
For enforcing Anti-Affinity rules for RAC VM’s for Infrastructure level HA, the high level steps are :
- Create a new Category ‘Category-RAC-VM-AntiAffinity‘ for RAC VM’s with Associable Object Type as ‘Virtual Machine’
- Create a new Tag ‘Tag-RAC-VM-AntiAffinity‘ for RAC VM’s for and associate this Tag to category ‘Category-RAC-VM-AntiAffinity’
- Apply Tag ‘Tag-RAC-VM-AntiAffinity’ to all RAC VM’s
- Enforce RAC VM Anti Affinity policy by creating a new Compute Policy ‘RAC VM – Anti Affinity’
- Compute policy has policy type set to ‘VM – VM anti affinity’
- Category set to ‘Category-RAC-VM-AntiAffinity’
- Tag set to ‘Tag-RAC-VM-AntiAffinity’
Create a new Category ‘Category-RAC-VM-AntiAffinity‘
Create a new Tag ‘Tag-RAC-VM-AntiAffinity‘ for RAC VM’s
Apply Tag ‘Tag-RAC-VM-AntiAffinity’ to RAC VM ‘prddg01’
Apply Tag ‘Tag-RAC-VM-AntiAffinity’ to RAC VM ‘prddg02’
We can see the tag ‘Tag-RAC-VM-AntiAffinity’ associated with RAC VM ‘prddg01’
We can see the tag ‘Tag-RAC-VM-AntiAffinity’ associated with RAC VM ‘prddg02’
Enforce RAC VM Anti Affinity policy by creating a new Compute Policy ‘RAC VM – Anti Affinity’
New Compute Policy ‘’RAC VM – Anti Affinity’ is created as below
As we can from above , we were able to enforce Anti-Affinity rules for RAC VM’s thereby providing Infrastructure level HA.
High-level steps for RAC VM’s setup across Multi AZ
The high level steps for spreading RAC VM’s across multiple sites or AZ’s are as shown below.
- Assign tags to ESXi servers in both AZ’s (us-west-2b and us-west-2c)
- Create 2 new host Categories , one category type for each AZ
- ‘Category-FaultDomain-us-west-2b’ with Associable Object Types as ‘Host’
- ‘Category-FaultDomain-us-west-2c’ with Associable Object Types as ‘Host’
- Create 2 new host Tags, one tag for each AZ
- ‘Tag-Host-us-west-2b’ with category as ‘Category-FaultDomain-us-west-2b’
- ‘Tag-Host-us-west-2c’ with category as ‘Category-FaultDomain-us-west-2c’
- Assign appropriate Tags to every ESXi servers in both the AZ’s
- ‘Tag-Host-us-west-2b’ is associated with servers in ‘us-west-2b’
- ‘Tag-Host-us-west-2c’ is associated with servers in ‘us-west-2c’
- Create 2 new host Categories , one category type for each AZ
- Spread RAC VM’s across 2 AZ’s
- Create a new Category ‘Category-RAC-VM-AZ’ with a multi-valued Associable Object Type set to both ‘Host’ and ‘Virtual Machine’
- Create 2 new Tags , ‘Tag-RAC-VM-AZ1’ and ‘Tag-RAC-VM-AZ2’, both associated with ‘Category-RAC-VM-AZ’
- Apply Tag ‘Tag-RAC-VM-AZ1’ to RAC VM1
- Apply Tag ‘Tag-RAC-VM-AZ2’ to RAC VM2
- Create 2 new Compute policies
- Compute Policy ‘CP-RAC-VM-AZ1’ for ‘us-west-2b’ with VM tag set to (Category-RAC-VM-AZ,Tag-RAC-VM-AZ1) and Host tag set to (Category-FaultDomain-us-west-2b,Tag-Host-us-west-2b)
- Compute Policy ‘CP-RAC-VM-AZ2’ for ‘us-west-2c’ with VM tag set to (Category-RAC-VM-AZ,Tag-RAC-VM-AZ2) and Host tag set to (Category-FaultDomain-us-west-2c,Tag-Host-us-west-2c)
Assign tags to ESXi servers in all AZ’s
Create ‘Category-FaultDomain-us-west-2b’ with Associable Object Types as ‘Host’
Create ‘Category-FaultDomain-us-west-2c’ with Associable Object Types as ‘Host’
Both Categories can be seen as shown below
Create Tag ‘Tag-Host-us-west-2b’ with category as ‘Category-FaultDomain-us-west-2b’
Create Tag ‘Tag-Host-us-west-2c’ with category as ‘Category-FaultDomain-us-west-2c’
Both Tags can be seen as shown below
Assign ‘Tag-Host-us-west-2b’ is associated with all servers in ‘us-west-2b’ .
Assign ‘Tag-Host-us-west-2c’ is associated with all servers in ‘us-west-2c’.
Spread RAC VM’s across 2 AZ’s
Create a new Category ‘Category-RAC-VM-AZ’ with a multi-valued Associable Object Type set to both ‘Host’ and ‘Virtual Machine’
Below shows a Category ‘Category-RAC-VM-AZ’ with a multi-valued Associable Object Type set to both ‘Host’ and ‘Virtual Machine’
Create a new Tag ‘Tag-RAC-VM-AZ1’ associated with Category ‘Category-RAC-VM-AZ’
Create a new Tag ‘Tag-RAC-VM-AZ2’ associated with Category ‘Category-RAC-VM-AZ’
Both Tags ‘Tag-RAC-VM-AZ1’ and ‘Tag-RAC-VM-AZ2’ are shown as below.
Apply Tag ‘Tag-RAC-VM-AZ1’ to RAC VM ‘prddg01’
Apply Tag ‘Tag-RAC-VM-AZ2’ to RAC VM ‘prddg02’
RAC VM ‘prddg01’ with ‘Tag-RAC-VM-AZ1’ is shown below
RAC VM ‘prddg02’ with ‘Tag-RAC-VM-AZ2’ is shown below
Create Compute Policy ‘CP-RAC-VM-AZ1’ for ‘us-west-2b’ with VM tag set to (Category-RAC-VM-AZ,Tag-RAC-VM-AZ1) and Host tag set to (Category-FaultDomain-us-west-2b,Tag-Host-us-west-2b)
Create Compute Policy ‘CP-RAC-VM-AZ2’ for ‘us-west-2c’ with VM tag set to (Category-RAC-VM-AZ,Tag-RAC-VM-AZ2) and Host tag set to (Category-FaultDomain-us-west-2c,Tag-Host-us-west-2c)
Both Compute Policy ‘CP-RAC-VM-AZ1’ and ‘CP-RAC-VM-AZ2’ are shown below.
Each RAC VM , ‘prddg01’ and ‘prddg02’ is associated with
- Tag ‘Tag-RAC-VM-AZn’ [ n=1,2] associated with Category ‘Category-RAC-VM-AZ’ for spreading RAC VM’s across Multi AZ to ensure all RAC VM’s do not land in the same AZ
- Tag ‘Tag-RAC-VM-AntiAffinity’ associated with category ‘Category-RAC-VM-AntiAffinity’ for enforce Anti-Affinity rules for RAC VM’s
- Both rules are complaint as shown below
RAC VM ‘prddg01’
RAC VM ‘prddg02’
As we can from above , we were able to spread Oracle RAC VM’s across multiple sites or AZ’s to provide Site level HA.
Conclusion
Using vSphere Tags and Attributes , we are able to provide Site level HA along with Infrastructure level HA to an Oracle RAC on Stretched Clusters for VMware Cloud on AWS with 2 Fault Domains / AZ , ‘us-west-2b’ and ‘us-west-2c by
- Setting up Anti-Affinity rules for RAC VM’s to ensure no 2 RAC VM’s land on the same ESXI server to ensure Infrastructure level HA
- Spreading RAC VM’s across AZ’s to ensure all RAC VM’s do not land in the same AZ to ensure Site level HA
The paper Migrating Oracle Workloads to VMware Cloud on AWS describes the deployment, migration options along with best practices when migrating Oracle Standalone and Oracle RAC on VMware on-premises (vSphere with traditional Storage or VMware HCI vSAN ) to Stretched Clusters for VMware Cloud on AWS.
All Oracle on vSphere white papers including Oracle on VMware vSphere / VMware vSAN / VMware Cloud on AWS , Best practices, Deployment guides, Workload characterization guide can be found at Oracle on VMware Collateral – One Stop Shop .