Author Archives: Louis Liu

vCloud Availability for Cloud-to-Cloud DR 1.5 Reference Architecture

Overview

The vCloud Availability Cloud-to-Cloud DR solution provides replication and failover capabilities for vCloud Director workloads at both VM and vApp level.

VMware vCloud Availability for Cloud-to-Cloud DR Reference Architecture (PDF format here)

This blog demonstrates the reference architecture of vCloud Availability for Cloud-to-Cloud Disaster Recovery 1.5, VMware vCloud Availability for Cloud-to-Cloud DR 1.5 allows tenant and service provider users to protect vApps between different virtual data centers within a vCloud Director environment and across different vCloud Director based clouds.

The architecture diagram illustrates the needed solution components between cloud provider’s two data centers which are backed by different vCloud Director cloud management platform, it also shows the network flow directions and port number required for communication among components in the vCloud Availability for Cloud-to-Cloud DR solution. Architecture supports symmetrical replication operations between cloud environments.

The service operates through a VMware Cloud Provider Program, and each installation provides recovery for multiple cloud environments. The vCloud Availability for Cloud-to-Cloud DR provides:

  • Self-service protection and failover workflows per virtual machine (VM).

  • Single installation package as a Photon-based virtual appliance.

  • The capability of each deployment to serve as both source and recovery vCloud Director instance (site). There are no dedicated source and destination sites.

  • Symmetrical replication flow that can be started from either the source or the recovery vCloud Director site.

  • Replication and recovery of vApps and VMs between vCloud Director sites.

  • Using a single-site vCloud Availability for Cloud-to-Cloud DR installation, you can migrate vApps and VMs between Virtual Data Centers that belong to a single vCloud Director Organization.

  • Secure Tunneling through a TCP proxy.

  • Integration with existing vSphere environments.

  • Multi-tenant support.

  • Built-in encryption or encryption and compression of replication traffic.

  • Support for multiple vCenter Server and ESXi versions.

Architecture Explained

When you implement this solution from the ova file in your production environment, make sure you are not choosing the “Combined” configuration type, instead you need to choose the “Manager node with vCloud Director Support’ configuration (icon # 6 in the RA), you’ll see the configuration description showing “The H4 Management Node. Deploy one of these if you need to configure replications to/from vCD”, H4 represents the vCloud Availability Replicator or Manager (C4 is for vCloud Availability vApp Replication Service or Manager), by selecting this configuration type, the ova will install three vCAV components all together in a single appliance:

  1. vCloud Availability Cloud-to-Cloud DR Portal (icon # 5 in the RA)
  2. vCloud Availability vAPP Replication Manager (icon # 4 in the RA)
  3. vCloud Availability Replication Manager (icon # 3 in the RA)

The above three components are located in a white-colored rectangle box (icon # 6) in the reference architecture diagram, all the communications between those three components are happened internally and will never route through outside this appliance, for example, vCloud Availability vAPP Replication Manager will use REST API calls to vCloud Availability Replication Manager in order to perform required replication tasks.

  1. vCloud Director
    • With the vCloud Director, cloud provider can build secure, multi-tenant private clouds by pooling infrastructure resources into virtual data centers and exposing them to users through Web- based portals and programmatic interfaces as fully automated, catalog-based services.
  2.  vCloud Availability Replicator Appliance
    • For production deployments, You deploy and configure dedicated vCloud Availability Replicator appliance or appliances, it exposes the low-level HBR primitives as REST APIs.

  3. vCloud Availability Replicator Manager
    • A management service operating on the vCenter Server level. It understands the vCenter Server level concepts for starting the replication workflow for the virtual machines. It must have TCP access to the Lookup Service and all the vCloud Availability Replicator appliances in both local, and remote sites.
  4. vCloud Availability vApp Replication Manager
    • Provides the main interface for the Cloud-to-Cloud replication operations. It understands the vCloud Director level concepts and works with vApps and virtual machines using vCD API calls.
  5. vCloud Availability C2C DR Portal
    • It provides tenants and service providers with a graphic user interface to facilitate the management of the vCloud Availability for Cloud-to-Cloud DR solution. It also provides overall system and workload information.
  6. Manager node with vCloud Director Support
    • Single appliance that contains the following services:
      • vCloud Availability Cloud-to-Cloud DR Portal
      • vCloud Availability vAPP Replication Manager
      • vCloud Availability Replication Manager
  7. vCenter Server with Platform Services Controller
    • The PSC provides common infrastructure services to the vSphere environment. Services include licensing, certificate management, and authentication with VMware vCenter Single Sign-On.
  8. vCloud Availability Tunnel Appliance
    • This solution requires that each component on a local site has bidirectional TCP connectivity to each component on the remote site, If bidirectional connections between sites are a problem, you configure Cloud-to-Cloud Tunneling, you must provide connectivity between the vCloud Availability Tunnel appliances on each site. It simplifies provider networking setup by channeling all incoming and outgoing traffic for a site through a single point.
  9. Network Address Translation
    • You must set an IP and port in the local site that is reachable for remote sites and forward it to the private address of the vCloud Availability Tunnel appliance, port 8048, for example, by using destination network address translation (DNAT).

Coexistence

  1. Based on the product release nodes, vCloud Availability for Cloud-to-Cloud DR 1.5 and vCloud Availability for vCloud Director 2.X can be installed and can operate together in the same vCloud Director environment. You can protect virtual machines either by using vCloud Availability for Cloud-to-Cloud DR 1.5 or vCloud Availability for vCloud Director 2.X.
  2. vCloud Availability for Cloud-to-Cloud DR 1.5 and vCloud Director Extender 1.1.X can be installed and can operate together in the same vCloud Director environment. You can migrate virtual machines either by using vCloud Availability for Cloud-to-Cloud DR 1.5 or vCloud Director Extender 1.1.X.

Interoperability

  • vSphere Hypervisor (ESXi) –  5.5 and above
  • vCenter Server – 6.0, 6.5 and 6.7
  • vCloud Director for Service Providers – 8.20, 9.0, 9.1 and 9.5

* Please visit VMware Product Interoperability Matrices website to check the latest support products version.

Notes

  • There’s a comprehensive vCloud Availability Cloud-to-Cloud DR Design and Deploy Guide available here, which was published by my colleague, Avnish Tripathi, you can find detail design guidelines for this solution.
  • VMware official vCloud Availability for Cloud-to-Cloud DR Documentation is here.

Virtual Machine Performance Metrics in VMware vCloud Director 9.0

Starting with VMware vCloud Director® 5.6, service providers have been able to configure vCloud Director to store metrics that it collects on virtual machine performance and resource consumption. Data for historic metrics is stored in a Cassandra and KairosDB database.

VMware Cloud Providers™ can set up database schema to store basic VM historical performance and resource consumption metrics (CPU, memory and storage), which are collected every 5 minutes (with 20 seconds granularity) by a StatsFeeder process running on the vCloud Director cells. These metrics are then are pushed to a Cassandra NoSQL database cluster with KairosDB persistent storage.

However, this implementation has several limitations, including the following:

• Uses Kairos on top of Cassandra, with an extra layer to maintain
• Supports an outdated version of Kairos DB 0.9.1 and Cassandra 1.2.x/2.0.x
• VMware vCenter Server® does not provide metrics for NFS-based storage
• Difficult to maintain the size of performance data, there is no TTL setting
• Lack of SSL support

With vCloud Director 9.0, VMware has made the following enhancements:

• Provides hybrid mode (you can still choose to use KairosDB)
• Uses a native Cassandra schema and support Cassandra 3.x
• Uses SSL
• Uses vCloud Director entity IDs to tag data in Cassandra instead of Moref/VC-id
• Adds the CMT command to configure a Cassandra cluster

 

After the service provider has successfully implemented this VM performance metrics collecting mechanism, vCloud Director tenant users can directly view their VM’s performance chart from within their vCloud Director 9.0 tenant HTML5 user interface. Service providers are no longer required to use the API call for this purpose, enabling them to offer this benefit to their customers in a much simpler way.

To configure basic VM metrics for vCloud Director 9.0, follow the steps in “Install and Configure Optional Database Software to Store and Retrieve Historic Virtual Machine Performance Metrics” in the vCloud Director 9.0 Installation and Upgrade Guide here. In this version, the configuration file does not need to be generated first. Simply follow the documented steps and everything will automatically be done for you.

If you issue the cell-management-tool configure-metrics –metrics-config /tmp/metrics.groovy command described here, you might have a problem adding schema (as shown in the following screen capture) where vCloud Director 9.0 cannot start up normally and is stopped at the com.vmware.vcloud.metrices-core process.

You must perform the following steps before running the cell-management-tool cassandra command, because it will try to add the same schema again which will cause the error:

1. Remove the keyspace on Cassandra:
# cqlsh –ucassandra –pcassandra; // or other super account
#  drop keyspace vcloud_metrics;

2. Edit the content of the /tmp/metrics.groovy file to:

configuration {
}

3. Run the following command:
# cell-management-tool configure-metrics –metrics-config /tmp/metrics.groovy

4. Run the following command (replace with your Cassandra user and IPs):
# cell-management-tool cassandra –configure –create-schema –cluster-nodes ip1,ip2,ip3,ip4 –username cassandra –password ‘cassandra’ –ttl 15 –port 9042

Notes:

• See the latest vCloud Director 9.0 release notes here for supported vCloud Director Cassandra versions:
– Cassandra 2.2.6 (deprecated for new installations. Supported for legacy upgrades still using KairosDB)
– Cassandra 3.x (3.9 recommended)

• See the vCAT blog at https://blogs.vmware.com/vcat/2015/08/vmware-vcloud-director-virtual-machine-metric-database.html for detailed VM metrics explanations.

• The service provider can implement a more advanced tenant-facing performance monitoring solution for their tenants by using the VMware vRealize® Operations Manager™ Tenant App for vCloud Director, which provides a tenant administrator visibility in to their vCloud Director environment. For more information, go to https://marketplace.vmware.com/vsx/solutions/management-pack-for-vcloud-director.

• There is no need to setup additional Load Balancer in front of a Cassandra Cluster, Cassandra’s Java driver is smart enough in terms of load balancing the request between the Cassandra nodes.