Home > Blogs > vCloud Architecture Toolkit (vCAT) Blog > Tag Archives: vCloud Director

Tag Archives: vCloud Director

vCloud Director with Virtual SAN Sample Use Case

This brief and high level implementation example will provide a sample use case for the utilization of VMware Virtual SAN in a vCloud Director for Service Providers environment.

Outlined in the illustration below, each Provider Virtual Data Center / Resource Cluster has been configured with a Virtual SAN datastore that meets the specific capability requirements set out by their Service Level Agreement (SLA) for that tier of service.

In this example, the service provider is deploying three tiers of offerings, Gold, Silver and Bronze. The compute consolidation ratio and virtual SAN capability, based on the disk group configuration and storage policy, defines how the offering will perform for a consumer. In addition, not shown in the configuration below, NIOC and QoS are being employed by the service provider to ensure an appropriate balance of network resources are assigned, based on tier of service. This requires the configuration of 3 separate tiered VLANs for Virtual SAN traffic (Gold, Silver and Bronze) with traffic priorities configured accordingly.

The exact disk configuration will vary depending on hardware manufacturer and provider SLAs.

Logical Design Overview

blog

The full VMware technology solution stack is illustrated below.

VSAN with vCD2

The above figure shows how the solution is constructed on VMWare technologies. The core vSphere platform provides the storage capability through Virtual SAN, which in turn is abstracted via vCloud Director. The VSAN Disk Group configuration across the hosts, along with the Storage Policy, that is configured at the vSphere level, define the performance and capacity capabilities of the distributed datastore, which in turn is employed to define the SLAs for this tier of the cloud offering.

As is illustrated above, the vSphere resources are abstracted by vCloud Director into a Provider Virtual Data Center (PvDC). These resources are then further carved up into individual Virtual Data Centers (vDC), assigned to Organisational tenants. The overall result is that the vApps that reside within the Organizational vDCs represent the Virtual SAN storage capability defined by the service provider.

Typically, but outside the scope of this discussion, tiered service offerings are defined by more than just storage capability. vCPU consolidation ratios, levels of guaranteed memory and network resources and backups etc. will all be employed by a service provider to define the SLAs.

As I develop this use case for the service providers I’m working with I will update this article further.

VMware vCloud Director Virtual Machine Metric Database

Hybrid Cloud PoweredThis article is a preview of a section from the Hybrid Cloud Powered Automation and Orchestration document that is part of the VMware vCloud® Architecture Toolkit – Service Providers (vCAT-SP) document set. The document focuses on architectural design considerations to obtain the VMware vCloud Powered service badge, which guarantees true hybrid cloud experience for VMware vSphere® customers. The service provider requires validation from VMware that their public cloud fulfills hybridity requirements:

  • Cloud is built on vSphere and VMware vCloud Director®
  • vCloud user API is exposed to cloud tenants
  • Cloud supports Open Virtualization Format (OVF) for bidirectional workload movement

This particular section focuses on a new feature of vCloud Director—virtual machine performance and resource consumption metric collection, which requires deployment of an additional scalable database to persist and make available a large amount of data to cloud consumers.

Virtual Machine Metric Database

As of version 5.6, vCloud Director collects virtual machine performance metrics and provides historical data for up to two weeks.

Table 1. Virtual Machine Performance and Resource Consumption Metrics

Table 1. Virtual Machine Performance and Resource Consumption Metrics

Retrieval of both current and historical metrics is available through the vCloud API. The current metrics are directly retrieved from the VMware vCenter Server™ database with the Performance Manager API. The historical metrics are collected every 5 minutes (with 20 seconds granularity) by a StatsFeeder process running on the cells and are pushed to persistent storage—Cassandra NoSQL database cluster with KairosDB database schema and API. The following figure depicts the recommended VM metric database design. Multiple Cassandra nodes are deployed in the same network. On each node, the KairosDB database is running, which also provides an API endpoint for vCloud cells to store and retrieve data. For high availability load balancing, all KairosDB instances are behind a single virtual IP address which is configured by the cell management tool as the VM metric endpoint.

Figure 1. Virtual Machine Metric Database Design

Figure 1. Virtual Machine Metric Database Design

Design Considerations

  • Currently only KairosDB 0.9.1 and Cassandra 1.2.x/2.0.x are supported.
  • Minimum cluster size is three nodes (must be equal or larger than the replication factor). Use scale out rather than scale up approach because Cassandra performance scales linearly with number of nodes.
  • Estimate I/O requirements based on the expected number of VMs, and correctly size the Cassandra cluster and its storage.

n … expected number of VMs
m … number of metrics per VM (currently 8)
t … retention (days)
r … replication factor

Write I/O per second = n × m × r / 10
Storage = n × m × t × r × 114 kB

For 30,000 VMs, the I/O estimate is 72,000 write IOPS and 3288 GB of storage (worst-case scenario if data retention is 6 weeks and replication factor is 3).

  • Enable Leveled Compaction Strategy (LCS) on the Cassandra cluster to improve read performance.
  • Install JNA (Java Native Access) version 3.2.7 or later on each node because it can improve Cassandra memory usage (no JVM swapping).
  • For heavy read utilization (many tenants collecting performance statistics) and availability, VMware recommends increasing the replication factor to 3.
  • Recommended size of 1 Cassandra node: 8 vCPUs (more CPU improves write performance), 16 GB RAM (more memory improves read performance), and 2 TB storage (each backed by separate LUNs/disks with high IOPS performance).
  • KairosDB does not enforce a data retention policy, so old metric data must be regularly cleared with a script. The following example deletes one month’s worth of data:

#!/bin/sh

if [ "$#" -ne 4 ]; then
    echo "$0  port month year"
    exit
fi

let DAYS=$(( ( $(date -ud 'now' +'%s') - $(date -ud "${4}-${3}-01 00:00:00" +'%s')  )/60/60/24 ))
if [[ $DAYS -lt "42" ]]; then
 echo "Date to delete is in not before 6 weeks"
 exit
fi

METRICS=( `curl -s -k http://$1:$2/api/v1/metricnames -X GET|sed -e 's/[{}]/''/g' | awk -v k="results" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}'|tr -d '[":]'|sed 's/results//g'|grep -w "cpu\|mem\|disk\|net\|sys"` ) echo $METRICS for var in "${METRICS[@]}" do for date in `seq 1 30`;   do     STARTDAY=$(($(date -d $3/$date/$4 +%s%N)/1000000))     end=$((date + 1))     date -d $3/$end/$4 > /dev/null 2>&1
    if [ $? -eq 0 ]; then
       ENDDAY=$(($(date -d $3/$end/$4 +%s%N)/1000000))
       echo "Deleting $var from " $3/$date/$4 " to " $3/$end/$4
       echo '
       {
          "metrics": [
          {
            "tags": {},
            "name": "'${var}'"
          }
          ],
          "cache_time": 0,
          "start_absolute": "'${STARTDAY}'",
          "end_absolute": "'${ENDDAY}'"
       }' > /tmp/metricsquery
       curl http://$1:$2/api/v1/datapoints/delete -X POST -d @/tmp/metricsquery
    fi
  done
done

rm -f /tmp/metricsquery > /dev/null 2>&1

Note: The space gains will not be seen until data compaction occurs and the delete marker column (tombstone) expires. This is 10 days by default, but you can change it by editing gc_grace_seconds in the cassandra.yaml configuration file.

  • KairosDB v0.9.1 uses QUORUM consistency level both for reads and writes. Quorum is calculated as rounded down (replication factor + 1) / 2, and for both reads and writes quorum number of replica nodes must be available. Data is assigned to nodes through a hash algorithm and every replica is of equal importance. The following table provides guidance on replication factor and cluster size configurations.
Table 2. Cassandra Configuration Guidance

Table 2. Cassandra Configuration Guidance

 

vCAT Software Tools

The vCloud Software Tools document provides the reader with instructions on how to use some of the tools that are available for implementing, managing and reporting on your vCloud cloud environments.

vCloud Director Audit provides automated report generation against a vCloud Director deployment, providing you with details on how the vCloud environment is configured, provisioned and being consumed by the tenants of the cloud.

vCloud Provisioner provides an automated framework for describing the configuration requirements of the cloud being provisioned and the execution of that discription against a vCloud Director implementation in order to automatically create all of the underlying vCloud objects necessary to deliver the services described in the provisioner description.

Cloud Cleaner allows you to provision a vCloud Director instance on top of a vSphere environment, perform any testing or evaluation you may wish to carry out and then clean up anything performed against the vSphere environment by vCloud Director, returning the vSphere environment to a pre-vCloud Director state. This is useful when evaluating vCloud Director, or educating yourself on how vCloud Director is implemented on an underlying vSphere environment