Home > Blogs > vCloud Architecture Toolkit (vCAT) Blog

vCloud Director for Service Providers (VCD-SP) and RabbitMQ Security

Let us start with what is RabbitMQ and how does RabbitMQ fit into vCloud Director for Service Providers (VCD-SP)?

RabbitMQ provides robust messaging for applications, in particular vCloud Director for Service Providers (VCD-SP).  Messaging describes the sending and receiving of data (in the form of messages) between systems. Messages are exchanged between programs or applications, similar to the way people communicate by email, but with select-able guarantees on delivery, speed, security and the absence of spam.

A messaging infrastructure (a.k.a. message-oriented middle-ware or enterprise service bus) makes it easier for developers to create complex applications by decoupling individual program components. Rather than communicating directly, the messaging infrastructure facilitates the exchange of data between components.  The components need know nothing about each other’s status, availability or implementation, which allows them to be distributed over heterogeneous platforms and turned off and on as required.

In a vCloud Director for Service Provider deployment, VCD-SP uses the open standard AMQP protocol to publish messages associated with Blocking Tasks or Notifications. AMQP is the wire protocol natively understood by RabbitMQ and many similar messaging systems, and defines the wire format of messages, as well as specifying the operational details of how messages are published and consumed. VCD-SP also uses AMQP to communicate with extension services: http://goo.gl/xZ9gkL – vCloud Director for Service Provider API Extensions are implemented as services that consume the API requests from a RabbitMQ queue. The API request (http request is serialized and published as an AMQP message. The API implementation consumes the messages, performs the business logic and then replies with an AMQP message. In order to publish and consume messages, you need to configure your RabbitMQ exchange and queues.

RabbitMQ1

A RabbitMQ server or _broker_, runs within the vCloud Director for Service Provider network environment, and for example is deployed into the VCD-SP underlying vSphere installation as a virtual appliance, or vApp. Clients (in this case vCloud Director for Service Provider cells belonging to the vCloud Director Service Provider (VCD-SP) infrastructure itself, as well as other applications interested in notifications) connect to the RabbitMQ broker. Such clients then publish messages to, or consume messages from the broker. The RabbitMQ broker is written in the Erlang programming language and runs on the Erlang virtual machine. Notes on Erlang-related security and operational issues are presented later in this vCAT-SP blog.

 

The Base Operating System Hosting the RabbitMQ Broker

Securing the RabbitMQ broker in a vCloud Director for Service Provider environment begins with securing the base operating system of the computer (bare metal or virtualized) on which Rabbit runs.  Rabbit runs on many platforms, including Windows and multiple versions of Linux.  As of this writing, commercial versions of RabbitMQ are sold by VMware as part of the vFabric suite and supported on Windows and RPM-based Linux distributions in the Fedora/RHEL family, as well as in a tar.gz-packaged Generic Linux edition. Please see : http://docs.gopivotal.com/rabbitmq/index.html for purchasing details.

It is generally recommended in a vCloud Director Service Provider (VCD-SP) deployment that a Linux distribution of RabbitMQ be used.  VMware expects to eventually provide a pre-packaged vApp with a Linux installation, the necessary Erlang runtime, and a RabbitMQ broker, although this form factor is not yet officially released. The VMware RabbitMQ virtual appliance undergoes, as part of its build process, a security hardening regime common to VMware-produced virtual appliances.

If a customer is deploying RabbitMQ on a Linux of their own choosing, whether running on bare-metal OS, or as part of a virtual appliance they have created themselves, the VMware’s security team recommends the following guidelines be adopted for securing the base Operating System in question:

The hardening discipline applied to the VMware produced RabbitMQ virtual appliance is based on DISA STIG recommendations above.

 

General networking concerns

Exposing the AMQP traffic occurring between vCloud Director for Service Provider cells and other interested applications in one’s cloud infrastructure outside of the private networks meant for cloud management can expose a VCD-SP provider to security threats. Messages published on an AMQP broker like RabbitMQ are sent for events that happen when something in vCloud Director for Service Provider changes and thus may include sensitive information. Thus, AMQP ports should be blocked at the network firewall protecting the DMZ to which vCloud cells are connected. Code that consumes AMQP messages from the broker must also be connected to same DMZ.  Any such piece of code should be controlled, or at least audited to the point of trustiness, by the vCloud Director Service for Provider.

It is also worth mentioning that AMQP is not exposed to any Cloud tenants and is only used by the Service Provider.

* The Erlang runtime

** What is Erlang?

Erlang is a programming language developed and used by Ericsson in its high-end telephony and data routing products.  The language and its associated virtual machine supports several features leveraged by RabbitMQ, including:

  • support for highly concurrent applications like RabbitMQ
  • built-in support for distributed computing, thus enabling easier clustering of RabbitMQ systems
  • built-in process monitoring and control, for ensuring that a RabbitMQ broker’s subsystems remain running and healthy
  • Mnesia: a performant distributed database
  • high-performance execution.

That RabbitMQ is written in Erlang matters relatively little to a system administrator responsible for deploying, configuring and securing the broker, with only a few small exceptions:

  • Erlang distribution has certain open port constraints.
  • Erlang distribution requires a special “cookie” file to be shared between hosts participating in distributed Erlang communication; this cookie must be kept private.
  • Some RabbitMQ configuration files are represented with Erlang syntax, of which one must be mindful when placing delimiters (like ‘[‘, ‘{‘, and ‘)’) and certain punctuation marks (notably the comma and the period).

 

Running Erlang securely for RabbitMQ

When clustered, RabbitMQ is a distributed Erlang system, consisting of multiple Erlang virtual machines communicating with one another.  Each such running virtual machine is called a *node*.  In such a configuration, the administrator must be aware of two basic Erlang ideas: the Erlang port mapper daemon, and the Erlang node magic cookie.

 

epmd:  The Erlang port mapper daemon

The Erlang port mapper daemon is automatically started at every host where an Erlang node (such as a RabbitMQ broker) is started.  The appearance of a process called ‘epmd’ is not to be viewed with alarm. The Erlang virtual machine itself is called ‘beam’ or ‘beam.smp’ and at least one of these will be seen on a machine running the RabbitMQ server. The Erlang port mapper daemon listens, by default on TCP port 4369. The host system’s firewall should leave this port open as a result.

 

Node magic cookies

Each Erlang node (as defined above) has its own magic cookie, which is an Erlang atom contained in a text file.  When an Erlang node tries to connect to another node (this could be a pair of RabbitMQ brokers connecting in a clustered RabbitMQ implementation, or the rabbitmqctl

utility connecting to a broker to perform some administrative function upon it) the magic cookie values are compared.  If the values of the cookies do not match, the connected node rejects the connection.

A node magic cookie on a system should be readable only by those users under whose id Erlang processes that need to communicate with one another are expected to run.  The Unix permissions of cookie files should typically be 400 (read-only by user).

For most versions of RabbitMQ, cookie creation and installation is handled automatically during installation.  For an RPM-based Linux distribution of RabbitMQ such as that for RHEL/Fedora the cookie will be created and deposited in /var/lib/rabbitmq, called ‘.erlang.cookie’ and given permissions 400 as described above.

* Rabbit server concepts

** Rabbit security:  the OS-facing side

*** OS user accounts

**** RPM-based Linux

In an RPM-based Linux distribution such as the vFabric release of RabbitMQ or the RabbitMQ virtual appliance, the Rabbit server runs as a daemon, started by default at OS boot time.  On such a platform the server is set up to run as system user ‘rabbitmq’.  The Mnesia database and log files must be owned by this user.  More will be said about these files in subsequent sections.

To change whether the server starts at system boot time use:

$ chkconfig rabbitmq-server on

or:

$ chkconfig rabbitmq-server off

An administrator can start or stop the server with:

$ /sbin/service rabbitmq-server stop|start|restart

 

Network ports

Unless configured otherwise, the RabbitMQ broker will listen on the default AMQP port of 5672.  If the management plugin is installed to provide browser-based and HTTP API-based management services, it will listen on port 55672.

*Any firewall configuration should be certain to open these two ports. *

Strictly speaking, you only need port 5672 open for VCD-SP to work. You open port 55672 only if you want to expose the management interface to the outside world.

Also, as noted above, the Erlang port mapper daemon port, TCP 4369, must also be open.

 

Rabbit security: The broker-facing side

When considering the security of the RabbitMQ broker itself it’s helpful to divide one’s thinking into the consideration of the face Rabbit shows to the outside world, in terms of how communication with clients can optionally be authenticated and secured against eavesdropping and the ways in which RabbitMQ’s internal structures like exchanges, queues and the bindings between them that determine message routing are governed.

For the former consideration, a RabbitMQ broker can be configured to communicate with clients using the SSL protocol.  This can provide channel security for client-broker communications and optionally the verification of the identities of communicating parties.

 

TLSv1.2 and RabbitMQ in vCloud Director for Service Providers (VCD-SP)

In the context of vCloud Director Service Provider (VCD-SP), the administrator can configure vCloud Director Service Provider (VCD-SP) to use secure communication based on TLSv1.2 when sending messages to the AMQP broker. TLSv1.2 can also be configured to verify the presented broker’s certificate to authenticate its identity. To enable secured communication, you need to log in to vCloud Director Service Provider (VCD-SP) as a system administrator. In the ‘Administration’ section of the user interface, you must open the ‘Blocking Tasks’ page and select ‘Settings’ tab. In the ‘AMQP Broker Settings’ section there is checkbox labelled ‘Use SSL.’  Turn this option on. You can now select whether to accept all certificates – turn “Accept All Certificates” option on or to verify presented certificates. To configure verification of presented broker’s certificates you need either to create a Java KeyStore in JCEKS format that contains the trusted certificate(s) used to sign the broker’s certificate or you can directly upload the certificate if it is in PEM format.  Under this same ‘AMQP Broker Settings’ section use either the ‘Browse’ button for single SSL Certificate or for SSL Key Store. If you upload keystore you need to provide also SSL Key Store Password. If neither keystore or certificate are uploaded, then default JRE truststore is used.

 

Securing RabbitMQ AMQP communication with SSL

Full documentation on setting up the RabbitMQ broker’s built-in SSL support can be found at: http://www.rabbitmq.com/ssl.html

The documentation at this site covers:

  • the creation of a certificate authority using OpenSSL and the generation of signed certificates for both the RabbitMQ server and its clients.
  • enabling SSL support in RabbitMQ by editing the broker’s config file (for its location on a specific Rabbit platform see http://www.rabbitmq.com/configure.html#configuration-file)

 

Broker virtual hosts and RabbitMQ users

A RabbitMQ server internally defines a set of AMQP users (with passwords), which are stored in its Mnesia database.  *NOTE:* A freshly installed RabbitMQ broker starts life with a user account called ‘guest’ and endowed with the password ‘guest’.  We recommend that this password be changed, or this account deleted when RabbitMQ is first set up.

A RabbitMQ broker’s resources are logically partitioned into multiple “virtual hosts.”  Each virtual host provides a separate namespace for resources such as exchanges and queues.  When clients connect to a broker, they specify the virtual host with which they plan to interact at connection time.  A first level of access control is enforced at this point, with the server checking whether the user has sufficient permissions to access the virtual host.  If not, the connection is rejected.

RabbitMQ offers _configure_, _read_, and _write_ permissions on its resources.  Configure operations create or destroy resources, or modify their behavior.  Write operations inject messages into a resource, and read operations retrieve messages from a resource.

It is important to note that VCD-SP requires to have all these permissions granted for its AMQP user.

Details on RabbitMQ virtual hosts, users, access control and permissions can be found here:

http://www.rabbitmq.com/admin-guide.html

The setting of permissions using the ‘rabbitmqctl’ utility is described in:

http://www.rabbitmq.com/man/rabbitmqctl.1.man.html#Access%20control

One should stick to a policy of least privilege in the granting of permissions on broker resources.

 

The rabbitmqctl utility

The rabbitmqctl (analogous to apachectl or tomcatctl) utility is one of the primary points of contact for administering RabbitMQ.  On Linux systems a man page for rabbitmqctl is typically available specifying its many options.  The contents of this page can also be found online at:

http://www.rabbitmq.com/man/rabbitmqctl.1.man.html

 

The Rabbit broker:  Where things are and how they should be protected

The following are true for a RabbitMQ server installed on an RPM-based Linux distribution such as RHEL/Fedora.  Permissions are given for top level directories where named.  Data files within them may have more liberal permissions set, particularly group/other authorized to read/write.

 

Erlang cookie

Ownership:    rabbitmq/rabbitmq

Permissions:  400

Location: /var/lib/rabbitmq/.erlang.cookie

 

RabbitMQ logs

Ownership:    rabbitmq/rabbitmq

Permissions:  755

Location: /var/log/rabbitmq/

|– rabbit@localhost-sasl.log

|– rabbit@localhost.log

|– startup_err

`– startup_log

 

Mnesia database location, plugins and message stores

Ownership:    rabbitmq/rabbitmq

Location: /var/lib/rabbitmq/mnesia

|– rabbit@localhost

|   |– msg_store_persistent

|   `– msg_store_transient

`– rabbit@localhost-plugins-expand

 

Configuration files location and permissions

RabbitMQ’s main configuration file, as well as environment variables that influences its behavior are documented here: http://www.rabbitmq.com/configure.html

Note that the contents of the rabbitmq.config file are an Erlang term, and it is thus important to be mindful of delimiters and line ending symbols, so as not to produce a syntactically invalid file that will prevent RabbitMQ from starting up.

 

Privileges required to run broker process and rabbitmqctl

Ownership:    root/root

Permissions:  755/usr/sbin/rabbitmqctl

The rabbitmqctl utility must be run as root, and maintain ownership and permissions as above.

The broker can be started, stopped, restarted or status checked by an administrator running:

$ /sbin/service rabbitmq-server stop|start|restart|status

 

Sources/References

VMware vFabric Cloud Application Platform (with purchase links for commercial RabbitMQ):

http://info.vmware.com/content/12834_index?src=PaidSearch_Google_amer-us_ENG_vFabric_vFab_Brand_Search&gclid=CLuOp7e84asCFTAaQgodJzlEQw

NSA operating systems security guidelines: http://www.nsa.gov/ia/guidance/security_configuration_guides/operating_systems.shtml

US DoD Information Assurance Support Environment Security Technical Implementation Guides for operating systems: http://iase.disa.mil/stigs/os/index.html#

RabbitMQ broker configuration: http://www.rabbitmq.com/configure.html

RabbitMQ administration guide: http://www.rabbitmq.com/admin-guide.html

RabbitMQ broker/client SSL configuration guide: http://www.rabbitmq.com/ssl.html

RabbitMQ configuration file reference: http://www.rabbitmq.com/configure.html#configuration-file)

Configuring access control with rabbitmqctl: http://www.rabbitmq.com/man/rabbitmqctl.1.man.html#Access%20control

Rabbitmqctl man page: http://www.rabbitmq.com/man/rabbitmqctl.1.man.html

 

Authored by Michael Haines – Global Cloud Practice

Special thanks to Radoslav Gerganov and Jerry Kuch for their help and support.

VMware vCloud Director Virtual Machine Metric Database

Hybrid Cloud PoweredThis article is a preview of a section from the Hybrid Cloud Powered Automation and Orchestration document that is part of the VMware vCloud® Architecture Toolkit – Service Providers (vCAT-SP) document set. The document focuses on architectural design considerations to obtain the VMware vCloud Powered service badge, which guarantees true hybrid cloud experience for VMware vSphere® customers. The service provider requires validation from VMware that their public cloud fulfills hybridity requirements:

  • Cloud is built on vSphere and VMware vCloud Director®
  • vCloud user API is exposed to cloud tenants
  • Cloud supports Open Virtualization Format (OVF) for bidirectional workload movement

This particular section focuses on a new feature of vCloud Director—virtual machine performance and resource consumption metric collection, which requires deployment of an additional scalable database to persist and make available a large amount of data to cloud consumers.

Virtual Machine Metric Database

As of version 5.6, vCloud Director collects virtual machine performance metrics and provides historical data for up to two weeks.

Table 1. Virtual Machine Performance and Resource Consumption Metrics

Table 1. Virtual Machine Performance and Resource Consumption Metrics

Retrieval of both current and historical metrics is available through the vCloud API. The current metrics are directly retrieved from the VMware vCenter Server™ database with the Performance Manager API. The historical metrics are collected every 5 minutes (with 20 seconds granularity) by a StatsFeeder process running on the cells and are pushed to persistent storage—Cassandra NoSQL database cluster with KairosDB database schema and API. The following figure depicts the recommended VM metric database design. Multiple Cassandra nodes are deployed in the same network. On each node, the KairosDB database is running, which also provides an API endpoint for vCloud cells to store and retrieve data. For high availability load balancing, all KairosDB instances are behind a single virtual IP address which is configured by the cell management tool as the VM metric endpoint.

Figure 1. Virtual Machine Metric Database Design

Figure 1. Virtual Machine Metric Database Design

Design Considerations

  • Currently only KairosDB 0.9.1 and Cassandra 1.2.x/2.0.x are supported.
  • Minimum cluster size is three nodes (must be equal or larger than the replication factor). Use scale out rather than scale up approach because Cassandra performance scales linearly with number of nodes.
  • Estimate I/O requirements based on the expected number of VMs, and correctly size the Cassandra cluster and its storage.

n … expected number of VMs
m … number of metrics per VM (currently 8)
t … retention (days)
r … replication factor

Write I/O per second = n × m × r / 10
Storage = n × m × t × r × 114 kB

For 30,000 VMs, the I/O estimate is 72,000 write IOPS and 3288 GB of storage (worst-case scenario if data retention is 6 weeks and replication factor is 3).

  • Enable Leveled Compaction Strategy (LCS) on the Cassandra cluster to improve read performance.
  • Install JNA (Java Native Access) version 3.2.7 or later on each node because it can improve Cassandra memory usage (no JVM swapping).
  • For heavy read utilization (many tenants collecting performance statistics) and availability, VMware recommends increasing the replication factor to 3.
  • Recommended size of 1 Cassandra node: 8 vCPUs (more CPU improves write performance), 16 GB RAM (more memory improves read performance), and 2 TB storage (each backed by separate LUNs/disks with high IOPS performance).
  • KairosDB does not enforce a data retention policy, so old metric data must be regularly cleared with a script. The following example deletes one month’s worth of data:

#!/bin/sh

if [ "$#" -ne 4 ]; then
    echo "$0  port month year"
    exit
fi

let DAYS=$(( ( $(date -ud 'now' +'%s') - $(date -ud "${4}-${3}-01 00:00:00" +'%s')  )/60/60/24 ))
if [[ $DAYS -lt "42" ]]; then
 echo "Date to delete is in not before 6 weeks"
 exit
fi

METRICS=( `curl -s -k http://$1:$2/api/v1/metricnames -X GET|sed -e 's/[{}]/''/g' | awk -v k="results" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}'|tr -d '[":]'|sed 's/results//g'|grep -w "cpu\|mem\|disk\|net\|sys"` ) echo $METRICS for var in "${METRICS[@]}" do for date in `seq 1 30`;   do     STARTDAY=$(($(date -d $3/$date/$4 +%s%N)/1000000))     end=$((date + 1))     date -d $3/$end/$4 > /dev/null 2>&1
    if [ $? -eq 0 ]; then
       ENDDAY=$(($(date -d $3/$end/$4 +%s%N)/1000000))
       echo "Deleting $var from " $3/$date/$4 " to " $3/$end/$4
       echo '
       {
          "metrics": [
          {
            "tags": {},
            "name": "'${var}'"
          }
          ],
          "cache_time": 0,
          "start_absolute": "'${STARTDAY}'",
          "end_absolute": "'${ENDDAY}'"
       }' > /tmp/metricsquery
       curl http://$1:$2/api/v1/datapoints/delete -X POST -d @/tmp/metricsquery
    fi
  done
done

rm -f /tmp/metricsquery > /dev/null 2>&1

Note: The space gains will not be seen until data compaction occurs and the delete marker column (tombstone) expires. This is 10 days by default, but you can change it by editing gc_grace_seconds in the cassandra.yaml configuration file.

  • KairosDB v0.9.1 uses QUORUM consistency level both for reads and writes. Quorum is calculated as rounded down (replication factor + 1) / 2, and for both reads and writes quorum number of replica nodes must be available. Data is assigned to nodes through a hash algorithm and every replica is of equal importance. The following table provides guidance on replication factor and cluster size configurations.
Table 2. Cassandra Configuration Guidance

Table 2. Cassandra Configuration Guidance

 

VMware vCloud Architecture Toolkit (vCAT) is back!

Introducing VMware vCloud Architecture Toolkit for Service Providers (vCAT-SP)

The current VMware vCloud® Architecture Toolkit (3.1.2) is a set of reference documents that help our service provider partners and enterprises architect, operate and consume cloud services based on the VMware vCloud Suite® of products.

As the VMware product portfolio has diversified over the past few years with the introduction of new cloud automation, cloud operations and cloud business products plus the launch of our VMware’s own hybrid cloud service; VMware vCloud Air™, VMware service provider partners now have many more options for designing and building their VMware powered cloud services.

VMware has decided to create a new version of vCAT specifically focused on helping guide our partners in defining, designing, implementing and operating VMware based cloud solutions across the breadth of our product suites. This new version of vCAT is called VMware vCloud Architecture Toolkit – Service Providers (or vCAT-SP).

What are we attempting to achieve?

What VMware intends to do through the new vCAT-SP is provide prescriptive guidance to our partners on what is required to define, design, build and operate a VMware based cloud service… aligned to the common service models that are typically deployed by our partners. This will include core architectures and value-add products and add-ons.

VMware vCAT-SP will be developed using the architecture methodology shown in the following graphic. This methodology takes real-world service models, use cases, functional, non-functional requirements and implementation examples that have been validated in the real world.

Architecture Methodology

 

Which implementation models will be covered?

The new vCAT-SP initially focuses on two implementation models: Hybrid Cloud Powered and Infrastructure as a Service (IaaS) Powered. These in turn align to common cloud service models, such as Hosting, Managed Private Cloud, and Public/Hybrid Cloud.

Hybrid Cloud Powered

To become hybrid cloud powered, the service provider’s cloud infrastructure must meet the following criteria:

  • The cloud service must be built with VMware vSphere® and VMware vCloud Director for Service Providers.
  • The vCloud APIs must be exposed to the cloud tenants.
  • Cloud tenants must be able to upload and download virtual workloads packaged with Open Virtualization Format (OVF) version 1.0.
  • The cloud provider must have an active rental contract of 3600 points of more with an aggregator.

This implementation model is typically used to build large scale multi-tenant public or hybrid cloud solutions offering a range is IaaS, PaaS or SaaS services to the end-customers.

Infrastructure as a Service Powered

To design an Infrastructure as a Service powered cloud infrastructure, the solution must meet the following criteria:

  • The cloud service must be built with vSphere.
  • The cloud provider must have an active rental subscription with an aggregator.

This implementation model is typically used to build managed hosting and managed private cloud solutions with varying levels of dedication through compute, storage and networking layers, again offering a range of IaaS, PaaS and SaaS services to the end customers.

The vCloud Architecture Toolkit provides all the required information to design and implement a hybrid powered or IaaS powered cloud service, and to implement value-added functionality for policy based operations management, software-defined network and security, hybridity, unified presentation, cloud business management, cloud automation and orchestration, software-defined storage, developer services integration etc.

For more information please visit: vcloudairnetwork.com

Modular and iterative development framework

Modularity is one of the key principles when starting to develop the new vCAT-SP architecture framework. Our modular approach makes it easier to iterate upon, by having smaller building blocks that can be checked out of the architecture, have the impact assessed against other components, updated, then re-inserted in to the architecture with minimal impact to the larger solution landscape.

What will vCAT-SP contain?

VMware vCAT-SP provides the following core documents:

Introductory Documents

Within this section there will be a document map, which details all the available documents and document types that are contained within vCAT-SP. There will also be an introduction document that provides the partners with guidance on how to get the most out of vCAT-SP as a consumer.

Service Definitions

The service definition document(s) provide the information needed to create an effective service definition document. They contain use cases, SLAs, OLAs, business drivers, and the like, that are required to build a hybrid cloud powered or IaaS powered cloud service. The initial vCAT-SP efforts will focus on the hybrid cloud powered service definition, with IaaS Powered following shortly after.

Architecture Documents

The vCAT-SP architecture documents detail the logical design specifics, architecture options available to the designing architect, design considerations for availability, manageability, performance, scalability, recoverability, security, and cost.

Implementation Examples

The implementation example documents detail an end-to-end specific implementation of a solution aligned to an implementation model and service definition. These documents highlight which design decisions were taken and how the solution meets the use cases and requirements identified in a service definition.

Additionally, there will be implementation examples for pluggable value-added services that are developed through the VMware vCloud Air Network. For example, Disaster Recovery as a Service (DRaaS), these components can be plugged in to the core architecture.

Emerging Tools, Solutions and Add-Ons

This area is not just for documentation, but also allows for the team to capture and store Useful software tools and utilities, such as, scripts, plugins, workflows, etc. and which can be used to enhance a particular implementation model. For example, how my cloud platform can present cloud-native applications, such as Project Photon. The development of these documents / add-ons will be iterative and not aligned to the core documentation releases.

The following figure shows the map of documentation currently planned. This is subject to change.

Document Map

When can I get a copy of vCAT-SP?

We are planning to launch the first pdf-based release of vCAT-SP on www.vmware.com/go/vcat around the VMworld EMEA time-frame, we will be publishing in to web format shortly afterwards… so watch this space!

VMware vCAT-SP will be developed iteratively, with a published road-map. This will be in-line with our major software releases where possible, to ensure there is effective service and solution focused architectural guidance available to VMware service provider partners as close to GA dates as possible.

Who is the vCAT-SP development team?

The Global Cloud Practice – vCloud Air Network team, led by Dan Gallivan, is a team of specialist service provider-focused cloud architects that work throughout the vCloud Air Network within the VMware Cloud Services Business Unit.

The team is a global team with many years experience helping our service provider partners build world-class cloud products based on VMware software. The team also contains five certified VCDX architects and three members of the VMware CTO Ambassadors program.

Over the next couple of months we will be releasing frequent technical preview blogs across the technology domains as we approach VMworld EMEA.

 

Be sure to subscribe to the vCAT blog, vCloud blog, follow @VMwareSP on Twitter or ‘like’ us on Facebook for future updates.

 

Related VMware vFabric Reference Architecture released

Now available, the highly anticipated vFabric Reference Architecture!!!

Customers, partners and VMware field can leverage the practical templates, examples and guiding principles authored by VMware’s vFabric experts to design, implement and run their own integrated vFabric middleware technology suite.

Download the vFabric Reference Architecture at www.vmware.com/go/vFabric-Ref-Arch.

vCAT 3.1 released with videos

vCAT 3.1 was released several weeks ago in time for PEX. We have added updates to vCenter Chargeback and vCloud Connector. Please see the release notes for specific details on content change.

vCAT 3.1 now includes two sets of videos in support of the vCloud Architecture Toolkit.
a) Executive videos – We have provided several short executive briefs on VMware Validated Architectures, vCAT, our use internally of vCAT, and alignment of our Cloud Infrastructure Management (Layer 1) set of technologies. Currently we have postings from Pat Gelsinger, Ray O’Farrell, Bogomil Balkanski, Scott Aronson, and Mark Egan.
b) Subject Matter Expert (SME) videos – We have included 10 videos covering the Document Center tool for viewing the documentation, and one for each of the 9 document areas.

VMware Press release
VMware Press will be publishing the vCAT 3.1 release for those wishing a printed copy.

As always, we welcome your feedback on how we can improve on our vCAT.next release. Please send feedback to ipfeedback@vmware.com.

vCloud Director Service Builder and the vCloud Director workflow run service

 

In vCAT 3 we described how we could leverage vCloud Director blocking tasks and notifications to extend vCloud Director with new capabilities and as part of the workflow examples document, provided the notification package as an implementation example leveraging vCenter Orchestrator rich library of workflows.

vCloud Director 5.1 introduced a new API extensions feature allowing a cloud provider to extend the vCloud API with developing services providing functionality not available in the vCloud API. These API extensions have been covered in details on Christopher Knowles’s theclouds.ca blog. Thomas Kraus wrote about implementing a specific service leveraging a vCenter Orchestrator workflow on his Cloud Actual blog.

Now it is my turn to release a tool to create custom services leveraging any vCenter Orchestrator workflow as a service operation. “vCloud Director Service Builder” is a wizard based workflow allowing to create new services and their operations in a few clicks.

In addition once a service operation has been started the included “vCloud Director workflow run” service allows managing the workflow life cycle.

Get vCloud Director Service Builder and the vCloud Director workflow run service and find out more information in the vCenter Orchestrator Communities.

 

Enforce System Wide CPU/Memory limits in vCloud Director

Have you ever wished you could prevent users from powering on VMs in your vCD environment with 4 or more CPUs? How about preventing VMs with more than 8 GB of memory from powering up? There may be performance benefits to enacting such limitations depending on the number of CPUs and cores the physical hosts in the underlying cluster have available.

While neither of the above items are possible with any basic settings, such control can be enforced in your vCloud Director environments. Laying out every step in detail to accomplish this task is beyond the scope of today’s post, but the basic steps are as follows:

  • Configure an AMQP server
  • Enabling the “Start vApp (Deploy from api)” blocking task
  • Specify a vCenter Orchestrator workflow to subscribe to the queue

If you find this concept interesting and feel you would benefit from such a solution, please leave me feedback as a comment to the Workflow Examples document in the Orchestrator Community.

Impact of “Greenfield” on the Organizational Constructs

 How does the Organizing for Cloud Operations section of the Operating VMware vCloud document apply to a new cloud operations organization for a “greenfield” VMware® vCloud deployment? In other words, what if we’re unencumbered by a legacy IT organization and its processes? This question has come up during several customer conversations recently. Admittedly, we’ve focused our work in this area on establishing cloud operations in the context of an existing IT organization.  By the nature and newness of the vCloud Tenant Operations organizational construct, it holds whether it’s a new organization implementation or an addition to an existing IT organization. What is impacted, though, is the Cloud Infrastructure Operations Center of Excellence (COE).   

The way I look at the impact is the Cloud Infrastructure Operations (COE) and key members of its ecosystem combine to become the new vCloud operations organization. Now, that is a bit of an oversimplification, but at its essence this is the case. As usual, the “devil is in the details.” A couple of key details I’d like to touch on are vCloud operations processes and vCloud operations role skillsets.

vCloud operations processes are the easier of the two. Standing up a new vCloud operations organization unencumbered by existing IT operations processes means you have the advantage (luxury?) of starting from scratch. The processes can be purpose-built for vCloud operations; taking advantage of new vCloud management tools capabilities and their impact on operations processes.  For example, taking advantage of the VMware vCenter™ Operations Manager™ impact on the event, incident, problem process cycle, or moving to policy-driven compliance with vCenter Configuration Manager™, or setting up Change Management with the goal of pre-approved, standard changes for capabilities such as VMware vSphere® vMotion®, HA, or DRS. How nice would that be?

The impact of a implementing a cloud operations organization unaffected by a legacy IT organization for a greenfield vCloud deployment on the role skillsets is particularly interesting to me. I believe this opens up some impactful possibilities. I’m thinking here of individuals responsible for physical networking and physical storage, for example. Creating these roles anew for cloud operations affords the opportunity for them to become experts in virtual networking and virtual storage as well. They become the vCloud networking SME and vCloud storage SME; a very powerful combination. In the Cloud Infrastructure Operations COE as part of a legacy IT organization version, the virtual networking and virtual storage expertise would reside within the COE Cloud Architect role. The Cloud Architect would regularly interact with the “champions” from the physical networking and storage teams, but there would be a separation of expertise. This clearly isn’t as efficient, but necessary when implementing within the context of a legacy IT organization. This would apply to other ecosystem functional groups as well. 

That said, what I just described would equally apply to how the Cloud Infrastructure Operations COE and it ecosystem would change as the vCloud infrastructure scales out and becomes the primary IT infrastructure environment used. But, transitioning the Cloud Infrastructure Operations COE and its ecosystem is far more challenging than creating a purpose-built cloud infrastructure operations organization from scratch.

Finally, what are the implications of these roles in a purpose-built vCloud infrastructure operations organization from a specialist versus generalist perspective?  This is another debate that, while I wouldn’t say it’s raging, it certainly the topic of some interesting conversations. I certainly have my views on this topic, but I’ll save those for another post.

vCAT Workflow Examples – What would you like to see?

The current release of vCAT features a few different workflow examples for use with vCenter Orchestrator. Those examples were based on the needs of VMware and various client projects we had worked on.

In considering what to include as examples, we had to be sure that we could make the processes generic enough to fit into any organization’s environment. The three released examples are:

Each of the examples above may be used as they are but are commonly used as starting points to larger custom projects around vCloud Director.

We want to hear your ideas on how to make the above packages better as well as ideas for new packages. Please post your ideas as comments to the vCAT Workflow Examples document post in the communities here: http://communities.vmware.com/docs/DOC-20230

 

vCAT Software Tools

The vCloud Software Tools document provides the reader with instructions on how to use some of the tools that are available for implementing, managing and reporting on your vCloud cloud environments.

vCloud Director Audit provides automated report generation against a vCloud Director deployment, providing you with details on how the vCloud environment is configured, provisioned and being consumed by the tenants of the cloud.

vCloud Provisioner provides an automated framework for describing the configuration requirements of the cloud being provisioned and the execution of that discription against a vCloud Director implementation in order to automatically create all of the underlying vCloud objects necessary to deliver the services described in the provisioner description.

Cloud Cleaner allows you to provision a vCloud Director instance on top of a vSphere environment, perform any testing or evaluation you may wish to carry out and then clean up anything performed against the vSphere environment by vCloud Director, returning the vSphere environment to a pre-vCloud Director state. This is useful when evaluating vCloud Director, or educating yourself on how vCloud Director is implemented on an underlying vSphere environment