Home > Blogs > VMware vCloud Blog > Monthly Archives: June 2010

Monthly Archives: June 2010

System Provisioning in Cloud Computing: From Theory to Tooling (Part I)

By Steve Jin, VMware engineer

 

This entry was reposted from DoubleCloud, a blog for architects
and developers on virtualization and cloud computing.

Cloud computing is an evolutionary technology because it doesn’t change
the computing stack at all. It simply distributes the stacks between the
service providers and the users. In some sense, it is not as impactful as
virtualization technology that introduced a new hypervisor layer in the
computing stack and fundamentally changed people’s perception about computing
with virtual machines.

But if you look closely at the latest IaaS clouds, they do leverage
virtualization as a way to effectively and efficiently deploy systems. Inside
one virtual machine, the computing stacks remain the same as before: from OS to
middleware to application.

Keep in mind that the application is the end while the OS and middleware
are the means. Customers care about applications more than the underlying
infrastructure. As long as the infrastructure can support the applications,
whatever the infrastructure might be is fine technically. Then the question
would shift to the economic side: whatever is the most cost effective wins in
infrastructure. That’s why Linux gains more share in the cloud than in
traditional IT shops.

To get to the end, you have to take a mean. In the IaaS cloud, you have
to install the underlying OS and middleware before you can run your
application. For the PaaS cloud, you can get away from that by focusing on
application provisioning.

OS Provisioning

Remember, the software stack inside a virtual machine doesn’t change. It
needs the OS, middleware and the application installed and configured before
the application can work. Because the software stack has one layer stacked on another, the
installation has to come in the order of bottom layer first. You have to
install the OS first, then middleware, and lastly the applications.

Although it happens in a cloud, it’s not a new problem. There are
already some tools available for you to solve the problem in traditional
computing environments:

* Kickstart (http://kickstart-tools.sourceforge.net/).
"This tool allows you to automate most of the Red Hat Linux installation
including language selection, network configuration, keyboard selection, boot
loader installation, disk partitioning, mouse selection, X Window System
configuration, etc. A system administrator needs to create a single file
containing the answers to all the questions that would normally be asked during
a typical Red Hat installation."

* Jumpstart (http://en.wikipedia.org/wiki/Jumpstart_(Solaris)).
"It’s used to automate the installation of Solaris OS."

* Cobbler (https://fedorahosted.org/cobbler/).
"Cobbler is a Linux installation server that allows for rapid
setup of network installation environments. It glues together and automates
many associated Linux tasks so you do not have to hop between lots of various
commands and applications when rolling out new systems, and, in some cases,
changing existing ones. With a simple series of commands, network installs can
be configured for PXE, reinstallations, media-based net-installs, and
virtualized installs (supporting Xen, qemu, KVM, and some variants of VMware).
Cobbler uses a helper program called 'koan' (which interacts with Cobbler) for
reinstallation and virtualization support."

* OpenQRM(http://www.openqrm.com/).
"openQRM is a very comprehensive and flexible Open Source
Infrastracture Management Solution. Its fully pluggable architecture focuses on
automatic, rapid- and appliance-based deployment, monitoring,
high-availability, cloud computing and especially on supporting and conforming
multiple virtualization technologies.
openQRM is a
single-management console for the complete IT-Infrastructure and provides a
well defined API which can be used to integrate third-party tools as additional
plugins. This provides companies with a highly scalable system that supports
small companies as well as global businesses who have large server base,
multi-os & high-availability requirements."

* xCAT (http://xcat.sourceforge.net/). "xCAT is
DataCenter Control. It allows you to: Provision Operating Systems on physical or virtual machines:
Centos5.X, SLES[10-11], RHEL5.X, Fedora[9-11], AIX, Windows Server 2008,
Cloning or scripted installation methods. Remotely Manage Sytems: Integrated
Lights-out management, remote console, and distributed shell support. Quickly
set up and control Management node services: DNS, HTTP, DHCP, TFTP. xCAT offers
complete and ideal management for HPC clusters, RenderFarms, Grids, WebFarms,
Online Gaming Infrastructure, Clouds, Datacenters, and whatever tomorrow's
buzzwords may be. It is agile, extendable, and based on years of system
administration best practices and experience."

Middleware
Provisioning

After getting the OS ready, you can go ahead with middleware
installation. The available tools allow you to describe the target configuration of
the software and then they take care of the rest. It’s much like a policy-driven
process.

* Puppet (http://www.puppetlabs.com/).
"Puppet’s declarative language describes your system configuration, allowing you
to easily reproduce any configuration on any number of additional systems.
Additionally, Puppet can help establish and enforce approved system
configurations automatically correcting systems that drift from their baseline.
Puppet provides an audit trail of all your systems, which can easily be kept in
version control for compliance purposes. Organizations are increasingly taking
advantage of Puppet’s support of a variety of operating systems. Whether you
are supporting Linux (Red Hat, CentOS, Fedora, Debian, Ubuntu, SuSE), or Unix
OS’es (Solaris, BSD, OS X), Puppet can fulfill your requirements. Although
Puppet evolved primarily to support Unix-like OS’es, Windows support is planned
in the near future."

* Chef (http://wiki.opscode.com/display/chef/Home).
"Chef is a systems integration framework, built to bring the
benefits of configuration management to your entire infrastructure. With Chef,
you can: Manage your servers by writing code, not by running commands. (via
Cookbooks);
Integrate tightly with your applications, databases, LDAP directories, and
more. (via
Libraries);
Easily configure applications that require knowledge about your entire
infrastructure ("What systems are running my application?" "What
is the current master database server?")"

* SmartFrog (http://wiki.smartfrog.org/wiki/display/sf/SmartFrog+Home).
SmartFrog is a
powerful and flexible Java-based software framework for configuring, deploying
and managing distributed software systems. SmartFrog helps you to encapsulate
and manage systems so they are easy to configure and reconfigure, and so that
that they can be automatically installed, started and shut down. It provides
orchestration capabilities so that subsystems can be started (and stopped) in
the right order. It also helps you to detect and recover from failures.

* CFEngine (http://www.cfengine.org/).
"Cfengine ensures that you have the proper packages installed, that
configuration files are correct and consistent, that file protections are
correct, and that processes are running (or not) in accordance with policy.
Cfengine closes security holes, hardens your systems, and makes sure that
critical daemons stay running. It monitors performance and reacts to what it
monitors. You tell Cfengine what promises you want it to keep, and the agent
does the work. Cfengine runs natively on all common platforms, including Linux, Unix,
Macintosh and Windows. It also has support for virtualization platforms.
Cfengine is supported by a community
of expert and novice users
, and a commercial enterprise of qualified Mission
Specialists. Cfengine can play a major role in solving almost any system administration
issue, with hands-free automation (see our Solutions guide
and standard library resources Policy Starter
Kit
, and we are constantly working to made automation simpler, without
over-simplifying."

* BFFG2 (http://trac.mcs.anl.gov/projects/bcfg2).
"
Bcfg2
helps system administrators produce a consistent, reproducible, and verifiable
description of their environment, and offers visualization and reporting tools
to aid in day-to-day administrative tasks. It is the fifth generation of
configuration management tools developed in the
Mathematics and Computer Science Division
of
Argonne National Laboratory.
It is based on an operational model in which the specification can be used to
validate and optionally change the state of clients, but in a feature unique to
bcfg2 the client's response to the specification
can also be used to assess the completeness of the specification. Using this
feature, bcfg2 provides an objective measure of how good a job an administrator
has done in specifying the configuration of client systems. Bcfg2 is therefore
built to help administrators construct an accurate, comprehensive
specification."

Application Provisioning

With the right system configuration in place, it’s time to install the
applications. So why not use the same tools we used for the OS and middleware? Do
we need yet another set of tools?

It depends. You can use the same set of tools for middleware to install some
applications. The middleware appears like an application to the OS as well. The
difference is whether your application is stable enough and whether you need to
customize per node. The tools like Puppet can be good for stable applications
that can be deployed the same way across all nodes. If your application is
still a work in progress and you need flexibility to tweak it, you need more
specialized application provisioning tools.

The big technical difference between application and middleware provisioning
tools is that application tools push the application to the nodes and remotely
change anything as needed. The process is procedural.

The middleware provisioning tools normally have agents on the nodes to
pull the software based on the prescribed configuration files. The process is
declarative.

Beyond the “push” and “pull” difference, the application provisioning
tools can also manage the lifecycles of applications (sometimes called
services) distributed on different nodes with a single line of command or code.
Given the nature of remote command dispatching framework, the application provisioning
tool can do almost anything. If there has to be a limitation, it’s your
imagination. So if you develop applications by yourself, you most likely need
application provisioning tools.

Let’s see what tools are there:

* Capistrano (http://www.capify.org/index.php/Capistrano). "Capistrano is an open source tool for running scripts on multiple
servers; its main use is deploying web applications. It automates the process
of making a new version of an application available on one or more web servers,
including supporting tasks such as changing databases. Capistrano is written in
the Ruby language and is distributed using the RubyGems distribution channel.
It is an outgrowth of the Ruby on Rails web application framework, but has also
been used to deploy web applications written using other frameworks, including
ones written in PHP. Capistrano is implemented primarily for use on the bash
command line. Users of the Ruby on Rails framework may choose from many Capistrano
recipes; e.g. to deploy current changes to the web application or roll back to
the previous deployment state. (http://en.wikipedia.org/wiki/Capistrano)"

* ControlTier. "ControlTier is a community driven, cross-platform
software system used to coordinate application service management activities
across multiple nodes and application tiers…The Command Dispatcher is a core
function of the ControlTier software that provides the mechanism to send commands
over the network seamlessly to the correct Nodes. This facility is used
whenever you run a command or script, via the command-line (ctl or ctl-exec) or
via Jobcenter."

* Fabric (http://www.fabfile.org). "Fabric is a Python
library and command-line tool for streamlining the use of SSH for application
deployment or systems administration tasks. It provides a basic suite of operations for executing local or
remote shell commands (normally or via
sudo) and
uploading/downloading files, as well as auxiliary functionality such as
prompting the running user for input, or aborting execution. Typical use
involves creating a Python module containing one or more functions, then
executing them via the
fab command-line tool. Func  (https://fedorahosted.org/func/). Func
allows for running commands on remote systems in a secure way, like SSH, but
offers several improvements. Func allows you to manage an arbitrary group of
machines all at once. Func automatically distributes certificates to all
"slave" machines. There's almost nothing to configure. Func comes
with a command line for sending remote commands and gathering data. There are
lots of modules already provided for common tasks. Anyone can write their own modules using the simple Python module API.
Everything that can be done with the command line can be done with the Python
client API. The hack potential is unlimited. You'll never have to use
"expect" or other ugly hacks to automate your workflow. It's really
simple under the covers. Func works over XMLRPC and SSL. Since func uses
certmaster, any program can use func certificates, latch on to them, and take
advantage of secure master-to-slave communication. There are no databases or crazy
stuff to install and configure. Again, certificate distribution is automatic
too."

VMware engineer Steve Jin is author of VMware
VI & vSphere SDK (Prentice Hall)
, founder of open source VI Java API, and is the chief blogger at DoubleCloud.org


VMware vCloud Architect Massimo Re Ferre’ talks public cloud

Following our Orange Business Services announcement, their annual conference Orange Business Live took place in Amsterdam on June 15th to 17th. Weren’t able to make it to
the Netherlands?

Below is a clip from the event, featuring an Orange representative
interviewing VMware vCloud Architect Massimo Re Ferre’. Re Ferre’ explains how VMware
technology provides a secure public cloud for customers, and the advantages of
being a VMware service provider.

The main highlights:

·              
Re Ferre’
stresses the point that VMware is a technology enabler for service provider partners, allowing them to implement
an IaaS layered architecture.

·              
The IaaS layer
for Orange Business Services provides a true public cloud offering based on
virtualization tools they’re already using, such as vSphere.

·              
Re Ferre’ says virtualization
is the cornerstone because it is imperative for flexibility and agility in the
cloud, enabling cloud abstraction.

·              
Virtualization
introduces concepts that go beyond running different operating systems. For
example, encapsulation allows a user to turn a service or application into a
single file making it easy to move that workload into the cloud.

VMware provides Orange
Business Services and other service providers with “infrastructure plumbing” in
the cloud, enabling them to host customers securely. Another advantage for
partners is the ability to add additional services, templates or other value
atop the VMware foundation. This is the concept of VMware – federating
infrastructures, giving both our service providers and customers freedom of
choice to federate internal infrastructures with outside compute power in the
public cloud.

Re Ferre’ also discusses
VMware’s recent acquisitions of SpringSource and Zimbra, and the role they play
in the company’s PaaS and SaaS strategies. Check out the full video here!

Cloud and the New IT Pillars

By Massimo Re Ferre’  

Staff Systems Engineer – vCloud Architect

 

I have used one
of my recent posts in some off-line discussions about the use and penetration of virtualization
in some accounts. In this post I’d like to expand a bit on that. I will just
start with a nice picture that is supposed to summarize with a different
graphic (but with the same core concepts) what I was trying to argue in the
post I was referring above.  In
this case, I think a picture is worth 1,000 words:


VCloud and the new IT pillars
Specifically I
want to position where cloud infrastructures are going to fit into an
organization. While this picture is really focused on internal (enterprise)
deployments it also maps how service and hosting providers are going to shape
their offerings for their end-users (more specifically, on the right hand-side
the traditional hosting business and on the left hand side the new virtual
servers /cloud business).

 

To make a long
story short, most enterprise will have to accommodate – like it or not – these
four platform pillars (right to left):

 

* Proprietary platform: essentially all non-x86 platforms.
Think of mainframes and the AS/400 as prime examples. While many may not refer
to Unix as a proprietary platform, I believe it is.

 

* Physical x86 platform: traditional Windows and
Linux deployments on physical servers. This is the typical old way where a
single OS image maps to a dedicated physical server.  Many customers still have physical server deployments as
part of their regular practice. Sometimes this is required; sometimes they do
this simply for “irrational fear of virtualization technologies”.

 

* Virtualized x86 platform: this is the first
deployment policy for many organizations. Think of VMware VI3 or VMware vSphere
deployments. This has proved to work well for the last five to six years and it’s
an established good practice. As mentioned in the post I referred to at the
beginning, the level of penetration may vary depending on many factors.

 

* Cloud platform
(IaaS): this is the new potential player in your infrastructure and it’s
probably going to support the less critical and more dynamic environments you
have to deal with on a daily basis (test and development is one common example;
there are many others).

 

One could spend
hours commenting on this slide but I’ll try to be dry on some key points that
you need to digest (in my opinion).

 

First and
foremost there is clearly a trend where the left pillar(s) is taking over some
of the workloads of the right pillar(s). And this trend is consistent across
the board: x86 physical deployments are cannibalizing proprietary platforms. It’s
the same pattern for virtualized deployments eating up typical x86 physical
workloads (more and more we hear about the “virtual first policy”). Last but
not least expect the new player, cloud infrastructure, to cannibalize most of
the traditional virtualized x86 deployments.  Stopping the trends I am describing here will be as
difficult as trying to stop a moving train with your fingertips: good luck.

 

At this point
you may wonder why, given that cloud infrastructures build on top of and
leverage virtual infrastructures, I am calling out two specific and separate pillars.
That’s a good point, especially because it’s true that clouds (specifically
IaaS clouds) build on top of hardware virtualization. In fact I argued just
this point in the other post I referenced at the beginning of this blog: since
the first cloud instantiation will tend to trade-off the complexity of many
tuning options for a better and easier end-user experience, we expect some
workloads that require a bit of tuning and visible infrastructure layout
options to remain on more traditional vSphere types of deployment.

 

If you think
about this, cloud is all about agility and with agility comes less control
(i.e. tuning). As time goes by these two pillars will converge. The Cloud
pillar will take over the traditional virtualization pillar. Indeed, I expect
that most of these tunings and controls will no longer be needed because of the
additional automation and auto-tuning concepts that cloud-related technologies
offer. Last but not least, let’s not forget that Cloud technologies will also
mature over time and will fill holes we see in the first wave of cloud
technologies.

 

Finally, I’d
like to touch briefly on management. I think we need to be all very pragmatic
here. I know many customers are looking for the nirvana “one tool to manage
them all”. The fact of the matter is that the more you try to normalize these
pillars under the same management umbrella, the more benefits for each of the
pillars you must sacrifice. I have had an interesting discussion lately with a
colleague at VMware and I think he did hit the nail on the head when he said, “
They want to have one tool
because they think it’s more efficient. It’s not. It’s more efficient, and more
effective, to run two tools that manage two systems well than to run one tool
that manages ten systems poorly
“.

 

In my previous
IT life I was in the business of trying to homogenize heterogeneous
virtualization platforms under a single management umbrella so I have to
(strongly) agree with my colleague’s statement. In fact, these pillars are very
different in the way you manage them. This is true not only from a technology
perspective but also, and even more so, from a process perspective. For
example, the process to request a partition on a legacy Unix system may be
totally different than the process required to instantiate a new physical
server, which in turn is totally different than the process to request a new
vSphere virtual machine. To complicate things more, the Cloud pillar, by very
definition, doesn’t require any process whatsoever to instantiate a new
workload from the self-service portal.

 

Try to
homogenize this with common processes, a single management umbrella, and a
single pane of glass. The moment you think you have done it, you wake up all
sweaty.

I am not making the case that your application or
service will not span different pillars. You may very well have your scale-out
web front-end on a dynamic cloud pillar and your scale-up back-end database on
a more tunable virtualized pillar or any other combination. After all, the
concept of application layer tiering
isn’t that new in this industry. If you think about that we have just added
another interesting pillar (Cloud) into a picture that we have been using in
the last ten years.  This is not
going to shake our world, but it is going to make it much better.

 

 

 

 

Are hypervisors cloud commodities?

By
Massimo Re Ferre’

Staff
Systems Engineer – vCloud Architect

There
have been a number of discussions in the industry in the last few years about
whether hypervisors are (becoming) a commodity and whether the value is (or
will be) largely driven by the management and automation tools on top of them.
To be honest, I have conflicting sentiments about this. On one hand I tend to
agree. If you look at how the industry is shaping pricing schemas around these
products, that's the general impression – all major hypervisors are free and by
definition one could argue that they are a commodity.

On
the other hand, this doesn't really match my definition of commodity. I'd
define a commodity technology as something that had reached a "plateau of
innovation" where there is very little to differentiate from comparable
competitor technologies. This pattern typically drives prices down and adoption
up (in a virtuous cycle) because users focus more on costs rather than on
technology differentiation. The PC industry is a good example of this pattern.

Is
this what is happening with hypervisor technologies? Hell no. I think there is
no one on this planet who thinks that deploying OS images on dedicated physical
servers is faster, more flexible and in general better than deploying them on a
virtualized host. Yet virtualization usage, in the industry, is broad but
not deep
and it's usually around 30 percent (on average) within most
organizations. And these technologies are widely available for free (ESXi,
Hyper-V, XenServer and KVM)!

So
if everybody agrees that there is a problem with the current physical server deployment
model, and that there are free technologies available to download from the Internet
that can address the problem, why are organizations only confident to put 30
percent of their workloads on these hypervisors? Can someone explain this? My
take is that there may be a number of concerns around support and licensing. But
the industry has matured and made huge progress on this front in the last few
years (Oracle being one of the few exceptions unfortunately). I bet that a
large chunk of that 70 percent of server deployments is not virtualized simply
for technology concerns such as stability, performance, scalability, security
and so forth. Where there are technology concerns or technology limitations then
there is space for innovation (or education to raise awareness).

The
fact that the industry is moving to a model where the hypervisor is free and
the management tools are the source of revenue tells a partial story to me. The
technology story behind the scenes is quite different. The reality is that
there are multiple ways to look at hypervisors and their use cases. If you view
the hypervisor as the thin software layer that allows you to consolidate five
servers on a single box… well I am with you. At 10 Km/hour there is little
difference between a Ferrari and a Fiat (even though the Ferrari is still damn
cool). If you, instead, view the hypervisor as the foundation for private and
public clouds where multi-tenancy, security, flexibility, performance consistency
and predictability, integrity and scalability are not optional
characteristics… well then there is a difference indeed.

You
may argue that you can achieve most of these characteristics using the proper
management and automation tools that sit on top of bare metal hypervisors. But
the fact is that the policies at the management layer are only as good and reliable
as the hypervisor used to implement and enforce them. Yes, you could put a
Ferrari engine on a Fiat and have the best pilot (Michael Schumacher
Fernando Alonso) pushing it at 330 Km/hour! And everything may be great up
until the moment when you hit the brakes and find out that it will take you 1,500
meters to stop it (if you don't hit a wall before).

Similarly
could say that the real "value" of an airplane is its cockpit with
all the automation that goes into it. Again, you can put autopilot on and all
is good but, at the end of the day, the autopilot (and all the other automation
technologies in the cockpit) only instructs the "basic" airplane
technologies (thrust reversal, flaps, etc…) to do the real job. And I can
assure you that you will want these technologies to be as good, reliable and
secure as possible!. Always remember that it's not the autopilot and all the slick
automation that happens in the cockpit that keep you flying at 33.000 feet – it's
the wings.

I
am mixing metaphors here and perhaps digressing. Going back to our lovely
"commodity" hypervisors discussion, one of the things that always
shocked me is how powerful the networking subsystem is that is inside ESX. It's
just amazing. Out-of-the-box and easy-to-use support for distributed virtual
switches, redundancy (both at the physical and logical level), multiple
failover and balancing algorithms on a PortGroup basis, traffic shaping,
security built-in via the VMSafe APIs, and a tons of other parameters and features
that you can leverage and tune based on your specific requirements. And what
you have seen so far is really just the foundation of what's happening in terms
of injecting more cloud oriented and multi-tenancy support. We are working on
some cool stuff that will be coming out in the future that is just amazing. I personally
spent the last three months digging into those things and the potential there
is phenomenal. I can't talk about this in detail today but it's pretty clear
that here we are not talking about just setting up 10 Windows VMs on a physical
server allowing them to connect to a flat L2 segment sharing a single Ethernet
cable. I can't wait to talk more about what we have in the works and to prove to
you that, just like you can't build a castle on the sand, you can't build an
Enterprise Cloud on a limited hypervisor.