Home > Blogs > VMware vCloud Blog


How to Avoid Virtual Machine Sprawl in the Cloud Age

Steve Jin, VMware R&D

This entry was reposted from DoubleCloud.org, a blog for architects and developers on virtualization and cloud computing.

Technology can be a lot like fashion, with quickly shifting trends. Once we embraced big iron but after the mainframe age the industry went into the client/server age where we soon found too many servers to manage. So we consolidated them, not back to the mainframe age, but onto hypervisors. With one physical server, you could run multiple virtual machines.

Server consolidation solved a big problem and resulted in big cost savings. From management’s point of view, however, it does not actually reduce the number of servers to manage in your enterprise. To some extent, it worsens the problem!  In some circumstances it’s so easy and inexpensive to create a new virtual machine that you end up with many more servers than you really want – or can effectively manage. This problem not only exists in private clouds, but also in the public cloud.

According to VMware CEO Paul Maritz in his keynote at VMworld 2010, the number of virtual machines exceeded physical machines in 2009, and will reach 10 million by the end of this year. This is definitely great news for the virtualization software industry but also a challenge moving forward.

So how should you try to solve the problem of virtual machine sprawl or even better, prevent it from happening? I discuss some solutions one by one here.

Better management

The pure number of servers poses a big challenge for management. For every additional server, you have to configure and manage it. This effort is proportional with the number of servers – despite virtualization.

While we cannot reduce the work in the back, we can definitely ease the interaction, and reduce the management complexity.

Here are several approaches:

1. Grouping. Putting related servers in a container group and managing the container can significantly reduce the complexity. You can hide the servers unless you want to dig down for more details. vSphere vApp is a good example.

2. Generalization. It’s a special type of grouping in which you can define generic behavior/settings while allowing group members to override any of them. You can think of the port group in vSphere networking as an example.

3. Aspect oriented management. You don’t look at all aspects, but one aspect of all the entities. Then, you have to handle one thing at a time. Need an example? Think about the profile management in vSphere Client. When focusing on one aspect at a time, a human being can handle complexity better.

4. Automation. Automating routine tasks and operations through scripting and programming is also critical. No one can afford clicking and typing all from GUIs in a large-scale system.

Different architectures

Server sprawl is a problem mainly because every server is different. We have to keep it even though there is only very small piece of user data that is unique, not to mention different applications. What if we could externalize user data and make all the servers the same? This way we don’t need to care much about particular virtual machines, and we can trash them anytime we want. When appropriate, we can also recycle used virtual machines.

This brings in big benefits – not only the ease of provisioning and lifecycle management but also simplification of system backup, auditing, etc.

Any disadvantages? You have to re-think your platform architecture. Legacy applications have to be re-architected and rebuilt, which may or may not be worth the effort given a particular project in a given scope.

Follow this blog for future architectural tips. I will blog more about this soon.

Merging virtual machines

Virtualization also consolidates virtual machines but does not break the boundary of multiple virtual machines. The consolidated virtual machines continue to run exactly the same way as before. The migration process is seamless and painless.

How about a step further? We can break the boundary of virtual machines and merge multiple virtual machines into one. This may sound crazy but it’s feasible for some types of virtual machines.

Let’s pick a concrete example. Say we have several virtual machines serving web content. They can easily be combined into one virtual machine by copying files and configuring virtual host features in a web server like Apache. From a users’ perspective, there is no noticeable difference.

The disadvantage is that we lose the built-in isolation by virtual machines. Sometimes you care about isolation; sometimes you don’t. When you don’t, you should definitely consider this.

Technically it may be difficult, if not impossible. For example, if you have applications running on Windows then the tight coupling with the system registry may prevent you from moving applications around. Certainly you want application virtualization technology like VMware ThinApp so that you can easy move apps by simply copying the application directory.

After all, you don’t want to merge the virtual machines manually. A good merging tool is definitely a big help here. It’s also an opportunity for some entrepreneurs to start a company because I don’t see such a tool available today.

Steve Jin is author of VMware VI & vSphere SDK (Prentice Hall), founder of open source VI Java API, and is the chief blogger at DoubleCloud.org.