Home > Blogs > VMware Security & Compliance Blog > Monthly Archives: April 2011

Monthly Archives: April 2011

2010’s Trend and Risk Report from a VMware Perspective

Hi Everyone, Rob Babb here. Yes there are 2 Rob’s on VMware’s security specialist team, but aside from name it’s very difficult to get us confused in person! At any rate, I wanted to take an opportunity to discuss a new report from one of our security vendor partners, IBM. The report I’m talking about is X-Force’s 2010 Trend and Risk Report, which was released on April 1st 2011. In the spirit of full disclosure, prior to coming to VMware in 2008, I was with ISS and IBM in various roles both pre and post acquisition and, much of the report was done by some good friends and former co-workers. In the report they go through all of the information they’ve collected for the previous year on all of the disclosed vulnerabilities in the computing industry. The 2010 mid-year report was the first time they brought up the virtualization layer, and now in the full-year report they’ve expanded on that a little bit more.

The reason I want to bring up this section is because I believe there is some very key information that each of us must examine in more depth to fully realize how it applies to our organization. And in that examination we take what sounds like a very scary, insecure, world and provide the contextual analysis to reveal that there’s more than meets the eye. What I mean is that we need to look at a few key areas and dissect the info.

  1. Not all vendors in the virtualization space are created equal.
  2. There are different types of hypervisor architectures, and each has its own potential set of vulnerabilities.
  3. Some exploits are targeted at common open-source components while others are targeted at proprietary code.
  4. When you review many of the XFDB’s CVSS scores, you see that their temporal element is listed as “unproven”. This is an important point because frequently we see a disconnect between what is announced as a “guest escape” and what actually injects arbitrary code to run on the host.

 

Vendor-specific considerations

So, on the subject of different vendors it’s very important to understand where each of the key players are at in their development of security related technology. This applies to both protecting the hypervisor as well as protecting the guest OS’s. We at VMware have built-in a ton of protection types for the hypervisor over the years. Some of those key protections, or core isolation principles, are responsible both for the day to day running of Guest OS’s as well as the ability to prevent one OS from taking over another OS or the host OS, aka a guest escape. Some of those technologies are:

  • CPU
    • Each VM runs under it’s own VMM and those VMM’s don’t communicate amongst one another
    • Trapping and Translating of privileged instructions
    • Emulation and Binary Translation
    • Hardware pipeline support for No-Execute and Execute-Disable commands
  • Memory
    • Hardware Translation Look-a-side Buffers (TLB)
    • Shadow Page Tables
    • Memory zero-ing on assignment to a Guest
    • ASLR of vmkernel processes
  • Storage
    • VMFS exclusive locking on running VMDK files
    • Abstraction layer of datastore volume from guest OS
  • Networking
    • Layer-2 attack immunity (VLAN, double encapsulation, brute forced multi-cast, etc)
    • Forged MAC detection
    • vSwitches act separately from one another and do not share data
    • Just like physical switches, virtual switches are designed to only forward packets. They never execute the payload contents of those packets.

And that’s just some key items built-into the out of the box vSphere deployment, some of which differentiates us from our competition. We’ve been doing many of those things since 2003 when ESX first came out. On top of that we’ve been building many other protections in over the years. These protections are not just for protecting the hypervisor itself, but others are industry standard security technologies that have been modified for an x86 virtualized realm. Some of the more advanced technologies are:

  • ESXi
    • Smaller footprint, less to patch, less running services
    • PXE boot or stateless boot
    • Lockdown mode
    • Trusted Platform Module (TPM) support
    • Self-checking diagnostics on boot-up
  • vShield
    • Edge – Perimeter security device
    • App – vNIC level inline network firewall
    • Endpoint – Anti-Virus offload mechanism
  • VMSafe & vShield APIs
    • A set of API’s for security vendors to directly interface with the hypervisor which allows the existing security industry ecosystem to evolve towards virtualized security appliances.

All of these things come together to build a very secure and reliable platform for virtual data centers to be built on top of. These are also key components to moving towards the industry vision of cloud computing. Without these protections you would have no way to de-couple your security protection from your networking infrastructure.

Hypervisor Architecture

The second thing to keep in mind is that there are 2 primary virtualization architectures used today for x86 computing. The first is the Type-2 architecture where you run a generic multi-purpose OS (Windows, Linux, OSX) as your “Host OS” and on top of that you run a software application package that provides your x86 virtualization layer for your guest OS’s. In the Type-2 architecture you have all the security ramifications you normally would for that generic OS. The principles of isolation and separation for VM’s at the hypervisor level are unchanged, but because the hypervisor has to go through the Host OS to access the physical layer there is a much greater potential for exploit of the Type-2 scenario.

By contrast a Type-1 hypervisor is what’s known as a bare-metal installation.  There is no generic OS running between the hypervisor and the physical resources, and because of this, the hypervisor can have a very intimate tie-in to the hardware. Type-1 systems are not all created equally. In Hyper-V and Citrix Xen they both use what’s known as a parent partition (aka Dom0) as a localized management environment. That parent partition typically acts as a funnel for much of the communication between the guest OS’s and the physical resources. Because of that it is a frequent target for attack from hackers looking to intercept and maliciously change information mid-stream. On ESX we have something similar, known as our Console OS (COS), however our COS does not act as a funnel for access to resources and instead is treated more as a mechanism for administrators to change/manage the host system.

Exploits against Local Host Management Consoles

As with the Hyper-V parent partition and Xen Dom0, the COS is a source of potential attack as it contains some privileged access to control the hypervisor. Over the years we saw on the order of ~90% of all our patches related to the COS.  Because of this enormous security footprint of the COS, we decided several years ago to begin migrating away from having one altogether. ESXi has no COS; instead of having a 2+GB footprint of potentially exploitable software, the entire ESXi code base is less than 100MB. One of the most significant premises in computer security research is that the smaller the code base (read attack surface), the less security vulnerabilities that exist. ESXi gives us a great lead in this realm ahead of our competition. We no longer have a ton of generic 3rd party code. Instead the majority of our ESXi hypervisor is written in house at VMware and is purpose built to only run virtualization. This approach has also forced us to think in new ways about how to properly deploy hypervisors as part of the infrastructure and to control that infrastructure remotely only over secure API channels. To that end we’ve developed a plethora of APIs and access mechanisms over those APIs to make managing a vSphere environment more centralized in nature. The hypervisor itself is moving to more of a stateless plug-n-play deployment model where the central command-and-control enforces policy and compliance standards for the whole of the datacenter.

Exploitability of Vulnerabilities

Now why have I spent so much time talking about core isolation and hypervisor architecture types? I told you this was going to be about the X-Force report and their discussion of virtualization. Well, it’s back to my fourth point. Because the X-Force report looks at the virtualization industry as a whole, they are forced to lump a bunch of things together. But that doesn’t do any good to my customers who want to know what their true threat and risk is to their virtual environment. So the purpose of laying out all that foundational knowledge up to this point is so we can truly analyze the threats of 2010 as related to vSphere.

Some of you may be familiar with X-Force’s huge repository of information called their XFDB. Having come from ISS I know that DB pretty well and have used it much in my past roles there in Support and QA. I don’t have as much access to that data as I used to and I was forced to do my queries and analysis with a rudimentary set of public tools, but nonetheless I was able to gather some metrics. By using the X-Force Database Search at http://xforce.iss.net/ you too can do the same type of searching I did. I started by pulling all the vulnerabilities that contained “ESX or ESXi”. That yielded 106 results between 1/1/2010 and 12/31/2010. After I got those results, I used some xpath querying to try and categorize the results in terms of severity and such. I want to try and put them in the context of the types of attacks that the X-Force team uses to describe virtualization attack vectors. A look over the attacks shows the majority applies to generic open source libraries, which would be targeted mostly at the ESX COS, and not ESXi.

I plan to release a follow-up blog in a few weeks that further goes through my analysis of these exploits. As part of that process I’ll be exploring how each exploit operates, what part of the environment it targets, and what the resulting exploit provides to the attacker. The resulting information I hope will help to frame our security discussions with many more facts and much less ambiguity.

With all that said, I’ll bid you all a great week and we look forward to having these security discussions with all of you!

 

Security FAQ: My vShield Endpoint SVM is not responding, what do I do?

Hey…Rob Randell here again.  A new feature that we will be sprinkling into the security blog is entries that will talk about some interesting or frequently asked questions that we feel deserves some more explanation to more than just the person who asked the question.  

Recently we had a question come up a few times as to the resiliency of the vShield Endpoint SVM and what happens if it fails or if the app itself stops responding.  Specifically, the question is: “What kind of availability capabilities do we have for the vShield Endpoint SVM?”

The issue obviously is that if it does fail the VMs being protected by the SVM for AV scanning will be vulnerable to virus’ during the time it is down.  So because of this issue, we’ve built in “health monitoring” of all of its components through standard vCenter Events and Alerts.  These events can trigger an alert in vCenter, which in turn can trigger an action.   This is well documented in the vShield Admin Guide staring on page 81.  That said, we thought it would be worthwhile to discuss this in deeper detail to bring it to folks attention.

The vShield Endpoint SVM that is provided by our partners is constantly monitored by the vShield Manager.  If for some reason the SVM stops responding the vShield Manager will send an event to vCenter that will trigger an alarm.   The screenshot below shows the prebuilt alarm for alertling on the status of the SVM appliance itself.

Alarm Setting - General

These alarms can be used to perform a number of actions like send a notification email or SNMP traps, reset the SVM, or reboot the VM.  In addition, the host can be put into maintenance mode, which will force all VMs to migrate to other hosts in the same resource container that have working SVMs providing protection.  It can be configured to even run a command.  For example, because the SVM is stateless, a standby SVM can even be configured (by cloning the original SVM after registration) to take over in case of a failure.  This can be accomplished through a script which can be run should the alarm be triggered.  This allows us to minimize the downtime of an SVM as well as get notified should an issue such as this should arise so it can be responded to very quickly.  The screenshot below shows a subset of the list of actions that can be taken.

Alarm Settings - Actions

So in short, there are a number of options to provide resiliency and redundancy into the deployment of the vShield Endpoint SVMs.  Expect more of these FAQ type blogs in the future on the VMware Security Blog.

vSphere 4.1 Security Hardening Guide released

VMware would like to announce the availability of the final release of the vSphere 4.1 Security Hardening Guide.  The Introduction section describes the scope, structure, recommendation levels, and other aspects of the guide in more detail.  Please read this section first before diving into the rest of the guide, as it provides important context.

Although this version of the guide can be considered as "final" and appropriate for use in production environments, we recognize that there is always room for improvement.  We will continue to welcome comments and corrections on this guide, and we will publish updated versions of the guide from time to time as feedback is accumulated.  This feedback of course will also be incorporated into the hardening guide for future releases of vSphere.

The vSphere 4.1 Security Hardening Guide has been posted to the VMware Communities in the "Security and Compliance” area, in the Documents tab.  Please provide feedback in the Comments area.