Author Archives: Robert Babb

Thoughts on Visibility, Context, and Control after listening to Chris Young’s keynote from RSA 2012

At RSA last week in San Fran, Chris Young, from Cisco, commanded the stage, he held the audience on the edge of their seats in anticipation, and he said all the right things. Well, that is he said all the right things to make Cisco sound perfectly positioned. And can we really fault them for being so network centric?

He did make some excellent points. Chris said “I believe that visibility and context aware enforcement are two of the things we all need the most in security” which I totally agree with. You obviously can’t take action against an attack if you can’t visualize it. And how do you know if it’s legitimate or not without the context. In fact this has been the premise of IPS and Anomaly Detection tools for over a decade. And yes, those are definitely network based tools.

Have you ever stopped to consider WHY those tools are network based? It’s because that was where the concept of an inline tap was developed. It was easy for an engineer to take the network signal off an ethernet cable and pipe it into an analysis tool without actually interfering with the connection. And that served as a much easier way of monitoring lots of systems communicating without the pain of attaching to every new system as they were added to the network.

The concept of a tap may have been born out of networking, but it can now be applied to a wide range of other technologies. The same idea has been applied and in-use with software agents for at least a decade as well to shim or tap the CPU, memory, storage, and networking stacks inside an OS. Those agents work fairly well, but in recent years they have succumb to attack from malware designed to disable these tools upon takeover of an OS. The reason the malware has been successful against a “host agent”, but not against a network agent comes from the context of execution. When a piece of malware takes over an OS, it has already taken control beyond the scope of what was originally designed. To say it another way, the malware plays with no rules or makes up its own rules on the fly, but the security software and OS are only going to ever adhere to the rules they know. So agent based tools are always at a disadvantage.

So with that context in mind, the security world has been split between network centric tools and software agent tools but because of the inherent dis-advantage of the software agents we’ve seen an uptick in network specific tools over the last several years. To articulate this point, Chris also said “the network is becoming the only constant source of intelligence we can rely on and the only control point we can depend on”. Unfortunately, this is where I will have to disagree with Chris and Cisco’s approach to using only network centric tools. What he failed to acknowledge is another form of tap available and in use today. This technology, like agent based solutions can intercept many different forms of data streams such as CPU, memory, network, or storage. However this solution does NOT have the problem agent based tools have and instead leverages the transparent inspection nature of a network based tap. Sounds like the best of both worlds right?

So what is this tapping tool that Chris neglected to acknowledge? Why of course it’s a Hypervisor! Yes, that’s right, it’s the core competency of virtualization and what I’m describing is an added benefit that has been overlooked by others for many years, but which we at VMware have invested heavily in for nearly half a decade now. All data processing in all forms that ever happens inside a VM is all passed through the hypervisor and all of that data is available to be inspected for any conceivable reason. And we’ve already been creating access methods for the security industry to use for the last 4 years. These tools could be API’s like VMSafe or EPSEC for partners to use, or even our own vShield suite of technology.

Even Cisco is using some of these technologies, like vNetwork and DVFilter, to do their own inspection and enforcement like Chris is advocating. In fact their own implementation while gaining access to these data streams in the hypervisor, they insist on moving the inspection back into their network centric tools via the Nexus 1000v and their Virtual Security Gateway (VSG).

The problem with that approach is that the depth of these protection tools is typically not comprehensive across all of the different threat vectors. What we need to do as an industry is work on ways to better integrate and adopt these tools more rapidly. The unfortunate truth is that each of the security vendors has a core competency and they let that small set of protection tools dictate the direction of their portfolio and development efforts. Whereas our adversaries recognize none of these limits, play with no rules, and exploit our unwillingness to properly implement our defense in depth and breadth strategies. As a call to action we should learn to embrace each of our various tool sets, make it easier for our customers to use our tools in conjunction with one another, and even someday to create an open management framework for shared policy constructs.

We don’t need to focus on the network and the minimal set of inspection points that has to offer in the traditional security model. Instead we should focus on the hypervisor and the near infinite and simultaneous inspection points now available. Only this level or cooperation will allow us to take off our stack specific blinders and instead Visualize the true threat landscape and apply the proper Context to implementing our Control boundaries in this new evolution of IT, we call it Cloud.

 –

Rob

Rob Babb is a Senior Systems Engineer on the Security and Compliance Specialist team at VMware. 

Defense in Depth. Who needs it?

Greetings from San Francisco and the RSA conference! It's been a great week of sessions, meetings, and vendor presentations. While listening to all these great talks, one thing came to inspire me about what I believe we're all seeing in the industry; misconfiguration and mismanagement.

Defense in depth, the process of adding layers of protection to secure an area. It's not a new concept at all. In fact, it has been the mantra of the security industry for at least a decade if not longer and well before that was a widely used military tactic. Usually, this strategy is used to prevent an attacker from exploiting one of your security tools, like a firewall or IPS, and bypassing what would be a single layer of protection. Now I agree that adding layers of well thought out protection makes a ton of sense when those layers are all managed properly, updated regularly, and the threat you're trying to solve is preventing an attacker from bypassing your security device through its own vulnerabilities. The problem though is that nearly every day, and every breach we've heard about in the news, that is not the case. More often than not, the attacker used a valid communication pathway to access a publicly available service, like a web server, and then used an exploit in the code of that application or un-patched service to gain access into the infrastructure. On top of that, the attacker is usually very likely to gain access to other systems in the corporation by simply pivoting from the system they just compromised.

So, why is this possible? Well, it seems that administrators are spending the majority of their time just "keeping the lights on". Meaning our security and operations teams are stretched far too thin. What suffers when administrators don't have enough time? Almost every single time, the existing product/server is going to suffer so that the schedule is kept for the new project(s) being worked on. What that means is that instead of an administrator keeping existing servers patched and up to date in a timely manner, or keeping your security tools functioning properly to begin with, they are actually working on new stuff for the corporation. It also means you need to protect your security teams from becoming the "Jack of all trades, master of none."

This is a horrible conundrum for an IT organization. So back to my title question, the answer is "it depends". I would argue it makes more sense to have 1 layer of really well implemented security and process before going to add other layers. If you can't make that one layer work properly 100% of the time, then you need to go back to what you were doing and make it work. This may not be the answer the business owners want to hear, but it's the answer they MUST hear.

Far too often I see companies move to technologies like DLP, Anomaly Detection, GRC, etc before they've even mastered how to deploy and use an IDS or IPS. That's like learning to jump hurdles in the Olympics before you've mastered walking. The other thing I can tell you is that with Cloud/ITaaS/'X'aaS coming to a datacenter near you soon, the world of security is getting both more complicated and simpler all at the same time. We're seeing customers collapse network domains in the favor of reducing VLANs, but at the same time they are implementing tools like vShield App that allow for network segmentation at layers 2-4 networking level. These new tools make designing networks a more complicated thought process up front, but in the long run simplify the management of those networks.

We face a series of similar paradoxical situations in the future of virtualized security and compliance technologies. It's incumbent upon us all to learn these tools quickly, act on their proper implementation, and by all means make sure to test our designs before we move on to that next layer of our Defense in Depth strategy. I know the temptation and push is strong to add the latest and greatest, but it'll do you no good if it's implemented incorrectly.

 –

Rob

 

 

Rob Babb is a Senior Systems Engineer on the Security and Compliance Specialist team at VMware. 

“Let’s get out of the weeds”

As part of VMware’s Security & Compliance Specialist team, we’re brought in to speak about a very wide range of concepts that extend from CPU architecture all the way up to the traditional tools like Firewalls, IPS’, Anti-Virus, and many others. Usually there’s some type of compliance question or concern driving the need to have a security conversation. And what most people don’t explicitly realize is that a discussion about security, whether physical or computer, always distills to the lowest common denominator being ‘trust’.

The concept of trust is an interesting notion. Trust is usually a faith or belief based emotion, and the hope that we hold for one another is that in matters of science and technology that trust is based upon some empirical evidence and well-informed reasoning. So obviously education is often our best methodology to assist customers with building that trust around our products.

Often the questions I receive are not about things like virtualized security products, like vShield, or the various API’s that have been developed. Instead the focus is most often on the vSphere platform itself. The reasoning behind this is mainly a lack of accurate information of sufficient detail available in the market. For several years VMware did a great job of building a secure architecture of vSphere but did not focus on advertising much of those design decisions, not because it wasn’t important but because it was not a topic our customers were expressing a need to have with us. Obviously as customers move through their own unique virtualization journey and move into Phase 2, Business Production, they are tackling security and compliance concerns around the more mission critical applications and data that are beginning to be virtualized. Having these conversations are also a pre-cursor  of things that need to be resolved prior to a company investing in a private, public, or hybrid “cloud” solution as it all relates back to how well a company can trust the technological controls that have been put in place.

Since I am so often asked questions about vSphere, that tell me the asker does not trust vSphere, or any hypervisor platform, I am frequently having a discussion on what I call “building a pyramid of trust”. Like any structure, the foundation is the most important part because without a well-formed base, in this case with regards to knowledge, it is highly unlikely the other pieces layered on top will be stable enough to continue adding more layers. In my pyramid, my base consists of the core constructs of virtualization. These are the Core Isolation Principles that describe exactly how the hypervisor is designed to separate out itself from the virtual machines and also what keeps each VM separate from one another.  Should these principles be violated, so would the isolation described by the very definition of virtualization.

To help explain the core principles I break apart the functions of the hypervisor into 4 key areas, CPU, Memory, Storage, and Networking. Each of these describe the physical functions that are abstracted into the VM’s themselves. The ways in which this abstraction occurs are very key concepts to fully grasping and understanding how we’ve developed our platform from the ground up with security in mind. It shows through in how we isolate specific CPU instructions, how our memory is layered, abstracted, and allocated, through the storage platform, and most importantly the protections guarding against remote exploit and arbitrary code execution. All of these things build defense in depth techniques that layer security in a virtualized environment.

Many security practitioners have built their careers focusing on more up leveled concepts of security, and their primary attention was never much directed to the physical hardware interfaces themselves. Much in the same way that server admins were not familiar with centralized storage and networking when we taught them how to virtualize over the last 10+ years. We are helping the security admins also break down their traditional barriers of understanding and now helping them to understand all of these other disciplines in the context of their day-to-day activities.

The interesting part is the resistance we face in educating security teams about all of these technologies and helping to build their trust in the technology. The experience thus far has shown that the typical US corporation is full of cliché terminology, which we’ve already known for years. Dilbert, The Office, SNL, all have made us laugh for hours at what we have become. Even with all this exposure to the ludicrousness of business clichés, I was taken aback a few weeks ago when an attendee at a meeting said we needed to “get out of the weeds”. It was obvious with that one statement that this person was not able to see the foundation of the pyramid being built. They were not willing to connect the dots and see how knowing the information being presented was able to answer all of their questions. Instead, they were using their pre-conceived notions that were founded on mis-information and FUD in the market to limit their ability to absorb the material in an educational context.

I don’t blame this person for their comment. In the day and age we live, time is precious and things happen so quickly it’s hard to keep up with changes in business without sacrificing too much personal time. We’re constantly being asked to make value judgments on which information is worthwhile to absorb vs deciding when it’s time to move on. For some of us, our thread of patience is stretched to the breaking point already.

After a few days had passed, the meeting organizer came back to me and said how grateful they were to have the conversation. They said the discussions that were sparked both during our meeting and in the days following has caused some very positive decisions to be made, mostly because of the comment made by that one individual to “get out of the weeds”. That was a key indicator for many other attendees that their co-worker was resistant to change and to use another cliché “unable to see the forest for the trees”.

This is not an all-too unique situation for us. In fact, it’s become more of a norm for our team to have initial education meetings followed a week or two later by another meeting to review the information again. The reason is that we’ve got to come back and reinforce and inspect that foundation of the pyramid so our audience fully builds their trust of our solution. We’re having great success in this education endeavor and we look forward to meeting with you and your teams in the future.

 –

Rob

 

 

Rob Babb is a Senior Systems Engineer on the Security and Compliance Specialist team at VMware. 

Follow-up Analysis from “2010’s Trend and Risk Report from a VMware Perspective”

As promised from my blog back in April (sorry it took so long!), here is a follow-up examination to the vulnerabilities announced from 2010 as they relate to ESX and ESXi. As I mentioned before there were 106 different CVE’s that came out of the X-Force DB with the search I pulled. I tried to be as thorough as possible and only focused on vSphere’s ESX and ESXi as opposed to the entirety of our product lines. I also found that several of the items returned were actually for Workstation and were only detected because there was a combined VMware Security Announcement (VMSA) that covered multiple patches. So all of the stats have been updated to reflect the new total of 99. Of those 99, only 7 applied to VMware’s owned code and of those, only 3 applied to ESXi’s code. Another interesting stat was that 1 of those 3 applied only at version 4.1 of ESXi, and the other 2 only applied to version 4.0. Lastly, 6 of those 7 applied to ESX versions 3.0 and 3.5 and 2 of those 6 applied to ESXi version 3.5. That’s a lot of numbers, but what it tells me is that by moving to ESXi as a sole platform, we’re moving in the right direction because there are less overall patches and less patches in each subsequent version.

 
So looking at the breakdown of the announcements in the chart below you’ll notice the pretty even split between specific linux exploits for the ESX COS and open source libraries used heavily in the ESX COS. There were also a lot of Java vulnerabilities announced last year. The common thread here is that these hit us just as they did for every other vendor in the industry using those tools. And they are very common tools like OpenSSL, Glibc, Java, Linux kernel, etc.  The fact that we’re not unique here should be an important factor to consider since you would have these same types of patches on other platforms.

 

1

The next important set of stats to look at is the classification. X-Force uses CVSS just like many other vendors to assign their risk ranking. One thing that was interesting though was when looking at items like vmware-libraries-code-execution (57663) it was assigned a “High Risk” classification, but the CVSS base score was only 4.4 and the temporal score was a 3.3. But when looking at vmware-web-requests-spoofing (57312) it was given a “Medium Risk” with higher CVSS scores for both rankings. It’s these interesting skews that mean we need to consider the real impact as it directly relates to each individual organization and deployment. Below is a breakdown of those risk ratings, but knowing what we do now I would recommend taking these with a grain of salt. One point to consider is that inside VMware we use CVSS for scoring, but have also added a few components for calculating potential severity in a virtual environment. These are value that the original CVSS spec never considered. For example, in a reported escape of any kind we score that a 10 (the highest value) which means it receives the most critical severity. We do this for a few other possible attack vectors. Which means we’re laser focused on making sure we produce the most secure products we can for our customers.

 

2

Lastly I think it’s important to look into the sensationalism in the security research industry. Frequently we hear of “exploits” to virtualization, but there is usually a lack of evidence as to the critical nature of the vulnerability. In typical general purposes OS’s we’re used to seeing buffer overflows that are able to fully compromise the entire computing environment. That perception continues to pervade the rest of the computing industry as if it is gospel. Yet, the environment created by a Type-1 hypervisor is drastically different than the direct hardware access a regular OS has. There are multiple layers of abstraction that I previously discussed and each of those layers builds upon themselves to create defense in depth computing for the virtual environment. It is because of this that I attribute the results of the chart below. Greater than 86% of all of those vulnerabilities were listed as “Unproven” and all 7 of the vmware vulnerabilities I listed before fell into that category.

3

Now I won’t sit here and tell you to not be vigilant with your examination of newly announced CVE’s, in fact I’ll tell you to examine each one of them very closely and thoroughly. Also I won’t say that someone won’t someday come out with some exploit code and release it in the wild, because that would be giving you a false sense of security. This is software and all software has its defects from time to time. What I will tell you is the importance of being up to date on ESX and ESXi. In fact I’m going to drop ESX entirely because soon we will be moving to an entirely ESXi offering and your only option will be the more secure ESXi platform.

We need the industry to drop the stigma that’s been adopted over the last 20 years of OS patching. It’s incredibly unfortunate that we’ve all grown up in IT in an age when certain vendors updates required the “early adopters” to test them for 6 months or years before large IT shops felt comfortable applying them. We at VMware have a very different approach. We want you to update early and often and stay on top of every update that comes out. When you move to the ESXi platform I would seriously encourage you to start looking at stateless implementations of deploying ESXi where every time you reboot the host you’re able to give it the latest and greatest update automatically. This will help you reduce the patching/updating time to your VMware vSphere environments on the whole and you’ll be better off for it in the end as you’ll be assured your hosts are running on the latest and greatest for both features and security!

 

 

If you'd like to do your own analysis or review the same CVE's I did, here is a CSV file with links to each of them in the X-Force DB.  Download XFDB Query

 

 

 

 

 

 

 

 

 

 

 

 

 

2010’s Trend and Risk Report from a VMware Perspective

Hi Everyone, Rob Babb here. Yes there are 2 Rob’s on VMware’s security specialist team, but aside from name it’s very difficult to get us confused in person! At any rate, I wanted to take an opportunity to discuss a new report from one of our security vendor partners, IBM. The report I’m talking about is X-Force’s 2010 Trend and Risk Report, which was released on April 1st 2011. In the spirit of full disclosure, prior to coming to VMware in 2008, I was with ISS and IBM in various roles both pre and post acquisition and, much of the report was done by some good friends and former co-workers. In the report they go through all of the information they’ve collected for the previous year on all of the disclosed vulnerabilities in the computing industry. The 2010 mid-year report was the first time they brought up the virtualization layer, and now in the full-year report they’ve expanded on that a little bit more.

The reason I want to bring up this section is because I believe there is some very key information that each of us must examine in more depth to fully realize how it applies to our organization. And in that examination we take what sounds like a very scary, insecure, world and provide the contextual analysis to reveal that there’s more than meets the eye. What I mean is that we need to look at a few key areas and dissect the info.

  1. Not all vendors in the virtualization space are created equal.
  2. There are different types of hypervisor architectures, and each has its own potential set of vulnerabilities.
  3. Some exploits are targeted at common open-source components while others are targeted at proprietary code.
  4. When you review many of the XFDB’s CVSS scores, you see that their temporal element is listed as “unproven”. This is an important point because frequently we see a disconnect between what is announced as a “guest escape” and what actually injects arbitrary code to run on the host.

 

Vendor-specific considerations

So, on the subject of different vendors it’s very important to understand where each of the key players are at in their development of security related technology. This applies to both protecting the hypervisor as well as protecting the guest OS’s. We at VMware have built-in a ton of protection types for the hypervisor over the years. Some of those key protections, or core isolation principles, are responsible both for the day to day running of Guest OS’s as well as the ability to prevent one OS from taking over another OS or the host OS, aka a guest escape. Some of those technologies are:

  • CPU
    • Each VM runs under it’s own VMM and those VMM’s don’t communicate amongst one another
    • Trapping and Translating of privileged instructions
    • Emulation and Binary Translation
    • Hardware pipeline support for No-Execute and Execute-Disable commands
  • Memory
    • Hardware Translation Look-a-side Buffers (TLB)
    • Shadow Page Tables
    • Memory zero-ing on assignment to a Guest
    • ASLR of vmkernel processes
  • Storage
    • VMFS exclusive locking on running VMDK files
    • Abstraction layer of datastore volume from guest OS
  • Networking
    • Layer-2 attack immunity (VLAN, double encapsulation, brute forced multi-cast, etc)
    • Forged MAC detection
    • vSwitches act separately from one another and do not share data
    • Just like physical switches, virtual switches are designed to only forward packets. They never execute the payload contents of those packets.

And that’s just some key items built-into the out of the box vSphere deployment, some of which differentiates us from our competition. We’ve been doing many of those things since 2003 when ESX first came out. On top of that we’ve been building many other protections in over the years. These protections are not just for protecting the hypervisor itself, but others are industry standard security technologies that have been modified for an x86 virtualized realm. Some of the more advanced technologies are:

  • ESXi
    • Smaller footprint, less to patch, less running services
    • PXE boot or stateless boot
    • Lockdown mode
    • Trusted Platform Module (TPM) support
    • Self-checking diagnostics on boot-up
  • vShield
    • Edge – Perimeter security device
    • App – vNIC level inline network firewall
    • Endpoint – Anti-Virus offload mechanism
  • VMSafe & vShield APIs
    • A set of API’s for security vendors to directly interface with the hypervisor which allows the existing security industry ecosystem to evolve towards virtualized security appliances.

All of these things come together to build a very secure and reliable platform for virtual data centers to be built on top of. These are also key components to moving towards the industry vision of cloud computing. Without these protections you would have no way to de-couple your security protection from your networking infrastructure.

Hypervisor Architecture

The second thing to keep in mind is that there are 2 primary virtualization architectures used today for x86 computing. The first is the Type-2 architecture where you run a generic multi-purpose OS (Windows, Linux, OSX) as your “Host OS” and on top of that you run a software application package that provides your x86 virtualization layer for your guest OS’s. In the Type-2 architecture you have all the security ramifications you normally would for that generic OS. The principles of isolation and separation for VM’s at the hypervisor level are unchanged, but because the hypervisor has to go through the Host OS to access the physical layer there is a much greater potential for exploit of the Type-2 scenario.

By contrast a Type-1 hypervisor is what’s known as a bare-metal installation.  There is no generic OS running between the hypervisor and the physical resources, and because of this, the hypervisor can have a very intimate tie-in to the hardware. Type-1 systems are not all created equally. In Hyper-V and Citrix Xen they both use what’s known as a parent partition (aka Dom0) as a localized management environment. That parent partition typically acts as a funnel for much of the communication between the guest OS’s and the physical resources. Because of that it is a frequent target for attack from hackers looking to intercept and maliciously change information mid-stream. On ESX we have something similar, known as our Console OS (COS), however our COS does not act as a funnel for access to resources and instead is treated more as a mechanism for administrators to change/manage the host system.

Exploits against Local Host Management Consoles

As with the Hyper-V parent partition and Xen Dom0, the COS is a source of potential attack as it contains some privileged access to control the hypervisor. Over the years we saw on the order of ~90% of all our patches related to the COS.  Because of this enormous security footprint of the COS, we decided several years ago to begin migrating away from having one altogether. ESXi has no COS; instead of having a 2+GB footprint of potentially exploitable software, the entire ESXi code base is less than 100MB. One of the most significant premises in computer security research is that the smaller the code base (read attack surface), the less security vulnerabilities that exist. ESXi gives us a great lead in this realm ahead of our competition. We no longer have a ton of generic 3rd party code. Instead the majority of our ESXi hypervisor is written in house at VMware and is purpose built to only run virtualization. This approach has also forced us to think in new ways about how to properly deploy hypervisors as part of the infrastructure and to control that infrastructure remotely only over secure API channels. To that end we’ve developed a plethora of APIs and access mechanisms over those APIs to make managing a vSphere environment more centralized in nature. The hypervisor itself is moving to more of a stateless plug-n-play deployment model where the central command-and-control enforces policy and compliance standards for the whole of the datacenter.

Exploitability of Vulnerabilities

Now why have I spent so much time talking about core isolation and hypervisor architecture types? I told you this was going to be about the X-Force report and their discussion of virtualization. Well, it’s back to my fourth point. Because the X-Force report looks at the virtualization industry as a whole, they are forced to lump a bunch of things together. But that doesn’t do any good to my customers who want to know what their true threat and risk is to their virtual environment. So the purpose of laying out all that foundational knowledge up to this point is so we can truly analyze the threats of 2010 as related to vSphere.

Some of you may be familiar with X-Force’s huge repository of information called their XFDB. Having come from ISS I know that DB pretty well and have used it much in my past roles there in Support and QA. I don’t have as much access to that data as I used to and I was forced to do my queries and analysis with a rudimentary set of public tools, but nonetheless I was able to gather some metrics. By using the X-Force Database Search at http://xforce.iss.net/ you too can do the same type of searching I did. I started by pulling all the vulnerabilities that contained “ESX or ESXi”. That yielded 106 results between 1/1/2010 and 12/31/2010. After I got those results, I used some xpath querying to try and categorize the results in terms of severity and such. I want to try and put them in the context of the types of attacks that the X-Force team uses to describe virtualization attack vectors. A look over the attacks shows the majority applies to generic open source libraries, which would be targeted mostly at the ESX COS, and not ESXi.

I plan to release a follow-up blog in a few weeks that further goes through my analysis of these exploits. As part of that process I’ll be exploring how each exploit operates, what part of the environment it targets, and what the resulting exploit provides to the attacker. The resulting information I hope will help to frame our security discussions with many more facts and much less ambiguity.

With all that said, I’ll bid you all a great week and we look forward to having these security discussions with all of you!