We recently posted a paper titled, Network Segmentation in Virtualized Environments on vmware.com that discusses and describes three virtualized trust zone configurations and some best practices for secure deployment.
So, what’s a trust zone? It’s part of a network (a network segment) within which traffic flows relatively freely. Traffic in and out of the trust zone is subject to stronger restrictions. Good examples are DMZs, or web/application/database zones between which we would put some form of firewalling.
The idea of consolidating a DMZ to a single host (one of the scenarios described in the paper) has stirred some opinions in the VMware Communities. The subject of security always does.
I thought one of the replies to the ongoing discussion was worth reposting. This post is from our own Serge Maskalik (aka vSerge on communities). You can read the rest of the thread here to get the context of the discussion, but the points about L2 attacks I think stand on their own (and hence why I reposted them here)…
These are really good questions, and there are a number of considerations with regards to using VLANs and how to properly secure L2 environments to reduce your attack surface area. To say that VLANs aren't secure and can't be used for DMZ usage isn't fair - the reality is that there lots of very secure VLAN implementation in production networks since the early part of this decade, especially seen in service provider networks. When you go to a Savvis, Global Crossing, AT&T, etc - you get a VLAN + CIDR block and datacenters' tenants are split up this way across the access layer. I recall building out the GlobalCenter datacenters in the late 90s/early part of this decade (these are now Savvis through Exodus acquisition), and the flat edge network which was Catalyst 5500s with shared broadcast domains became Catalyst 6500s or 7600s or comparable solutions by other vendors with VLAN segregation by customer with VLAN counts in 1k+ range per datacenter. That was almost 10 years ago and we now see lots of large and small enterprise networks heavily leverage VLANs to reduce numbers of physical NICs, simplify physical topology, reduce port density requirements on the switching edge, provide more configuration flexibility, reach large consolidation ratios by having more VMs run on smaller number of ESX servers in collapsed DMZ+Internal environments, etc.
The following is a little bit of information about L2 attacks that folks often talk about and how to put some controls in place to prevent them.
1. CAM flooding or MAC flooding. Switches use content-addressable memory which contain VLAN/PORT/MAC-ADDRESS tables for looking up egress ports as frames are forwarded. These are the forwarding tables for the switches and they have limits in size. The CAM tables are populated by looking at the source MAC on a frame and creating a CAM entry that records which port maps to what source MAC. This attack type tries to overrun the table by generating large numbers of frames with different MACs, to the point that there is no more room in the CAM to store the MAC entries. When the CAM can no longer be populated, the switch will act like a hub and flood frames to all ports except the one the frame came in on (to prevent loops). To avoid this, there are features like setting the max number of MAC entries per port - in most cases you only need one per NIC. By setting this configuration, you get rid of this risk. Secondly, you have to evaluate the risk of such attack. The attacker has to penetrate into the DMZ, own a host within the DMZ or be already on a segment close to the DMZ to run this attack. This attack could not occur if there are intermediate routers in the path, since a mac rewrite occurs on those nodes. It's a good idea to limit your L2 broadcast domains and the diameter of the switched network to avoid propagation of these type of issues.
2. VLAN Cross-talk Attacks (or VLAN Hopping) - on Cisco switches, the dot1q trunks pass all tags be default. When you configure the ESX host to uplink via a dot1q trunk, and guest tagging is allowed, it's conceivable that a rogue guest can generate frames for VLANs it should not be a part off. Avoid enabled guest tagging and monitor your vSwitch configuration activity for such things. Another way to hop VLANs is to spoof dynamic trunk configuration frames from a host; protocols like these are used by vendors to automatically configure 802.1q trunks to set up and allow VLANs. To avoid this, configure switch ports passing tagged frames explicitly to be trunks and to explicitly forward specific tags. Also, don't allow for unplugged ports on the switches to remain in a VLAN used by important assets - put them into an unused VLAN to avoid the possibility of someone plugging in to a port and getting access to the VLAN. Avoid using default VLANs (like VLAN1 on Ciscos).
3. ARP spoofing - this is where a host on the same segment as other hosts modify the ARP table on the edge router/gateway to point to the attacker's MAC and are able to redirect traffic to themselves. This can be done using ARP request or Gratuitous ARP mechanisms. This is a bit tougher to defend against, but can happen regardless of whether you are using VLANs or not.
4. Spanning-Tree attacks - this is where attackers could cause a DoS and bring down the L2 network section by generated malicious STP BDPUs and become root bridges or confuse the protocol to block specific ports. This can happen regardless of usage of VLANs, plus features like bpdu-guard and root-guard help prevent this type of stuff.
5. VRRP or HRSP tampering - break the failover protocol for the default gateway, take over the gateway MAC yourself, etc.
6. Starve out the DHCP address range - not as big of deal of DMZ, unless you are using DHCP for servers.
We on the vShield Zones teams recognize these issues and try to provide visibility to VMs and flows destined/sourced to and from VMs from a network perspective. Using Zones, you can see an ARP spoofing attack from a VM or a physical host on a segment and remediate the issue. Security best practices claim that you need visibility into L2 to deal with these type of issues, so in addition to providing firewalling functionality, we spent a lot of time on providing microflow visibility.
Also, we are seeing lots of customers using vShield Zones to isolate and segment clusters to provide dual-purpose for DMZ and internal server VMs usage using VLANs + vShield Zones isolation. We will be posting papers on this front and there will be examples at VMworld on how this can work. We are seeing three major use cases in the context of this:
1. Isolated/Segment DMZ in a dedicated set of ESX hosts or cluster with multiple trust zones provided by the vShield Zones.
2. Fully collapsed DMZ where the cluster or set of ESX hosts are shared by internal VMs and Internet-facing VMs.
3. Branch office environments where there may be some VMs hosted with Internet access, some for internal server usage and VDI as well.