Note: this post was developed jointly by Justin Pettit of VMware and Mark Pearson of HP, with additional content from VMware’s Martin Casado and Bruce Davie.
A recent Network Heresy post “Of Mice and Elephants” discussed the impact long-lived flows (elephants) have on their short-lived peers (mice). A quick summary is that, in a datacenter, it is believed that the majority of flows are short-lived (mice), but the majority of packets are long-lived (elephants). Mice flows tend to be bursty and latency-sensitive, whereas elephant flows tend to transfer large amounts of data, with per-packet latency being of less concern. These elephants can fill up network buffers, which can introduce latency for mice.
At the HP 2013 Discover Conference, HP and VMware demonstrated a technology preview of detecting and handling elephant flows in an overlay network. The demonstration featured the joint HP-VMware solution announced at VMworld 2013. VMware NSX provided an overlay network using HP switches as the underlay along with the HP VAN SDN controller. Through controller federation interfaces, the overlay and the underlay co-operated to mitigate the effects of the elephant flows on the mice. The solution shows the power of integration between network virtualization and SDN solutions.
The NSX controller configures a modified version of Open vSwitch running in the hypervisors to begin detecting and reporting elephant flows. Since Open vSwitch forwards traffic from virtual machines and handles tunneling, it is in a unique position at the edge to handle elephant flows. It is able to identify the logical traffic of the virtual machines and correlate it to the physical traffic on the wire.
An elephant flow is identified by its throughput and duration, which are configurable parameters. The intuition here is that an elephant flow, by definition, sends enough data to congest the network, and must do so for long enough to make mitigation worthwhile. Open vSwitch monitors its fastpath flow table to identify flows that exceed the configured parameters. Once an elephant is identified in the logical space, it stores the physical addresses in its OVSDB database.
Using the OVSDB protocol, an NSX Elephant Agent monitors the Open vSwitches’ databases and receives a trigger when an elephant flow is added or removed. The agent uses an API (which was jointly developed by HP and VMware) to inform the HP VAN SDN Controller of changes.
For example, assume that virtual machines VM1 and VM2 are running on hypervisors HV1 and HV2, respectively. The virtual machines are using addresses from the 192.168.0.x private address space and are connected by a logical switch. The hypervisors are using addresses in the 17.0.0.x address space and are physically connected through a set of HP switches. NSX configures a VXLAN tunnel between the hypervisors to realize this logical topology on top of the physical topology.
As long as there are only mice flows, nothing changes in the network:
vm1# while `/bin/true`; do wget http://192.168.0.2/file.1kb ; done
However, when we introduce an elephant:
vm1# wget http://192.168.0.2/file.10mb
Open vSwitch quickly detects the elephant flow and identifies it in its database. The following command reads the “elephant-flows” column from the database:
You’ll notice that the reported flow is using the hypervisor addresses. This is because Open vSwitch has identified the particular VXLAN tunnel that is carrying the elephant. A change to the “elephant-flows” column in the database triggers the NSX Elephant Agent to react and notify the HP SDN Controller.
The HP VAN SDN Controller, running an application called Converged Control, receives elephant notifications from the NSX Elephant agent. It speaks to the physical switches under its control using standards-based SDN protocols. It generates flows to specially handle the elephants identified by NSX.
A number of actions are possible, such as:
- Install flow-specific rate meters to regulate elephant flows.
- Place elephant flows into different queues from mice. The different queues can be configured to be drained with different weights, effectively allocating a guaranteed share of the link to the mice.
- Route elephants differently from mice. For example, mice can be routed using standard hashing, while elephants can be adaptively routed.
- Send elephant flows along a separate physical network such as an optical network that is more suitable for slow changing, bandwidth-intensive traffic.
In the HP-VMware technology preview, a number of mice flows were run. An elephant flow was introduced, which brought the throughput down for the mice. When the elephant flow detection was enabled in NSX, the elephant flow was quickly identified and the physical HP switch configured to rate meter it, which brought the mice flow throughput back to the previous rates.
The demonstration of VMware’s and HP’s joint solution to elephant flow handling points the way to a number of interesting future possibilities. For example, HP’s IMC (Intelligent Management Center) platform could be used in combination with NSX to monitor and visualize both overlay tunnels and the physical infrastructure underpinning those tunnels.
The HP-VMware technology preview has shown the value of decoupling detection of elephant flows from the action taken on those flows. Having shown the viability of detection in the vswitch, we are developing alternative approaches to vswitch-based detection, which would not require tracking per-flow statistics. Similarly, we are also exploring other actions in the underlay beyond rate meters. By decoupling identification from remediation, we allow each part of the solution to evolve independently.
We encourage you to download out solution brief to read more about the joint HP-VMware solution.
Justin and Mark