Schuberg Philis Deploys VMware NSX

Summaryschuberg_philis_logo_pms298uwarmgray9u

Application Roll Out Reduced from Weeks to Minutes
• VMware NSX Enables Better Agility, Flexibility and Security

Recently I had the opportunity to speak with the team at Schuberg Philis about their successful, production deployment of VMware NSX. As background, Schuberg Philis is an innovative business technology company and an important player in the field of mission critical outsourcing services. The company serves customers across financial services, retail suppliers and utilities, and therefore must comply with the highest international risk management and corporate governance standards, while remaining flexible to evolving customer needs.

The adoption of VMware NSX based network virtualization has transformed the way Schuberg Philis runs its IT. In order to provide 100 percent functional up time of its customers’ critical applications, Schuberg Philis continuously optimizes its infrastructure and processes. However, the company increasingly saw its network as a barrier to increasing business agility.

To solve this challenge and to accelerate application roll out, the Schuberg Philis implemented a software-defined data center environment, and deployed VMware NSX. Schuberg Philis is taking advantage of the VMware NSX platform’s flexibility, security and agility to accelerate the deployment of applications to customers. Schuberg Philis customers now have easy access to the flexibility of the cloud, but within a certified, auditable environment, which includes built in controls and security.

Funs Kessen, cloud architect at Schuberg Philis, explained, “The process for spinning up new applications for customers used to take weeks to complete. Now we can do it in a little more than 18 minutes. This allows our customers to respond more quickly to business requirements and opportunities.

By fully automating the process, Kessen and team can offer Schuberg Philis customers complete access to the flexibility of the cloud within a certified environment, complete with all controls and security built in, and we’ve made it fully auditable.”

The adoption of VMware NSX based network virtualization has transformed the way Schuberg Philis runs its IT.

Kessen noted, “With VMware NSX in our software-defined data center, we can focus on applications, and not on the infrastructure,” said “

Follow Schuberg Philis on Facebook, Twitter, YouTube and Google+

Roger

OVS Fall 2014 Conference: Observations and Takeaways

Last week we hosted the Open vSwitch 2014 Fall Conference, which was another great opportunity to demonstrate our continued investment in leading open source technologies. To get a sense of the energy and enthusiasm at the event, take a quick view of this video we captured with attendees.

I’ve been thinking about the key takeaways from everything I saw and everyone I spoke with.

First, there’s huge interest in Open vSwitch performance, both in terms of measurement and improvement. The talks from Rackspace and Noiro Networks/Cisco led me to believe that we’ve reached the point where Open vSwitch performance is good enough on hypervisors for most applications, and often faster than competing software solutions such as the Linux bridge.

Talks from Intel and one from Luigi Rizzo at the University of Pisa demonstrated that by bypassing the kernel entirely through DPDK or netmap, respectively, we haven’t reached the limits of software forwarding performance. Based on a conversation I had with Chris Wright from Red Hat, this work is helping the Linux kernel community look into reducing the overhead of the kernel, so that we can see improved performance without losing the functionality provided by the kernel.

Johann Tönsing from Netronome also presented a talk describing all the ways that Netronome’s NPU hardware can accelerate OpenFlow and Open vSwitch; I’ve talked to Johann many times before, but I had never realized how many different configurations their hardware supports, so this was an eye-opening talk for me.

Next, enhancing Open vSwitch capabilities at L4 through L7 is another exciting area. Our own Justin Pettit was joined by Thomas Graf from Noiro to talk about the ongoing project to add support for NAT and tracking L4 connections, which is key to making Open vSwitch capable of implementing high-quality firewalls. A later talk by Franck Baudin from Qosmos presented L7 enhancements to this capability.

The final area that I saw highlighted at the conference is existing applications for Open vSwitch today. Peter Phaal from InMon, for example, demonstrated applications for sFlow in Open vSwitch. I found his talk interesting because although I knew about sFlow and had talked to Peter before, I hadn’t realized all of the varied uses for sFlow monitoring data. Vikram Dham also showed his uses for MPLS in Open vSwitch and Radhika Hirannaiah her use case for OpenFlow and Open vSwitch in traffic engineering.

I want to thank all of our participants and the organizing committee for helping to put together such an amazing event.

Ben

State of the State for Open vSwitch

This week, VMware will be hosting the Open vSwitch 2014 Fall Conference, with more than 200 attendees and nearly two dozen talks on a variety of subjects from a key participants.  The full schedule is available here, and we’ll be doing a wrap up of some of the takeaways from the conference a bit later.

For the uninitiated, Open vSwitch is a production quality, multilayer virtual switch licensed under the open source Apache 2.0 license.  It is designed to enable massive network automation through programmatic extension, while still supporting standard management interfaces and protocols (e.g. NetFlow, sFlow, IPFIX, RSPAN, CLI, LACP, 802.1ag).  In addition, it is designed to support distribution across multiple physical servers similar to VMware’s vDS or Cisco’s Nexus 1000V. See full feature list here

For more information on OVS, I encourage you to check out the OVS website.

In the mean time, take a read about latest Open vSwitch developments in this post on Network Heresy by OVS core contributors Justin Pettit, Ben Pfaff, and Ethan Jackson.

Accelerating Open vSwitch to “Ludicrous Speed”

Roger

Free Seminar – Advancing Security with the Software-Defined Data Center

We’re excited to take to the road for another edition of our VMware Software-Defined Data Center Seminar Series. Only this time, we’ll be joined by some great company.

VMware & Palo Alto Networks invite you along for a complementary, half-day educational event for IT professionals interested in learning about how Palo Alto Networks and VMware are transforming data center security.

Thousands of IT professionals attended our first SDDC seminar series earlier this year in more than 20 cities around the globe. Visit #VirtualizeYourNetwork.com to browse the presentations, videos, and other content we gathered.

This free seminar will highlight:

  • The Software-Defined Data Center approach
  • Lessons learned from real production customers
  • Using VMware NSX to deliver never before possible data center security and micro-segmentation

Who should attend?

People who will benefit from attending this session include:

  • IT, Infrastructure and Data Center Managers
  • Network professionals, including CCIEs
  • Security & Compliance professionals
  • IT Architects
  • Networking Managers and Administrators
  • Security Managers and Administrators

Agenda

  • 8:30 a.m. Registration & Breakfast
  • 9:00 a.m. VMware: Better Security with Micro-segmentation
  • 10:00 a.m. Palo Alto Networks: Next Generation Security Services for the SDDC
  • 11:00 a.m. NSX & Palo Alto Networks Integrated Solution Demo
  • 11:45 a.m. Seminar Wrap-up
  • 12:00 p.m. Hands-on Workshop
  • 1:30 p.m. Workshop Wrap-up

Check out the schedule and register. Space is limited.

Learn more at http://info.vmware.com/content/26338_nsx_series

Roger

Talking Tech Series: VMware NSX Edge Scale Out with Equal-Cost Multi-Path Routing

This post was written by Roie Ben Haim and Max Ardica, with a special thanks to Jerome Catrouillet, Michael Haines, Tiran Efrat and Ofir Nissim for their valuable input.

****

The modern data center design is changing, following a shift in the habits of consumers using mobile devices, the number of new applications that appear every day and the rate of end-user browsing which has grown exponentially. Planning a new data center requires meeting certain fundamental design guidelines. The principal goals in data center design are: Scalability, Redundancy and High-bandwidth.

In this blog we will describe the Equal Cost Multi-Path functionality (ECMP) introduced in VMware NSX release 6.1 and discuss how it addresses the requirements of scalability, redundancy and high bandwidth. ECMP has the potential to offer substantial increases in bandwidth by load-balancing traffic over multiple paths as well as providing fault tolerance for failed paths. This is a feature which is available on physical networks but we are now introducing this capability for virtual networking as well. ECMP uses a dynamic routing protocol to learn the next-hop towards a final destination and to converge in case of failures. For a great demo of how this works, you can start by watching this video, which walks you through these capabilities in VMware NSX.

Scalability and Redundancy and ECMP

To keep pace with the growing demand for bandwidth, the data center must meet scale out requirements, which provide the capability for a business or technology to accept increased volume without redesign of the overall infrastructure. The ultimate goal is avoiding the “rip and replace” of the existing physical infrastructure in order to keep up with the growing demands of the applications. Data centers running business critical applications need to achieve near 100 percent uptime. In order to achieve this goal, we need the ability to quickly recover from failures affecting the main core components. Recovery from catastrophic events needs to be transparent to end user experiences.

ECMP with VMware NSX 6.1 allows you to use upto a maximum of 8 ECMP Paths simultaneously. In a specific VMware NSX deployment, those scalability and resilience improvements are applied to the “on-ramp/off-ramp” routing function offered by the Edge Services Gateway (ESG) functional component, which allows communication between the logical networks and the external physical infrastructure.

ECMP Image 1

External user’s traffic arriving from the physical core routers can use up to 8 different paths (E1-E8) to reach the virtual servers (Web, App, DB).

In the same way, traffic returning from the virtual server’s hit the Distributed Logical Router (DLR), which can choose up to 8 different paths to get to the core network.

How is the path determined:

NSX for vSphere Edge Services Gateway device:

When a traffic flow needs to be routed, the round robin algorithm is used to pick up one of the links as the path for all traffic of this flow. The algorithm ensures to keep in order all the packets related to this flow by sending them through the same path. Once the next-hop is selected for a particular Source IP and Destination IP pair, the route cache stores this. Once a path has been chosen, all packets related to this flow will follow the same path.

There is a default IPv4 route cache timeout, which is 300 seconds. If an entry is inactive for this period of time, it is then eligible to be removed from route cache. Note that these settings can be tuned for your environment.

Distributed Logical Router (DLR):

The DLR will choose a path based on a Hashing algorithm of Source IP and Destination IP.

What happens in case of a failure on one of Edge Devices?

In order to work with ECMP the requirement is to use a dynamic routing protocol: OSPF or BGP. If we take OSPF for example, the main factor influencing the traffic outage experience is the tuning of the OSPF timers.

OSPF will send hello messages between neighbors, the OSPF “Hello” protocol is used and determines the Interval as to how often an OSPF Hello is sent.

Another OSPF timer called “Dead” Interval is used, which is how long to wait before we consider an OSPF neighbor as “down”. The OSPF Dead Interval is the main factor that influences the convergence time. Dead Interval is usually 4 times the Hello Interval but the OSPF (and BGP) timers can be set as low as 1 second (for Hello interval) and 3 seconds (for Dead interval) to speed up the traffic recovery.

ECMP Image 2

In the example above, the E1 NSX Edge has a failure; the physical routers and DLR detect E1 as Dead at the expiration of the Dead timer and remove their OSPF neighborship with him. As a consequence, the DLR and the physical router remove the routing table entries that originally pointed to the specific next-hop IP address of the failed ESG.

As a result, all corresponding flows on the affected path are re-hashed through the remaining active units. It’s important to emphasize that network traffic that was forwarded across the non-affected paths remains unaffected.

Troubleshooting and visibility

With ECMP it’s important to have introspection and visibility tools in order to troubleshoot optional point of failure. Let’s look at the following topology.

ECMP Image 3

A user outside our Data Center would like to access the Web Server service inside the Data Center. The user IP address is 192.168.100.86 and the web server IP address is 172.16.10.10.

This User traffic will hit the Physical Router (R1), which has established OSPF adjacencies with E1 and E2 (the Edge devices). As a result R1 will learn how to get to the Web server from both E1 and E2 and will get two different active paths towards 172.16.10.10. R1 will pick one of the paths to forward the traffic to reach the Web server and will advertise the user network subnet 192.168.100.0/24 to both E1 and E2 with OSPF.

E1 and E2 are NSX for vSphere Edge devices that also establish OSPF adjacencies with the DLR. E1 and E2 will learn how to get to the Web server via OSPF control plane communication with the DLR.

From the DLR perspective, it acts as a default gateway for the Web server. This DLR will form an OSPF adjacency with E1 and E2 and have 2 different OSPF routes to reach the user network.

From the DLR we can verify OSPF adjacency with E1, E2.

We can use the command: “show ip ospf neighbor”

 

ECMP Image 4

 

From this output we can see that the DLR has two Edge neighbors: 198.168.100.3 and 192.168.100.10.The next step will be to verify that ECMP is actually working.

We can use the command: “show ip route”

 

ECMP Image 5

 

The output from this command shows that the DLR learned the user network 192.168.100.0/24 via two different paths, one via E1 = 192.168.10.1 and the other via E2 = 192.168.10.10.

Now we want to display all the packets which were captured by an NSX for vSphere Edge interface.

In the example below and in order to display the traffic passing through interface vNic_1, and which is not OSPF protocol control packets, we need to type this command: “debug packet display interface vNic_1 not_ip_proto_ospf”

We can see an example with a ping running from host 192.168.100.86 to host 172.16.10.10

 

ECMP Image 6

 

If we would like to display the captured traffic to a specific ip address 172.16.10.10, the command capture would look like: “debug packet display interface vNic_1 dst_172.16.10.10”

 

ECMP Image 7

 

Useful CLI for Debugging ECMP

  • To check which ECMP path is chosen for a flow
    • debug packet display interface IFNAME
  • To check the ECMP configuration
    • show configuration routing-global
  • To check the routing table
    • show ip route
  • To check the forwarding table
    • show ip forwarding

Useful CLI for Dynamic Routing

  • show ip ospf neighbor
  • show ip ospf database
  • show ip ospf interface
  • show ip bgp neighbors
  • show ip bgp

ECMP Deployment Consideration

ECMP currently implies stateless behavior. This means that there is no support for stateful services such as the Firewall, Load Balancing or NAT on the NSX Edge Services Gateway. The Edge Firewall gets automatically disabled on ESG when ECMP is enabled. In the current NSX 6.1 release, the Edge Firewall and ECMP cannot be turned on at the same time on NSX edge device. Note however, that the Distributed Firewall (DFW) is unaffected by this.

For more in-depth information, you can also read our VMware® NSX for vSphere (NSX-V) Network Virtualization Design Guide

About The Authors

VMware-Roie Ben HaimRoie Ben Haim works as a professional services consultant at VMware, focusing on design and implementation of VMware’s software-defined data center products.  Roie has more than 12 years in data center architecture, with a focus on network and security solutions for global enterprises. An enthusiastic M.Sc. graduate, Roie holds a wide range of industry leading certifications including Cisco CCIE x2 # 22755 (Data Center, CCIE Security), Juniper Networks JNCIE – Service Provider #849, and VMware vExpert 2014, VCP-NV, VCP-DCV. Follow his personal blog at http://roie9876.wordpress.com/

VMware-Max ArdicaMax Ardica is a senior technical product manager in VMware’s networking and security business unit (NSBU). Certified as VCDX #171, his primary task is helping to drive the evolution of the VMware NSX platform, building the VMware NSX architecture and providing validated design guidance for the software-defined data center, specifically focusing on network virtualization. Prior to joining VMware, Max worked for almost 15 years at Cisco, covering different roles, from software development to product management. Max owns also a CCIE certification (#13808).

Automating a Multi-Action Security Workflow with VMware NSX

This post was written by VMware’s John Dias, (VCP-DCV), Sr. Systems Engineer, Cloud Management Solutions Engineering Team, and Hadar Freehling, Security & Compliance Systems Engineer Specialist

***

Through a joint effort with Hadar Freehling, one of my esteemed peers here at VMware, we co-developed a proof-of-concept workflow for a network security use case.  Hadar created a short video showing and explaining the use case, but in summary this is a workflow that reacts to and remediates a security issue flagged by third-party integration with VMware NSX. In the video, TrendMicro is used but it could be any other partner integration with vShield Endpoint.

Here’s what happens:

  • A virus is detected on a VM and is quarantined by the AV solution
  • The AV solution tags the VM with an NSX security tag
  • VMware NSX places the VM in a new Security Group, whose network policies steer all VM traffic through an intrusion prevention system (IPS)
  • vCenter Orchestrator (vCO) monitors the security group for changes and when a VM is added
    • a snapshot of the VM is taken for forensic purposes
    • a vSpan session (RSPAN) is set up on the Distributed Virtual Switch to begin capturing inbound/outbound traffic on the VM
    • once the VM has been removed from the security group, the vSpan session is removed

Watch the video below for a walk-through by Hadar:

You will note that there is a portion of the workflow that is handled natively by VMware NSX (Security Tag reaction, Security Group policy) but the snapshot and RSPAN are done via vCO workflow.

If you are interested in exploring this capability, I have provided the vCO workflow package for download. This is provided as-is and you should fully test it (and modify as needed) before using in your environment.

Assuming you have VMware NSX, vShield Endpoint and some third party integration already set up, you will need the following:

  • vCO 5.5.2
  • The NSX plugin for vCO (installed and configured)
  • The REST plugin with your NSX manager added as a REST host
  • vCenter plugin configured

The workflow package includes a good number of “helper” workflows which you will not need to run directly. The master workflow is in the root folder Security Reaction and is named “Set up VM Forensics RUN THIS” (just in case you had any doubt as to which one to run).

Multi-Site Security

The Security Reaction Master Workflow

Running the master workflow will prompt you for three items:

  • The NSX Security Group to monitor – This is why the NSX plugin is required, so that you can browse the vCO managed objects and locate the desired Security Group.
  • A time to sleep in seconds – The master workflow will run continuously until manually stopped and will use a REST call to NSX to get the current membership for the Security Group.  We have no recommendation on this poll time, although in testing we used 5-10 seconds.  It would have been better to use some external event to kick off the vCO workflow but we could not find a way to do this from NSX.  It may be possible to do via the partner solution, but we wanted this workflow package to be “partner neutral.”
  • Destination IPv4 address – This is the destination for the RSPAN (or vSpan session in vSphere API terms).  The vSpan session is created with some defaults (for example sampling rate, normal traffic allowed, etc).  If you want to change any of those properties, you will need to modify the Helper workflow named “Configure encapRemoteMirrorSource vSpan Session on DVS” (modify the “Create Port Mirror” script task).

Also note that this workflow doesn’t support VMs with multiple vNICs. Specifically, it will only create an RSPAN that includes the first vNIC found on a VM.  You can modify the Helper workflow “Implement Forensics” and adjust the script task “Prep for Mirror Creation” so that the additional NICs (if any) are added to the sourcePorts array. It’s something we intended to fix but forgot about until after our final testing and video production – so as they say in the textbooks “this is left as an exercise for the reader.”

Of course, there are many other actions that can be taken besides setting up an RSPAN and getting a snapshot. This solution can be extended to practically any task required during such an event such as creating a ticket in your service desk software, spinning up additional workloads to replace the compromised VM, sending emails, guest OS file system operations…all of these and more can be accomplished using vCO in conjunction with NSX.

 

Using Differentiated Services to Tame Elephants

This post was co-authored by Justin Pettit, Staff Engineer, Networking & Security Business Unit at VMware, and Ravi Shekhar, Distinguished Engineer, S3BU at Juniper Networks.

********************

As discussed in other blog posts and presentations, long-lived, high-bandwidth flows (elephants) can negatively affect short-lived flows (mice). Elephant flows send more data, which can lead to queuing delays for latency-sensitive mice.

VMware demonstrated the ability to use a central controller to manage all the forwarding elements in the underlay when elephant flows are detected.  In environments that do not have an SDN-controlled fabric, an alternate approach is needed.  Ideally, the edge can identify elephants in such a way that the fabric can use existing mechanisms to treat mice and elephants differently.

Differentiated services (diffserv) were introduced to bring scalable service discrimination to IP traffic. This is done using Differentiated Services Code Point (DSCP) bits in the IP header to signal different classes of service (CoS). There is wide support in network fabrics to treat traffic differently based on the DSCP value.

A modified version of Open vSwitch allows us to identify elephant flows and mark the DSCP value of the outer IP header.  The fabric is then configured to handle packets with the “elephant” DSCP value differently from the mice.

Figure 1: Elephants are detected at the edge of the network and signaled to the fabric through DSCP.  Based on these code points, the fabric can treat elephant traffic differently from mice

Figure 1: Elephants are detected at the edge of the network and signaled to the fabric through DSCP. Based on these code points, the fabric can treat elephant traffic differently from mice

Detecting and Marking Elephants with Open vSwitch

Open vSwitch’s location at the edge of the network gives it visibility into every packet in and out of each guest.  As such, the vSwitch is in the ideal location to make per-flow decisions such as elephant flow detection. Because environments are different, our approach provides multiple detection mechanisms and actions so that they can be used and evolve independently.

An obvious approach to detection is to just keep track of how many bytes each flow has generated.  By this definition, if a flow has sent a large amount of data, it is an elephant. In Open vSwitch, the number of bytes and an optional duration can be configured. By using a duration, we can ensure that we don’t classify very short-lived flows as elephants. We can also avoid identifying low-bandwidth but long-lived flows as elephants.

An alternate approach looks at the size of the packet that is being given to the NIC.  Most NICs today support TCP Segmentation Offload (TSO), which allows the transmitter (e.g., the guest) to give the NIC TCP segments up to 64KB, which the NIC chops into MSS-sized packets to be placed on the wire.

Because of TCP’s slow start, the transmitter does not immediately begin sending maximum-sized packets to the NIC.  Due to our unique location, we can see the TCP window as it opens, and tag elephants earlier and more definitively. This is not possible at the top-of-rack (TOR) or anywhere else in the fabric, since they only see the segmented version of the traffic.

Open vSwitch may be configured to track all flows with packets of a specified size. For example, by looking for only packets larger than 32KB (which is much larger than jumbo frames), we know the transmitter is out of slow-start and making use of TSO. There is also an optional count, which will trigger when the configured number of packets with the specified size is seen.

Some new networking hardware provides some elephant flow mitigation by giving higher priority to small flows. This is achieved by tracking all flows and placing new flows in a special high-priority queue. When the number of packets in the flow has crossed a threshold, the flow’s packets from then on are placed into the standard priority queue.

This same effect can be achieved using the modified Open vSwitch and a standard fabric.  For example, by choosing a packet size of zero and threshold of ten packets, each flow will be tracked in a hash table in the kernel and tagged with the configured DSCP value when that flow has generated ten packets.  Whether mice are given a high priority or elephants are given a low priority, the same effect is achieved without the need to replace the entire fabric.

Handling Elephants with Juniper Devices

Juniper TOR devices (such as QFX5100) and aggregation devices (such as MX, EX9200) provide a rich diffserv model CoS to to achieve these goals in the underlay.  These include:

  • Elaborate controls for packet admittance with dedicated and shared limits. Dedicated limits provide a minimum service guarantee, and shared limits allow statistical sharing of buffers across different ports and priorities.
  • A large number of flexibly assigned queues; up to 2960 unicast queues at the TOR and 512K at the aggregation device.
  • Enhanced and varied scheduling methods to drain these queues: strict and round-robin scheduling with up to 4-levels of hierarchical schedulers.
  • Shaping and metering to control the rate of injection of traffic from different queues of a TOR in the underlay network. By doing this, bursty traffic at the edge of the physical network can be leveled out before it reaches the more centrally shared aggregation devices.
  • Sophisticated controls to detect and notify congestion, and set drop thresholds. These mechanisms detect possible congestion in the network sooner and notify the source to slow down (e.g. using ECN).

With this level of flexibility, it is possible to configure these devices to:

  • Enforce minimum bandwidth allocation for mice flows and/or maximum bandwidth allocation for elephant flows on a shared link.
  • When experiencing congestion, drop (or ECN mark) packets of elephant flows more aggressively than mice flows.  This will result in TCP connections of elephant flows to back off sooner, which alleviates congestion in the network.
  • Take a different forwarding path for elephant flows from that of mice flows.  For example, a TOR can forward elephant flows towards aggregation switches with big buffers and spread mice flows towards multiple aggregation switches that support low-latency forwarding.

Conclusion

By inserting some intelligence at the edge and using diffserv, network operators can use their existing fabric to differentiate between elephant flows and mice. Most networking gear provides some capabilities, and Juniper, in particular, provides a rich set of operations that can be used based on the DSCP.  Thus, it is possible to reduce the impact of heavy hitters without the need to replace hardware. Decoupling detection from mitigation allows each to evolve independently without requiring wholesale hardware upgrades.

 

VMworld 2014 Networking and Security Session Guide

At last year’s show, we introduced you to VMware NSX, and presented a vision for how network virtualization will fundamentally change data center networking. We focused a lot on what NSX is, what it does, and why you should start planning to virtualize your network.

This year, we’re still focused on the basics. We have a lot of content that will help those of you who are new to network virtualization and NSX start to establish a base. But of course, we have a whole year of selling NSX under our belt. And we want to share that experience with you in a VMworld program that will take you, and NSX, to the next level.

Security and network micro-segmentation?  We’ve got it covered.  Customer deployment stories? You bet. Partners with real GA solutions, solving real-world problems? They are on the agenda.

Take a pass through the list below, and then check out the schedule builder on VMworld.com to organize your week.

We think the #NSXninjas will be out in full force at VMworld. Are you one?  We hope so!

Monday August 25, 2014

Networking Sessions

NET1846

Introduction to NSX

11:00 – 12:00 PM

NET1214

NSX Certification – the Next Step in Your Networking Career

11:30 – 12:30 PM

NET3441-GD

vSphere Distributed Switch

12:30 – 1:30 PM

NET2747

VMware NSX: Software Defined Networking in the Real World

1:00 – 2:00 PM

NET1743

VMware NSX – A Technical Deep Dive

2:00 – 3:00 PM

NET1786

The Business Case for Network Virtualization

2:30 – 3:30 PM

NET1949

VMware NSX for Docker, Containers & Mesos

3:30 – 4:30 PM

NET3305-S

Virtualize your Network with VMware NSX

3:30 – 4:30 PM

NET3442-GD

vCAC and NSX

4:00 – 5:00 PM

NET1745

The Case for Network Virtualization: Customer Case Study

5:30 – 6:30 PM

NET1957

NFV for Telco Infrastructure

5:30 – 6:30 PM

Security Sessions

SEC1196

Who Can You Trust? Strategies and Designs for Implementing a Zero Trust Model Leveraging NSX

12:30 – 1:30 PM

SEC2238

Security and Microsegmentation for the Software Defined Data Center

5:00 – 6:00 PM

Tuesday August 26, 2014

 Networking Sessions

NET1589

Reference Design for SDDC with NSX & vSphere

11:00 – 12:00 PM

NET1468

A Tale of Two Perspectives: IT Operations with VMware NSX

11:30 – 12:30 PM

NET1583

NSX for vSphere Logical Routing Deep Dive

12:30 – 1:30 PM

NET3445-GD

NSX Multi-site Deployments

12:30 – 1:30 PM

NET1586

Advanced Network Services with NSX

1:00 – 2:00 PM

NET1560

The NSX Guide to Horizon View

2:00 – 3:00 PM

NET1974

Multi-Site Data Center Solutions with VMware NSX

2:00 – 3:00 PM

NET3444-GD

NSX Network Services

2:00 – 3:00 PM

NET1743

VMware NSX – A Technical Deep Dive

2:30 – 3:30 PM

NET1674

Advanced Topics & Future Directions in Network Virtualization with NSX

3:30 – 4:30 PM

NET1883

NSX Performance Overview

5:00 – 6:00 PM

NET1966

Operational Best Practices for VMware NSX

5:00 – 6:00 PM

NET3443-GD

NSX Routing Design Best Practices

5:00 – 6:00 PM

NET1588

Load Balancer as a Service using NSX or Partner Solutions

5:30 – 6:30 PM

 Security Sessions

SEC1959-S

The Goldilocks Zone for Security

11:00 – 12:00 PM

SEC1746

NSX Distributed Firewall Deep Dive

11:30 – 12:30 PM

SEC1958

Automating Security Policy Enforcement With VMware NSX

12:30 – 1:30 PM

SEC1977

A Customer Perspective: VMware NSX and Next-Generation Security

2:30 – 3:30 PM

SEC1698

Optimize Security with Context and Isolation Using NSX Guest Introspection

4:00 – 5:00 PM

 Other Sessions

MGT2385

McKesson One Cloud – The One Cloud to Rule Them All

4:00 – 5:00 PM

MGT1878

Deep Dive into How vCenter Operations Simplifies NSX Operations

2:30 – 3:30 PM

TEX2211

VMware NSX and Riverbed Cascade Solution – Monitoring Network and Application Performance in NSX environment

3:30 – 4:30 PM

 Wednesday August 27, 2014

 Networking Sessions

NET1401

vSphere Distributed Switch Best Practices for NSX

8:00 – 9:00 AM

NET1861

Automating Networking and Security Services with NSX for vSphere and vCenter Orchestrator (vCO)

8:00 – 9:00 AM

NET1581

Reference Design for SDDC with NSX for Multi-Hypervisors (NSX-MH)

9:30 – 10:30 AM

NET2318

Scale-Out NSX Deployments:  With VMware-powered SDDC Converged Infrastructure Solution

9:30 – 10:30 AM

NET2379

Dynamically Configuring Application Specific Network Services with vCAC and NSX

9:30 – 10:30 AM

NET3448-GD

NSX Platform Extensibility

9:30 – 10:30 AM

NET2745

vSphere Distributed Switch: Technical Deep Dive

11:00 – 12:00 PM

NET2225

NSX Platform: Enabling Third Party Network and Security Solutions

11:30 – 12:30 PM

NET1846

Introduction to NSX

1:00 – 2:00 PM

NET1592

Under the Hood: Network Virtualization with OpenStack Neutron and VMware NSX

2:00 – 3:00 PM

Security Sessions

SEC3446-GD

Security and Micro-segmentation

11:30 – 12:30 PM

SEC2567

Unleashing Collaborative Security with VMware NSX – Advanced Defense for Advanced Threats

12:30 – 1:30 PM

SEC2421

VMware NSX Security Operations Best Practices

2:30 – 3:30 PM

SEC1958

Automating Security Policy Enforcement with VMware NSX

3:30 – 4:30 PM

 Thursday August 28, 2014

 Networking Sessions

NET2118

NSX PCI Reference Architecture – Policy, Audit and Remediation

10:30 – 11:30 AM

NET1589

Reference Design for SDDC with NSX & vSphere

1:30 – 2:30 PM

Security Sessions

SEC2238

Security and Microsegmentation for the Software Defined Data Center

10:30 – 11:30 AM

SEC1196

Who Can You Trust? Strategies and Designs for Implementing a Zero Trust Model Leveraging NSX

12:00 – 1:00 PM

SEC3449-GD

Security Policy using NSX Service Composer

12:00 – 1:00 PM

 

VMware NSX Customer Story: Colt Decreases Data Center Networking Complexity

Adoption of network virtualization and SDN technologies from VMware and Arista Networks simplifies cloud infrastructure and enables automation to reduce timescales of cloud and network service provisioning

colt_logo_l_cmyk

Offering the largest enterprise-class cloud footprint in Europe, Colt, an established leader in delivering integrated network, data center, voice and IT services, has implemented software-  defined networking [SDN] and network virtualization to simplify how its managed IT and cloud-based networking environment is deployed, managed and scaled throughout its data centers.

Following an extensive review, Colt selected Arista to provide high speed 10 and 40 gigabit Ethernet cloud-centric switches as an underlay network fabric and VMware NSX™ network virtualization to deliver a fully decoupled software network overlay.

SDN paves the way for automated cloud service delivery

The shift to SDN will provide a flexible, scalable, efficient and cost effective way to support the delivery of Colt’s managed IT services, including cloud based services. This makes Colt one of the first service providers in Europe to adopt SDN in a production environment to remove  automate cloud service delivery.

As a result of deploying a new network architecture based on Arista and VMware networking technologies, the time for Colt to add, change or modify services will now take minutes  rather than days, and will enable Colt to onboard customers faster and expand its service portfolio quicker.

The big transformation in IT in recent years has been the development of cloud services with IT capacity purchased on-demand. In contrast, networking has remained relatively static.  The adoption of server virtualization over the past decade as the foundation for cloud computing and IT-as-a-service have resulted in a completely new operational model for provisioning and managing application workloads. However, the operating model of the network to which dynamic virtualized services are connected has not evolved to help businesses achieve the full benefits of mobile-cloud.

Mirko Voltolini, VP Technology and Architecture at Colt says, “The excitement around SDN and network virtualization is that – for the first time – networking is becoming more software orientated so we’re able to dynamically orchestrate service modification and activation in real-time.  In other words, network connectivity can now keep up when virtual machines and associated compute and storage change or are moved within distributed data centers.  Ultimately this means that servers, storage and now the network are in synch so that we can meet the specific needs of our customers in a timescale they demand.”

With more than 25,000 customers worldwide, Colt offers an information delivery platform comprising network, voice, data center and IT services sold directly to its enterprise customers or indirectly via channel partners and operators. In Europe, it has invested significantly to create a pan European network spanning 22 countries, 195 connected cities, around 19,800 buildings, as well as operating 42 metropolitan area networks.

Turning to the specialist technology firms has really delivered

Colt first considered adopting network overlay technology three years ago. It went out to tender approaching only large, mainstream technology suppliers and was disappointed by the response received.  The cost was too great and solutions not really mature enough to warrant changing. Eighteen months ago, it revisited the process given the technology had evolved, expanding the shortlist of suppliers asked to provide proposals to include specialist firms like Arista and VMware.

VMware NSX enables Colt to decouple the data center network from the underlying physical hardware to gain massive scale while simplifying network design and operations. With NSX, Colt is able to consolidate operations for four disparate physical networks running in the data center and manage these networks as a single logical network. Colt has developed a new data center architecture that leverages the scalability of a Layer 3 data center fabric and NSX’s overlay network virtualization platform.

Chris King, vice president, product marketing, networking and security business unit at VMware, said,  “Colt is an all too common story of an organization that simply hit the limits of what the physical network could provide in a virtualized world. VLAN limitations prevented Colt’s ability to scale. They needed to simplify the physical infrastructure in order to gain flexibility which in turn would allow them to adapt quickly to the business environment. VMware NSX helped Colt successfully execute a data center re-architecture which can now operate at cloud scale with better performance, easier management and lower overall costs.”

In addition to wanting to capitalize on the all the benefits offered by network overlays, the requirement for a new switch supplier was driven by Colt’s need to replace its existing legacy switches which had reached end of life and are not supported anymore.  Furthermore, the business wanted to reduce the total cost of ownership [TCO] of its networking equipment.

Voltolini explains, “Our target was to reduce the unit cost of our switches which includes the cost per port, along with maintenance, power, space and so on.  We wanted a step change in TCO which we will now achieve working with Arista.”

VXLAN addresses the limitations of Spanning Tree

From a technical perspective, Colt also wanted to move away from legacy protocols like Spanning Tree protocol which requires ports to be available – but not used – to deliver service availability. This underutilizes switch assets and adds unnecessary cost to its operation. Moreover, Colt required new switches which could scale to support increased connectivity capabilities both in terms of the number of ports [so that more customers can be connected] as well as logical scale.

Voltolini says, “The new VXLAN protocol removes traditional Ethernet limitations which is crucial for a service provider so that we can handle multiple tenants per port across numerous physical locations.”

Ultimately Arista switches will be installed in all Colt data center locations, the roll out of which will be driven by service and capacity demands.  The expectation is that this will happen over the next 18 to 24 months.  Deployment is made straightforward as all Arista switches – irrespective of port count or speed – feature the same network operating system, the Arista EOS.

Mark Foss, VP Global Operations and Marketing, concludes, “It is important to stress that this project is one of collaboration.  Being an innovative nimble company, we’re accommodating Colt’s requirements and helping shape their service design, while they’re guiding us in terms of our future product roadmap so we develop features pertinent to all cloud service providers.”

VMware NSX Use Case – Simplifying Disaster Recovery (Part 2)

Nicolas Vermandé (VCDX#055) is practice lead for Private Cloud & Infrastructure  at Kelway, a VMware partner. Nicolas covers the Software-Defined Data Center on his blog www.my-sddc.om,

This is Part 2 in a series of posts the describes a specific use case for VMware NSX in the context of Disaster Recovery. Here’s part 1,

++++++++++++++++++++++++++++++++++

Deploying the environment

Now let’s see have a closer look at how to create this environment. The following picture represents the vSphere logical architecture and the associated IP scheme…

ipNSX

 

… and the networks mapping:

logicalnetNSX

First of all you have to create three vSphere clusters: one Management Cluster and two Compute Clusters, as well as  two distinct VDS, within the same vCenter. Each Compute cluster will be connected to the same VDS. One cluster will represent DC1, and the other one will represent DC2. The second VDS will connect to the Management and vMotion networks. Also, you have to create a couple of VLANs: one VLAN for VTEPs, used as the outer dot1q tag to transport VXLAN frames, two external transit VLANs to allow the ESGs to peer with your IP core and VLANs for traditional vSphere functions, such as Management, vMotion and IP storage if required.

Note: As this lab has been created for educational purpose, it is clearly not aligned with NSX design considerations for a production environment. I’ll probably dedicate another blog post for that.

Now let’s get our hands dirty. I have to assume that you already have the NSX Manager deployed as well as 3 controllers. All those virtual appliances should be placed in the Management Cluster and connected to the Management VDS. For the sake of simplicity you can use the same Management VLAN for both ESXi and NSX components management.

The first step after having deployed the controllers is to install the ESXi vibs:  go to the NSX vCenter Plugin, then under Installation, select the Host Preparation tab. Select your Compute Clusters and click Install.

01

Once done, click Configure under the VXLAN section to configure the VXLAN networking:

02

The VLAN field is the outer VLAN ID for your VXLAN overlay. Create a new IP pool named VTEP and use it at the reference pool for your VTEP configuration. Note that if you select “Load Balance – SRCID” or “Load Balance - SRCMAC” as the teaming policy, two VTEPs will be created within the same IP Pool. It means that if you want your VTEPs to reside in two different subnets, you have to use a DHCP server. Another thing I noticed: be sure to create the appropriate number of VDS uplinks BEFORE preparing the hosts, or the NSX manager may not see the right number of uplinks when you want to deploy multiple VTEPs.

03

Next step is to configure the Segment ID range, which will represent your pool of available VNIs. As we will be using unicast transport mode, we don’t need to configure any Multicast Group.

04

Then you can go under Logical Network Preparation > Transport Zones. Add two Transport Zones, as we’ll be simulating two distinct datacenters. Select Unicast as the Control Plane Mode.

05

Each simulated datacenter should end up with its own transport zone, as shown below:

27

Now it’s time to create the Logical Switches. In the Network & Security pane, go to Logical Switches. In the right pane click the “+” icon. Give it a name, and select the first Transport Zone.

07

Create a second Logical Switch, linked to the second Transport Zone. As both Logical Switches are in two different Transport Zones, they will be completely isolated, without any possibility to connect them to the same Logical Router.

08

For the sake of completeness and to match the initial design, you can create a second Logical Switch in each datacenter. Additional Logical Switches to be created create are those connecting Logical Routers to the upstream Edge Gateway. Name those Logical Switches dc1_transit and dc2_transit.

09

The next components to be deployed are the Logical Routers. In the Networking & Security Pane, go to NSX Edges. In the right pane, click the “+” icon. Select Logical Router, you can enable HA if you wish so (I know it’s kind of weird to configure the DLR under the NSX Edge menu…).

10

Configure the credentials and at the “Configure deployment” step, click the “+” icon under NSX Edge Appliances. Select you first datacenter cluster and the appropriate Datastore, Host and Folder.

11

Then configure the management interface by clicking Select, next to “Connected To”. You should select aDistributed Portgroup, not a Logical Switch.

Next, go under Configure interfaces of this NSX Edge and click the “+” icon. Give the interface a name and selectinternal as the interface type. Connect the interface to the first Logical Switch in DC1 and configure its IP address and subnet mask. Repeat the steps to connect a second internal interface to dc1_ls02.

14

As you can imagine, the Uplink interface type will be used to connect the Logical Router interface to thedc1_transit Logical Switch. Add this interface and configure its IP address and subnet mask. It is worth noting that in the case of an internal LIF, the IP address given must be the default gateway for the VMs belonging to that particular Logical Switch.

Here is a screenshot of what you should have as the final configuration:

15

You can then click Next, Next, Finish. Repeat the same operations to create a second Logical Router, but this time in the second datacenter. The cluster/resource pool parameter must be set do dc2 so you can have access to the NSX components available in that specific Transport Zone. Here is a screenshot of what you should have in the end:

16

The last components to be deployed are the NSX Edge Gateways, which connect to the Logical Routers Uplink LIF through the transit Logical Switch. The Edge Gateways must have both a VXLAN interface (the internal interface connecting to the Logical Router) and a VLAN interface, connecting to the external, physical network.

To deploy an Edge Gateway, go to NSX Edges and click on the “+” icon under NSX Edge Appliances. Select Edge Services Gateway as the install type, enable HA and give a name to the gateway.

17

Click Next and configure the credentials. Then click Next and select the appliance size. Compact is ok for a lab, bigger appliances support a higher number of adjacencies, firewall rules, etc.

18

Then click the “+” icon under NSX Edge Appliances and select dc1 as the Cluster/Resource Pool and the appropriate Datastore, Host and Folder.

Click Next, then the “+” icon to add an interface. Give the interface a name and connect it to thedc1_transit network as an internal interface. Configure the IP address and the subnet mask, click OK and repeat the procedure to create an Uplink interface connected to a VDS Portgroup that represents the external network (It can be a VDS or VSS Portgroup).

21

The end result should look like this:

22

Click Next and configure a default gateway if you wish so. However, it’s not strictly necessary in our scenario. You can then click Next, Next and Finish to deploy the Edge Gateway in the first datacenter. Repeat the deployment procedure for the second datacenter by selecting dc2 as the cluster/resource pool so you can connect the appliance to the NSX components available in the second Transport Zone.

Before activating dynamic routing protocols within the NSX environment, we must configure an external device to enable routing adjacency with Edge Gateways in both simulated datacenters. You can use a physical device but if you want to deploy this in your home lab or if you don’t have access to a physical device, I recommend using a Vyatta virtual appliance. It has decent routing capabilities and OSPF configuration is pretty straight forward. I’m using VC6.6R1 in my lab.

Your external routing device should have two interfaces: one connecting to the DC1 Edge Gateway external interface network and one connecting to the DC2 Edge Gateway external interface network. Refer to the global topology diagram for IP addresses and subnets. Here is a screenshot of my Vyatta hardware configuration (I’ve added a third VNIC to connect to the management network so I can SSH into the appliance):

28

Now let’s see how to activate OSPF on the Logical Router and the Edge Gateway:

Under Network & Security, go to NSX Edges and select the Logical Router for DC1. Double-click on it. Go toManage > Global Configuration and click Edit next to Dynamic Routing Configuration. Set a custom Router IDand click save (Don’t tick the OSPF box)

24

Then go to OSPF and click Edit next to OSPF Configuration. You have to set two IP addresses: the Protocol Address is used to establish adjacency with neighbors. The Forwarding Address is the actual address used by the ESXi kernel module in the data plane. They must be part of the same subnet and the Forwarding Address must be the IP address of the Logical Router Uplink interface you have previously configured.

23

Click on the “+” icon under Area Definitions and add Area 0.

25

Then go to Area to Interface Mapping and add the transit vNIC to Area 0. Don’t add the Logical Switch internal LIFs to Area 0 as they’re not participating in the OSPF process. Instead the Logical Switch routing information is redistributed into OSPF (redistribute connected routes). Don’t forget to Publish Changes.

26

Repeat the same procedure for the second Logical Router in DC2.

To activate OSPF within the Edge Gateways, configure the Router ID and tick the OSPF box. There is no need to split the control plane from the data plane as the Edge Gateway is a virtual appliance and as such, it doesn’t have any kernel module installed on the ESXi host. Another difference is that you have to add both the transit and the external network interfaces into Area 0.

Note that if you want to ping the DLR and the ESG from you external network, you’ll have to modify appropriate firewalls rules as both components may have a default deny rule on their local firewall.

If you have configured everything correctly, you should see OSPF information about all routes on the external routing device:

vyatta@vyatta:~$ sh ip route ospf
Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
       I - ISIS, B - BGP, > - selected route, * - FIB route

O>* 192.168.0.0/24 [110/1] via 192.168.6.2, eth0, 00:06:08
O>* 192.168.1.0/24 [110/1] via 192.168.6.2, eth0, 00:06:08
O   192.168.6.0/30 [110/10] is directly connected, eth0, 00:22:31
O>* 192.168.7.0/29 [110/11] via 192.168.6.2, eth0, 00:11:38
O>* 192.168.10.0/24 [110/1] via 192.168.14.2, eth1, 00:00:21
O>* 192.168.11.0/24 [110/1] via 192.168.14.2, eth1, 00:00:21
O   192.168.14.0/30 [110/10] is directly connected, eth1, 00:22:31
O>* 192.168.15.0/29 [110/11] via 192.168.14.2, eth1, 00:00:38

By default, the Vyatta appliance adds a cost of 10 to its ospf interfaces and the ESG adds a cost of 1. This is customizable, and so are OSPF Hello and Dead intervals.

Hopefully you’ve got everything working now! :-P

The next post will focus on the very cool part, which is how to use python and pyVmomi to perform both NSX and vSphere tasks to move things around.