Welcome to this new new blog post series about Container Networking with Antrea. In this blog, we’ll take a look at the Egress feature and show how to implement it on vSphere with Tanzu.

According to the official Antrea documentation Egress is a Kubernetes Custom Resource Definition (CRD) which allows you to specify which Egress (SNAT) IP the traffic from the selected Pods to the external network should use. When a selected Pod accesses the external network, the Egress traffic will be tunneled to the Node that hosts the Egress IP if it’s different from the Node that the Pod runs on and will be SNATed to the Egress IP when leaving that Node. You can see the traffic flow in the following picture.

Antrea Egress

When the Egress IP is allocated from an externalIPPool, Antrea even provides automatic high availability; i.e. if the Node hosting the Egress IP fails, another node will be elected from the remaining Nodes selected by the nodeSelector of the externalIPPool.

Note: The standby node will not only take over the IP but also send a layer 2 advertisement (e.g. Gratuitous ARP for IPv4) to notify the other hosts and routers on the network that the MAC address associated with the IP has changed.

You may be interested in using the Egress feature if any of the following apply:

  • A consistent IP address is desired when specific Pods connect to services outside of the cluster, for source tracing in audit logs, or for filtering by source IP in external firewall, etc.
  • You want to force outgoing external connections to leave the cluster via certain Nodes, for security controls, or due to network topology restrictions.

You could also use routed Pods to achieve the same effect. But, with Egress, you can control exactly which Pods have this capability (it’s not all or nothing!).

Feature Gates and Configuration Variables for TKG

The Open Source Antrea Version 1.6 (and higher) enables the Egress feature gate by default. Since the VMware Antrea Enterprise edition has a different release schedule we need to check if Egress is enabled for vSphere with Tanzu or TKG and if not how to enable it.

TKG 1.6 introduced Antrea configuration variables which can be used in the cluster definition. It’s not necessary anymore to patch TKG or pause Kubernetes controllers which override manual changes.

As you can see, Egress is enabled by default with TKG 1.6. All the other available Antrea features (e.g. NODEPORTLOCAL, FLOWEXPORTER, MULITCAST, etc.) can be configured using the corresponding variables.

Custom Resource Definitions for vSphere 8 with Tanzu

What about vSphere 8 with Tanzu? There is a Kubernetes CRD AntreaConfig available which gives you all the flexibility needed to configure Antrea. Before we deep dive into this topic let’s create a supervisor namespace tkg that overrides the default settings and disables the NAT mode. This means the worker node IPs are routed (otherwise Egress would not make much sense) but the Pod IPs are still NATed. A small routable namespace network segment and a /28 subnet prefix have been chosen for the Egress demo in a vSphere 8 with NSX 4.0.1.1-based setup.

Namespace Settings

After successfully creating the namespace, let’s check the NSX configuration:

If you take a look at the vSphere 8 documentation about routed Pods, you can see a reference to the CNI named antrea-nsx-routed.

For our demo, we will use the new Cluster-API-based ClusterClass approach to create TKG clusters. To enable Egress (and some other features for future demos), we need to create our own Antrea configuration using the AntreaConfig CRD. The method to attach this configuration to a specific cluster is different, though, and uses a special naming convention.

The following command shows that there are already two pre-defined AntreaConfig objects available in our tkg namespace. Both of them have Egress set to false:

Our cluster is called cluster-3, so we need to use the name cluster-3-antrea-package in our AntreaConfig definition (i.e. we have to append -antrea-package to the cluster name). This AntreaConfig is then automatically used by the cluster.

Note: Use a separate YAML file for the Antrea configuration! Deleting the cluster will automatically delete the corresponding AntreaConfig Resource. If you put both definitions in the same file the cluster deletion will hang!

Clusters, ClusterClass, and Egress

Now it’s time to create our cluster (with ClusterClass) using the resource type Cluster (and not TanzuKubernetesCluster anymore) and configure Egress.

The cluster is using Ubuntu as base operating system and the default vSAN storage policy. Since there is no NetworkClass available in ClusterClass yet, we had to use the AntreaConfig naming convention.

Note: Although the cluster nodes are routed (and not NATed), you can not SSH into them directly because a distributed firewall (DFW) rule has been created to deny access.

Disable Ingress to worker nodes

If you need this direct access you have to create a specific firewall rule. Otherwise, you can use one of the supervisor VMs or a jumbox container to log into the worker nodes following the official vSphere 8 documentation.

The rest of the Egress configuration is pretty standard now. After switching to the cluster-3 context we define a ExternalIPPool first. Since the nodeSelector is empty Antrea will choose one active node randomly and designate a standby node. You can also configure dedicated Egress nodes if needed.

What about the CIDR and IP range used in the configuration? Since the automatically created NSX segment for the cluster VMs is using the range 10.221.193.49/28 in our setup the ExternalIPPool is configured to use the last two (unused) IP addresses in that range.

Important: The standard approach is to use Egress IP addresses in the same range as the cluster VMs. Since these Egress IPs are not managed by vSphere 8 with Tanzu you have to be careful not to scale up the cluster to this range to avoid duplicate IP addresses. Technically you could use other ExternalIPPool ranges and let Antrea announce the settings and changes by BGP. But this is a story for some other time.

Let’s continue with our configuration. We create two Egressresources, one for Pods labeled web in the namespace prod and another one for Pods labeled web in the namespace staging.

This looks good so far. Let’s create a jumpbox following the vSphere 8 documentation to access the above cluster node cluster-3-node-pool-01-hgxk2-768974f9bc-l4fdq with IP address 10.221.193.52 so we can see the new interface antrea-egress0 created by Antrea with both IP addresses (10.221.193.61/32 and 10.221.193.62/32) attached to it.

Next, we create the two namespaces prod and staging and deploy a simple web application into each of them.

To test that the web Pods in each namespace use a different Egressaddress let’s create a simple web application on the VM showip.tanzu.lab outside of NSX returning the IP address of the client accessing it.

And now the final test, showing that the pods in the different namespaces use different Egress addresses to access our web application.

And here we are: Antrea Egress successfully implemented on vSphere 8 with Tanzu using the AntreaConfig CRD.

See you next time!

Got any questions? Don’t hesitate to leave them in the comments or reach out to us @VMwareNSX on Twitter or LinkedIn.