Enterprises are growing increasingly dependent on modern distributed applications to innovate and respond quickly to new market challenges.  As applications grow in significance, the end-user experience of the application has become a key differentiator for most businesses.  Understanding what kind of application performance the end-users experience, optimizing the infrastructure, and quickly identifying the source of any issues has become extremely critical.

The Modern Network framework puts the end-user experience at the forefront.  It helps our customers provide the public cloud experience on-premise with an on-demand network that enforces secure connectivity and service objectives across on-premise and cloud environments.  As applications become more distributed, the increased application resiliency and efficiency often comes at the cost of increased contention for shared resources.  The dynamic nature of the network, device density, and the volume of data and transactions generated makes this even more challenging. Managing network complexity and simplifying network operations in such environments requires a well architected network with support for modern cloud concepts such as availability zones that provide fault tolerance.  Similarly, effective network-level fault isolation requires the ability to create self-contained fault domains that facilitate network resiliency, disaster recovery and avoidance, and end-to-end root cause(s) analysis throughout the application stack.

A Multi-Domain Approach for App Resiliency

A fundamental principle of the Modern Network framework addresses enhanced availability and resiliency in an always-on application environment through built-in fault isolation.  Hardware maintenance and hardware failures are realities that all IT teams have to deal with.  The physical network infrastructure is heterogenous and multi-vendor.  A more nuanced approach to infrastructure availability is required – one that enables the IT team to move at the pace of the application while still providing a robust fault-tolerant infrastructure.  A multi-domain architecture provides this foundation.

A Multi-Domain Approach for App Resiliency

At the core of this multi-domain approach is the separation between global and local management and control planes.

As applications span across the edge, core, and cloud environments, providing a single pane of glass for network management, monitoring, and policy consistency is key.  Centralized global management is key – it provides a cloud-like operating model by simplifying the consumption of networking and security constructs. A centralized console manages the network as a single entity while keeping configuration and operational state synchronized across multiple locations.  This also provides increased flexibility and scalability across your network, allowing you to manage globally while complying with local policies.

Local management and control planes within separate fault domains complete this picture.  Separate fault domains let you distribute your workloads and network services so that they are not on the same physical hardware within a single domain.  Application and infrastructure resources are isolated from each other within separate domains for fault tolerance as they are very unlikely to fail simultaneously.  These domains do not share infrastructure and a failure at one availability domain is unlikely to impact the availability of the others.  This level of fault isolation and infrastructure resiliency is critical for improving application availability and application experience for the end user.

While this multi-domain approach brings much needed resiliency and fault isolation to modern environments, IT teams also need the ability to manage such a multi-domain network at varying levels of granularity.  From a policy perspective, some domains might represent unique deployments.  On the other hand, you may want to apply a common policy to a subset of the domains or to all domains.  This level of granularity is crucial for ensuring long term policy consistency across the network.  The concept of policies applying to varying levels of granularity – regions, availability zones, all the way down to subnets and workloads – is as important in the private on-premises clouds as it is for the public cloud deployments.

Delivering Fault Tolerance Across the Application Stack

Users expect always-on access from anywhere to applications deployed on any cloud.  Any downtime can lead to productivity loss and business risk.  As IT strives to minimize the risk of failure, they need solutions that can proactively isolate failure segments while also identifying performance and scaling optimization opportunities.  They also need a way to achieve this level of fault isolation across both on-premise and cloud locations.

The VMware Virtual Cloud Network (VCN) enables the core set of capabilities needed for fault isolation and avoidance that provides the foundations for horizontally scalable networking.  Let’s look at some examples.

VMware NSX-T® Federation provides federated networking and security policies across multiple network deployments.  It uses the NSX-T Global Manager to achieve operational simplicity and consistent policy configuration and enforcement across all NSX locations.  The network no longer needs to be built or managed with a location-by-location or domain-by-domain approach.  The entire environment is seen as an end-to-end system.

In this Introduction to NSX-T Federation, Dimitri Desmidt demonstrates NSX Federation multi-site support for large scale NSX deployments along with simplified disaster avoidance and recovery, while ensuring high availability and improved application response.

Another example is VMware Tanzu Service Mesh built on VMware NSX.  It is an enterprise-class service mesh solution that provides reliable control and security for microservices, end users, and data across all your clusters and clouds in the most demanding multi-cluster and multi-cloud environments.  Tanzu Service Mesh runs on multiple application platforms, public clouds, and runtime environments, including Kubernetes clusters.

Tanzu Service Mesh supports cross-cluster and cross-cloud use cases with global namespaces which lets you securely deploy applications across clusters and clouds and have consistent traffic management policies, application continuity, and security policies.  Application boundaries provide strongly isolated environments for application teams and business units.

Furthermore, the control plane has separate global and local controllers for added resiliency.  Global controller allows platform teams to connect and protect microservices across all of their clusters and clouds in the most complex enterprise architectures. Local control plane components run in each customer cluster on-premises or in the public cloud and delivers fault tolerance in the event that a cluster becomes disconnected from the global controller.  Learn more about advanced concepts used in Tanzu Service Mesh.

Similarly, VMware NSX Advanced Load Balancer implements a distributed cloud architecture where individual components can fail without affecting the availability of the services delivered by automatically fixing itself.  It allows you to build a truly fault tolerant network that is cost effective, can scales up and down based on demand, and fixes itself if something breaks.  Designing a fault tolerant network with multi availability zones and high availability infrastructure is expensive and leads to a lot of operational and capital expenditure on infrastructure that still doesn’t guarantee that services are not disrupted. The NSX Advanced Load Balancer uses modern failure detection techniques across the control plane and the data plane to automatically manage available capacity and redistribute resources for seamless failure recovery without the need to over-provision load balancing capacity as Active/Standby pairs of hardware appliances to account for sporadic traffic peaks.

If you haven’t already, register for the Virtual Cloud Network event – The Modern Network for a Future Ready Business – where VMware executives, customers and industry thought leaders discuss the need for a modern network.

Additional Resources