Blog Post Co-Authored with Bhanu Vemula
The shift to multi-cloud, microservices-based architectures is well underway across enterprises. VMware NSX has long provided secure connectivity between private and public clouds while offering consistent policy management within hybrid cloud environment with our Service-defined Firewall. More than a year ago, VMware NSX-T expanded beyond just supporting ESX-based VMs to cover workloads running on bare metal servers, multiple hypervisors, and containers.
However, as the adage goes, the only constant is change. So, it goes with application architectures. As enterprises embrace cloud-native architectures, applications are becoming even more distributed and heterogenous. We see this particularly in some of our forward leaning customers – payment providers, financial institutions, retailers, technology vendors, etc. are driving us to further evolve our security thinking.
Customers are containerizing their new applications with Kubernetes, and exploring solutions such as VMware Tanzu, Project Pacific, Pivotal Cloud Foundry, and other platforms and managed services. They leverage a mix of open source and multiple SaaS services for various functions such as observability, analytics, and cost optimization. Yet, they also need to communicate with their existing VM-based applications. These customers want a common framework for identity, policy, and compliance, one that can deal with assets that are more ephemeral in nature and not directly under their control. Some of them have used open source client libraries to bring those capabilities into their existing applications but are struggling to operationalize and maintain them.
New Application Paradigms, New Security Challenges
As we evaluated these evolving customer needs, we recognized four themes that we had to address in the evolution of our NSX security model:
- A new set of asset classes have emerged that are not easy to protect with traditional policy models. For example, as APIs, S3 buckets, and serverless become common, the policy model needs to evolve to support them. You certainly can’t put a traditional virtualized firewall in front a serverless/lambda function. The policy model needs to support users, microservices, APIs, serverless apps, data objects, and so on.
- These assets may be spread across multiple public or private clouds – IaaS, PaaS, SaaS – with underlying infrastructure under separate administrative control. As a result, policy needs to be expressed not only at an application/business level, but end-to-end as well. This will ensure it remains in force regardless of services— which may come and go— along the path of the request.
- Identity is now more complex. With assets in multiple clouds, and stolen credentials becoming a leading source of breaches, it’s becoming critical for identity to account for asset context, posture, and configurations in an automated manner.
- The regulatory compliance environment will be increasingly challenging to navigate. On the one hand, multiple data privacy protection laws are emerging – GDPR in Europe, CCPA in California, etc. Simultaneously, cloud native applications are often geographically distributed, and accessed by users globally. Customers need a unified compliance framework to deal with this challenge.
Evolving the Security Model
To address these security challenges in modern applications, security needs to be specified at a higher level of abstraction— namely, at the application level with users, services/APIs, and data— instead of at the level of infrastructure constructs.
Context-Aware Policy Engine
At the core of this new security model is the policy engine that we are building within VMware Tanzu Service Mesh (TSM). Rich contextual data is now available from cloud native applications such as namespaces, request flows, geolocation, etc. We see enormous value in cultivating this rich context and using it to create a notion of strong identity—one that goes beyond tags and infrastructure constructs while being automatically extracted and harder to spoof. A strong identity model should understand a richer set of static and dynamic attributes spanning configuration, state, provenance, telemetry, behavior, etc. This context-awareness needs to be accompanied by a flexible query language that allows resource groups to be defined based on a richer set of attributes— for example, all workloads exhibiting a certain type of vulnerability, or user devices showing risky behavior based on their device posture scores which can be provided by VMware Workspace ONE Intelligence, and so on. Access control policies also need to evolve, moving from defining control on a hop-by-hop basis to specifying end-to-end transactions that need to be protected. Other potential actions include enabling encryption, traffic redirection, and changing privilege levels. For instance, we may define a policy stating that only services with a security score of 90 or higher (subject) and using TLS1.3 can access (action) S3 Buckets containing sensitive PII data (object).
Open and Extensible Data Integration Framework
Security and infrastructure markets are rapidly evolving, and NSX cannot assume it has access to all data for all attributes natively or expect to understand the relevant semantics. It’s incumbent on us to build an open and extensible data integration framework to which third-party solutions can write a plugin to. NSX Service Mesh will have the ability to access and aggregate this context and enable policy decisions against it. The context will span data such as an inventory of users, services, data from various infrastructure platforms, and attributes for these assets. It will also cover extrinsic data such as user and application lifecycle behavior. Using the open plugin framework of NSX Service Mesh, some companies are already interoperating with the NSX Service Mesh to provide valuable context:
- Rezilion– Adds the capability to analyze artifacts deployed in production to turn an organization’s CI/CD pipeline into a allowlist of known good relationships and dependencies. Please refer to Rezilion blog here
- Sysdig – Provides enhanced security intelligence to extend runtime threat prevention, detection, and response in NSX Service Mesh. Please refer to Sysdig blog here
- Octarine – Monitors traffic patterns at Layer 7 to detect anomalies and sends compliance violation events to NSX Service Mesh. Please refer to Octarine blog here
Continuous, Risk-based Security
This kind of rich, real-time data can enable access control to shift from the existing model of making static, one-time binary access decisions, to a dynamic risk assessment of asset posture with a set of graduated actions. VMware Tanzu Service Mesh (TSM), with these extensions to the policy model, an extensible data integrations framework, and multi-cluster/multi-cloud namespace approach, is evolving security towards a continuous, risk-based model. This is the first in a series of blog posts on this topic. In subsequent blogs, we’ll cover the key topics introduced here in more detail. Stay tuned for more.
Awesome blog, nice job Manish and Bhanu. For a change I have seen the real background reason for Service Mesh usecases other than the regular service-to-service usecase, and also the real value creation by NSX-SM for context aware, risk-based end-to-end security/control usecase. It’s quite interesting than regular ServiceMesh usecades. Thank you for the effort and looking forward to see more blogs of this kind.