Teams adopting VMware vSphere Kubernetes Service (VKS) may already be using GitOps patterns. They could already have Argo CD instances humming along in their environments, managing application lifecycles. But as they transition to VMware Cloud Foundation (VCF), a common question arises: “Does the platform change my tooling, or just the surface area I can manage?”
The answer is the latter. When Argo CD runs in VCF, it can declaratively manage infrastructure resources directly through the vSphere Supervisor. As detailed in the technical documentation for VKS Cluster and Add-on Management with the Argo CD Supervisor Service, this includes vSphere Namespaces, VKS cluster definitions, and VM Service instances.
In this post, we’ll dive into a design approach for continuous delivery on VCF that treats your entire platform, from the infrastructure boundary to the microservice as code.
The Core Principles of GitOps on VKS
An effective design is built on a single, uncompromising principle: everything that can be expressed as a Kubernetes resource lives in Git. On VCF, this definition expands to include the platform boundary itself. Our architectural framework is guided by four pillars:
- Git as the Single Source of Truth: Every resource, including VKS Clusters, AddOn Packages & application, is declared in Git. Any out-of-band change is treated as configuration drift and corrected automatically.
- Separation of Concerns: Platform-level resources (namespaces, clusters, add-ons) and application workloads are maintained in separate directories or repositories with distinct access and approval policies.
- Declarative Control Plane: The Argo CD instance itself, along with its projects and application definitions, is managed as code. The Argo CD Supervisor Service treats these instances as first-class Kubernetes custom resources.
- Explicit Dependency Ordering: We encode strict dependency chains through Argo CD sync waves, ensuring a single commit can safely bootstrap an entire environment from scratch.

Why Do Platform Teams Adopt This Pattern?
Moving to a declarative, GitOps-driven model isn’t just an architectural upgrade, it directly impacts business velocity, operational risk, and the bottom line. For a VKS customer, adopting this framework unlocks critical operational advantages:
- Zero-Touch Cluster Provisioning: Instead of navigating dashboards or running fragile ad-hoc scripts, new infrastructure is spun up entirely via code. Merging a single pull request automatically bootstraps a fully configured VKS cluster, reducing time-to-market from days to minutes.
- Fleet-Wide Add-on Standardization: Managing fundamental tooling across dozens of disparate clusters can quickly lead to configuration sprawl. GitOps ensures that every cluster automatically inherits the exact same core packages such as Velero for backups, security agents, or ingress controllers guaranteeing consistency across your entire fleet.
- Drift Eradication & Compliance: Manual hotfixes and unauthorized configuration changes introduce massive security and operational risks. By continuously reconciling the live cluster state against Git, any out-of-band modifications are instantly self-healed and overwritten, keeping your infrastructure strictly aligned with your compliance baselines.
- Risk-Free Upgrade Cycles: Infrastructure upgrades no longer require nail-biting, weekend-long maintenance windows. With automated rolling upgrades via Cluster API and a seamless one-commit rollback strategy, upgrading a cluster becomes as routine and low-risk as any standard software deployment.
A Layered Design for Platform Stack
To manage this expanded surface area, we organize the design into a four-layer model. Each layer depends on the one below it, typically following a hub-and-spoke topology where a central Argo CD instance manages multiple spoke clusters.
| Layer | Managed Resources | Architectural Purpose |
|---|---|---|
| 1. Platform | vSphere Namespaces, Storage Classes, VM Classes | Establishes the multi-tenant tenancy boundary and resource constraints. |
| 2. Cluster | VKS Cluster resources, ClusterClass, VKr references | Declares cluster topology and versioning; manages creation, upgrades, and scaling. |
| 3. Add-ons | AddonRepository, AddonInstall, AddonConfig | Configures standard packages (e.g., Contour, Velero) for a consistent runtime. |
| 4. Workloads | Helm charts, Kustomize overlays, Application manifests | Deploys microservices and business applications into the provisioned clusters. |
Mastering the “App of Apps” Pattern
The App of Apps pattern is our primary organizing principle. A single “root” application points to a bootstrap directory in Git. That directory contains child Application manifests, which in turn point to other directories representing our layers. Argo CD recursively discovers and reconciles this tree.
Automating Cluster Onboarding (Disclaimer : Experimental Community Service)
Before Argo CD can deploy workloads to a spoke cluster, that cluster must first be registered. At scale, manual registration using argocd cluster add becomes a significant bottleneck. To streamline this, we utilize an integration from the community-driven argocd-attach-service repository. This capability deploys a lightweight controller that watches for ArgoCluster and ArgoNamespace custom resources, effectively transforming manual cluster onboarding into a simple, automated Git commit.
Note: This configuration is currently experimental and provided as a community service.
Declarative Add-on Lifecycle
VKS 3.5 introduced a declarative state for add-ons. By treating add-ons like any other manifest, platform teams ensure every cluster converges to the same baseline, typically Contour for ingress and Velero for backup with full drift detection and auditability.
Sync Strategy: The Art of the Wave

Bootstrapping an environment requires careful sequencing. We use the argocd.argoproj.io/sync-wave annotation to ensure infrastructure is ready before workloads land.
| Wave | Resource / Action | Technical Rationale |
|---|---|---|
| 0 | ArgoNamespace CR | Registers workload-ns with the ArgoCD instance via the auto-attach service. |
| 10 | Application cl01-app | Triggers Cluster API (CAPI) to provision the cl01 control plane and worker node VMs, registers the cluster with ArgoCD as a deployment target, & installs cert-manager & contour. |
| 12 | Application cl02-app | Provisions the cl02 VKS cluster with cluster autoscaler enabled & registers it with ArgoCD |
| 20 | Application app-1 | Deploys nginx into the demo-app namespace on cl01 |
| 25 | Application app-2 | Deploys Google Online Boutique(11 microservices) into the online-boutique namespace on cl01 |
Proving the Steady State: Day-2 Operations
The true test of a GitOps design is how it handles Day-2 events. In this model, manual operations like kubectl edit are replaced by pull requests:
- Cluster Upgrades: Updating a VKS cluster version is a one-line change to the VKr reference in Git. Argo CD applies the change, and Cluster API performs a rolling upgrade with zero manual intervention.
- Scaling and Packages: Scaling worker nodes or adding a backup package like Velero follows the same shape: commit the change, push to Git, and let the controllers handle the resolution.
Conclusion
GitOps on VKS isn’t just about managing YAML; it’s about leveraging the Supervisor Service to create a “business-aware” platform. By collapsing infrastructure procedures into Git commits, you eliminate the audit gap and configuration drift that plague traditional CD at scale. The VKS Design Library provides the validated patterns to turn this vision into a production reality. The path is documented, and the repos are ready, so let’s start building.
References:
Discover more from VMware Cloud Foundation (VCF) Blog
Subscribe to get the latest posts sent to your email.