TL;DR: VKS 3.6 introduces
TuneDProfile, a declarative API for Kubernetes OS tuning. It replaces imperative scripts and unsafe sysctls with a supported, Kubernetes-native approach. Key features include per-pool granularity, automated drain-and-reboot handling for kernel changes, and a built-in “Production Ready” profile that fixes common Elasticsearch and database crashes out of the box.
Kubernetes is often described as a “leaky abstraction.” While it does an incredible job of abstracting away underlying infrastructure, the reality is that containers are just processes sharing a kernel. When you deploy stateful, high-performance workloads—like Elasticsearch, Kafka, MongoDB, or Telco applications—that abstraction starts to leak.
Suddenly, you aren’t just managing Pods; you’re debugging vm.max_map_count errors, tracing packet drops due to ring buffer exhaustion, or fighting latency jitter.
In the past, solving these issues meant breaking the Kubernetes model. You might have baked custom OS images (creating a maintenance nightmare), run privileged DaemonSets to fundamentally alter the host (a security risk), or manually SSH-ed into nodes to tweak /proc/sys (creating “snowflake” servers).
With vSphere Kubernetes Service (VKS) 3.6, we are introducing a better way: TuneDProfile.
The Shift Down: Platform Engineering 2.0
As we move toward a “Platform as a Product” mindset, the goal is no longer to “Shift Left” (burdening developers with infrastructure complexity) but to “Shift Down”—embedding that complexity into the platform layer.
TuneDProfile enables this shift. Instead of requiring every developer to know the intricacies of Linux memory subsystems, platform engineers can define “Golden Paths”—pre-validated Node Pools optimized for specific workload types (e.g., “High Performance Database” or “Low Latency Streaming”).
Architecture Deep Dive
VKS leverages a native integration with the industry-standard TuneD Project. The implementation relies on three key components:
- The
TuneDProfileCRD: A Custom Resource Definition where you define the actual profile content (in standard INI format). - Cluster Integration: You can bind these profiles to either all nodes or specific Node Pools via the
osConfigurationin your Cluster definition. - The Machine Agent: A local agent running on every VKS node that applies configuration and manages safety.
Zero-Touch Reboot Orchestration
A major challenge with kernel tuning is that critical parameters (like hugepages or boot arguments) often require a reboot. The VKS Machine Agent and CAPI handle this lifecycle automatically
Why not just use allowedUnsafeSysctls?
In the past, customers would have to use Kubelet’s allowedUnsafeSysctls. While valid for some use cases, it creates a governance gap.
If you allow a sysctl like net.core.somaxconn via the Kubelet, any Pod on that node can modify it. Since many of these parameters are not fully namespaced (or affect shared resources), one Pod can negatively impact the network performance of its neighbors.
TuneDProfile shifts control back to the Platform Admin. It treats kernel tuning as an Immutable Infrastructure property of the Node Pool, ensuring that the node is correctly provisioned before workloads land on it.
Walkthrough: Deploying a Database-Optimized Node Pool
Let’s look at a real-world scenario. You need to deploy an Elasticsearch cluster. To do this safely, you need to increase memory map limits and prevent file descriptor exhaustion.
Step 1: Define the Profile
First, we create the TuneDProfile resource.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
apiVersion: os.kubernetes.vmware.com/v1alpha1 kind: TunedProfile metadata: name: database-performance namespace: my-namespace spec: content: | [main] summary=Optimized profile for stateful database workloads on VKS # Inherit base performance settings from the standard throughput profile include=throughput-performance [sysctl] # STABILITY: Critical for ElasticSearch/OpenSearch. # Increases the limit on memory map areas. Without this, # database bootstrap checks will fail immediately at startup. vm.max_map_count=262144 # LATENCY: Tells the kernel to prefer keeping data in RAM (page cache) # rather than swapping to disk, preventing massive latency spikes. vm.swappiness=1 # SAFETY: System-wide file descriptor limit. # WARNING: Do NOT set this to 65536. This is the OS-wide ceiling. # We set it 2M+ (or even Max Int64) to prevent neighbor starvation. # Per-process limits (ulimit) are handled by the container runtime. fs.file-max=2097152 # SCALE: Increases the backlog for pending connections. # Prevents packet drops during bursts of high traffic (e.g. Ingress spikes). net.core.somaxconn=4096 # RELIABILITY: Sends keepalive probes more frequently (every 10 minutes) # to detect dead connections faster than the default 2 hours. net.ipv4.tcp_keepalive_time=600 |
Critical Technical Note: In the example above, a common mistake is setting fs.file-max to match the process limit (65536). However, fs.file-max is the system-wide ceiling. Setting it too low can cause a Denial of Service for the entire node. We set it to +2 million or even Max Int64 to ensure ample headroom, while relying on the container runtime to handle per-process ulimits.
Step 2: Apply it to a Node Pool
Next, we reference this profile in our Cluster topology.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# Snippet from Cluster object topology: workers: machineDeployments: - class: node-pool name: database-pool replicas: 3 variables: overrides: - name: osConfiguration value: tuned: # Activate the profile we defined above active: - database-performance # Map the active profile name to the specific Custom Resource profiles: database-performance: profileRef: name: database-performance # The name of the TuneDProfile CRD |
Advanced Pattern: The Heterogeneous Cluster
One of the most powerful features of VKS is the ability to decouple infrastructure configuration from the cluster lifecycle. You aren’t limited to a single OS tuning for your entire cluster. By combining Node Pools (MachineDeployments) with TuneDProfile, you can build heterogeneous clusters where the underlying kernel behavior matches the specific workload requirements of that pool.
Consider a cluster hosting both a memory-sensitive data store (like Redis) and a high-density Function-as-a-Service (FaaS) layer. These workloads have opposing requirements:
- The Database (Redis): Requires strict memory guarantees to prevent background save failures. You want
vm.overcommit_memory=1to ensure the kernel always allows memory allocation (essential for Redis snapshots),vm.swappiness=1to prevent latency-killing swaps, andvm.transparent_hugepages=never. - The FaaS Layer (Knative / OpenFaaS): Prioritizes density and cost-efficiency. You might want Kernel Samepage Merging enabled to deduplicate memory across thousands of tiny containers, and
vm.swappiness=20to allow cold functions to page out gracefully.
With VKS, you simply define two profiles and map them to distinct pools in the same cluster definition:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# Snippet from Redis Profile [main] summary=Optimized for Redis Data Store include=latency-performance [vm] # Disable Transparent Huge Pages to prevent latency spikes transparent_hugepages=never [sysctl] # Mandatory for Redis background saves vm.overcommit_memory=1 # Prefer In-Memory: Use 1 to avoid swap but prevent premature OOM kill # 0 can be used for for latency but requires careful memory planning vm.swappiness=1 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# Snippet from FaaS Profile [main] summary=Optimized for Serverless Density include=throughput-performance [sysfs] # Enable Kernel Samepage Merging for memory deduplication /sys/kernel/mm/ksm/run=1 # Scan 100 pages per millisecond (adjust based on CPU overhead tolerance) /sys/kernel/mm/ksm/pages_to_scan=100 [sysctl] # Aggressive swappiness to page out idle functions vm.swappiness=20 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# Snippet from Cluster topology topology: workers: machineDeployments: # Pool 1: Optimized for Data Intensity - class: node-pool name: redis-pool replicas: 3 variables: overrides: - name: osConfiguration value: tuned: active: ["redis-performance"] profiles: redis-performance: profileRef: {name: "redis-profile"} # Pool 2: Optimized for Serverless Density - class: node-pool name: faas-pool replicas: 10 variables: overrides: - name: osConfiguration value: tuned: active: ["faas-density"] profiles: faas-density: profileRef: {name: "faas-profile"} |
Connecting the dots: By applying standard Kubernetes Taints and Tolerations to these node pools, you ensure that your Database pods land exclusively on the tuned infrastructure, while your functions pack densely onto the generic nodes. This grants you bare-metal performance characteristics without the operational overhead of managing separate clusters for every workload type.
Production Ready Defaults: The builtin-vks Profile
Not every team wants to maintain custom kernel profiles. For this reason, VKS 3.6 ships with a curated, “batteries-included” profile named builtin-vks-v3.6.0. It is optimized for VKS and solves 90% of common friction points out of the box:
vm.max_map_count = 262144: Elasticsearch and SonarQube run immediately without custom tuning.kernel.pid_max >= 4194304: Prevents PID exhaustion on dense nodes.net.ipv4.ip_local_port_range: Expanded for high-connection workloads.net.ipv4.neigh.default.gc_thresh*: Adjusted ARP cache limits to support large clusters.
Recommendation: Use builtin-vks as your default baseline. It allows you to deliver a “Production Ready” platform experience from Day 1.
Frequently Asked Questions
Is the builtin-vks profile applied by default?
No. While the builtin-vks profile is available in the cluster, it is not active by default. You must explicitly reference it in your Cluster topology (under osConfiguration).
How do I verify the profile is actively applied to my nodes?
The Kubernetes API does not surface OS-level configuration. To verify, SSH to a node (or use a privileged Pod) and run tuned-adm active to see the loaded profile, tuned-adm verify to confirm settings match, or check /var/log/tuned/tuned.log for errors.
Does this require a reboot?
It depends on the parameter. Most sysctl changes (like net.core.somaxconn) are applied immediately without a reboot. However, boot-time parameters (like hugepages or iommu settings) require a reboot. The VKS Machine Agent automatically detects this requirement and handles the drain-and-reboot process for you.
Can I break my node?
Yes. Incorrect settings can render nodes unhealthy. Always test custom profiles on a dedicated dev/staging Node Pool first.
- Test in isolation: Use dedicated dev/staging Node Pools before production
- Treat profiles as immutable: Create versioned profiles (e.g.,
redis-profile-v2) rather than editing active ones - Layer your changes: Start with
builtin-vksand add incremental tweaks - Observe the impact: Watch node health and app metrics after applying new profiles
- Plan for rollback: Keep
builtin-vksas your known-good fallback
What if I have multiple profiles?
You can list multiple profiles in the active list (e.g., active: ["builtin-vks", "my-db-overrides"]). They are applied in order, meaning settings in the later profiles override those in the earlier ones. This allows you to “layer” your tuning (e.g., Base Profile + Specific App Tweak).
Can I update a profile after it’s applied?
No, TuneDProfile Custom Resources are immutable by design to prevent configuration drift to modify settings:
- Create a new TuneDProfile CR (e.g.,
database-performance-v2) - Update your Cluster’s
osConfigurationto reference the new profile - VKS orchestrates a safe rolling update across your Node Pool
Is this Linux only?
Yes. TuneDProfile leverages the Linux-specific tuned daemon. Windows nodes in VKS use a different mechanism for configuration.
Conclusion
With TuneDProfile in VKS 3.6, we are moving OS tuning from imperative hacks to declarative guardrails.
- Supportability: You rely on a VKS-supported mechanism, not a fragile startup script.
- Granularity: Tune specific pools for specific workloads without affecting the whole cluster.
- GitOps Ready: Your kernel tuning is version-controlled YAML, living right alongside your deployment manifests.
Ready to standardize your node performance? Upgrade to VKS 3.6 and deploy the builtin-vks profile today and eliminate configuration drift.TuneDProfile is one of several operational enhancements in VKS 3.6. To learn about other features like upgrade readiness checks, AppArmor profile management, and RHEL 9 support, read the full VKS 3.6 announcement
Discover more from VMware Cloud Foundation (VCF) Blog
Subscribe to get the latest posts sent to your email.