Unlocking Database Performance on Kubernetes with TuneDProfile in VKS 3.6

TL;DR: VKS 3.6 introduces TuneDProfile, a declarative API for Kubernetes OS tuning. It replaces imperative scripts and unsafe sysctls with a supported, Kubernetes-native approach. Key features include per-pool granularity, automated drain-and-reboot handling for kernel changes, and a built-in “Production Ready” profile that fixes common Elasticsearch and database crashes out of the box.

Kubernetes is often described as a “leaky abstraction.” While it does an incredible job of abstracting away underlying infrastructure, the reality is that containers are just processes sharing a kernel. When you deploy stateful, high-performance workloads—like Elasticsearch, Kafka, MongoDB, or Telco applications—that abstraction starts to leak.

Suddenly, you aren’t just managing Pods; you’re debugging vm.max_map_count errors, tracing packet drops due to ring buffer exhaustion, or fighting latency jitter.

In the past, solving these issues meant breaking the Kubernetes model. You might have baked custom OS images (creating a maintenance nightmare), run privileged DaemonSets to fundamentally alter the host (a security risk), or manually SSH-ed into nodes to tweak /proc/sys (creating “snowflake” servers).

With vSphere Kubernetes Service (VKS) 3.6, we are introducing a better way: TuneDProfile.

The Shift Down: Platform Engineering 2.0

As we move toward a “Platform as a Product” mindset, the goal is no longer to “Shift Left” (burdening developers with infrastructure complexity) but to “Shift Down”—embedding that complexity into the platform layer.

TuneDProfile enables this shift. Instead of requiring every developer to know the intricacies of Linux memory subsystems, platform engineers can define “Golden Paths”—pre-validated Node Pools optimized for specific workload types (e.g., “High Performance Database” or “Low Latency Streaming”).

Architecture Deep Dive

VKS leverages a native integration with the industry-standard TuneD Project. The implementation relies on three key components:

The TuneDProfile CRD: A Custom Resource Definition where you define the actual profile content (in standard INI format).
Cluster Integration: You can bind these profiles to either all nodes or specific Node Pools via the osConfiguration in your Cluster definition.
The Machine Agent: A local agent running on every VKS node that applies configuration and manages safety.

Zero-Touch Reboot Orchestration

A major challenge with kernel tuning is that critical parameters (like hugepages or boot arguments) often require a reboot. The VKS Machine Agent and CAPI handle this lifecycle automatically

Why not just use `allowedUnsafeSysctls`?

In the past, customers would have to use Kubelet’s allowedUnsafeSysctls. While valid for some use cases, it creates a governance gap.

If you allow a sysctl like net.core.somaxconn via the Kubelet, any Pod on that node can modify it. Since many of these parameters are not fully namespaced (or affect shared resources), one Pod can negatively impact the network performance of its neighbors.

TuneDProfile shifts control back to the Platform Admin. It treats kernel tuning as an Immutable Infrastructure property of the Node Pool, ensuring that the node is correctly provisioned before workloads land on it.

Walkthrough: Deploying a Database-Optimized Node Pool

Let’s look at a real-world scenario. You need to deploy an Elasticsearch cluster. To do this safely, you need to increase memory map limits and prevent file descriptor exhaustion.

Step 1: Define the Profile

First, we create the TuneDProfile resource.

apiVersion: os.kubernetes.vmware.com/v1alpha1
kind: TunedProfile
metadata:
  name: database-performance
  namespace: my-namespace
spec:
  content: |
    [main]
    summary=Optimized profile for stateful database workloads on VKS
    # Inherit base performance settings from the standard throughput profile
    include=throughput-performance

    [sysctl]
    # STABILITY: Critical for ElasticSearch/OpenSearch.
    # Increases the limit on memory map areas. Without this,
    # database bootstrap checks will fail immediately at startup.
    vm.max_map_count=262144

    # LATENCY: Tells the kernel to prefer keeping data in RAM (page cache)
    # rather than swapping to disk, preventing massive latency spikes.
    vm.swappiness=1

    # SAFETY: System-wide file descriptor limit.
    # WARNING: Do NOT set this to 65536. This is the OS-wide ceiling.
    # We set it 2M+ (or even Max Int64) to prevent neighbor starvation.
    # Per-process limits (ulimit) are handled by the container runtime.
    fs.file-max=2097152
    
    # SCALE: Increases the backlog for pending connections.
    # Prevents packet drops during bursts of high traffic (e.g. Ingress spikes).
    net.core.somaxconn=4096
    
    # RELIABILITY: Sends keepalive probes more frequently (every 10 minutes)
    # to detect dead connections faster than the default 2 hours.
    net.ipv4.tcp_keepalive_time=600

apiVersion: os.kubernetes.vmware.com/v1alpha1

kind: TunedProfile

metadata:

namespace: my-namespace

spec:

content: |

[main]

summary=Optimized profile for stateful database workloads on VKS

# Inherit base performance settings from the standard throughput profile

include=throughput-performance

[sysctl]

# STABILITY: Critical for ElasticSearch/OpenSearch.

# Increases the limit on memory map areas. Without this,

# database bootstrap checks will fail immediately at startup.

vm.max_map_count=262144

# LATENCY: Tells the kernel to prefer keeping data in RAM (page cache)

# rather than swapping to disk, preventing massive latency spikes.

vm.swappiness=1

# SAFETY: System-wide file descriptor limit.

# WARNING: Do NOT set this to 65536. This is the OS-wide ceiling.

# We set it 2M+ (or even Max Int64) to prevent neighbor starvation.

# Per-process limits (ulimit) are handled by the container runtime.

fs.file-max=2097152

# SCALE: Increases the backlog for pending connections.

# Prevents packet drops during bursts of high traffic (e.g. Ingress spikes).

net.core.somaxconn=4096

# RELIABILITY: Sends keepalive probes more frequently (every 10 minutes)

# to detect dead connections faster than the default 2 hours.

net.ipv4.tcp_keepalive_time=600

Critical Technical Note: In the example above, a common mistake is setting fs.file-max to match the process limit (65536). However, fs.file-max is the system-wide ceiling. Setting it too low can cause a Denial of Service for the entire node. We set it to +2 million or even Max Int64 to ensure ample headroom, while relying on the container runtime to handle per-process ulimits.

Step 2: Apply it to a Node Pool

Next, we reference this profile in our Cluster topology.

# Snippet from Cluster object
topology:
  workers:
    machineDeployments:
    - class: node-pool
      name: database-pool
      replicas: 3
      variables:
        overrides:
        - name: osConfiguration
          value:
            tuned:
              # Activate the profile we defined above
              active:
              - database-performance 
              
              # Map the active profile name to the specific Custom Resource
              profiles:
                database-performance: 
                  profileRef:
                    name: database-performance # The name of the TuneDProfile CRD

# Snippet from Cluster object

topology:

workers:

machineDeployments:

- class: node-pool

replicas: 3

variables:

overrides:

- name: osConfiguration

value:

tuned:

# Activate the profile we defined above

active:

- database-performance

# Map the active profile name to the specific Custom Resource

profiles:

database-performance:

profileRef:

Advanced Pattern: The Heterogeneous Cluster

One of the most powerful features of VKS is the ability to decouple infrastructure configuration from the cluster lifecycle. You aren’t limited to a single OS tuning for your entire cluster. By combining Node Pools (MachineDeployments) with TuneDProfile, you can build heterogeneous clusters where the underlying kernel behavior matches the specific workload requirements of that pool.

Consider a cluster hosting both a memory-sensitive data store (like Redis) and a high-density Function-as-a-Service (FaaS) layer. These workloads have opposing requirements:

The Database (Redis): Requires strict memory guarantees to prevent background save failures. You want vm.overcommit_memory=1 to ensure the kernel always allows memory allocation (essential for Redis snapshots), vm.swappiness=1 to prevent latency-killing swaps, and vm.transparent_hugepages=never.
The FaaS Layer (Knative / OpenFaaS): Prioritizes density and cost-efficiency. You might want Kernel Samepage Merging enabled to deduplicate memory across thousands of tiny containers, and vm.swappiness=20 to allow cold functions to page out gracefully.

With VKS, you simply define two profiles and map them to distinct pools in the same cluster definition:


# Snippet from Redis Profile
[main]
summary=Optimized for Redis Data Store
include=latency-performance

[vm]
# Disable Transparent Huge Pages to prevent latency spikes
transparent_hugepages=never

[sysctl]
# Mandatory for Redis background saves
vm.overcommit_memory=1


# Prefer In-Memory: Use 1 to avoid swap but prevent premature OOM kill
# 0 can be used for for latency but requires careful memory planning
vm.swappiness=1

# Snippet from Redis Profile

[main]

summary=Optimized for Redis Data Store

include=latency-performance

[vm]

# Disable Transparent Huge Pages to prevent latency spikes

transparent_hugepages=never

[sysctl]

# Mandatory for Redis background saves

vm.overcommit_memory=1

# Prefer In-Memory: Use 1 to avoid swap but prevent premature OOM kill

# 0 can be used for for latency but requires careful memory planning

vm.swappiness=1


# Snippet from FaaS Profile
[main]
summary=Optimized for Serverless Density
include=throughput-performance

[sysfs]
# Enable Kernel Samepage Merging for memory deduplication
/sys/kernel/mm/ksm/run=1
# Scan 100 pages per millisecond (adjust based on CPU overhead tolerance)
/sys/kernel/mm/ksm/pages_to_scan=100

[sysctl]
# Aggressive swappiness to page out idle functions
vm.swappiness=20

# Snippet from FaaS Profile

[main]

summary=Optimized for Serverless Density

include=throughput-performance

[sysfs]

# Enable Kernel Samepage Merging for memory deduplication

/sys/kernel/mm/ksm/run=1

# Scan 100 pages per millisecond (adjust based on CPU overhead tolerance)

/sys/kernel/mm/ksm/pages_to_scan=100

[sysctl]

# Aggressive swappiness to page out idle functions

vm.swappiness=20


# Snippet from Cluster topology
topology:
  workers:
    machineDeployments:
    # Pool 1: Optimized for Data Intensity
    - class: node-pool
      name: redis-pool
      replicas: 3
      variables:
        overrides:
        - name: osConfiguration
          value:
            tuned:
              active: ["redis-performance"]
              profiles:
                redis-performance:
                  profileRef: {name: "redis-profile"}

    # Pool 2: Optimized for Serverless Density
    - class: node-pool
      name: faas-pool
      replicas: 10
      variables:
        overrides:
        - name: osConfiguration
          value:
            tuned:
              active: ["faas-density"]
              profiles:
                faas-density:
                  profileRef: {name: "faas-profile"}

# Snippet from Cluster topology

topology:

workers:

machineDeployments:

# Pool 1: Optimized for Data Intensity

- class: node-pool

replicas: 3

variables:

overrides:

- name: osConfiguration

value:

tuned:

active: ["redis-performance"]

profiles:

redis-performance:

profileRef: {name: "redis-profile"}

# Pool 2: Optimized for Serverless Density

- class: node-pool

replicas: 10

variables:

overrides:

- name: osConfiguration

value:

tuned:

active: ["faas-density"]

profiles:

faas-density:

profileRef: {name: "faas-profile"}

Connecting the dots: By applying standard Kubernetes Taints and Tolerations to these node pools, you ensure that your Database pods land exclusively on the tuned infrastructure, while your functions pack densely onto the generic nodes. This grants you bare-metal performance characteristics without the operational overhead of managing separate clusters for every workload type.

Production Ready Defaults: The builtin-vks Profile

Not every team wants to maintain custom kernel profiles. For this reason, VKS 3.6 ships with a curated, “batteries-included” profile named builtin-vks-v3.6.0. It is optimized for VKS and solves 90% of common friction points out of the box:

vm.max_map_count = 262144: Elasticsearch and SonarQube run immediately without custom tuning.
kernel.pid_max >= 4194304: Prevents PID exhaustion on dense nodes.
net.ipv4.ip_local_port_range: Expanded for high-connection workloads.
net.ipv4.neigh.default.gc_thresh*: Adjusted ARP cache limits to support large clusters.

Recommendation: Use builtin-vks as your default baseline. It allows you to deliver a “Production Ready” platform experience from Day 1.

Frequently Asked Questions

Is the builtin-vks profile applied by default?
No. While the builtin-vks profile is available in the cluster, it is not active by default. You must explicitly reference it in your Cluster topology (under osConfiguration).

How do I verify the profile is actively applied to my nodes?
The Kubernetes API does not surface OS-level configuration. To verify, SSH to a node (or use a privileged Pod) and run tuned-adm active to see the loaded profile, tuned-adm verify to confirm settings match, or check /var/log/tuned/tuned.log for errors.

Does this require a reboot?
It depends on the parameter. Most sysctl changes (like net.core.somaxconn) are applied immediately without a reboot. However, boot-time parameters (like hugepages or iommu settings) require a reboot. The VKS Machine Agent automatically detects this requirement and handles the drain-and-reboot process for you.

Can I break my node?
Yes. Incorrect settings can render nodes unhealthy. Always test custom profiles on a dedicated dev/staging Node Pool first.

Test in isolation: Use dedicated dev/staging Node Pools before production
Treat profiles as immutable: Create versioned profiles (e.g., redis-profile-v2) rather than editing active ones
Layer your changes: Start with builtin-vks and add incremental tweaks
Observe the impact: Watch node health and app metrics after applying new profiles
Plan for rollback: Keep builtin-vks as your known-good fallback

What if I have multiple profiles?
You can list multiple profiles in the active list (e.g., active: ["builtin-vks", "my-db-overrides"]). They are applied in order, meaning settings in the later profiles override those in the earlier ones. This allows you to “layer” your tuning (e.g., Base Profile + Specific App Tweak).

Can I update a profile after it’s applied?
No, TuneDProfile Custom Resources are immutable by design to prevent configuration drift to modify settings:

Create a new TuneDProfile CR (e.g., database-performance-v2)
Update your Cluster’s osConfiguration to reference the new profile
VKS orchestrates a safe rolling update across your Node Pool

Is this Linux only?
Yes. TuneDProfile leverages the Linux-specific tuned daemon. Windows nodes in VKS use a different mechanism for configuration.

Conclusion

With TuneDProfile in VKS 3.6, we are moving OS tuning from imperative hacks to declarative guardrails.

Supportability: You rely on a VKS-supported mechanism, not a fragile startup script.
Granularity: Tune specific pools for specific workloads without affecting the whole cluster.
GitOps Ready: Your kernel tuning is version-controlled YAML, living right alongside your deployment manifests.

Ready to standardize your node performance? Upgrade to VKS 3.6 and deploy the builtin-vks profile today and eliminate configuration drift.TuneDProfile is one of several operational enhancements in VKS 3.6. To learn about other features like upgrade readiness checks, AppArmor profile management, and RHEL 9 support, read the full VKS 3.6 announcement

Discover more from VMware Cloud Foundation (VCF) Blog

Subscribe to get the latest posts sent to your email.

The Shift Down: Platform Engineering 2.0

Architecture Deep Dive

Zero-Touch Reboot Orchestration

Why not just use allowedUnsafeSysctls?

Walkthrough: Deploying a Database-Optimized Node Pool

Step 1: Define the Profile

Step 2: Apply it to a Node Pool

Advanced Pattern: The Heterogeneous Cluster

Production Ready Defaults: The builtin-vks Profile

Frequently Asked Questions

Conclusion

Discover more from VMware Cloud Foundation (VCF) Blog

Related Articles

VCF Breakroom Chats Episode 89 – Unpacking the Latest AI Advancements in the VCF 9.1 Release, Part 2

The Year the Sovereign Cloud Debate Got Specific

How to Upgrade to VMware Cloud Foundation 9.1

Why not just use `allowedUnsafeSysctls`?