Smarter Infrastructure Starts Here: Drive Cost Savings and Consolidation with Memory Tiering

Broadcom’s goal is to deliver the world’s best private cloud platform and we are on a mission to help customers unlock as much value as possible with VMware Cloud Foundation (VCF). The focus for the vSphere platform has been to reduce TCO, provide end-to-end security and reduce operational complexity to provide cloud-like agility.

We continuously strive to understand how our customers deploy and allocate their hardware infrastructure to run workloads. Broadcom is introducing a key infrastructure innovation – memory tiering, which enables customers to get more memory capacity at a much lower cost than DRAM, and helps densify their hosts to improve their host CPU utilization. We will show how memory tiering helps create value by optimizing customer infrastructure. We will delve into customer pain points, provide some background on the history and prior work on memory, introduce memory tiering, discuss customer value propositions, and conclude with a real customer study.

Customer pain-points

A majority of our customers today face a variety of challenges with respect to consolidating their infrastructure and making it more efficient. This has to do mainly with how their environment evolves and ends up with a low CPU usage but with a high memory demand, and we will cover more details in this section.

Expensive DRAM

DRAM is the single most expensive component for a server BOM. It can typically account for anywhere from 30% to 70% of the overall server cost. The link below shows DRAM list prices for a popular OEM server: PowerEdge R750 Rack Server | Dell United States

Customers want to increase their memory capacity without a significant additional investment. Applications do tend to grow in size over time, taking up more host resources, and tend to support more users. Customers aren’t usually able to scale their infrastructure due to memory unavailability.

Memory Bottleneck – High consumed memory but low active memory usage

Customer VMs when deployed tend to consume all the memory allocated to them due to the OS and application behavior. There isn’t enough memory left when workloads need to scale. Thus memory is a bottleneck for most customers, and especially when there is available CPU capacity.

Complexity of adding DRAM

In order to solve the memory bottleneck crisis, customers are unable to just add more DRAM (DIMMs) because DIMM slots are usually not available (DIMMs are usually pre-populated to achieve bandwidth across memory channels). Furthermore, replacement means adding higher density DRAM that costs exponentially more than equivalent low density DRAM (for e.g. 128GB costs a lot more than 2x 64GB DIMM modules). Customer costs therefore increase exponentially!

Low CPU usage along with increasing cores and software licensing costs

Many customers report having unused CPU resources, yet still pay for enterprise software licenses based on total CPU cores—regardless of actual usage. Customers find that CPU utilizations are low due to memory bottlenecks.

As data center CPUs evolve to support more cores (e.g., Intel’s Sapphire Rapids with 60 cores and Granite Rapids with 128; AMD’s Genoa with 128 and Turin with 192), the CPU-memory imbalance and the ensuing memory bottlenecks become more acute. On the other hand, customers seek higher core counts to consolidate environments, reduce host count, save on software licenses, and power, space, and cooling (P-S-C). This creates a challenge: customers are left with a need to satisfy contradicting goals of deploying more cores on fewer hosts while paying licensing costs per core, even if those cores are not utilized efficiently.

Memory as a key infrastructure component

Memory, alongside compute, storage, and networking is a critical resource that needs to be carefully managed. This has strong implications on cost. Bringing more memory efficiency has long been prioritized at VMware by Broadcom to help customers run more applications, consolidate data centers, and reduce costs—without compromising performance or reliability. The hypervisor tries to optimize on resources to achieve efficiencies without sacrificing application experience, all while balancing resources and providing application reliability.

With deep expertise in memory management, VMware by Broadcom introduced innovations like DRS and vMotion decades ago, enabling careful workload mobility by monitoring and transferring memory pages from host to host without affecting application performance (see The vMotion Process Under the Hood – VMware vSphere Blog), and without the end-user realizing that underlying infrastructure had changed!

We built considerable expertise in monitoring and managing memory pages for applications because for customers, memory is precious (high costs of system DRAM with a very limited capacity), and injudicious use also means application performance and therefore user experience can get negatively impacted. VMware is able to deliver turn-key features for customer benefits while being purpose-built for running VMs and containers.

VMware looked into the Intel Optane persistent memory technology, supporting multiple use-cases over many years, but it met with an untimely demise. (See Intel Optane, Memory Optimization, and vSphere). VMware found a great alternative in NVMe, and built on top of the Optane experience.

Memory tiering is a technology that enables fine-grained classification of memory pages, allowing the hypervisor to intelligently monitor and move data between faster DRAM and slower NVMe tiers. By proactively offloading inactive (cold) memory pages from DRAM, more memory becomes available for active workloads—boosting scalability, supporting more VMs, and lowering costs compared to DRAM-only setups. It is fully integrated into the ESXi kernel as part of the memory management subsystem and helps present a uniform single address-space of host memory.

The reason to choose NVMe as the tiering device is because it offers a very low latency, and it is ubiquitous and standardized with a low operational overhead. Memory Tiering on ESX can run equally well on older server generations, with both Intel and AMD CPUs.

Memory Tiering is now GA with VMware Cloud Foundation 9.0 (VCF 9.0), and VMware vSphere Foundation 9.0 (VVF 9.0). (See the announcement https://blogs.vmware.com/cloud-foundation/2025/06/19/advanced-memory-tiering-now-available/ )

Who can benefit from Memory Tiering and how?

In this section, we will look at customer types and what the value proposition is. Customers can look into the value of memory tiering in a couple of different ways as shown below. Customers can be categorized into two types of scenarios, labeled as A and B.

Customer A scenario: They have a fresh, greenfield install. Instead of purchasing expensive DRAM, they can choose to deploy NVMe to save on hardware costs.

Customer B scenario: They want to further increase efficiencies of their existing hosts to support more workloads. They plug-in an NVMe device to improve their CPU utilization and densify their existing hosts to consolidate and save on additional software licenses, additional hardware, and P-S-C.

Hardware savings – Customer A

In the comparison below (using a popular OEM server https://www.dell.com/en-us/shop/cty/pdp/spd/poweredge-r760/, the left hand side shows a DRAM-only configuration whereas the right hand side shows half of the DRAM substituted with NVMe. The customer spend drops from $55,878 to $33,792 per server, resulting in savings of almost 40% on the overall server costs.

Licensing Savings and ROI – Customer B

The table below shows costs and a customer setup using a 64-core, 2-socket Dell server. By increasing CPU utilization, the customer saves the equivalent of 12 cores — leading to significant software license savings, including VCF and SQL per-core licenses. For some customers, license savings even exceed hardware savings, with an annual ROI of $28,200 (10%) per server, based solely on Microsoft SQL Server subscription pricing. (SQL Server 2022—Pricing | Microsoft)

Minimizing Performance Impact

Memory tiering involves continuously monitoring and moving memory pages of applications workloads. A key goal is to ensure that it has a zero to minimum performance impact. Performance testing shows that memory tiering works well not only with VDI and IT workloads but also with performance-sensitive workloads like mission-critical and OLTP databases. Compared to DRAM-only hosts, those with memory tiering (using NVMe) demonstrate a linear boost in performance and resource utilization—enabling customers to run more workloads on the same host efficiently, without compromising performance.

The first chart shows VMmark3 (representing an average VMware user) results demonstrating linear scalability (VMmark is a mix of workload VMs – OLTP databases, web servers, email etc.). Each VMmark tile represents 19 VMs. With memory tiering, we were able to scale up to 3 additional tiles without failing QOS metrics and almost 2x increase in throughput (orange line).

The second chart shows VDI (using the LoginEnterprise benchmark) scaling linearly from 160 to 320 VDI instances (VMs). With the 320 VMs, by using the additional memory capacity we were able to increase CPU utilization to 90% and still be able to get a good EUX (End-User-Experience) score of 7.8.

The third chart below shows MS SQL server using HammerDBv5.0 benchmark (TPC-C profile) showing a VM density increase. Without additional memory capacity, running more than 6 VMs started to result in SQL Server transactions timing out.

With Memory Tiering however, we were able to run 6 more VMs resulting in aggregate throughput of about 9 Million transactions per minute (TPM) compared to about 5.8 Million TPMs with 6 VMs.

The key takeaways are also that not only is density increasing but throughput is increasing too. These numbers also show that memory tiering can be applied to mission-critical databases.

Look out for an upcoming white paper, which will talk about these and further results in a lot more detail. For prior similar runs, please refer to the performance blog: Extreme Performance Series 2024: Improving Server Consolidation with Memory Tiering.

Customer case study

In this section, let’s look at how a customer, SS&C, was able to save considerable costs when they deployed memory tiering.

SS&C – Powering the private cloud environment

SS&C provides software solutions for financial services and healthcare, with its SS&C Cloud offering, a private cloud for diverse workloads—databases, VDI, containers, etc.—where end-customers are charged based on custom CPU and memory use. Previously, SS&C, as the below figure shows, used Intel Optane (PMem), leveraging a 4:1 PMem-to-DRAM ratio to reduce DRAM costs. SS&C’s active memory usage was just 5–6% of their overall memory capacity. Their consumed memory usage however was much larger. After Optane’s discontinuation, SS&C adopted NVMe-based memory tiering as a viable alternative.

The picture below on the left shows how SS&C invested 50-86% of their server cost on memory before, and once they moved to NVMe-based memory tiering using a 1:1 DRAM-to-NVMe capacity ratio, they were able to save 50%-65% on memory costs (green box) alone.

Brandon Frost, Director of Virtualization Architecture and Engineering at SS&C has said:

“Most VMs just consume a lot of memory but they are not really being used and CPU is wasted on hosts with high memory consumption. Being able to have an NVMe tier would allow more memory on the hosts, without the cost, enabling better CPU utilization. SS&C has been able to reduce costs on DRAM.”

Conclusion

VMware will continue to innovate to provide more value to customers. Memory continues to be a key infrastructure component but presents challenges for customers unable to better utilize their infrastructure.

Memory tiering takes memory virtualization to the next level and helps reduce costs by consolidating customer infrastructure resources and improving efficiency without sacrificing performance. Memory tiering is easy to deploy and manage, with proven performance across a variety of workloads, and is now ready for production deployments. Memory tiering is suited for many customer use-cases and can increase the value of customer VMware Cloud Foundation (VCF) and VMware vSphere Foundation (VVF) environments.

***

Ready to get hands-on with VMware Cloud Foundation 9.0? Dive into the newest features in a live environment with Hands-on Labs that cover platform fundamentals, automation workflows, operational best practices, and the latest vSphere functionality for VCF 9.0.

Discover more from VMware Cloud Foundation (VCF) Blog

Subscribe to get the latest posts sent to your email.