I recently deployed some new hardware in my lab and to my surprise discovered that after moving several VMs to the new hosts their memory utilization went up. I also noticed that Transparent Page Sharing (TPS) wasn’t working on these hosts. This didn’t make sense so naturally I did some digging and I want to share with you what I’ve learned as I know I’m not the only one upgrading hardware and many of you will likely stumble onto this same issue.
First, I did several tests to verify that my initial observations were correct – my VMs are in fact consuming more memory and TPS is not being used on the new servers. What I didn’t expect to find was that this is all to be expected and the reason has to do with the CPUs in my new servers. Let me explain.
Memory is allocated to VMs in either small pages (4KB) or large pages (2MB). While I don’t want to digress into a discussion about large vs. small pages, suffice it to say that prior to ESX 3.5 large pages weren’t really used. I’m not sure why, but it appears that VMware’s focus was on using small pages and leveraging TPS as much as possible.
However, coinciding with the release of ESX 3.5, Intel and AMD introduced a new feature with their CPUs called Hardware Assisted Memory Management Unit (MMU) (http://kb.vmware.com/kb/1020524). You can read the KB article for more detail, but in a nutshell MMU can provide a 10 – 20% performance improvement when using large pages. So starting in ESX 3.5 VMware changed the VMkernel so it will detect if MMU is enabled, and if so it will use large pages. In the event that MMU is not available, or it is disabled, ESX will then fall back to using small pages.
The reason I’m seeing more VM memory consumption and less TPS on the new servers is because they have the newer Nahalem CPUs with MMU. The VMkernel has detected that MMU is enabled and is using large pages. Because I’m using large pages more memory is being allocated to my VMs, and while TPS is enabled for both large and small pages, it doesn’t really do much with large pages because it’s unlikely that there will be a lot of identical 2MB memory regions that can be shared. Hence why it appears that TPS is not working.
Does this mean TPS is broke when running on CPUs with hardware assisted MMU? No, the good news is the VMkernel will only continue to use large pages as long as there is no memory contention on the host. If memory contention develops, the VMKernel will automatically switch to small pages and implement TPS in an effort to free up memory (http://kb.vmware.com/kb/1021095). So you really get the best of both worlds – when memory is plentiful ESX uses large pages with a modest performance benefit, but when memory contention develops it switches gears back to small pages in order to leverage TPS to reduce overall memory consumption.
For more information on TPS please check out Duncan’s recent blog on Yellow-Bricks: http://www.yellow-bricks.com/2011/01/10/how-cool-is-tps/.