Technical

Assess the Performance Impact of the Security Change in Transparent Page Sharing Behaviour

As VMware continues to use a “secure by default” policy, there are some up-coming security changes to the Transparent Page Sharing (TPS) memory mechanism you need to be aware of and should assess for potential performance impact.

Essentially the ability to share memory pages “between” virtual machines (or Inter-VM sharing) is being disabled by default and only memory pages “within” a virtual machine (or Intra-VM sharing) will be allowed.

In preparation, a new salting mechanism has already been introduced with existing patches, to help TPS determine which memory pages can be shared.

VMware has now begun releasing updates that will enforce these new behaviors by default.

More info here: KB2080735

The impact here is that current levels of shared memory generated by TPS may change, since savings across virtual machines will no longer be leveraged. This may increase the amount of host memory required to support the same number of workloads and will further vary based on the type of workload.

Therefore, it makes sense to quickly assess the worst case scenario before applying these updates to ensure you don’t drive up ESXi host memory usage to levels in which memory page swapping occurs. Swapping memory pages to disk will negatively impact performance.

Scripting guru, Brian Graf, gratuitously helped me create a PowerShell script to look at shared memory per host, and report it in a tabular form, so you can easily review the current shared memory savings, and the worst case impact in contrast with the free memory on the host.

We call it the “Host Memory Assessment Tool”

What the script does:

  • Connects to vCenter and enumerates all ESXi hosts
  • Allows you to enable SSH on selected hosts
  • Generates an assessment report
  • Allows you to export the assessment report to .csv
  • Allows you to easily turn off SSH again if necessary

Assumptions:

All ESXi hosts have the same root account and password. If not, you’ll need to do something more creative here.

Requirements (and tested with):

  • PowerShell 3
  • PowerCLI 5.5 or higher
  • Internet connection (or have plink.exe in c:tenp)
  • Open PowerShell as Administrator and execute: Set-executionpolicy unrestricted
  • Read/write access to c:temp for working files

The script uses plink.exe to remotely SSH into each ESXi host and record memory counters using vsish. There is very low risk and impact to the ESXi hosts as it is a read only process.

I should note that vsish is a relatively undocumented tool and is functions are subject to change so should the script stop working, let me know.

Before we look at the script output, we should quickly review the notion of zero pages as that is often a large percentage of shared memory savings. Zero pages are memory pages that are filled with zeros. They are immediately shared by TPS regardless of their size large (2MB) or small (4KB). If you think about how an OS initializes, it will immediately touch and zero out all memory before allocating it. In the case you can no longer share zero pages between virtual machines, significant memory savings still exist within the virtual machine as zero pages at the additional cost of one additional zero page per virtual machine.

The real impact one needs to assess is how much memory is being shared amongst virtual machines that is not a zero page. Here, we take total number of memory pages saved using TPS and subtract the zero pages. This is the amount of memory that as a worst case can longer be saved. Example: VDI. It makes sense to quickly contrast this potential loss of savings against the host free memory knowing your savings will likely end up somewhere in-between. This is currently the best instrumentation available to us.

The Assessment Report:

Host Memory Assessment Tool

The assessment report collects and displays the following information.

  • Total Host Memory – amount of physical memory the vmkernel can access.
  • Host Mem Saved via TPS – amount of memory TPS has currently saved.
  • Host Mem Saved via TPS Zero Pages – amount of TPS saved memory that are zero pages
  • Potential Host Mem Savings Lost – the difference of total memory TPS has saved minus the zero pages. This constitutes the memory savings that “may” be lost when you can no longer share across all virtual machines.
  • Host Free Mem – amount of memory currently being reported as free by the vmkernel. If the potential lost memory savings is greater than this value, you might be at risk of swapping.

The PowerShell script is available here (updated v2 March 243, 2015): Host Memory Assessment Tool v2.ps1

(and for those new to git – like I was – press the RAW button and save the resulting page as the .PS1 script)