I’ve been involved recently in a couple situations in which the ‘preferHT’ advanced setting has been implemented, but for the wrong reasons. I want to re-clarify how and when it should be used. As with many advanced settings, it can be helpful or hurtful.
“PreferHT exposes Hyper-Threading to the guest operating system” – False!
vSphere present vCPU’s to the guest operating system and does not ever expose when those threads are scheduled onto a Hyper-Threaded SMT context. The guest operating system sees only physical cores. vSphere chooses when to schedule on the Hyper-Threading SMT context.
[Side note: So if you had a physical server with 2 sockets, 8 cores and therefore 32 logical processors with Hyper-Threading enabled – you would need to configure a virtual machine with 32 vCPUs to make it potentially equivalent.]
There have been some great articles written in the past with respect to this advanced setting here and here.
The gist is that vSphere will spread out a VM’s virtual CPUs across as many physical cores as possible in order provide the best performance to an SMP virtual machine. In other words, it will try to give every vCPU it’s own physical core with nothing else scheduled on the secondary hyperthread. It will first use all the physical cores before using the secondary SMT contexts of Hyper-Threaded processors as its default behavior. Another way to say this is that by default vSphere prefers to use physical cores over hyperthreads.
We all know from this article that Hyper-Threaded processors provide two execution contexts per physical core, but since they share execution resources, they don’t double compute capability. However, using two physical cores would be twice the computing power of a single core which is why the default behavior is to use as many unique cores as possible.
What does it do?
Now there are some cases in which you might want to use the SMT context instead of another unique core. One case is when the application has a cache intensive footprint and can really benefit from the local processor cache. In that case you would want to consolidate the number of physical cores you’re using so that they will all shared the same processor caches. Another is when you want to limit a VM to a single NUMA node for memory locality purposes, but also want to allow it to have the ability to use the hyperthreads.
For cases like these, you can change the NUMA scheduler behavior, on either a per host basis (which has been available since the vSphere 4 era) or on a per virtual machine basis (available since vSphere 5).
Enabling PerferHT KB Article: http://kb.vmware.com/kb/2003582
It is important to recognize though that by using this setting, you are telling vSphere you’d rather have access to processor cache and NUMA memory locality as priority, over the additional compute cycles. So there is a trade off.
Sometimes it may not be easy to know when to use this setting and therefore it is valid to enable it and confirm if it helps or hinders your application.