Technical

Which vSphere CPU Scheduler to Choose

vSphere RocketThe release of vSphere 6.7 Update 2 brought with it a new vSphere CPU scheduler option, the Side-Channel Aware Scheduler version 2 (SCAv2) or “Sibling Scheduler.” This new scheduler can help restore some performance lost to CPU vulnerability mitigations, but also carries some risk with it. Further complicating matters is the newest Intel CPU vulnerability, Microarchitectural Data Sampling (MDS) identified by CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, and CVE-2019-11091, as well as VMware Security Advisory VMSA-2019-0008. Our goal is that this post, coupled with the vSphere 6.7 CPU Scheduler Advisor tool, helps customers begin conversations within their organizations about the risks and possible solutions from a business perspective.

This post is long because this is a complicated topic. If you wish, you can visit the vSphere 6.7 CPU Scheduler Advisor directly, or scroll down to the common questions at the bottom.

What is a CPU Scheduler and why do we care?

Put simply, the vSphere CPU scheduler in ESXi determines what virtual machines get to use the processors. It tries to maximize utilization of the processors while also trying to be fair to all the workloads. In determining how to place workloads on CPUs the scheduler considers many factors such as CPU cache, memory locality, performance needs, priorities, I/O to storage and networks, and more.

CPU vulnerabilities are security problems that exist in processors delivered by major CPU vendors. They’re particularly serious because these problems may allow attackers to bypass all of the other security controls we’ve built up in our environments and aren’t detectable through traditional methods. As you might expect, unlike software, a computer’s processor is difficult or impossible to update once it’s built and shipped to customers inside a server or a desktop. ESXi is software, though, and while the problem isn’t caused by VMware, we can teach the vSphere CPU scheduler how to help us avoid the issues present in vulnerabilities like Spectre, Meltdown, L1TF, and MDS. We leverage new instructions provided by the CPU vendors using microcode that we ship as part of ESXi or is shipped in vendor firmware updates.

However, this comes with tradeoffs of both security risk and performance. For example, the L1TF vulnerability exposed a problem where processors can leak information between two threads on a core in an Intel CPU. The original ESXi Side-Channel Aware Scheduler, version 1 (SCAv1), helped protect us by not allowing the use of threads (Intel Hyper-Threading) and thereby mitigated the problem, but incurred a cost in CPU capacity. The MDS vulnerability can also allow processes to leak information between processes, so SCAv1 is a good choice to mitigate that. However, in some cases your security needs might be such that you could use Hyper-Threading within a virtual machine. SCAv2 allows you do to that, and you regain some of the lost performance and capacity.

In all cases the risk needs to be considered carefully, because these attacks are not theoretical. There are working demonstrations of each of these attacks on YouTube. One of the demonstrations shows an easy-to-use, commercially available tool recovering usernames and passwords from another virtual machine that is an Active Directory domain controller. That is a very serious problem.

How do I choose a CPU Scheduler?

There are three schedulers to choose from in ESXi 6.7 Update 2: the default, SCAv1, and SCAv2.

Default Scheduler

The default scheduler has no inherent security awareness but is the most performant. It should be assumed that all workloads running on a host with the default scheduler can see each other’s data.

Default Scheduler in ESXi

Side-Channel Aware Scheduler v1 (SCAv1)

SCAv1 implements per-process protections which assist with L1TF and MDS and is the most secure but can be the slowest.

=Side-Channel Aware Scheduler v1

Side-Channel Aware Scheduler v2 (SCAv2)

SCAv2 implements per-VM protections but needs considerations for the risk involved in using it.

Side-Channel Aware Scheduler v2

Use the vSphere 6.7 CPU Scheduler Advisor

At vSphere Central we have compiled an easy to use decision tree to help you make your decision on which scheduler to use, the vSphere 6.7 CPU Scheduler Advisor. When you visit it you’ll be asked a few questions:

“Does any VM on this cluster contain data, secrets, or credentials that you would not want leaked to an adversary?”

Put frankly, the answer to this for most organizations is going to be yes. What do we mean by an adversary? Anybody you don’t trust. This could be your competition in business, a hacking group, or even state-sponsored APT groups that are interested in stealing your business’ intellectual property, financial records, employee information, or customer lists. What do we mean by data, secrets, or credentials? Any data that could open you up for a further attack or compromise. Login passwords, encryption keys, digital certificates, perhaps a password to the Excel spreadsheet on your CFO’s desktop that could cause serious financial or legal trouble for your company if released prematurely.

“Does any VM have multiple users or processes where you do not want to leak secrets between those users or processes? This includes VBS, containers, terminal servers, etc.”

One key consideration when deciding which scheduler to use is whether all the processes inside a virtual machine have the same security needs and risk. Because L1TF and MDS allow secrets to leak between processes, if those processes belong to different users, different customers, or different applications then they all become part of the same security scope. This level of risk may be a problem and one that only you can make a risk management decision for.

Virtualization-Based Security (VBS) is a feature that enables Microsoft VBS, which is also known as Device Guard and Credential Guard. Since vSphere 6.7, ESXi supports running Windows VMs that require VBS. It enables a feature of Microsoft Windows 10/2016/2019 that uses part of the Hyper-V subsystem to create an isolated, secure space for credentials and other secrets. VBS helps prevent many credential theft attacks and thwarts many types of malware. However, it represents a type of process that you may not want to leak information, which could be possible with both the default and SCAv2 schedulers. There are still serious benefits to running VBS even if you cannot choose SCAv1 but your decision on how to proceed should account for that. Similarly, terminal servers with multiple users, and containers and other forms of nested virtualization should all be considered. It is often the case that a single large VM hosts several different applications running inside containers.

“Does any VM run untrusted code, including Java or Javascript, or from untrusted sources?”

While a knee-jerk reaction here might be “absolutely not!” you may want to consider the use of desktop OSes, web browsers, terminal servers, and the use of software available from the open Internet. Are users able to browse web sites and potentially run Javascript from uncontrolled sources? Are you using container images pulled from public registries? Do you use software modules from repositories like NPM, CPAN, PowerShell Gallery, or other places that allow public contributions? Are all software update repositories (WSUS, SCCM, Yum repositories, etc.) cryptographically signed, verified, and backed by a vendor? These are not judgments, but simply thoughts about how untrusted code might sneak into an environment and affect the answer to this question.

“Do you have CPU capacity headroom in the cluster?”

The performance impacts of these remediations affect every workload differently. That said, SCAv1 worst case performance loss is generally regarded as 30%, and SCAv2 is generally 10%. If you don’t have extra CPU headroom in your cluster you may have issues. Assuming that’s the case, and you cannot tolerate slower workloads, there are options to consider.

First, while you can enable different schedulers on different hosts, it isn’t a good idea to do so. Always enable the same scheduler on all hosts in a cluster. Having differing schedulers is a recipe for confusion, errors, and security & performance problems. That said, a valid approach might be to sequester workloads that need the per-process protections of SCAv1 on a separate cluster and run SCAv2 on other clusters that are compatible with the workloads’ risk.

Second, enabling SCAv2 gets you some protections, though not the full protections of SCAv1. If you need SCAv1 you should aim for SCAv1, but SCAv2 may be a stopgap.

Third, if you have enabled Enhanced vMotion Compatibility (EVC) in your clusters it is easy to add additional capacity. Simply acquire additional capacity and add it to the cluster. EVC is a tremendous “future-proofing” tool that is often overlooked when building clusters, as it masks out the differences between CPU generations so that virtual machines can vMotion seamlessly between old and new hardware. Every 18 months or so new CPUs become available, and those new CPUs always have new instructions and features that make vMotion unable to move VMs between the servers. EVC “masks” the new instructions out which allows vMotion to continue working. Later, when the whole cluster is upgraded and old equipment retired, the EVC level can be changed to support the newer CPUs, and VMs can take advantage of any newer CPU features. In practice, most enterprise workloads do not take advantage of new CPU features, and the benefit EVC brings in upgrade and expansion flexibility far outweighs any slight performance differences. If you aren’t using EVC now I urge you to review an older (but still very relevant) whitepaper on EVC and Performance.

Fourth, if your hardware is aging there might be other benefits to considering a refresh, including decreasing server counts due to improvements in performance and system sizing options (which, in turn, saves data center space, power, cooling, port count, and other infrastructure costs), the opportunity to use vSAN and its data-at-rest & stretched cluster capabilities, NSX, AppDefense, VMware Cloud on AWS & HCX for disaster recovery, etc.

vSphere is a very mature, stable, and incredibly secure platform on which to build a software-defined data center, and the software-defined data center has tremendous benefits in terms of cost of ownership and positive staff time improvements.

Common Questions

Q1: Is this a problem on VMware software and not elsewhere?

A1: This isn’t a VMware problem at all, it is a problem in the processor hardware inside your server or desktop. All major operating systems (ESXi, Microsoft Windows, various Linux distributions) use techniques like these to help their users cope with the issues because, without the software remedies, there would be no workable remedies at all. CPU vulnerabilities affect dedicated physical hardware as well, not just virtual environments. Please refer to your operating system vendor’s documentation and support for information on how they choose to remediate these issues, what performance impacts there are, and if tradeoffs have been made with the remediations that may affect the design or operation of your systems. For example, for performance reasons some operating systems do not apply vulnerability mitigations by default to processes running with elevated permissions.

Q2: Will a firewall catch these particular security problems?

A2: Most likely no, but please verify that with your vendor.

Q3: Will EDR/antivirus software catch these particular security problems?

A3: Most likely no, but please verify that with your vendor.

Q4: You talk about vSphere 6.7 Update 2, but we are on vSphere 6.0. What are our options?

A4: vSphere 6.0 and 6.5 only have two schedulers: the default and SCAv1. Because of architectural changes to 6.7 the SCAv2 scheduler will not be backported. The upgrade process to 6.7 Update 2 is extremely easy and we recommend that customers update to gain access to the new schedulers, as well as the multitude of other improvements. This is especially true for customers running vSphere 6.0 or earlier, as vSphere 6.0 becomes unsupported on March 12, 2020. That isn’t far away. If you would like more information or access to resources VMware has to help with upgrades, please contact your VMware account team or technical account manager.

Q5: Since this is a hardware problem, can I buy hardware that fixes these issues?

A5: The answer is a partial yes. The newest Intel CPUs offer remediations for these vulnerabilities. Additionally, hardware from different CPU vendors, such as AMD, are affected differently. Please consult your CPU & hardware vendor’s documentation.

Q6: Beyond the CPU scheduler, what else do I need to do?

A6: Once you’ve followed the procedures in the relevant VMware KB to implement a new vSphere CPU scheduler you will need to ensure that any remediations in the guest OSes (Microsoft Windows, Linux distributions, etc.) are enabled and functioning. This will likely include power-cycling the VMs as well. We also always recommend keeping hardware firmware up to date, as this helps eliminate other possible vulnerabilities (in management controllers, etc.) as well as fix bugs with NICs and other components that could affect availability and performance.

Q7: Is the cloud affected?

A7: Absolutely. All cloud vendors are affected by these issues, and most have remediated their environments for them. VMware Cloud on AWS has been fully & proactively remediated. Because each VMware Cloud on AWS customer is running on dedicated hardware, issues of cross-customer contamination and threats are not applicable.

Q8: My compliance framework & guidelines don’t mention these issues at all. Do I need to do anything?

A8: Compliance and security are two different things, but they are related. It’s likely that your compliance guidelines indicate that you need to be running vendor-supported software and hardware and be current on vendor-supplied updates and patches. These issues would fall into those categories. If in doubt, reach out to folks who administer your compliance auditing, your VMware account team, or your technical account manager. We always recommend opting for true security controls and vulnerability remediation versus simply meeting a compliance goal. Hackers don’t care if you’re compliant!

Q9: Are the different schedulers supported by VMware Global Support Services?

A9: Absolutely, as are all the changes outlined in the relevant KB articles.

Q10: So you’re saying we should switch to SCAv2 and everything will be fixed?

A10: NO! Each environment is different, and each organization has differing security needs and responsibilities. This is a business decision, risk versus cost versus potential for legal & publicity & other issues, so it needs to be a conversation between many groups inside an organization (CISO, CFO, CIO, CEO). There is NO “Easy Button” for these issues.

Q11: Why do I need to do a hard reboot of my virtual machines after I remediate my guest OS?

A11: Within the ESXi hypervisor a virtual machine runs as a process. That process, what we call the virtual machine monitor, is responsible for handling the relationship between the guest OS and the server hardware. For that process to inherit and present the new CPU vulnerability mitigations to the guest OS the virtual machine monitor has to be restarted. The way to do that is to shut the guest OS down and ensure the VM has powered off. When the VM powers back on a new virtual machine monitor process will start that will have the new CPU information, and it’ll pass that along to the guest OS. While this is inconvenient, it can be easily scripted with PowerCLI. It is also a great time to do a virtual machine hardware compatibility upgrade!

Update (8/20/2019): vSphere 6.7 Update 3 adds the VM advanced parameter vmx.reboot.powerCycle, which when set to “true” will cause ESXi to power-cycle (hard off, then hard on) a VM the next time it reboots. This means that CPU vulnerability mitigations can proceed as part of your normal guest OS patching cycle. To enable this you can manually edit the VM’s advanced parameters or, better yet, use PowerCLI: “Get-VM  | New-AdvancedSetting -Name ‘vmx.reboot.powerCycle’ -value $true”

Q12: What is the recommended way of remediating my vSphere environment?

A12: The generic answer is to verify the order in the documentation. Start with KB 67577. Put simply: always patch vCenter first, patch ESXi hosts second. vCenter must be patched first so that it can properly coordinate the ESXi updates, in terms of EVC, DRS, and so on. Out-of-order updates cause EVC errors and support issues. vCenter first!

Q13: If I already remediated my environment for L1TF using the SCAv1 scheduler do I have to do anything more for MDS?

A13: To gain the remediations for MDS (which is an additional CPU instruction, MDCLEAR) you will need to update to the relevant vSphere patch levels and then follow the guidance for remediating guest OSes, including power-cycling the VMs.

Q14: What performance impacts do these schedulers have?

A14: All workloads are affected differently but most people use 30% as a worst-case estimates for SCAv1 performance losses and 10% for SCAv2 performance losses.

Q15: If I remediate for MDS will I be covered for all other vulnerabilities as well?

A15: Yes. vSphere patches are cumulative so applying the most recent updates and then following the procedure to remediate the guest OSes will cover all issues, though there may be differing procedures for each vulnerability in each guest OS. We always recommend you verify your work, such as with tools like those available at mdsattacks.com, and we always recommend testing updates in a dedicated test environment if possible (nested ESXi & vCenter is a great test environment).

Q16: I understand vSphere updates ship with CPU microcode in them. Do I need to update my BIOS?

A16: We recommend it. VMware ships only what it needs to gain the extra CPU instructions on a CPU. However, hardware vendors like Dell, HP, Lenovo, etc. ship additional updates for management engines, out-of-band controllers like iLO and iDRAC, and HBA & NIC firmware updates that improve compatibility and availability.

Q17: Can I upgrade to vSphere 6.7 if I apply these new patches?

A17: Maybe, maybe not. Please check the upgrade compatibility matrix for more information.

Q18: How can I find out about security advisories from VMware?

A18: Visit the Advisories page, where you can also sign up for email alerts!

(Many thanks to Mike Foley & Edward Hawkins for editing, David Dunn for the questions in the Advisory Tool, and Will Pien for the scheduler graphics in this post)