Cloud Security Cloud Updates Migration Product Updates

On Meltdown and Spectre: The new security risks, and what you need to know

Two serious security vulnerabilities, Meltdown and Spectre, have just been revealed. Many major media outlets are covering the news: to quote the New York Times, the flaws pose “a major threat to the way cloud-computing systems operate” with the Guardian calling them the “worst ever” CPU bugs.

What’s different about these latest vulnerabilities? For one, they are embedded in the core of computing systems’ hardware. Technical details are available at https://spectreattack.com/. To make a long story short: there is a critical vulnerability in the silicon on Intel, AMD and ARM chips themselves. This affects billions of devices, personal computers, and servers…including the servers used to run cloud infrastructure at AWS, Azure, Google, and other cloud providers.

As with any discovery of this magnitude, there’s a lot of discussion about the implications — both near- and long-term. Meanwhile, cloud providers are furiously patching and AWS users are reporting changes in instance performance. To help cut through the chatter, here’s a look at the potential side effects of Spectre and Meltdown and the ways in which you can address the issue.

What are the potential side effects?

Security

The cloud relies heavily on virtualization and sharing hardware between tenants, so Meltdown and Spectre have some scary possibilities.

For instance, it may be possible for a malicious piece of software running on an EC2 instance or Azure VM to read sensitive information from another tenant running on the same physical server. That means encryption keys, passwords, locations/credentials for databases and other servers, sensitive customer information…you name it. If it is in kernel memory it can be found.

Starting to panic? Don’t. The way this incident has been handled proves that competitors can cooperate effectively. AWS, Azure and Google had a coordinated maintenance event. The kernel patch to address Meltdown has been in the works for over two months prior to this public disclosure, and Linus Torvalds (the father of Linux) has assisted in expediting the Linux kernel fix.

Performance

The necessary fixes in the Linux Kernel and Windows operating systems will have a performance impact across the board on any affected device that receives a mitigating patch. Some workloads may see as much as a 30% impact in performance.

So what can I do?

From https://aws.amazon.com/security/security-bulletins/AWS-2018-013/

“An updated kernel for Amazon Linux is available within the Amazon Linux repositories. Instances launched with the default Amazon Linux configuration on or after 10:45 PM (GMT) January 3rd, 2018 will automatically include the updated package. Customers with existing Amazon Linux AMI instances should run the following command to ensure they receive the updated package”

Here’s how you can remediate the risks posed by Meltdown and Spectre:

  • Amazon Linux AMIs: Amazon has issued an updated kernel. You can run the following command to ensure you’ve received the updated package: yum update kernel
  • Azure: Azure is reporting that their impact is negligible, and a small set of customers may experience some networking performance impact. This can be addressed by turning on Azure Accelerated Networking (Windows, Linux), a free capability available to all Azure customers.
  • Red Hat: Red Hat customers running affected versions of the Red Hat products are strongly recommended to update them as soon as errata are available.
  • Google: Google has already patched GCE infrastructure against known attacks, but you must patch/update guest environments. GKE customers who have turned off auto-updates must manually upgrade their clusters.

Update

January 5, 2018

Using a cloud management platform, such as CloudHealth, you can isolate paravirtualized EC2 Instances and view their aggregate Avg CPU over the past week. Some CloudHealth customers have shown almost no change, while others have seen a large spike. As many posts have stated, your baseline CPU usage will change based on your workload.

paravirtualized EC2 Instances

If you’d like to see how exposed you are to the impact on your paravirtualized instances, you can filter to see how many you have. Additionally, you can refine the search to just your production assets using an environment perspective, if configured. This will allow your teams to decide if a plan is needed to transition over to HVM. The same analysis can be performed on your AMIs.

Spectre Meltdown - paravirtualized instances