Technical

Configuring TPM 2.0 on a 6.7 ESXi host

In a previous blog post I went over the details on how ESXi uses a TPM 2.0 chip to provide assurance that Secure Boot did its job and how that “attestation” rolls up to vCenter to be reported on.

In this blog article I’m going to go over some of steps necessary to configure the ESXi host to use TPM 2.0 chip. Now, I have only a limited number of hardware systems in my lab from which to do this, but the steps should be familiar, regardless of the server model.

Stop! Important Note!

Please see my other blog on “Prepping an ESXi 6.7 host for Secure Boot“. TPM 2.0’s function on an ESXi host to attest that Secure Boot has done its job. If you cannot successfully boot with Secure Boot FIRST then don’t don’t bother trying to configure the host for TPM 2.0. You need Secure Boot working FIRST. First rule of good troubleshooting, limit the number of changes!

Prerequisites

As called out in the documentation, there are a few prerequisites you need to meet before starting this process.

To use a TPM 2.0 chip, your vCenter Server environment must meet these requirements:

  • vCenter Server 6.7

  • ESXi 6.7 host with TPM 2.0 chip installed and correctly configured in the UEFI bios

  • UEFI Secure Boot enabled

Server BIOS settings

Correctly configuring the TPM 2.0 devices in the BIOS involves ensuring a number of settings are correct.

  • The TPM is set to use SHA-256 hashing
  • If available, it must also be set to use the IS/FIFO (First-In, First-Out) interface and not CRB (Command Response Buffer)
  • TXT must be disabled
    • Yes, we use TXT when using TPM 1.2 but it is not yet implemented in TPM 2.0 on ESXi (and yes, I ran in to this specifically!)

My Servers

The servers I have in my lab are Dell PowerEdge R630’s. They originally came with TPM 1.2 devices but I had them upgraded to TPM 2.0 and they are running BIOS version 2.6.0.

TPM Settings

Here are the settings in the System Security part of my servers BIOS. Your systems may look different but the options should be similar.

RTFM?

When I first started this process I did what most do. I didn’t read the docs. I like to break things and see if I can fix them. And then ask questions of the engineers. Why do I do this? Well, for one, I believe I learn faster by breaking and fixing and besides, it’s a lot more fun for me. Also, I’m trying to replicate what customers may encounter. Oh, sure, 99% of you actually read the docs before jumping on to Twitter to ask a question, right? RIGHT? Well, I’m there for that 1% who don’t!

When I started, I got the TPM 2.0 devices installed and I then installed 6.7 (after updating my VCSA first of course!). What resulted next was an error on the summary page of the ESXi host.

Note: I do not have 117 ESXi hosts at my disposal. Yes, I have been asked that.

I went in to the BIOS and started playing around with settings. I cleared the “TPM Hierarchy” (the contents of the TPM) but that didn’t do it. I was getting an alarm that things weren’t configured correctly.

Hashing

One of our engineers, Sam, was awesome. I have to give her credit for maintaining her patience with me. She had me look at the logs and sure enough, we found something interesting:

Oh look! TPM wasn’t set to use SHA256 hashing! So I set the TPM to use SHA256 hashing.

This setting was in the TPM Advanced settings page that I was able to select the hashing algorithm. See below:

TXT Disable

Note that when I took this screenshot I had TXT enabled. This caused another set of errors in the log files. Here’s the text from that.

While going through this process I was sharing my experiences on the vExpert Slack channel and others had come across the “Tpm2ResMgrProcessResponse:846: Error: TPM command error code 0x18b” error as well.

It was at this time that I was told by Engineering to disable TXT. TXT has not been implemented it in our current TPM 2.0 code.

Time to file a bug report

Reboot number, oh, I don’t know, 3? 4? I still encountered a failure. So I filed a bug. This time the host was reporting a “Failed” attestation and there was nothing in the kernel log stating why. Another one of our engineers looked at the bug and the vCenter and ESXi support bundles and found the latest culprit.

A-Ha! “No identity key in DB, try to reconnect host” explains it! What this means is that the host was added to vCenter without a TPM 2.0 chip enabled in the bios. After it was added was when the TPM 2.0 chip was enabled in the BIOS. In my case, my hosts were added a couple of years ago, I installed a TPM 2.0 device after the fact. What this error means is that there is no TPM Endorsement Key stored in the VCDB. This trust is set up when vCenter first adds the host to a cluster.

Disconnect…Reconnect

The solution was simple. Disconnect and reconnect the host. Put the host into Maintenance Mode, right click and select Connection…Disconnect and then right click again and select Connection…Connect. No need to remove the host from inventory.

Documentation fixes

Unfortunately, when I looked in the documentation (after the fact, naturally) to see if the error and solution was documented the response was “Call support”. We quickly got that fixed and now the documentation says the following:

Note: If you add a TPM 2.0 chip to an ESXi host that is already managed by a vCenter Server, you must first disconnect the host, then reconnect it. See vCenter Server and Host Management documentation for information about disconnecting and reconnecting hosts.

In fact, we even added a section on troubleshooting based directly on my experiences that led to this blog!

Passed Attestation

At this point the host showed up as having passed attestation! Woo-Hoo! Secure Boot has done its job and I can provide a report that says so, based on TPM 2.0 trust.

Wrap Up

I hope this has been helpful for you in setting up your ESXi host to use TPM 2.0. I think out of this whole process of NOT looking at the documentation and fumbling my way through the setup and configuration helped us end up with much better documentation and a better understanding of where things could go wrong. That’s #winning in my book.

I want to thank all the engineers that helped out on this. It really helped me understand what’s going on under the covers and enabled me to write these blogs.

If you have questions that haven’t been answered you can reply here, send them to mfoley at vmware.com or via Twitter to @vspheresecurity or my personal Twitter account: @mikefoley

@vspheresecurity is a curated list of vSphere Security specific tweets.

Thanks for reading!

mike