vSAN Hyperconverged Infrastructure Software-Defined Storage

Understanding vSAN Encryption – KMS Profile Addressing

When using vSAN Encryption, one of the vSAN Health Check tests will show the health of the connection between the vSAN Hosts and the KMS Cluster as well as vCenter and the KMS Cluster.

One scenario came up a few weeks ago where the vSAN Health Check indicated that the vSAN Hosts could properly communicate with the KMS Cluster, but the vCenter server had intermittent connectivity to the KMS Cluster.

Troubleshooting indicated that there were no blocked ports between the vCenter Server and the KMS Cluster as well as they were able to properly ping each other. vSAN Hosts could properly ping the KMS Cluster as well, and no ports were blocked.

Here is the vSAN Health Check’s reported error for the vCenter KMS Status.

vSAN Encryption

Notice that the certificate status is valid, but the connection and trust statuses are not.

Looking at the Host KMS Status it can be seen that the hosts are properly communicating with the KMS Server.

vSAN Encryption

The process of enabling vSAN Encryption includes the following steps:

  1. A KMS Connection Profile is created in vCenter and the trust is established.
  2. vSAN Encryption is enabled in the Configuration>Data Services menu in the vSAN UI.
  3. The KMS Connection Profile is pushed to each of the ESXi hosts, they use the kekId and hostkeyId in this profile to retrieve the KEK and HostKey for the vSAN Cluster.

The connection has to be correct in vCenter Server before it can be correct/pushed to vSAN Hosts. Something must have changed in the environment to cause this issue.

Further investigation indicated that the connectivity to the KMS Cluster was intermittent. Sometimes the vCenter KMS Status reported green and other times reported red. So maybe nothing changed.

Careful review of the vCenter KMS Status and Host KMS Status health checks, the KMS Alias is a “short name”.

Maybe there is an issue where the short name is intermittently resolved from DNS… But the vSAN Hosts were not showing any intermittent connectivity, only the VCSA.

The Key Management Servers configuration Profile in the vCenter’s settings shows that the trust cannot be established. The KMS Address is the same value as the KMS Alias in the vSAN Health Check.

vSAN Encryption

When using a short name, the default TCP/IP stack of a vSAN host uses designated search domains in the name resolution process. In the case of this cluster, demo.local and demo.central can be used in short name resolution.

vSAN Encryption

The VCSA, on the other hand, does not have any search domains:

vSAN Encryption

Without search domains to assist with the short name, vCenter would rely on the DNS server for name resolution.

The suggestion was made to change the KMS Address value for each KMS Cluster node to either an IP address or the Fully Qualified Domain Name (FQDN). Changing one of the two KMS entries showed some success.

vSAN Encryption

Adjusting the KMS Address for the alternate KMS Cluster node cleared the issue up entirely.

vSAN Encryption

In the case that this was brought up, an alternate vCenter had no issues connecting to the KMS Cluster, but an IP address was used instead of a short name. Without digging into DNS configurations of the environment, setting the Fully Qualified Domain Name (FQDN) resolved the issue.


In short, when configuring the Key Management Server connection profile for a KMS Cluster, ensure that the KMS Address is one that vCenter and vSAN hosts can correctly resolve. Using a Fully Qualified Domain Name or IP address can prevent “short name” related issues.


9 comments have been added so far

  1. Thank you Jase,

    is there any recommendation regarding the KMS Alias ?
    Can I use the aliases to configure the KMS server in a desired sequence regardless of the FQDN (i.e KMS1, KMS2) ?

  2. I’m guessing you could. The KMS Name and KMS Address do not specifically have to align.
    In my example above, the KMS Names are short and the KMS Address’s are not (once resolved).
    To my knowledge the KMS Name is not used for connectivity, but rather the KMS Address.

  3. Thanks! A very timely post for me. I’m in the middle of deploying stretched cluster with 12 data hosts (disks already encrypted) and ran into similar problem.

    The data hosts lost connection to KMS when I configured static routes to the Witness on each ESXi host. The witness and KMS were on same VLAN/IP range and I added a static route using /24 network. I changed this to have a static host route /32 for the Witness and additional static host /32 route to the KMS. I’m able to connect to least one of the KMS servers. There is a second KMS failing to connect at different data center vlan/IP. Still troubleshooting the L3 changes made for stretched cluster that broke this connection.

    I was surprised to see that the vmk interface tagged for vSAN is communicating to the KMS. I’m looking at traffic logs and the source IP connecting to port 5696 on KMS host is the IP of data host for vSAN vmk. Does this sound correct?

  4. kmcd,

    The key request operation is going to connect to the KMS server(s) over the vSAN Hosts’ Management interface (typically vmk0).

    The situation you appear to be experiencing, is that because the vSAN Witness and KMS are on the same segment, the VMkernel interface being used to connect to the vSAN Witness Host is choosing to use the vSAN VMkernel interface for the KMS key request as well.

    I’ll ask if you are using vSAN 6.7 with Witness Traffic Separation. If you are, then you can easily tag the Management VMkernel interface (vmk0) with “Witness Traffic”. This would allow both “Witness Traffic” and the “key request” operation to use the same VMkernel interface. If you haven’t moved to vSAN 6.7, but are using Stretched Clusters with vSAN 6.1-6.6, then I would recommend using an IP address on an alternate segment to guarantee the traffic to the vSAN Witness and the key requests traverse different VMkernel interfaces.

    I hope this helps.

  5. Currently running vSAN 6.6. We had some challenges implementing L3 to our Witness. We have switch stack for data node ESXi managment and VM traffic. And a separate stack for vSAN. This second stack is connected with dedicated 10 GB between data centers.

    To get the Witness to work, I configured the vmk0 for both management and witness traffic. I opened ticket with GSS confirming if this was supported configuration and was told it was. But now thinking this is not correct or a supported configuration.


  6. Witness Traffic Separation is NOT supported with vSAN Stretched Clusters until vSAN 6.7.

    Witness Traffic Separation was introduced in vSAN 6.5 (and retroactively supported on vSAN 6.2/vSphere 6.0 Update3) for 2 Node vSAN Clusters only.

    It was not supported on “normal” vSAN Stretched Clusters until vSAN 6.7. There was some incorrect documentation that indicated otherwise, but I understand that has been corrected.

  7. The HostKeyID is not displayed in the vSphere Web Client or vSphere Client.
    It can be found in /etc/vmware/esx.conf – Not certain if this is easy to find in PowerCLI or not.

    Looking in the KMS server’s management interface, you may possibly be able to see the HostKeyID as an object.
    Only having a single vendor’s KMS running at the moment, I can confirm that the HostKeyID matches the UUID in the HyTrust interface.

    I’ll have to spin up a few of the other KMS appliances I have and see if the information matches there as well.

  8. Jase, thanks for your answer (admit, I’m a bit late). Would be interested to know how you compared HostKeyID to HyTrust UUID? I’m using Gemalto and we have to lookup keys in VMware and reference it in Gemalto, curious if Hytrust has any attributes that lends itself easily to be looked up?

  9. The Key ID’s listed in /etc/vmware/esx.conf are identical to the UUIDs in the HyTrust KeyControl KMIP interface.

Leave a Reply

Your email address will not be published.