VCF Operations VCF Operations

Diagnostics for VMware Cloud Foundation Operations – Newest Findings

Diagnostics for VMware Cloud Foundation is a centralized platform that monitors the overall operational status of the VMware Cloud Foundation(VCF) software stack. It is a self-service platform that helps you analyze and troubleshoot the components of VMware Cloud Foundation, including vCenter, ESXi, vSAN, capabilities such as vSphere vMotion, snapshots, VM provisioning, and other issues including security advisories and certificates. As an Infrastructure admin, you can monitor the operational state of your environment using diagnostics findings

Diagnostics Findings which were previously delivered through Skyline Advisor and Skyline Health Diagnostics are available to VCF and vSphere Foundation (VVF) customers in VCF Operations. Findings are prioritized by trending issues in Broadcom Technical Support, issues raised through post escalation review, security vulnerabilities, issues raised from Broadcom engineering, and nominated by customers. 

For the most recent release of VCF Operations, we released 114 new Findings. Of these, there are 83 Findings based on trending issues, 15 based on post escalation reviews, 14 based on VMSA, and 2 based on nominations. There are 62 Health findings, these are findings that are equivalent to Skyline Advisor findings. Health findings are automatically checked against your environment every 4 hours. There are 52 log based findings that are equivalent to Skyline Health Diagnostics findings. Log based findings are manually initiated against your environment by choosing refresh against the configuration operations instances. 

Security Vulnerabilities

In VMSA-2025-0010, VMware vCenter Server authenticated command-execution vulnerability (CVE-2025-41225) and VMware ESXi and vCenter Server Reflected Cross Site Scripting (XSS) Vulnerability (CVE-2025-41228). A malicious actor with privileges to create or modify alarms and run script action may exploit this issue to run arbitrary commands on the vCenter Server. A malicious actor with network access to the login page of certain ESXi host or vCenter Server URL paths may exploit this issue to steal cookies or redirect to malicious websites. This vulnerability is addressed in vCenter Server 8.0 Update 3e.

Post Escalation Review

Broadcom Technical Support has developed a Post Escalation Review process. We review critical escalations which come into our Escalation Management team and determine steps to prevent these escalations in the future with other customers. One of the outcomes of this process is the creation of Diagnostics Findings. 

In KB#370670, in ESX hosts disconnected from vCenter due to excessive logging rates, causing dropped syslog messages and services to be unable to log. This issue is commonly observed after additional NSX logging is enabled, causing the dfwpktlogs.log file to exceed the sustainable rate of the syslog service to write and send all log messages.  However, this is not necessarily the only cause as any service that begins to exceed the sustainable log rate by syslog could cause this issue. This finding is triggered by log messages being reported into the vmkernel.log file on an ESX host. 

VMware Technical Support Trending Issues

VMware Technical Support trending issues are KBs that have solved many SRs and/or are viewed many times. 

In KB 383273, Health: Miss counters detected alerts for Mellanox drivers on ESXi 8.0.2 and 8.0.3. The following error is reported “nmlx5_QueryNicVportContext:188 command failed: IO was aborted”. This is a known bug in the nmlx5 health mechanism logic where the driver incorrectly detects NIC is in faulty state. This is Resolved in ESX 8.0 Update 3e which contains nmlx5 version: 4.23.6.5. 

In KB 383273, ESX Host has reported PERM LOSS during an VCF Operations for Logs Query. Permanent Device Loss (PDL) A datastore is shown as unavailable in the Storage view. A storage adapter indicates the Operational State of the device as Lost Communication. All paths to the device are marked as Dead.All-Paths-Down (APD) A datastore is shown as unavailable in the Storage view. A storage adapter indicates the Operational State of the device as Dead or Error. All paths to the device are marked as Dead. You are unable to connect directly to the ESX host using the vSphere Client The ESX host shows as Disconnected in vCenter Server. This finding is triggered by log messages being reported into the vmkernel.log file on an ESX host.

Findings Nominated

The Diagnostics team’s primary focus is customer satisfaction. We want to keep customers out of harm’s way, and we do this by providing you with Findings we discover from the day-to-day business of Broadcom Technical Support. We also want to hear ideas of what you would like to see in Diagnostics for VMware Cloud Foundation. The following Finding came from one of our customers: 

In KB#385443, vSAN node PSODs with stuck IO after a disk failure. A command was detected as Stuck IO, the command later completed. By the time the command completed, the objects related to that command were already freed up (by the time the RED event was notified) and hence caused PSOD. This is Resolved in ESX 8.0 Update 3e.

To review all the Findings in Diagnostic for VMware Cloud Foundation please review the Findings Catalog found in the Diagnostics Findings section of VCF Operations. To get the latest updates on the findings in Diagnostics for VMware Cloud Foundation please subscribe to the Diagnostics for VMware Cloud Foundation Findings KB. This KB is updated proceeding either an in-product update or management pack update of the Diagnostics Findings.