Today we have a post about virtual networking from Ramprasad K.S., who is a senior tech support engineer in our Bangalore office.
Have you ever had a case where a virtual machine loses its configured NIC?
In vSphere we introduced “Hot Add/Remove” for Network Adapters and SCSI controllers along with CPU and Memory. This means you can now add or remove these devices while a VM is powered on and the guest is running. This action is not limited to the management. These devices also show up as hot removable in the guest (in Windows you use the “Safely Remove Hardware” icon in the system tray).
- One reason is Hot Removal from the Guest. With the new Hot Add/Remove feature, NICs show up under the “Safely Remove Hardware” list. Any user with administrative privileges can accidentally remove the NIC using this feature. This is a common reason why the NIC has gone missing. This misstep results in numerous calls into support.
- Another reason why NIC can go missing is someone manually removed it from the Virtual Machine configuration (Probably using UI or some SDK APIs).
In both cases we can resort to the Virtual Machine logs to provide clues as to which one of these method was used.
In case of the NIC is removed using UI (“Edit Settings” for the Virtual Machine) then one would see API calls being logged in the vmware.log of the Virtual Machine. The log text would be similar to the following:
Mar 15 03:13:37.392: vmx| Vix: [466627 vmxCommands.c:1929]: VMAutomation_HotPlugBeginBatch. Requested by connection (1).
Mar 15 03:13:37.420: vmx| Vix: [466627 vmxCommands.c:1861]: VMAutomation_HotRemoveDevice
Mar 15 03:13:37.420: vmx| VMAutomation: Hot remove device. asyncCommand=3E10BA28, type=54, idx=1
Mar 15 03:13:37.420: vmx| Requesting hot-remove of ethernet1
The line immediately above indicates that the NIC removal was initiated by either an SDK API Call or UI and the following log segment indicates the Hot Removal completed.
Mar 15 03:13:37.463: vmx| Powering off Ethernet1
Mar 15 03:13:37.463: vmx| Hot removal done.
You may also observe the VM pause for a brief time to complete the removal.
Mar 15 03:13:37.447: vmx| Checkpoint_Unstun: vm stopped for 17696 us
In this case we will see slightly different log entries. There will be no indications of VMAutomation being involved here. The start of removal is identified by following lines:
May 27 16:38:52.903: vcpu-0| CPT current = 0, requesting 1
May 27 16:38:52.903: vcpu-0| CONFIGDB: Logging Ethernet0.pciSlotNumber=-1
Completion of Hot Removal can be identifed with same logging message as the one in earlier case.
May 27 16:38:53.417: vmx| Powering off Ethernet0
May 27 16:38:53.418: vmx| Hot removal done.
Note: NIC removal is always a user initiated process either outside of the Guest (using UI) or inside the guest. There are no other reasons why a NIC should go missing from Virtual machine configuration.
- Hot Add/Remove has to be disabled at each the level of Virtual Machine. At this time we don't have any global configuration that would be valid for all Virtual Machine at ESX/vCenter Level. The parameter which controls the hotplug nature of the devices is
devices.hotplug. Please follow the Knowledge base article 1012225 : Disabling the HotPlug capability in ESX 4.0 virtual machine
Note:Remember disabling hotplug means you can neither add not remove a device from virtual machine in powered on state.
- For Guests running Windows operating systems, we can use a registry hack to hide the hot removable capabilities of the NIC. Be careful following this method as it uses potentially dangerous registry editing. Please backup your registry before proceeding with any edits.
- Run regedit as Local System account. One way to do this is to run “at <current time + 1 min in 24 hr format> /interactive regedit.exe”, without the quotes. Something like “at 00:33 /interactive regedit.exe”
- Now go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum, search for E1000
- Set the Capabilities flag in the key(s) found above, to the current value - 4.
For example, we have the key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\PCI\VEN_8086&DEV_100F&SUBSYS_075015AD&REV_01\4&47b7341&0&1088 with the Service value E1000. Capabilities is set to 6. On changing the value to 2, (immediately) E1000 NIC will be no longer listed in the safely remove hardware list anymore.
If the guests are part of Domain, you might be able to push these changes to the system registry for the guests.