Home > Blogs > Support Insider > Monthly Archives: June 2010

Monthly Archives: June 2010

Some random, but useful thoughts

Another guest post from Tech Support Engineer Mike Bean speaking casually about a few observations and a couple of tips.

Good morning lords and ladies, I know I missed last week, but what can I say, you can’t keep a good technical support engineer (or me!) down! First and foremost: I absolutely MUST call attention to something that crossed my twitter feed a few days ago.

clip_image001

Great day in the morning, why is this not on the front page of the VMware support page in bold lettering? I’m sure you can probably imagine, this is a frequently requested thing. I don’t know if it was only recently automated, or if we just didn’t know it was there, but rest assured I’ll be talking to the building manager about having a 4 story banner sign hung on the side of the building overlooking US36 (in Colorado). (I’m kidding, we don’t actually need a 4 story sign, two stories will do just fine!)

On a side note, one of our employees fairly high in the knowledge base chain of command came to my humble cubicle this afternoon! VMware management has been nothing but supportive of my efforts, but it became clear fairly quickly that neither she nor I had a clear picture what, if any impact these blogs are having. I’ve said it before, I’ll say it again, I read my fair share of technology blogs and listen to quite a few podcasts, and generally, the best give the public what they want.

ESX or diet ESX

Since my arrival at VMware, I can say safely that a fair amount of time and energy has been put into the distinction between ESX, and ESXi, and I’ve had at least a few conversations with various customers to that effect. The unwelcome truth, is that it’s not an overly simple subject. Usually by the time a customer arrives in my queue, they have a few preconceptions, and it’s usually because their managers told them “X”, or our sales force told them “Y”; but it’s a subject worth exploring in greater detail. When you have a clear picture of how ESX is supposed to look, it’s generally a lot easier to fix when something’s wrong.

I alluded briefly to the service console in a previous blog. What, IS, the service console? We’ll let our product design engineers hash out a specific definition. For our purposes, it’s enough to think of the service console as a linux virtual machine that exists to help your manage your host. Think of it, like a maintenance hatch. Don’t confuse the service console with VMkernel, or vice versa. Usually, if you need to call global support services, it’s almost always a good idea to have some form of root access to the service console ready first! Learn how to enable root SSH login here.

Which more or less brings me to my first point, the primary difference between ESX, and ESXi, from an operations point of view, is that ESXi has no service console. It doesn’t take a specialist to see that poses certain security and footprint size advantages. The service console, is a potential point of weakness, and without it, ESXi can sometimes be considered inherently more secure. At least, that’s the typical argument for using ESXi over ESX. Why then, use ESX over ESXi? You may well have heard rumors by now, that VMware’s focus is on ESXi. To the best of my knowledge, those rumors are completely true. Why then, would anyone choose ESX?

Simple, the service console is a linux derivative. Generally, anyone who’s familiar with Red Hat, will probably feel right at home. That means that ESX (classic) is SUBSTANTIALLY easier to maintain. Commands and techniques that work on ESX, don’t work on ESXi, and vice versa. Ultimately, when customer asks me on the phone, “What should I use?”, I  generally tend to advise my customers to plan according to our commitments, not our conversations. For the time being, we are committed to and supporting, both ESX and ESXi. So most of the time, a consumer should choose between ESXi’s enhanced security benefits, and smaller memory profile, and ESX’s ease of use.

Please be aware, some men and women far smarter than myself are working very hard to bring ESX’s ease of use to ESXi, and I have every confidence that they WILL succeed. In the meantime however, when customers choose ESXi, I usually try to suggest to them that they familiarize themselves with certain tools.

ESXi lives almost COMPLETELY in RAMdisk!

This is relevant because EVERYTHING inevitably enters an error state sooner or later. It’s the nature of software. If ESXi does enter an unrecoverable state that forces you to reboot – it loses the contents of RAM, and becomes VERY difficult to diagnose. For this reason, I ALWAYS recommend the use of an external syslog server! (Syslog is a redhat protocol that ESXi can be configured to use.)

See our Knowledgebase article: Enabling syslog on ESXi

This isn’t the easiest thing in the world to configure, but when you do enter an error state that forces a reboot, you’ll have your system logs to fall back on, and that, is a substantial advantage! It only takes one outage incident for a syslog server to pay for itself in spades!

vMA

Familiarize yourself with the vMA (vSphere management assistant). Think of it as a portable service console for your ESX host. The vMA, and the CLI interface it contains, understands many of the same commands as the ESX service console. Use it to connect to your ESXi host, and you’ll regain a great many of the commands ESXi lost.

Check out: vSphere Management Assistant (vMA) [appliance for vSphere CLI, vSphere SDK for Perl, and SMI-S] 
-AND-
vSphere Command-Line Interface Installation and Reference Guide

Feel free to hate me while you’re learning it ;). I’ve been working in ESX administration & troubleshooting for years now, and even I still find the vMA/CLI a little challenging. That said, trust me, you might not like using it today or tomorrow, but there will come a day when you won’t know how you lived without it. Syslog and the vMA are essential tools, practically the Batman and Robin of ESXi administration, and your environment will almost certainly be better off for it!

That’s all for this week, to reiterate, in conclusion, don’t be shy about letting the knowledge base team know what you like! If we’re doing things right, let us know so we can keep on trucking! More importantly, there’s an awful lot of possible subjects out there, so if there are topics you feel are weak or just flat want to know more about, send us a shout-out/ping.

Live well!

NIC is missing in my Virtual Machine

Today we have a post about virtual networking from Ramprasad K.S., who is a senior tech support engineer in our Bangalore office.

Have you ever had a case where a virtual machine loses its configured NIC?

Background

In vSphere we introduced “Hot Add/Remove” for Network Adapters and SCSI controllers along with CPU and Memory. This means you can now add or remove these devices while a VM is powered on and the guest is running. This action is not limited to the management. These devices also show up as hot removable in the guest (in Windows you use the “Safely Remove Hardware” icon in the system tray).

2 reasons why a NIC will go missing in the Virtual Machine.
  • One reason is Hot Removal from the Guest. With the new Hot Add/Remove feature, NICs show up under the “Safely Remove Hardware” list. Any user with administrative privileges can accidentally remove the NIC using this feature. This is a common reason why the NIC has gone missing. This misstep results in numerous calls into support.
  • Another reason why NIC can go missing is someone manually removed it from the Virtual Machine configuration (Probably using UI or some SDK APIs).
How can we find out which one of the methods was used?

In both cases we can resort to the Virtual Machine logs to provide clues as to which one of these method was used.

NIC removed from VM using UI (Edit Settings)

In case of the NIC is removed using UI (“Edit Settings” for the Virtual Machine) then one would see API calls being logged in the vmware.log of the Virtual Machine. The log text would be similar to the following:

Mar 15 03:13:37.392: vmx| Vix: [466627 vmxCommands.c:1929]: VMAutomation_HotPlugBeginBatch. Requested by connection (1).
Mar 15 03:13:37.420: vmx| Vix: [466627 vmxCommands.c:1861]: VMAutomation_HotRemoveDevice
Mar 15 03:13:37.420: vmx| VMAutomation: Hot remove device. asyncCommand=3E10BA28, type=54, idx=1
Mar 15 03:13:37.420: vmx| Requesting hot-remove of ethernet1

The line immediately above indicates that the NIC removal was initiated by either an SDK API Call or UI and the following log segment indicates the Hot Removal completed.

Mar 15 03:13:37.463: vmx| Powering off Ethernet1
Mar 15 03:13:37.463: vmx| Hot removal done.

You may also observe the VM pause for a brief time to complete the removal.
Mar 15 03:13:37.447: vmx| Checkpoint_Unstun: vm stopped for 17696 us

NIC removed inside the Guest

In this case we will see slightly different log entries. There will be no indications of VMAutomation being involved here. The start of removal is identified by following lines:

May 27 16:38:52.903: vcpu-0| CPT current = 0, requesting 1
May 27 16:38:52.903: vcpu-0| CONFIGDB: Logging Ethernet0.pciSlotNumber=-1
Completion of Hot Removal can be identifed with same logging message as the one in earlier case.
May 27 16:38:53.417: vmx| Powering off Ethernet0
May 27 16:38:53.418: vmx| Hot removal done.

Note:  NIC removal is always a user initiated process either outside of the Guest (using UI) or inside the guest. There are no other reasons why a NIC should go missing from Virtual machine configuration.

How to Stop the NIC from being removed
  • Hot Add/Remove has to be disabled at each the level of Virtual Machine. At this time we don’t have any global configuration that would be valid for all Virtual Machine at ESX/vCenter Level. The parameter which controls the hotplug nature of the devices is devices.hotplug. Please follow the Knowledge base article 1012225 : Disabling the HotPlug capability in ESX 4.0 virtual machine

Note:Remember disabling hotplug means you can neither add not remove a device from virtual machine in powered on state.

  • For Guests running Windows operating systems, we can use a registry hack to hide the hot removable capabilities of the NIC. Be careful following this method as it uses potentially dangerous registry editing. Please backup your registry before proceeding with any edits.
    • Run regedit as Local System account. One way to do this is to run “at <current time + 1 min in 24 hr format> /interactive regedit.exe”, without the quotes. Something like “at 00:33 /interactive regedit.exe”
    • Now go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum, search for E1000
    • Set the Capabilities flag in the key(s) found above, to the current value – 4.

For example, we have the key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\PCI\VEN_8086&DEV_100F&SUBSYS_075015AD&REV_01\4&47b7341&0&1088 with the Service value E1000. Capabilities is set to 6. On changing the value to 2, (immediately) E1000 NIC will be no longer listed in the safely remove hardware list anymore.
If the guests are part of Domain, you might be able to push these changes to the system registry for the guests.

New articles published for week ending 6/26/2010

Virtual Disk Development Kit
Increase in memory consumption while working with VDDK (1022540)
Date Published: 6/21/2010

VMware Data Recovery
VMware Data Recovery Integrity Check fails with the error: Trouble reading from destination volume, error -2241 (1018060)
Date Published: 6/23/2010
VMware Data Recovery incorrectly reports that backing up a virtual machine with an independent disk is successful (1018764)
Date Published: 6/25/2010
Using Telnet to test VMware Data Recovery Appliance connectivity (1018799)
Date Published: 6/24/2010
Upgrading to VMware Data Recovery 1.1 fails with the error: Could not restore the Data Recovery appliance configuration (1020077)
Date Published: 6/23/2010

VMware ESX
Cannot change an ESX host’s vpxuser account back to its default settings (1005759)
Date Published: 6/24/2010
Committing snapshots or cloning a virtual machine with snapshots fails with the vmkernel log error: Conflict between read-only and non-read-only modes (1006401)
Date Published: 6/23/2010
Installing VMware Tools on a Windows 2000 Terminal Server virtual machine does not complete (1014798)
Date Published: 6/22/2010
vSphere role recommendations do not appear after upgrading to vCenter Server 4.0 Update 1 (1018261)
Date Published: 6/23/2010
Update Manager fails to remediate an ESX 3.5 host with the error: Pre-install failed for [‘VMware-esx-scripts.i386’] (1018456)
Date Published: 6/22/2010
SCSI reservation conflict errors reported on all hosts at the same time, with one host reporting the error: remote port time out (1018675)
Date Published: 6/25/2010
Extracting the ks-first-safe.cfg and ks-first.cfg files for a scripted installation of ESX 4.0 (1018990)
Date Published: 6/24/2010
Error when powering on a virtual machine or removing a USB device: Device USB is not supported (1019203)
Date Published: 6/22/2010
Upgrade paths for ESX/ESXi hosts (1019239)
Date Published: 6/24/2010
The ethtool -k command incorrectly reports tx_csum and TcpSegmentationOffload settings (1019626)
Date Published: 6/23/2010
Upgrading to ESX 4.0 Update 1 fails with glibc errors (1020036)
Date Published: 6/25/2010
Network connection drops and never recovers on systems under heavy networking and processor load (1020060)
Date Published: 6/22/2010
VMware ESX 3.5, Patch ESX350-201006405-SG: Updates GNU GCC package (1020169)
Date Published: 6/25/2010
VMware ESX 3.5, Patch ESX350-201006406-SG: Updates GNU Gzip package (1020170)
Date Published: 6/25/2010
VMware ESX 3.5, Patch ESX350-201006407-SG: Updates NTP package (1020171)
Date Published: 6/25/2010
VMware ESX 3.5, Patch ESX350-201006408-SG: Updates Kerberos package (1020172)
Date Published: 6/25/2010
Error when accessing VMFS volume: WARNING: J3: 1644: Error freeing journal block (1020198)
Date Published: 6/22/2010
Remediating an ESX host fails with the error: A general system error occurred: Invalid argument (1020432)
Date Published: 6/21/2010
Changing the location of ESX core dumps (1020668)
Date Published: 6/23/2010
Migrating to a new vCenter Server with the Cisco Nexus 1000v DVS (1020893)
Date Published: 6/22/2010
Virtual Port ID based Load Balancing: Multiple vmnics teamed together however only 1 vmnic is used for data transfer (1021021)
Date Published: 6/24/2010
VMware ESX 3.5, Patch ESX350-201006409-BG: Updates CIM and Pegasus (1021179)
Date Published: 6/25/2010
Converting a template to a virtual machine fails with the error: A component of the virtual machine is not accessible on the host (1021563)
Date Published: 6/21/2010
vmnic creation fails for one or multiple ports during initialization of ESX when using Qlogic 10gb FCOE CNA adapters (QLE8142-dual port card) (1021785)
Date Published: 6/21/2010
Windows Server 2008 does not recognize more than one vCPU in Windows Task Manager (1022715)
Date Published: 6/22/2010
VMware ESX 3.5, Patch ESX350-201006401-SG: Updates VMkernel, VMware Tools, hostd, VMX, VMnix (1022899)
Date Published: 6/25/2010
VMware ESX 3.5, Patch ESX350-201006402-BG: Updates Emulex Fibre Channel driver (1022900)
Date Published: 6/25/2010
VMware ESX 3.5, Patch ESX350-201006403-BG: Updates QLogic driver (1022902)
Date Published: 6/25/2010
VMware ESX 3.5, Patch ESX350-201006404-BG: Updates e1000 driver (1022903)
Date Published: 6/25/2010
VMotion fails on a NFS datastore (1023230)
Date Published: 6/25/2010
Upgrading VMware Tools may fail when using Automatic Tools Upgrade in the vSphere Client (1023459)
Date Published: 6/25/2010
Unable to Register vSphere Essentials License keys (1023485)
Date Published: 6/25/2010
Unable to mount local VMFS partition in ESX 3.0.x when using SATA/SAS disks (1007658)
Date Published: 6/24/2010
Unable to remove or create an NFS datastore (1006790)
Date Published: 6/24/2010
SQL cluster node fails with the error: System failed to flush data to the transaction log (1015595)
Date Published: 6/24/2010
Cannot create a quiesced snapshot because the snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine (1018194)
Date Published: 6/21/2010
Rebooting ESX/ESXi when Storage vMotion is in progress might create a duplicate virtual machine with the same name (1023182)
Date Published: 6/24/2010
VMware Tools installation fails to start the guest operating system daemon on Red Hat Enterprise Linux 4 64-bit guests
with the 32-bit glibc-common package installed (1023185)

Date Published: 6/24/2010
VMware ESXi 3.5, Patch ESXe350-201006401-I-SG: Updates firmware (1020052)
Date Published: 6/25/2010
VMware ESXi 3.5, Patch ESXe350-201006402-T-BG: Updates VMware Tools (1020173)
Date Published: 6/25/2010

VMware Fusion
Fusion virtual machine data missing after installing guest operating system and restarting (1019423)
Date Published: 6/22/2010
Common Mac OS operations that may be needed with VMware Fusion (1022113)
Date Published: 6/21/2010
Investigating virtual machine resources in VMware Fusion (1022213)
Date Published: 6/23/2010
Opening Mac documents in a virtual machine (1023218)
Date Published: 6/24/2010
Virtual Machine Window Becomes Black if You Disable 3D Acceleration (1015395)
Date Published: 6/22/2010

VMware vCenter CapacityIQ
Collecting diagnostic information for VMware vCenter CapacityIQ (1022927)
Date Published: 6/25/2010

VMware vCenter Lab Manager
After Upgrading From VMware vCenter Lab Manager 4.0.0.1140, Unable to Upgrade Agent if VMs are Using Host Spanning. (1023159)
Date Published: 6/23/2010

VMware vCenter Orchestrator
Add support for RAC and TNS configuration for Oracle 11g Database instances to vCenter Orchestrator (1022828)
Date Published: 6/25/2010

VMware vCenter Server
Modifying the default password settings for the vpxuser account (1016736)
Date Published: 6/25/2010
Disk provisioning through vCenter Server’s Scheduled Task feature fails (1016835)
Date Published: 6/25/2010
Creating View desktops causes the vCenter Server error: The operation is not supported on the object (1017380)
Date Published: 6/24/2010
Installing vCenter Server 4.0 fails with the error: Error 25205. Error loading license key to LDAP Server (1018752)
Date Published: 6/22/2010
Accessing the vCenter Database Entity-Relationship diagram in SQL (1018790)
Date Published: 6/22/2010
Unable to view the Performance Chart for disks on virtual machines that use NFS storage (1019105)
Date Published: 6/23/2010
Windows Event Viewer reports the event: The configuration of the admin connection/TCP protocol in the SQL instance SQLEXP_VIM is not valid (1021188)
Date Published: 6/21/2010
Upgrading to vCenter Server 4.0 fails with the error: setup failed with an unknown error during DSN validation (1022249)
Date Published: 6/25/2010
The vCenter Server vpxd log reports this message: Resource module ‘xxxx’ not found. Using from default locale (1022708)
Date Published: 6/21/2010
Virtual machine becomes orphaned and reports a Not protected status when trying to enable Fault tolerance (1019675)
Date Published: 6/21/2010

VMware vCenter Site Recovery Manager
Upgrading to Site Recovery Manager 4.0.1 fails with the error: Failed to update Perl installation directories (1023191)
Date Published: 6/23/2010

VMware View Manager
Linked clone desktops do not power on after a successful recompose (1016286)
Date Published: 6/25/2010
Logging into the View Admin Console is slow (1016727)
Date Published: 6/25/2010
View Client log in window shows the username of the last user and connection server (1018772)
Date Published: 6/23/2010
Performing a scripted install or upgrade of the VMware View Agent or Client without rebooting the desktop (1018787)
Date Published: 6/24/2010
VMware View Composer My Documents is not redirected to the data disk (1019342)
Date Published: 6/25/2010
Moving automated persistent desktops to another cluster (1020113)
Date Published: 6/22/2010
Using the View Manager vdmadmin.exe command line utility (1021011)
Date Published: 6/22/2010
Text loses formatting when pasted from a View desktop using PCoIP (1021771)
Date Published: 6/21/2010
The View Composer service fails to start after the Composer DSN password is changed (1022526)
Date Published: 6/25/2010
Microsoft NetMeeting not available for RDP connections in VMware View (1023315)
Date Published: 6/25/2010

VMware View Manager
Force replication between ADAM databases (1021805)
Date Published: 6/25/2010
Redirecting a USB flash drive might take several minutes (1022836)
Date Published: 6/23/2010
You are unable to use Smart card from a View desktop using PCoIP (1022390)
Date Published: 6/21/2010

VMware Virtual Disk Development
VDDK fails to open VMDK when vCenter Server uses a non-default SSL port (1022884)
Date Published: 6/21/2010

VMware VirtualCenter
Opening a virtual machine console with Web Access fails with the error: Connection actively refused (1005592)
Date Published: 6/24/2010
Upgrading VirtualCenter 2.5.x Update 3 or earlier fails with the error: The DSN points to an older version of database repository (1022925)
Date Published: 6/23/2010
vSphere Client console screen is blank during an RDP session (1023109)
Date Published: 6/22/2010
Cloning an SLES 10 virtual machine and applying a customization fails (1007404)
Date Published: 6/24/2010

VMware vSphere Web Access
Unable to log in to vCenter Server using Web Access (1023245)
Date Published: 6/24/2010

VMware Support Twitter tweaks

We like to think we’re a pretty progressive support organization at VMware, so it should come with no surprise that we have been engaged in various social networking activities for some time now. This very blog came out of our strategy to communicate more openly with our customers and both our customers and VMware are better for it.

We have been using Twitter too in two different capacities. @vmwarekb was created to allow real time updates from support about new Knowledge Base articles as well as providing weekly digests, highlights and alerts.

@vmwarecares has been around for a while with a mandate to be on the constant lookout for customer care opportunities, and point people to the right resources for self-help when they were looking for technical information.

Over the next months you’ll see a shift in how these services are divided between the two. We’re going to move the techie stuff away from @vmwarecares and it will be a customer service only contact point. @vmwarekb on the other hand is going to become more interactive than it has been in the past. We hope this division of duties will be simpler for customers to understand and at the same time will allow us to expand service in both.

If you have any comments about our twittering or blogging activities, we’d love to hear them! Let us know in the comment area below what YOU think!

VMware Snapshots

Today we have another guest post from Tech Support Engineer Mike Bean speaking casually about snapshots, a commonly misunderstood piece of the VMware solution.

If you ever want to make your VMware support representative cringe, just tell him or her you’re calling about a snapshot problem. Snapshots are very high on the list of misunderstood features, and to complicate things, snapshot problems often result in data loss, and let’s be honest, data loss is never funny.

ESX anatomy 101

To understand how snapshots operate, it’s important to understand the composition of your average virtual machine. To be sure, various virtualization architectures exist, but VMware’s is fairly straightforward. Every virtual machine consists of two parts, a *.vmx, and a *.vmdk. You’ll fairly frequently see other components, but in the end, if you do not have a *.vmx, and a *.vmdk, you don’t have a virtual machine. As we dive a little deeper, the *.vmdk consists of two parts:

1) <File>.vmdk – This, in the jargon, is called the descriptor. It is, what it sounds like. This is the file that contains the characteristics of the disk, if it’s lost, it can be re-created.

2) <File-flat>.vmdk-flat – This is the actual disk. This is the money file. It is the deal breaker. The buck very definitely stops here. If the data is damaged, do not pass go, do not collect $200, just restore from a backup.

So, ultimately, our metaphorical VM will look something like this:

VMware Snapshots

Next, let’s add some secret sauce, and start taking some snapshots. ESX creates another descriptor, and starts creating a “changes” or delta file. The “changes” file is a continuous record of the block level changes to the disk. This is an important concept. A VMware snapshot, unlike SAN based snapshots, ARE NOT COPIES. Most everywhere you look, a snapshot, is a copy or an image. The typical assumption is that if something goes wrong with your disk or backup, you can revert to the image. In ESX, that dog won’t hunt.

As you continue to work, your changes are recorded in the delta file. If the original disk is hypothetically damaged, you CANNOT revert to the snapshot, because the snapshot is not an autonomous disk; and removing the changes will not repair the damage. (We can’t always know what caused the damage in the first place).

Let’s add some additional snapshots to the mix. Take an additional snapshot, and what you’re really doing, is tracking the block level changes between the first snapshot, and the VM’s current state. It doesn’t take a VMware Technical Support Engineer to see how this can get out of control very-quickly. We call these structures “snapshot chains”.

VMware Snapshots

Take a look at our snapshot chain. Let’s, for argument’s sake, poke a hole in it, and damage one of the delta files. UX/LX administrators out there will probably remember their old textbooks that discuss the difference between absolute and relative paths. The “changes” files are relative paths, and because one of the “mile markers” is now, for want of a better term, damaged, ALL of the changes data below the damage is now suspect.

Generally speaking, have a problem with snapshot 3, and you’re fine, just revert to snapshot 2. If you have a problem with snapshot 2, snapshot 3 is now entirely unreliable, because the changes it records, no longer apply. Have a problem with snapshot 1, and snapshots 2 AND 3 are now suspect!

I’m sure you can see how this could lead to some unhappy people having unpleasant conversations! To illustrate, an office co-worker of mine got a call once from a company trying to recover a corrupted/damaged base disk that had YEARS worth of snapshots. It didn’t end well.

By now it should be readily apparent why snapshots do not make good backups. More to the point, it’s just not good digital asset management. A good backup infrastructure has to be able to stand on its own two feet, a spare tire in the trunk won’t help you if the check-engine light in your car comes on. In that sense, I’d like to propose an alternative way of thinking about the subject.

Engineers/software nerds fairly commonly use a concept, for want of a better term, we’ll call it version control. The code exists in a main branch or trunk. Write a new feature, or code a new bug-fix, and check the new code into the “build”. If the new bug-fix doesn’t work out, back it out. Use the build prior to the fix, however, ultimately, if the new bug-fix DOES work out, that, in essence, BECOMES the new build.

I can’t even begin to describe how many support calls to Support could be avoided completely with simple, faithful adherence to this principleHumbly, I suggest we emulate this kind of thinking. Use snapshots not to create backups for your VM’s, but as a form of version control. Snapshots are intended for short term use only. Got an OS patch coming for a critical VM? Take a snapshot and wait a couple days, perhaps a week. Once you’re certain the patch is viable and won’t cause excessive disruption, remove the snapshot! I spoke to a customer once who had setup something he called his “nag script”. It routinely checked for the presence of snapshots older then a given interval, and began emailing the VM’s custodians on a regular basis to remind them to remove it. SMART. If I’d had an ESX infrastructure of my own, I would’ve asked if he’d be willing to share the code for his “nag script”. I can’t even begin to describe how many support calls to Support could be avoided completely with simple, faithful adherence to this principle. Don’t misunderstand me, when used as version control, snapshots can be a powerful tool. My primary goal in writing this article is not to discourage snapshot use, but to encourage responsible snapshot use, and try to impart some sense of WHY it’s important. I’ve said it before in previous articles and I’ll say it again, ultimately, the only safe policy is one of shared information (informed consent). I’ve spoken with numerous customers over the years who’ve viewed their support calls as an opportunity to learn/ask questions, and I’ve always tried to encourage that attitude. Sometimes they want to understand what I’m doing, sometimes they want to record the webex session, sometimes they just want to take notes, and we do our best to respond in kind! Ultimately, we’re all on the same side!

Until next time!

Addendum: Special thanks to Lisa Bernhardt (GSS, Storage Team) for helping translate!

New articles published for week ending 6/19/2010

Virtual Disk Development Kit
VDDK library returns the error: Failed to open NBD extent, NBD_ERR_GENERIC (1022543)
Date Published: 6/18/2010

VMware Data Recovery
Upgrading to VMware Data Recovery (vDR) 1.2 that use virtual disks (VMDK) or raw device mappings (RDM) for a dedupe datastore (1022346)
Date Published: 6/14/2010
Upgrading to VMware Data Recovery (vDR) 1.2 where Windows Network Shares (CIFS) are used as a dedupe datastore (1022700)
Date Published: 6/14/2010

VMware ESX
Unable to commit virtual machine snapshots because another task is in progress (1010310)
Date Published: 6/16/2010
Exporting a virtual machines from VMware Workstation to ESX (1012258)
Date Published: 6/18/2010
Lost pings and delays in the physical switch during NIC teaming failback (1014325)
Date Published: 6/15/2010
Installing Cisco Nexus VEM on an ESX or ESXi host fails with error: Encountered error VibFormatError (1020094)
Date Published: 6/18/2010
Connecting to vCenter Server or an ESX host using VI Client fails with error: Configuration system failed to initialize (1020406)
Date Published: 6/17/2010
Security warning appears when you run the automatically downloaded vSphere Client installer (1021404)
Date Published: 6/15/2010
Configuring or restoring networking from the ESX service console using console-setup (1022078)
Date Published: 6/14/2010
The ESX 3.x dmesg log reports the error: Failed to exec /sbin/modprobe -s -k scsi_hostadpter (1022534)
Date Published: 6/18/2010
Host Update Utility is unable to scan ESX/ESXi hosts (1021192)
Date Published: 6/18/2010
Virtual machines drop packets intermittently (1017755)
Date Published: 6/14/2010
Troubleshooting Virtual Machines that Lose Disk Access (1022119)
Date Published: 6/16/2010
Retrieving and setting limits for the guest resources of a virtual machine (1022391)
Date Published: 6/18/2010
ESX 4.0 Update 2 Compatibility with Cisco Nexus 1000V Virtual Ethernet Module (1022721)
Date Published: 6/16/2010
Windows Device Manager shows Yellow Bang for VMCI device (1023129)
Date Published: 6/19/2010

VMware ESXi
NIC teaming using EtherChannel leads to intermittent network connectivity in ESXi (1022751)
Date Published: 6/18/2010

VMware Fusion
Resolving the Fusion error: The file specified is not a virtual disk (1020887)
Date Published: 6/14/2010

VMware vCenter Converter
When using vCenter Converter the Standalone agent fails to install with an error (1021465)
Date Published: 6/14/2010
Error "Not more than 12 logical volumes can stay in the disk" when converting a powered on Linux physical machine. (1022739)
Date Published: 6/18/2010

VMware vCenter Lab Manager
VMware VirtualCenter and VMware ESX Support for VMware Lab Manager 3.0.2 (1023114)
Date Published: 6/18/2010

VMware vCenter Server
Enabling Update Manager plug-in when remotely accessing vCenter Server fails (1020291)
Date Published: 6/14/2010

VMware vCenter Site Recovery Manager
Enabling authentication failure auditing for Windows-based applications (1022607)
Date Published: 6/14/2010

VMware vCenter Update Manager
Update Manager fails to scan a host with error: DB Scan Error: Descriptor of release <patchname> not found in vmware-vci-log4cpp.log (1013832)
Date Published: 6/18/2010
ESX 4.0 upgrade fails with the error: Unknown failure (1017590)
Date Published: 6/18/2010
Installing Update Manager fails with the error: 1406 (1015647)
Date Published: 6/18/2010

VMware View Manager
Upgrading VMware Tools in a virtual desktop causes PCoIP connections to fail (1022830)
Date Published: 6/15/2010
Mouse pointer does not track the movement of a redirected composite USB device (1022076)
Date Published: 6/14/2010

Scheduled Maintenance June 18

VMware will be performing a system upgrade to several VMware Web applications on June 18, 2010. Maintenance will begin at 6:00 p.m. Pacific Time and end June 18, 2010 at approximately 11:59 p.m. Pacific Time.

While this upgrade is in progress, you will be unable to:

  • Access or manage your VMware account
  • Submit support requests online 
  • Download, purchase or register VMware products Manage VMware product licenses.
  • Access to VMware Communities

If you need to file a support request while the upgrade is in progress, call VMware Technical Support for assistance.

We appreciate your patience during this maintenance period. These system upgrades are part of our commitment to continued service improvements and will help VMware better serve your needs.

VMware ALERT: View customers using PCoIP are advised to NOT apply Update 2 to ESX 4.0 (yet)

image Earlier today VMware became aware of an issue affecting users of VMware View after applying Update 2 to their ESX 4.0 hosts. The problem only effects PCoIP, RDP works normally. There is a discussion of the problem in the VMware Communities here.

While our IT Teams work to resolve the issue, the Knowledge Base Team has responded by creating an up-to-the-minute live document at: http://kb.vmware.com/kb/1022830 and using @vmwarecares and @vmwarekb Twitter accounts to alert customers.

This Knowledge Base article will be updated as new information becomes available. If you have been affected by this, please read the KB.

We apologize for any inconvenience this may have caused you. If you know how to spread the word to your friends and colleagues, please do so.

To Patch, Perchance to Upgrade

Today we have another guest post from Tech Support Engineer Mike Bean.

'Morning everyone, I must take a moment to say thank you to people, both internally and externally, who’ve expressed support for the first column I wrote. To quote a web comic I enjoy, “we must learn, lest we stagnate”, if readers have enjoyed it as well; then I take that as a compliment!

At the risk of digressing from the “most wanted theme”, I wish to approach a new subject today. By the time you read this, ESX vSphere update 2 will be publicly available.

http://downloads.vmware.com/d/info/datacenter_downloads/vmware_vsphere_4/4

I gladly extend congratulations to all our development and QA teams the world over. I can say with complete sincerity they’re my heroes, for it is on their backs that our product is built, one feature spec and bug report at a time. Congratulations lords and ladies, hug your significant others, have a beer, and enjoy the moment. You’re our warriors!

I began my morning with a soda and a copy of the release notes, and I can safely say, it’s not light reading. Speaking as an army-ant in Global Support Services, I don’t see any issues we’ve been breathless with anticipation for, but there’s also far too many things being addressed to engage in any sweeping generalization.

http://www.vmware.com/support/vsphere4/doc/vsp_esx40_u2_rel_notes.html

In the course of my time here, I’ve often been asked the eternal question “should I get the patch?” I’m never quite sure how to answer this question, but it’s honestly one worth asking. So it’s worth taking a moment to examine.

Asking a software company if you should apply the patch is a little like asking a lawyer if you should sue; let’s face it, we have a slight bias. On one hand, if we didn’t think our customers would benefit, we wouldn’t have released the patch in the first place. On the other hand, many of my customers are system admins, and I’ve walked a mile in their shoes. In that sense, I’m well aware that they don’t have the liberty of applying patches/FW flashes whenever any number of numerous vendors they do business with, releases the latest update. My college economics classes would’ve called it “opportunity cost”. Downtimes must be scheduled, approvals must be obtained, benefits assessed. It is precisely because I have experienced both the software development point of view, and the system administration point of view, that I’m well aware that many of our customers may not have had problems. Risk is relative, myself and my co-workers routinely speak to customers who’ve literally ran for years without issues. Why then, upgrade?

Typically, when a customer asks me that question on the phone, I often end up trying to explain that I can’t really answer that question. It’s the natural course of software development that a changing code base means a changing landscape. Old problems are solved, and new ones arise, and I won’t imply to the contrary. However, that should not be interpreted as carte blanche to never patch. Risk may be relative, but as available security and stability fixes accumulate, so does risk, and so does benefit.

It’s a matter of risk assessment, as a GSS TSE (technical support engineer) it’s my responsibility to try and help present the facts and the options, but ultimately, the final decision always belongs to the customer. Only they know their networks, and basically, only they can realistically decide when the benefits of patching will exceed the costs. Inevitably, the only safe policy is one of shared information (informed consent); In that spirit, I encourage most everyone I speak to familiarize themselves with the available resources, both in documents and communities. Examine the facts for yourself, and let the update speak for itself.

In closing, I would briefly highlight at least one real case from memory. I spoke with a customer some weeks ago who’d been having substantial issues with hangs on his cluster of Dell 2900s. I sincerely hope he’s watching update 2’s contents, very carefully!

http://www.vmware.com/support/vsphere4/doc/vsp_esx40_u2_rel_notes.html#resolvedissues

VMware ESX might fail to boot on Dell 2900 servers
If your Dell 2900 server has a version of BIOS earlier than 2.1.1, ESX VMkernel might stop responding while booting. This is due to a bug in the Dell BIOS, which is fixed in BIOS version 2.1.1.

Workaround: Upgrade BIOS to the version 2.1.1 or later.

You spoke, and we were listening! Till next time!

New articles published for week ending 6/12/2010

VMware Consolidated Backup
When using SSPI, VMware Consolidated Backup fails with the error: Invalid user name or password (1007305)
Date Published: 6/11/2010

VMware Data Recovery
Logging in to a VDR appliance for the first time fails with the error: Could not log onto the server. The connect attempt timed out. (1012551)
Date Published: 6/8/2010

VMware ESX
A virtual machine does not power on after removing an RDM on a NPIV system (1007085)
Date Published: 6/10/2010
Querying SNMP with snmpwalk on ESX 4.0 returns the output: No more variables left in this MIB View (It is past the end of the MIB tree) (1020649)
Date Published: 6/8/2010
Adding an ESX host into a Distributed Virtual Switch fails with the error: Unable to Create Proxy DVS (1020736)
Date Published: 6/8/2010
Update Manager 4.0 Update 2 might display an incorrect conflict resolution message when you scan ESX/ESXi 4.0 hosts managed by Cisco Nexus 1000V (1020907)
Date Published: 6/11/2010
VMware ESX 4.0, Patch ESX400-201006225-UG: Updates the ESX 4.0 Web Access components (1021203)
Date Published: 6/11/2010
Distorted GNOME Login Screen in FreeBSD 8.0 (1021745)
Date Published: 6/10/2010
Root partition on an ESX host becomes read only (1022155)
Date Published: 6/9/2010
Using an API to create a virtual disk when connecting to vCenter Server (1022387)
Date Published: 6/8/2010
Placing a host in maintenance mode when networking operations are triggered (1022389)
Date Published: 6/8/2010
Missing KMOD Package for SLED/SLES 11 SP1 (1022632)
Date Published: 6/10/2010
Invalid pin number in the BIOS descriptions of interrupt routing entries cause undefined ESX/ESXi behavior (1021934)
Date Published: 6/11/2010
ESX/ESXi host becomes unresponsive at vsd-mount when booting after attaching SAN Storage (1014101)
Date Published: 6/9/2010
Provisioning linked clone pools on a vDS (1021193)
Date Published: 6/7/2010
VMware ESX 4.0, Patch ESX400-201006226-UG: Updates the VMware ESX 4.0 EHCI HCD device driver (1021343)
Date Published: 6/11/2010
VMware ESX 4.0 and ESXi 4.0, Patch VEM 4.0: Host upgrade to ESX 4.0 Update 2 Cisco Nexus 1000V release 4.0 (1021571)
Date Published: 6/11/2010
Transmit performance is poor on NetXen’s P3 based 10GB NICs (1021940)
Date Published: 6/11/2010
Receive calibrate_APIC_clock warning on Red Hat Enterprise Linux 5.5 (64-bit) guest operating system (1022250)
Date Published: 6/10/2010
ESX fails to boot after installation on a server with more than 190GB of physical memory (1022487)
Date Published: 6/11/2010

VMware ESXi
Update Manager host tasks might fail in slow networks (1021050)
Date Published: 6/11/2010
ESX hosts stop responding with vCenter Server IP address blacklisted errors in the hostd log (1022449)
Date Published: 6/8/2010

VMware Fusion
Resizing a virtual disk in VMware Fusion (1020778)
Date Published: 6/7/2010
Connecting an external hard drive to a Fusion virtual machine (1021853)
Date Published: 6/8/2010
Overview of VMware Tools for VMware Fusion (1022048)
Date Published: 6/10/2010
Installing VMware Tools in an Ubuntu virtual machine (1022525)
Date Published: 6/10/2010
Fusion auto-updater fails to download update (1022555)
Date Published: 6/9/2010

VMware vCenter Lab Manager
Deploying a single virtual machine inside of a Lab Manager configuration fails with the error: An error in contacting the virtual router occurred (1020570)
Date Published: 6/8/2010

VMware vCenter Server
Console tab error: Cannot issue a new session ticket because the maximum number of tickets have been issued (1020496)
Date Published: 6/8/2010
The VirtualCenter Server service stops when vCenter Server and the vCenter Server database are separated by a firewall (1020581)
Date Published: 6/8/2010
Migrating a powered off virtual machine across hosts in different datacenters fails with the error: A specified parameter was not correct. hostname (1020827)
Date Published: 6/7/2010
Migrating a virtual machine from one host to another within a cluster fails with the error: Unable to create scheduler group for migration worlds (1021780)
Date Published: 6/7/2010
Location of vCenter Server log files (1021804)
Date Published: 6/10/2010
DB2 database issues after installing vCenter Server on DE locale (1021971)
Date Published: 6/11/2010
Performance chart issues when a DE locale vCenter Server installed on Microsoft Windows Server 2003 32-bit or 64-bit accesses DB2 database (1022019)
Date Published: 6/11/2010

VMware vCenter Site Recovery Manager
VMware vCenter Site Recovery Manager installation fails with the error: Failed to create database tables (1015436)
Date Published: 6/11/2010
Site Recovery Manager 4.0 installation error: unexpected error code: -1 (1020810)
Date Published: 6/7/2010

VMware View Manager
Connecting to View Desktop using View Client fails with the error: The view connection server connection failed. An error occurred in the secure channel support (1020277)
Date Published: 6/11/2010

VMware vSphere Web Services SDK
Specifying the MAC address when creating a virtual machine through vSphere APIs (1022388)
Date Published: 6/8/2010