Home > Blogs > VMware Support Insider > Category Archives: From the Trenches

Category Archives: From the Trenches

New book: Getting Started with VMware Fusion

One of our own, Michael Roy has just published his first book: Getting Started with VMware Fusion, written to help readers to get started running Windows on their Mac the right way.

Michael talks about how to import your physical PC into the virtual world, and provides practical examples of how to keep your new Virtual Machine secure, backed up, and running smoothly.

Going a bit deeper, he teaches you about snapshots explaining their great uses, and also using Linked Clones in VMware Fusion Professional.

Michael Roy started at VMware working on VMware Fusion version 2 in 2009, where he co-led a world-class global support team, giving customers the help they needed to get the most out of VMware Fusion. He currently specializes in Technical Marketing for Hybrid Cloud Services.

iSCSI Storage and vMotion VLAN Best Practices

We got a question this morning on twitter from a customer asking for our best practices for setting up iSCSI storage and vMotion traffic on a VLAN.

The question caused a bit of a discussion here amongst our Tech Support staff and the answer it seems is too long to fit into a Tweet! Instead, here’s what you need to know if you are working on the best design for your VLANS.

iSCSI and vMotion on the same pipe (VLAN) is a big no-no unless you are using multiple teamed 1GbE uplinks or 10GbE uplinks with NIOC to avoid the two stomping on one another.

While vMotion traffic can be turned off/on/reconfigured on the fly, iSCSI traffic does not  handle any changes to the underlying network (though great improvements have been made 5.1/5.5) on the fly. You will need to take a maintenance window to reconfigure how you want your VLANs to function – especially for the iSCSI network – and then (more than likely) perform a rolling reboot of all hosts. If iSCSI traffic is already VLAN’d off, you should just leave the iSCSI traffic where it is as to avoid taking down the whole environment and just move the vMotion network to a separate VLAN.

That said, here is our most recent iSCSI Best Practice Setup guide from Cormac Hogan. Also see: vMotion Best Practice Setup guide.

Here are the pertinent pages in our documentation on the subject:

pubs.vmware.com…rking-guide.pdf – Page 187

and

pubs.vmware.com…orage-guide.pdf – Page 75

Installing async drivers in ESXi 5.x

One thing that catches a few customers up is the process of installing async drivers in their ESXi host … We have a KB article on the topic here, but there is more than one method to choose from and preparation steps involved. Since these steps might seem a little tricky, we decided a quick, live video explaining the topic might help many of you.

We called upon Kiwi Ssennyonjo to walk us through the salient points.

Again, the full KB article can be found here: Installing async drivers on ESXi 5.0/5.1/5.5 (2005205)

Using VisualEsxtop to troubleshoot performance issues in vSphere

What is VisualEsxtop?

VisualEsxtop is a new performance monitoring tool that was recently posted on the VMware Labs Flings project. On Flings, apps and tools built by VMware engineers for fun are available for download. The intent is to make VMware Administrators’ lives easier in their daily work.

Note:  VisualEsxtop is not an official VMware tool. For support and feedback please contact VMware Labs : http://labs.vmware.com/contact-us

VisualEsxtop is an enhanced version of resxtop and esxtop. VisualEsxtop can connect to VMware vCenter Server or ESX hosts, and display ESX server stats with a better user interface and more advanced features.

How to install VisualEsxtop ?

  1. Download visualEsxtop.zip from http://labs.vmware.com/flings/visualesxtop
  2. Unzip visualEsxtop.zip to folder
  3. Make sure Java 1.6 is in the PATH.
    1. On windows, to verify if Java is in the path,
      Click on Start > run > TYPE cmd and press ENTER > TYPE java and press Enter.
      If java is not in path, you will notice error like this -
    2. If JDK 1.6 or later is already installed on your machine but not in the path, here is how you add it (Instructions for Windows 7, other versions might be slightly different)
      -Go to Control Panel > System and Security > System >
      -Click Advanced system settings
      -Click on Environmental Variables
      -Click on New
      -Under Edit User Variable type the following and click OK
      Variable name: path
      Variable value: The path to the JDK 1.6 binary folder (C:\Program Files (x86)\Java\jdk1.6.0_14\bin\  for example)
    3. Then, open cmd again (Start > Run> Type cmd) and type “java”. This should successfully return usage options for java command.

What can you do with VisualEsxtop?

  • Real-time Performance monitoring of individual ESX(i) hosts or vCenter Server. The default interval (5 seconds) is modifiable. Type Ctrl+N and change to the new value
  • Multiple sessions to different hosts or same host at the same time. This comes in very handy when you are comparing stats between hosts or between multiple views/fields.
  • Flexible counter selection and filtering. This is in my opinion the best feature of this tool. You can filter results to get specific outputs. The examples in the next section will show you how to.
  • Save data to a batch file. You can now pick and choose relevant tabs and fields and also chose intervals, number of snapshots for the output. Type Ctrl+S to get the save option
  • Load batch output and replay them. Type Ctrl+B to load a saved csv file.
  • Line chart for selected performance counters
  • Embedded tooltip for counter description
  • Color coding for important counters

Running the tool:

  1. Run visualEsxtop.sh (Linux) or visualEsxtop.bat (Windows) from the extracted files.  (Note: Type export JAVA_OPTS=-Xmx2048m  if loading large amounts of data)
  2. On the VMWARE VisualEsxtop window, select File > Connect to Live server.
  3. Choose the IP address of the host or vCenter and the credentials to connect.

Examples of using the tool:

Example 1:

The example below is listing only the devices that have DAVG value of above 20ms. The filter used is DAVG/cmd under Disk World tab.  Typically we do not want DAVG (device driver level  latency) to be over 20ms for lengthy period. Note that by using the filter, it is very easy to list only the devices that currently have high latency values.

Example 2:

The example below is listing the vmkernels and virtual machines that are currently running on vmnic0. The filter used is TEAM-PNIC  under Network tab. You can also sort by %DRPTX , %DRPRX to filter for any devices reporting packet drops. Note that the vmnic number will only show up if the uplink ports are not in an etherchannel binding.

Tips on working with Charts:

  • To build a new chart, under Chart tab click twice on Object Types to start to select fields
  • To add a field to the chart, expand the related object and click twice on the field
  • To remove a field from the current chart, click twice on the field from the bottom left window pane
  • The chart allows you to add any fields from any objects at the same time. You have to be the judge regarding what fields are relevant. For example, listing DAVG and Physical CPU Core Util% in the same chart may not provide much value.

Chart view (tab) screenshot:

Where do I get tips on Troubleshooting performance on ESXi ?

Details about different fields in esxtop tool can be found in this communities blog post: Interpreting esxtop Statistics.

There are many great articles and tips on performance troubleshooting in the VMware Knowledgebase. Here are a couple that I recommend to give you a start –

Announcing: Support Insider Live

Announcing Support Insider LiveWe are delighted to announce a new video initiative from the Knowledge Management team at VMware that will bring you tips straight from the mouths of our front-line Technical Support Engineers.

Support Insider Live videos are short, to the point nuggets of wisdom from those who work on customer issues daily. Ranging from basic to advanced, each video will address one idea, probably answering a question you’ve asked yourself.

Videos are hosted on our VMware KBTV channel on YouTube in a new playlist called Support Insider Live. Each video will be blogged about here if you wish to consume them that way. Our first video answers the question, “What is a PSOD?”

We’d love to hear your feedback! Have any requests?

Troubleshooting when virtual machine operations are greyed out

Hello, I am a Technical Support Engineer at VMware. I would like to talk about an issue I worked on recently with a customer that may come up for other users. The Support Request (SR) came to me with this description: “The Virtual Machine options are not working”. I had to ask myself, “What do you mean by ‘not working’”? I quickly called the customer and we got a remote session going so that I could see for myself what the problem was.

The problem was with only one Virtual Machine in that vCenter instance. I found that the virtual machine options were disabled (greyed-out) when I right-clicked the problematic virtual machine.

Note: Click an image to see the full-sized version.

The customer at this point mentioned that he had tried to take a backup using third-party software. He saw this error message in the “Recent Tasks” pane:

Error: Another task in progress

My initial hunch was that there was a job running for the virtual machine in the background, but I needed to verify it wasn’t a permissions problem.

Permissions are not set correctly

First I checked the permissions given to that virtual machine. The customer had a very small environment, so I did not spend too much time checking the complete permissions given for the entire vSphere environment. Instead, I went through the following steps:

Note: If your environment is large and you have multiple vSphere administrators with different permissions, permissions on a particular virtual machine machine might be incorrect/missing. In this case, the vSphere root administrator needs to ensure that there are sufficient permissions given for you to administer the virtual machine, at all levels.

  1. Check the Permissions tab for the virtual machine, to make sure that your name is listed there with the proper permission assigned. If your name is not there, but an AD/Local group is listed, then make sure that your name is added to the AD/Local group.Here, the user Testuser has the Virtual Machine Administrator role:
  2. If the permissions are defined at the host, cluster, Datacenter, or vCenter level:
      1. Apply the required permissions to the user/group.
      2. Make sure that the permission is propagated to all the objects.
      3. Make sure that your name is listed in the local/AD group.Here, the group Virtual Machine Administrator has the Virtual Machine Administrator role:

  3. Roles can also be assigned in the folder level. Go to the VMs and Templates view and make sure the folder where the Virtual Machine located has all required privileges assigned.In the below example, VM1,VM2 and VM3 are located in the folder Test. The Virtual Machine Administrator group is assigned with “Virtual Machine Administrator” role.
    In the examples above, the role “Virtual Machine Administrator” is not a default role, it is created with privileges to Administrate Virtual Machines. To know what privileges are required, I always refer to the Required Privileges for Common Tasks section in: Virtual Machine Administration Guide.

With that set of checks, we can now go on to the next step, and this comes back to my hunch because of the error message which I noticed in the “Recent Task”.

Tasks Running in the Background

We should always be aware that certain jobs for any given Virtual Machine may be running in the background which will not allow any changes/operations in the Virtual Machine. These background jobs cannot be seen in the “Task and Event” tab in the vCenter server or in the “Recent Tasks”.

The method we look for these background jobs is by logging into the host via SSH where the problem Virtual Machine is running. Here are the steps to find any running jobs which cannot be seen in the vSphere client.

1. Log into the ESX/ESXi host using a SSH client.
2. Run the following command to list all the Virtual Machines registered in the host. We are running this specifically to find the vmid for the problem Virtual Machine.

vim-cmd vmsvc/getallvms

The output appears as:

3. Check the tasks running on the Virtual Machine by running the command:

vim-cmd vmsvc/get.tasklist vmid

In our example, we run the command against the Virtual Machine VM3. It indicates there are 2 jobs/tasks running on that Virtual Machine.

The output shows as below if there is no jobs/tasks running.

4. Now you you have a choice to either wait for the jobs to complete, or restart the Management agent to terminate this job.

In this particular support case I was working we found a snapshot job that was triggered by the backup agent running in the Virtual Machine. I stopped the job by restarting the management agent on that host. For this customer, this solved the problem, but there is one more reason this might happen. I had a word with one my colleagues, and he pointed out that you could also encounter this if you have an invalid entry in your VMX file. Let’s go one step further and show you how to check this.

Invalid entries in VMX file

This type of issue might happen if a vmx file has invalid parameters or blank lines in it. You can resolve this issue by manually removing the invalid arguments or deleting the blank lines.

Caution: If incorrectly done, your virtual machine may fail to start or operate incorrectly. Always take a backup of your .vmx file before modifying it.

  1. Open the .vmx file using any text editor.
  2. Search for any blank lines and delete them.
    Note
    : To delete a single line using vi editor, press d twice.
  3. Compare the .vmx file with a working virtual machine .vmx file and see if there are any invalid arguments.
  4. To apply the changes, reload the .vmx file by running the command:
vim-cmd vmsvc/reload vmid

Note: The Default location of the vmx file is:

Vmfs/volumes/name_of_the_datastore/vm_name/vm_name.vmx

Summary

Symptom:
Virtual machine operations are grayed out

Possible Causes:

  • Permissions are not set correctly
  • Tasks Running in the Background
  • vmx file is corrupted

For further information on the steps used in this article, refer to: Troubleshooting when virtual machine options are grayed out in vSphere Client (2048748)

Hope this blog post helps you out a little bit. Have a great day!

Attempting to Sysprep a Virtual Machine with IE10 fails

We just received word from one of our support engineers working on the front-lines that some customers are reporting problems wherein Sysprep fails on Virtual Machines that have IE10 on them.

The issue is not a VMware bug, it’s an Registry location problem, but customers might still reach for the phone to call us first. We thought we’d try and get the word out today to try and save you some steps (and time).

The problem manifests itself when you attempt to Sysprep a Virtual Machine that has IE10 on it.  In the Sysprep setupact.log log file (located at C:\Windows\System32\Sysprep\Panther), you see:

Error[0x0f0085]SYSPRP LaunchDll:Could not load DLL C:\Windows\SysWOW64\iesysprep.dll[gle=0x000000c1]

The registry entry for the location of certain files is incorrect. For further details and resolution, we refer you to KB article: Sysprep fails on Virtual Machine installed with IE10 (2051620).

VMware vFabric Postgres Cheat-sheet

Here’s a cool pdf download for you. It’s a cube note (or cheat sheet) that you can use when troubleshooting Postgres issues with VMware vFabric, straight from our awesome team of storage support engineers.

VMware vFabric Postgres Chart
VMware vFabric Postgres Chart

Introducing … RSS updates from the VMware Compatibility Guide / HCL

In the past, administrators wishing to determine whether their specific hosts, guest operating systems, storage arrays, or other hardware were on the official VMware Compatibility Guide / HCL had to continually return to our page to see if VMware had added support or added any updates.

The new and improved VMware Compatibility Guide makes it easier to get updates about specific hardware by providing RSS feed subscriptions. To subscribe to updates about an item:

  1. Search for the item you’re interested in.
  2. From the search results, click the model name.
  3. In the Model Detail seciton of the page, at the bottom-right corner, click the rss feed link.

You’ll be prompted for a way to subscribe to the feed, based on your browser and configuration. Updates will be sent via RSS when the page for this model is updated.

For more information about the Compatibility Guide / HCL, see our KB article Using the VMware HCL and Product Interoperability Matrixes (2006028).

Setting disk.enableUUID=true in vSphere Data Protection

A customer recently asked:

“Why should the option disk.enableUUID=true be set to false when a reboot is just pending, as stated in KB article: Backing up a Windows Server 2008 R2 virtual machine using vSphere Data Protection 5.1 fails with the error: Execution Error: E10055:Failed to attach disk (2035736).”

Answer:

When this specific parameter (disk.enableUUID) is absent from the vmx, the effective setting is false. If the parameter is introduced into the vmx and set to true while the VM is still powered on, it will result in the following behavior:

  1. The guest OS (Windows) will never write the disk UUID/Serial to the VMX because the vmx has not been reloaded.
  2. The snapshotting process will see that the parameter is set to true and attempt to quiesce, but it will fail because no parameter exists (due to 1 above).

The correct resolution is to reboot the VM (so the vmx is reloaded) and the guest OS writes the Disk UUID/serial. Then the backup/snapshotting process will read this value and be able to successfully quiesce.

There is a non disruptive workaround, and that is to edit the vmx and set the disk.enableUUID value to false, then vMotion the vm (to any other host), to reload the vmx in host memory. This effectively disables application level quiescing (file system quiescing is still available though). This can be done in bulk via PowerCLI and non-disruptively (while the VMs are powered on).

There is also another method to reload the vmx file with the vm remaining powered on detailed in KB article: Reloading a vmx file without removing the virtual machine from inventory (1026043).

Also, see the “cause” and “resolution” sections of KB article: Taking a snapshot of a virtual machine after a Storage vMotion on an ESX/ESXi 4.1 or 5.0 host fails (2009065).