Home > Blogs > VMware VROOM! Blog > Tag Archives: vmware

Tag Archives: vmware

Writing Performant Tagging Code: Tips and Tricks for PowerCLI

vSphere 5.1 introduced an inventory tagging feature that has been available in all later versions of vSphere, including vSphere 6.7. Tags let datacenter administrators organize different vSphere objects like datastores, virtual machines, hosts, and so on. This makes it easier to sort and search for objects that share a tag, among other things. For example, you might use tags to track a group of VMs that all have the same operating system.

Writing code to use tags can be challenging in large-scale environments: a straightforward use of VMware PowerCLI cmdlets may result in poor performance, and while direct Tagging Service APIs are faster, the documentation can be difficult to understand. In this blog, we show some practical examples of using PowerCLI and Tagging Service APIs to perform tag-related operations. We include some simple measurements to show the performance improvements when using the Tagging Service vs. cmdlets. The sample performance numbers are for illustrative purposes only. We describe the test setup in the Appendix.

1. Connecting to PowerCLI and the Tagging Service

In this document, when we write “PowerCLI cmdlets,” we mean calls like Get-Tag, or Get-TagCategory. To access this API, simply open a PowerShell terminal and log in:

Connect-VIServer <vCenter server IP or FQDN> -User <username> -Pass <password>

When we write “Tagging Service APIs,” we are referring to calls that are satisfied by the Tagging Service. To access such APIs from a PowerShell terminal, you must log in both to the vCenter Server and to the Tagging Service:

# Login to vCenter
Connect-VIServer <VC server IP or FQDN> -User <username> -Pass <password>

# Login to the tagging server (known as the CIS server), which contains the Tagging Service
Connect-CISserver <VC server IP or FQDN> -User <username> -Pass <password>

# Get a handle to the Tagging Service APIs
$taggingAPI = Get-CisService com.vmware.cis.tagging.tag

# Get a handle to the tag assignment APIs
$tagAssignmentAPI = Get-CisService com.vmware.cis.tagging.tag_association

To access built-in help for the tagging APIs, add .help to the method name. We give an example below with the actual documentation in italics.

# List available tag assignment method calls, using $tagAssignmentAPI from above
PS H:\> $tagAssignmentAPI.help

Documentation: The {@name TagAssociation} {@term service} provides {@term operations} to attach, detach, and query tags.

Operations: List<com.vmware.cis.tagging.tag_association.object_to_tags> list_attached_tags_on_objects(List<com.vmware.vapi.std.dynamic_ID> object_ids):

Fetches the {@term list} of {@link ObjectToTags} describing the input object identifiers and the tags attached to each object. To invoke this {@term operation}, you need the read privilege on each input object. The {@link ObjectToTags#tagIds} will only contain those tags for which you have the read privilege.

List<id> list_attachable_tags(com.vmware.vapi.std.dynamic_ID object_id): 

Fetches the {@term list} of attachable tags for the given object, omitting the tags that have already been attached. Criteria for attachability is calculated based on tagging cardinality ({@link CategoryModel#cardinality}) and associability ({@link CategoryModel#associableTypes}) constructs. To invoke this {@term operation}, you need the read privilege on the input object. The {@term list} will only contain those tags for which you have read privileges.

… <output truncated> …

2. High level differences between cmdlets and Tagging Service calls

2.1 Names vs. IDs

When using cmdlets, it is customary to use the name of a tag or category. For example, you might write the following to get the tag whose name is “john”.

PS H:\> $tag = Get-Tag -Name “john”

To get all tags with “john” in the name, you would use a “*” wildcard

PS H:\> $tagList = Get-Tag -Name “john*”

The Tagging Service, in contrast, typically requires IDs instead of names. These IDs persist throughout the lifetime of the tag or category, so they can be cached on creation and used throughout the lifetime of your scripts.

Example: Here is some sample code to retrieve the ID of a tag category given its name:

# We will find the id of the tag category for the category named “john”
$tagCatName = ‘john’

# Get a handle to all tag category API calls
$allCategoryMethodSvc = Get-CisService com.vmware.cis.tagging.category

# List all categories
$allCats = $allCategoryMethodSvc.list()

# Iterate over categories to find the desired category (name = $tagCatName)
foreach ($cat in $allCats) {
      if (($allCategoryMethodSvc.get($cat.value)).name -eq $tagCatName){
            # set $catID to the id of this category
            $catID= $cat.id
            break
      }
}

Example:  Here is some code to find the tag ID of the tag named “john”.

$tagName = “john”

# return a list of all tag IDs in the system.
$allTags = $allTagMethodSvc.list()

# iterate over tag IDs to find the tag whose name is $tagName
foreach ($tag in $allTags) {
      if ($allTagMethodSvc.get($tag).name -eq $tagName) {
            $tagID = $tag.id
            break
      }
}

If you know the name of the category, you can improve on the previous code by searching tags within this category.

# Assume that $catID was retrieved as specified above
$tagsForCatID = $allTagMethodSvc.list_tags_for_category($catID)
foreach ($tag in $tagsForCatID) {
      if ($allTagMethodSvc.get($tag).name -eq $tagName) {
            $tagID = $tag.id
            break
      }
}

For performance reasons, it is a good idea to store a mapping of IDs to names. With such a map, you avoid the need to iterate over each tag ID or category ID whenever you need a tag or category name. We give an example of making such a map below under example 4 (see createTagNameIdMap).

2.2 Tag and category specifications for Tagging Service calls

When creating a tag or a category using cmdlets, there are many default parameters, and others can be specified on the command line. With Tagging Service calls, a “spec” must be created and used.

Here is an example of creating a tag with “multiple” cardinality. (Multiple cardinality means that multiple tags from this category can be applied to a specific object at a time. For example, a category named “Owners” may have tags named “Alice” and “Bob”, and a given VM may have both “Alice” and “Bob” assigned to it. In contrast, single cardinality means that only one tag from a given category can be used on a specific object. For example, a category named “OS” may have tags named “Linux” and “Windows.” A given VM would have only one of these tags assigned.)

Creating a tag category using cmdlets

# Create tag category named ‘testCat’
$catName = ‘testCat’
New-TagCategory -Name $catName -Cardinality Multiple

Creating a tag category using the Tagging Service directly

$catName = ‘testCat’

# Get a handle to the category methods:
$allCategoryMethodSVC = Get-CisService com.vmware.cis.tagging.category

# Create a spec for the category
$catSpec = $allCategoryMethodSVC.Help.create.create_spec
$catSpec.cardinality = ‘MULTIPLE’
$catSpec.associable_types = ‘’

# NOTE: In vSphere 6.5, the category_id field should not be used.
# In vSphere 6.7, please set the category_id to ‘’.
$catSpec.category_id = ‘’
$catSpec.description = ‘’
$catSpec.name = $catName

# Now create the tag category
$allCategoryMethodSVC.create($catSpec)

3. Performance considerations

In general, Tagging Service calls are faster than cmdlets, and the difference becomes larger as the inventory size or number of tags grows. There are two main reasons for this. First, a cmdlet, while presenting a simpler interface to the user, often must make multiple backend calls to retrieve the same information that one might need to retrieve with a direct Tagging Service call. Second, cmdlets often use names rather than IDs in their invocations (e.g., Get-Tag <tagName> rather than Get-Tag -Id <tag-id>). Most of the tagging information is indexed by ID in the backend, so calls that use an ID are faster than calls that use a name.

4. Examples using cmdlets and Tagging Service calls

In the following examples, we show how to retrieve information using both cmdlets and Tagging Service calls. We also show sample performance numbers for a sample inventory with 3200 VMs (see the Appendix for details on the experimental setup). As noted above, the API for the Tagging Service is more efficient, though it requires using IDs rather than names. As a result, where possible, we suggest storing the ID/name mapping for tags and tag categories in a local file or data structure for fast lookup.

4.1 Global variables

In the examples that follow, we use a number of variables repeatedly. We define those variables here.

Cmdlet variables

$vsphereUnderTest = FQDN of the vCenter Server 
$user = username for the vCenter under test 
$pass = password for the vCenter under test 
$catName = ‘testCat’ 
$tagName = ‘testTag’ 
$allVMS = Get-VM

Tagging Service variables

$vsphereUnderTest = FQDN of the vCenter Server 
$user = username for the vCenter under test 
$pass = password for the vCenter under test 
$catName = ‘testCat’ 
$tagName = ‘testTag’ 
$vmNames = ‘testVM_1’, ‘testVM_2’, ‘testVM_3’, ‘testVM_4’, ‘testVM_5’ 
$allVMs = Get-VM

4.2 Code Samples 

Example 1: Connecting to vCenter and the Tagging Service

In this example, we show how to connect to vCenter and the Tagging Service. For the Tagging Service, as described above, you must log in to both the vCenter and the “CIS” server.

1A: Cmdlets
Connect-VIServer -Server $vsphereUnderTest -User $user -Password $pass
1B: Tagging Service (requires login to both vCenter and the CIS server)
# also connect to CIS service
Connect-VIServer -Server $vsphereUnderTest -User $user -Password $pass 
Connect-CISServer -Server $vsphereUnderTest -User $user -Password $pass

Discussion: Tagging Service requires logging in to both vCenter and the Tagging Service server (called the CIS server). When using the Tagging Service, you must also retrieve a handle to the appropriate Tagging Service methods. Here are some examples:

Getting method handles for various Tagging Services
# Category methods 
$allCategoryMethodSVC = Get-CisService com.vmware.cis.tagging.category 

# Tagging methods 
$allTagMethodSVC = Get-CisService com.vmware.cis.tagging.tag 

# Tag association methods 
$allTagAssociationMethodSVC = Get-CisService com.vmware.cis.tagging.tag_association

Example 2: Creating a tag category

In this example, we create a tag category. Recall that the category name $catName is defined above.

2A: Cmdlets
New-TagCategory -Name $catName -Cardinality Multiple
Average time to create tag category in our setup: 200 ms
2B: Tagging Service
# Create a spec for the category: 
$catSpec = $allCategoryMethodSVC.Help.create.create_spec 
$catSpec.cardinality = ‘MULTIPLE’ 
$catSpec.associable_types = ‘’
$catSpec.category_id = ‘’ 
$catSpec.description = ‘’ 
$catSpec.name = $catName 

# Now create the tag category 
$catObject = $allCategoryMethodSVC.create($catSpec)
   Average time to create tag category in our setup: 53 ms

Programming note: You don’t need to set the description, but if you don’t, the default value will be set to: @{Documentation=The description of the category.}

WARNING: If you are connected to a 6.7 vCenter you need to specify the category_id, if you don’t you will see an invalid_argument error.

Discussion: In the Tagging Service example, $catObject = $allCategoryMethodSVC.create($catSpec) creates the tag category, and the object returned is saved as $catObject. $catObject.Value is the ID of the category that was created.

Example 3: Creating a tag under a tag category

In this example, we create a tag under a tag category. As a convenience, we use the following function to get a category ID given the category’s name. We need the ID because the Tagging Service uses IDs, not names.

function Get-CategoryIdFromName {
    Param ($inputCatName)
    $allCategoryMethodSvc = Get-CisService com.vmware.cis.tagging.category
    $allCats = @()
    $allCats = $allCategoryMethodSvc.list()

    foreach ($cat in $allCats) {
        # Compare name of input category to current category. Return if match.
        if (($allCategoryMethodSvc.get($cat.value)).name -eq $inputCatName) {
            return $cat.value
        }
    }
    # no match
    return $null
}

Note: if you need to get category IDs from names multiple times, it is better to create a mapping of names to IDs. The following function creates a map of category names to IDs.

function createCategoryNameIdMap {
    $allCategoryMethodSvc = Get-CisService com.vmware.cis.tagging.category
    $allCats = $allCategoryMethodSvc.list()
    $catNameIdMap = @{}
    foreach ($cat in $allCats) {
        $catName = $allCategoryMethodSvc.get($cat.value).name
        $catNameIdMap.Add($catName,$cat)
    }
    return $catNameIdMap
}

 

Creating a tag under a tag category

3A: Cmdlets
New-Tag -Name $tagName -Category $catName
Average time to create a tag in our setup: 112 ms
3B: Tagging Service
# Use function defined above to get category ID from category name
$catID = Get-CategoryIdFromName($catName)

# First create a tag spec.
$spec = $allTagMethodSVC.Help.create.create_spec

$spec.name = $tagName
$spec.description = ‘’
$spec.tag_id = ‘’
$spec.category_id = $catID.value

# Now create the tag
$tagObject = $allTagMethodSVC.Create($spec)
Average time to create a tag in our setup: 64 ms

Programming note: You don’t need to set the description, but if you don’t the default value will be set to: @{Documentation=The description of the tag.}

WARNING: If you are connected to a 6.7 vCenter you need to specify the tag_id; if you don’t, you will see an invalid_argument error.

Discussion: The Tagging Service example uses the function Get-CategoryIdFromName to get the category ID. This line of code can be eliminated if you save the category object $catObject from example 2b. If you do this, you will also need to change the line $spec.category_id = $catID.value to $spec.category_id = $catObject.value.

When we create the tag, we save the object that is returned by $allTagMethodSVC.Create($spec) in $tagObject.

Example 4: Associating a tag with a VM

In this example, we associate a tag with a VM (also known as “attaching a tag to a VM”). The Tagging Service APIs, as mentioned above, use tag IDs instead of tag names. As a convenience, we use the following function to get a tag ID given the tag’s name.

function Get-TagIdFromName {
    Param ($a)
    $allTagMethodSvc = Get-CisService com.vmware.cis.tagging.tag
    $allTags = @()
    $allTags = $allTagMethodSvc.list()
    foreach ($tag in $allTags) {
        if (($allTagMethodSvc.get($tag.value)).name -eq $a) {
            return $tag.value
        }
    }
    return $null
}

In addition to using the tag ID instead of the tag name, the tag attachment API requires a specially-created VM object, as shown in the example below.

Attaching a tag to one VM

4A: Cmdlets
$tagArray = Get-Tag -Category $catName
New-TagAssignment -tag $tagArray[0] -entity $allVMS[0]
Average time to attach a tag in our setup: 2717 ms
4B: Tagging Service
# Pick a VM to attach the tag to
$testVM = Get-VM -Name $vmNames[0]

# The Tagging Service needs a VM object as an argument.
# We recommend doing this once and storing the result rather than recreating it each time you need it.
$VMInfo = $testVM.ExtensionData.MoRef
$vmid = New-Object PSObject -Property @{
    id = $VMInfo.value
    type = $VMInfo.Type
}

# The Tagging Service uses tag IDs, not names, so get the ID from the name using the method above.
$tagId = Get-TagIdFromName($tagName)

# Now attach the tag to the VM.
$allTagAssociationMethodSVC.attach($tagID, tagId, $vmid)
Average time to execute the attach() call in our setup: 35 ms

Programming note: $tagId uses our Get-TagIdFromName convenience function.

Note: If you need to get tag IDs from names multiple times, it is probably better to create a mapping of names to IDs. The following function creates a map of tag names to IDs:

function createTagNameIdMap {
    $allTagMethodSvc = Get-CisService com.vmware.cis.tagging.tag
    $allTags = $allTagMethodSvc.list()
    $tagNameIdMap = @{}
    foreach ($tag in $allTags) {
        $tagName = $allTagMethodSvc.get($tag.value).name
        $tagNameIdMap.Add($tagName,$tag)
    }
    return $tagNameIdMap
}

To use the function above, you could do the following (we assume that there is a tag named “test tag”):

$tmap = createTagNameIdMap
$testTagId = $tmap[“test tag”]

Example 5: Get the tags assigned to a VM (1 tag associated with the VM)

In this example, we get the tags assigned to a VM. The cmdlet assumes that we have an array of VMs $allVMs (defined above in “Global Variables”), and it gets the association to the first one. The Tagging Service needs a VM ID object rather than a VM name. We use the $vmid object created in the previous example. We assume there is one tag associated with the VM.

5A: Cmdlets
# Pick the first VM in our list of VMs ($allVMs)
Get-TagAssignment -Entity $allVMS[0]
Average time to get tag assignment in our setup: 2610 ms
5B: Tagging Service
# Assume we have the VM object $vmid from the previous example, and assume we have only 1 tag association.
$tagID = $allTagAssociationMethodSVC.list_attached_tags($vmid)
Average time to get tag assignment in our setup: 49 ms

Programming note: The above code will return the tag ID. If you need to get more information about the tag, use the following line:

$allTagMethodSVC.get($tagID.value)

If you want the name, then use this:

$allTagMethodSVC.get($tagID.value).name

 

Example 6: Assign one tag to 1000 VMs

In this example, we assign a single tag to 1000 VMs. In both the cmdlet case and the Tagging Service case, we assume an array of VMs ($allVMs), as defined above. For the Tagging Service, we create an array of VM ID objects from this $allVMs array. We also use the $tagID created in the previous example.

6A: Cmdlets
# Pick 1000 VMs from $allVMs and assign the first tag in $tagArray to each VM.
$useTheseVMs = $allVMS[0..999] 
$useTheseVMs | New-TagAssignment -Tag $tagArray[0]
Average time to attach 1k VMs to one tag in our setup: 2727842 ms
6B: Tagging Service
# Use $allVMs from Global Variables above.
$useTheseVMIDS = @()

# Create VM objects for all VMs.
# This should be done once for all VMs, not every time
# you want to do an association.
for ($i = 0; $i -lt 1000; $i++) {
    $VMInfo = $allVMS[$i].extensiondata.moref
    $useTheseVMIDS += New-Object PSObject -Property @{
        id = $VMInfo.value
        type = $VMInfo.type
    }
}

# Assume we have $tagId from the previous example.
$allTagAssociationMethodSVC.attach_tag_to_multiple_objects($tagID, $useTheseVMIDS)

Once VM objects have been created, the average time to attach 1k VMs to one tag: 18438 ms

Example 7: List the VMs assigned to one tag

In this example, we retrieve the VMs assigned to one tag. In the cmdlet case, we assume that there is an array of 1000 tags $tagArray, and we find the VMs assigned to the first tag. In the Tagging Service case, we need to use a tag ID instead of a name, so we use $tagID from the previous example.

7A: Cmdlets
# Get VMs attached to the first tag in $tagArray.
$vmsAssignedToTag = Get-VM -Tag $tagArray[0]
Average time to search for the VMs: 25844 ms
7B: Tagging Service
# Assume we have $tagID from earlier tests. 
$vmsAssignedToTag = $allTagAssociationMethodSVC.list_attached_objects($tagId)
Average time to search for the VMs: 411 ms

Programming note: Return value $vmsAssignedToTag is a list of PSCustomOjbects. To find the type, use $vmsAssignedToTag.type, and to find the ID, use $vmsAssignedToTag.id

Note: The return value of the list_attached_objects method is a list of VM objects. If you want to get the ID of a given VM for subsequent Tagging Service calls, you must do the following:

# Get the first VM in the list.
$vm_0 = $vmsAssignedToTag[0]

# Get the type. In this case, it is VirtualMachine.
$v_type = $vm_0.type

# Get the id. In this case, it will be something like ‘vm-222’.
$v_id = $vm_0.id

# Construct an ID from the type and ID.
# The result is of the form ‘VirtualMachine-vm-222’.
$VirtualMachineId = $v_type + "-" + $v_id

# This ID can now be used to retrieve additional information about VMs.
$vm_0_get = Get-VM -Id $VirtualMachineId

Example 8: Attach 1000 tags to one VM

In this example, we attach 1000 tags to a single VM. We assume that you have already created 1000 tags, for example, by taking the code in example 3 in a loop.

8A: Cmdlets
$useTheseTags = $tagArray[0..999] 
foreach ($tag in $useTheseTags) { 
  New-TagAssignment -Entity $vmArray[1] -Tag $tag
}
Average time to attach 1000 tags to 1 VM in our setup: 3236949 ms
8B: Tagging Service
# Get list of tags you want to attach
$tagArray = $alltagMethod.list()
$useTheseTags = $tagArray[0..999]

# Create a VM object to attach the tags to. We re-use a VM from our list above.
# For best performance, create these VM objects once and store them: do not 
# create them every time you wish to do an associaton.
$VMInfo = ($allVMS[0]).extensiondata.moref
$vmid = New-Object PSObject -Property @{
    id = $VMInfo.value
    type = $VMInfo.type
}

# Now do the actual association.
$allTagAssociationMethod.attach_multiple_tags_to_object($vmid, $useTheseTags)
Once VM object has been created, average time to attach 1000 tags to 1 VM in our setup: 1511 ms

Example 9: List all the tags associated with a VM (1000 associations)

In this example, we list all tags associated with a VM. In our test, we use a VM that has 1000 tags associated with it. In both the cmdlet case and the Tagging Service case, we use the $allVMs array above, and pick the first VM.

9A: Cmdlets
Get-TagAssignment -Entity $allVMs[0]
Average time to get tag assignments in our setup: 3715 ms
9B: Tagging Service
# Create VM object. Do this once and store it, rather than
# creating it every time you need to retrieve tag associations.
$VMInfo = $allVMs[0].extensiondata.moref 
$vmid = New-Object PSObject -Property @{ 
   Id = $VMInfo.value 
   Type = $VMInfo.type 
} 
$tagIDArray = $allTagAssociationMethod.list_attached_tags($vmid)
 Average time for get tag assignments after object has been created: 63 ms

Programming note: The above code will return an array of tag IDs. If you need to get more information about the tags, use $allTagMethodSVC.get for each tag ID.

5. Takeaways

As you can see, the cmdlets are typically shorter and more intuitive to write than the Tagging Service scripts. However, the performance of Tagging Service calls can be substantially better than cmdlets. For small inventories or small numbers of tags, cmdlet performance is likely to be adequate. For larger inventories, and for better performance, we recommend using Tagging Service calls.

Appendix: Experimental environment

For these examples, we used a testbed that had 32 hosts and 3200 VMs. These measurements were done after creating 10 tag categories with 1000 tags per category (for a total of 10,000 tags in the system). Actual performance for your environments will vary with inventory configuration and the configuration of the vCenter appliance.

Versions:
PowerShell version 5.1.14409.1005
VMware PowerCLI 10.1.0 build 8346946
vSphere 6.7 GA

About the Authors

Joseph Zuk is a staff-level automation test engineer working at VMware within the Performance Engineering Automation team. He focuses on at-scale performance testing of vCenter. He has been using PowerCLI for setup of his automation testbed since 2011.

Ravi Soundararajan is a principal engineer in the Performance Engineering group at VMware. He works on vCenter performance and scalability–from the UI, to the server, to the database, to the hypervisor management agents. He has been at VMware since 2003, and he has presented on the topic of vCenter Performance at VMworld from 2013-2018. His Twitter handle is @vCenterPerfGuy.

New Scheduler Option for vSphere 6.7 U2

Along with the recent release of VMware vSphere 6.7 U2, we published a new whitepaper that shows the performance of a new scheduler option that was included in the 6.7 U2 update.  We referred to this new scheduler option internally as the “sibling” scheduler, but the official name is the side-channel aware scheduler version 2, or SCAv2.  The whitepaper includes full details about SCAv1 and SCAv2, the L1TF security vulnerability that made them necessary, and the performance implications with several different workload types.  This blog is a brief overview of the key points, but we recommend that you check out the full document.

In August of 2018, a security vulnerability known as L1TF, affecting systems using Intel processors, was revealed, and patches and remediations were also made available. Intel provided micro-code updates for its processors, operating system patches were made available, and VMware provided an update for vSphere. The full details of the vCenter and ESXi patches are in a VMware security advisory that links to individual KB articles.

The ESXi-provided patches included a side-channel aware scheduler (SCAv1) that mitigated the concurrent-context attack vector for L1TF. Once that mode was enabled, the scheduler would only schedule processes on one thread for each core. This mode impacted performance mostly from a capacity standpoint because the system was no longer able to use both hyper-threads on a core. A server that was already fully utilized and running at maximum capacity would see a decrease in capacity of up to approximately 30%. A server that was running at 75% of capacity would see a much smaller impact to performance, but CPU utilization would rise.

In vSphere 6.7 U2, the side-channel aware scheduler has been enhanced (SCAv2) with a new policy to allow hyper-threads to be used concurrently if both threads are running vCPU contexts from the same VM. In this way, L1TF side channels are constrained to not expose information across VM/VM or VM/hypervisor boundaries.

Performance testing with several different workloads found a range of impact in performance for both SCAv1 and SCAv2 as compared to the default scheduler as the baseline of performance. If SCAv1 or SCAv2 were able to achieve the same performance, it would be 1.0, and if it achieved 75% of the performance, it would be .75.  The graphs here show the performance impact at max server utilization and the impact at the reduced load of approximately 75% utilization.

The charts show that the SCAv2 scheduler, represented by the third bar in each group, recovers a significant percentage of performance in all cases, except for the monster VM test case.  The monster VM test case was for a single large Oracle database VM that consumed an entire 4 socket host with 192 vCPUs.  In configurations with a single large monster VM that uses all the logical threads of the host, SCAv1 had a slight performance advantage over SCAv2 in our testing.

The reduced load numbers show that at server usage levels of approximately 75%, the overall impact to performance is much lower. With SCAv2 and the overall load below 75%, tests show that the largest performance impact measured in these tests was 11%. The SCAv2 scheduler option, available in vSphere 6.7 U2, provides better performance than SCAv1 in almost all cases.

For full details about the individual benchmark tests as well as more details about L1TF and VMware’s response to it, please see the full whitepaper and VMware KB 55806.

vMotion across hybrid cloud: performance and best practices

VMware Cloud on AWS is a hybrid cloud service that runs the VMware software-defined data center (SDDC) stack in the Amazon Web Services (AWS) public cloud. The service automatically provisions and deploys a vSphere environment on a bare-metal AWS infrastructure, and lets you run your applications in a hybrid IT environment across your on-premises data centers and AWS global infrastructure. A key benefit of VMware Cloud on AWS is the ability to vMotion workloads back and forth from your on-premises data center to the AWS public cloud as capacity and data privacy require.

In this blog post, we share the results of our vMotion performance tests across our hybrid cloud environment that consisted of a vSphere on-premises data center located in Wenatchee, Washington and an SDDC hosted in an AWS cloud, in various scenarios including hybrid migration of a database server. We also describe the best practices to follow when migrating virtual machines by vMotion across hybrid cloud.

Test configuration

We set up the hybrid cloud environment with the following specifications:

VMware Cloud on AWS

  • 1-host SDDC instance with Amazon EC2 i3.metal (Intel Xeon E5-2686 @ 2.3 GHz, 36 cores, 512 GB)
  • SDDC version: vmc1.6 (M6 – Cycle 17)
  • Auto-provisioned with NSX networking and VSAN storage

On-premises host

  • Dell PowerEdge R730 (Intel Xeon E5-2699 v4 @ 2.2GHz, 22 cores, 1 TB memory)
  • ESXi and vCenter version: 6.7
  • Storage: Dell NVMe, VMFS 5 volume
  • Networking: Intel 1GbE NIC (shared 2*10GbE DX links between on-prem and AWS)

Figure 1: Logical layout of the hybrid cloud setup

Figure 1 illustrates the logical layout of our hybrid cloud environment. We deployed a single-host SDDC instance on AWS cloud. The SDDC was the latest M6 version and auto-configured with vSAN storage and NSX networking. Our on-premises data center, located in Washington state, featured hosts running ESXi 6.7.

AWS Direct Connect

We used high-speed AWS Direct Connect links for connectivity between VMware on-prem data center and AWS Oregon region. AWS Direct Connect provides a leased line from the AWS environment to the on-premises data center. VMware recommends you use this type of link because it guarantees sustained bandwidth during vMotion, which isn’t possible with VPN internet connections. In our environment, there was about 40 milliseconds of round-trip latency on the network.

L2 VPN tunnel

We set up a secure L2 VPN tunnel for the compute traffic that spanned the two vCenters. This connected the VMs on cloud and on-premises to the same address space (IP subnet). So, the VMs remained on the same subnet and kept their IP addresses even as we migrated them from on-premises to cloud and vice versa.

Figure 2: Extending VXLAN across on-premises and cloud using L2 VPN

As shown in figure 2, two NSX Edge VMs provided VPN capabilities and the bridge between the overlay world (VXLAN logical networks) and the physical infrastructure (IP networks). Each NSX Edge VM was equipped with two virtual interfaces (vNICs): one vNIC was used as an uplink to the physical network, and the second vNIC was used as the VXLAN trunk interface.

Hybrid linked mode

Figure 3: A single console to manage resources across on-premises and cloud environments

We created a hybrid linked mode between the cloud vCenter and on-premises vCenter. This allowed us to use a single console to manage all our inventory across the hybrid cloud.  As shown in Figure 3, the cloud inventory included a single Client-VM provisioned in the compute workload resource pool and the on-premises inventory included three VMs including NSX-Edge VM, Client-VM and a Server VM.

Measuring vMotion performance

The following metrics were used to understand the performance implications of vMotion:

  • Migration time: Total time taken for migration to complete
  • Switch-over time: Time during which the VM is quiesced to enable switchover from on-premises to cloud, and vice versa
  • Guest penalty: Performance impact on the applications running inside the VM during and after the migration

Benchmark methodology

We investigated the impact of hybrid vMotion on a Microsoft SQL Server database performance using the open-source DVD Store 3 (DS3) benchmark, which simulates many customers performing typical actions in an online DVD Store (logging in, browsing, buying, reviewing, and so on).

The test scenario used a Windows Server 2012 VM configured with 8 VCPUs, 8 GB memory, 40 GB disk, and a SQL Server database size of 5 GB. As shown in figures 2 and 3, we used two concurrent DS3 clients, one client running on-premises, and a second client running on the cloud. Each client used a load of five DS3 users with 0.02 seconds of think time. We started the migration during the steady-state period of the benchmark when the CPU utilization (esxtop %USED counter) of the SQL Server VM was close to 275%, and the average write IOPS was 80.

Test results

Figure 4: SQL Server throughput at given time: before, during, and after hybrid vMotions

Figure 4 plots the performance of a SQL Server VM in total orders processed per second during vMotion from on-premises to cloud, and vice versa. In our tests, both DS3 benchmark drivers were configured to report the performance data at a fine granularity of 1 second (the default is 10 seconds). As shown in figure 4, the impact on SQL Server throughput was minimal during vMotion in both directions. The total throughput remained steady in the range of 75 operations throughout the test period. The vMotion durations from on-premises to cloud, and vice versa were 415 seconds, and 382 seconds, respectively, with the network throughput ranging between 500 to 900 megabits per second (Mbps). The switch-over time was about 0.6 seconds in both vMotions. The few minor dips in throughput shown in the figure were due to the variance in available network bandwidth on the shared AWS Direct Connect link.

Figure 5: Breakdown of SQL Server throughput reported by the on-premises and cloud clients

Figure 5 illustrates the impact of network latency on the throughput. While the total SQL Server throughput remained steady during the entire test period, the throughput reported by both on-premises and cloud clients varied based on their proximity to the SQL Server VM. For instance, throughput reported by the on-premises client drops from 65 operations to 10 operations when the SQL Server VM was onboarded to the cloud and jumps back to 65 operations after the SQL Server VM is migrated back to the on-premises environment.

The throughput variation seen by the two DS3 clients is not unique to our hybrid cloud environment and can be explained by Little’s Law.

Little’s Law

In queueing theory, Little’s Law theorem states that the average number (L) of customers in a stable system is equal to the average arrival rate (λ) multiplied by the average time (W) that a customer spends in the system. Expressed algebraically, the law is: L = λ × W

Figure 6: Little’s Law applicability in hybrid cloud performance testing

Figure 6 shows how Little’s Law can be applied to our hybrid cloud environment to relate the DS3 users, SQL server throughput, SQL Server processing time, and the network latency. The formula derived in figure 6 explains the impact of the network latency on the throughput (orders per second) when the benchmark load (DS3 users) is fixed. It should be noted, however, that although the throughput reported by both the clients varied due to the network latency, the aggregate throughput remained a constant. This is because the throughput decrease seen by one client is offset by the throughput increase seen by the other client.

This illustrates how important it is for you to monitor your application dependencies when you migrate workloads to and from the cloud. For example, if your database VM depends on a Java application server VM, you should consider migrating both VMs together; otherwise, the overall application throughput will suffer due to slow responses and timeouts.

One way to monitor your application dependencies is to use VMware vRealize Network Insight, which can mitigate business risk by mapping application dependencies in both private and hybrid cloud environments.

vMotion Stun During Page Send (SDPS)

We also tested vMotion performance by doubling the intensity of the DS3 workload on both on-premises and cloud clients. Although vMotion succeeded, vmkernel logs indicated vMotion SDPS kicked-in during test scenarios that had a higher benchmark load. SDPS is an advanced feature of vMotion that ensures migration will not fail due to memory copy convergence issues. Whenever vMotion detects that the guest memory dirty rate is higher than the available network bandwidth, it injects microsecond latencies to guest execution to throttle the page dirty rate, so the network transfer can catch up with the dirty rate. So, we recommend you delay the vMotion of a heavily loaded VMs on hybrid cloud environments with shared bandwidth links, which will prevent slowdown in the guest execution.

To learn more about SDPS, see “VMware vSphere vMotion Architecture, Performance, and Best Practices.”

vMotion across multiple availability zones in the SDDC

Every AWS region has multiple availability zones (AZ). Amazon does not provide service level agreements beyond an availability zone. For reasons such as failover support, VMware Cloud on AWS customers can choose an SDDC deployment that spans multiple availability zones in a single AWS region.

There are certain vMotion performance implications with respect to the SDDC deployment configuration.

Figure 7.  vMotion peak network throughput in a single availability zone vs. multiple availability zones

As shown in figure 7, vMotion peak network throughput depends on the host placement in the SDDC.

This is because vMotion uses a single TCP stream in the VMware Cloud environment. If the vMotion source and destination hosts are within the same availability zone, vMotion peak throughput can reach as high as 10 gigabits per second (Gbps), limited only by the CPU core speed. However, if the source and destination hosts are across availability zones, vMotion peak throughput is governed by the AWS rate limiter. The throughput of any single TCP or UDP stream across availability zones is limited to 5 Gbps by the AWS rate limiter.

Conclusion

In summary, our performance test results show the following:

  • vMotion lets you migrate workloads seamlessly across traditional, on-premises data centers and software-defined data centers on AWS Cloud.
  • vMotion offers the same standard performance guarantees across hybrid cloud environment, which includes less than 1 second of vMotion execution switch-over time, and minimal impact on guest performance.

References

vSAN Performance Diagnostics Now Shows “Specific Issues and Recommendations” for HCIBench

By Amitabha Banerjee and Abhishek Srivastava

The vSAN Performance Diagnostics feature, which helps customers to optimize their benchmarks or their vSAN configurations to achieve the best possible performance, was first introduced in vSphere 6.5 U1. vSAN Performance Diagnostics is a “cloud connected” feature and requires participation in the VMware Customer Experience Improvement Program (CEIP). Performance metrics and data are collected from the vSAN cluster and are sent to the VMware Cloud. The data is analyzed and the results are sent back for display in the vCenter Client. These results are shown as performance issues, where each issue includes a problem with its description and a link to a KB article.

In this blog, we describe how vSAN Performance Diagnostics can be used with HCIBench and show the new feature in vSphere 6.7 U1 that provides HCIBench specific issues and recommendations.

What is HCIBench?

HCIBench (Hyper-converged Infrastructure Benchmark) is a standard benchmark that vSAN customers can use to evaluate the performance of their vSAN systems. HCIBench is an automation wrapper around the popular and proven VDbench open source benchmark tool that makes it easier to automate testing across an HCI cluster. HCIBench, available as a fling, simplifies and accelerates customer performance testing in a consistent and controlled way.

Example: Achieving Maximum IOPS

As an example, consider the following HCIBench workload that is run on a vSAN system:

Number of
VMs
Number of
Disks (vmdks)
to Test
Number of
Threads/Disk
Working Set
Percentage
Block Size Read/Write
Percentage
Randomness
Percentage
1 10 1 100 4 KB 0/100 100

If the goal is to achieve maximum IOs per second (IOPS), vSAN Performance Diagnostics for this workload yields the result shown in figure 1.

Figure 1

In this example, vSAN Performance Diagnostics reports, “The Outstanding IOs for the benchmark is too low to achieve the desired goal”. Here we can see that the feedback from vSAN Performance Diagnostics tells us about the problem, and a possible solution recommends that we increase the number of outstanding IOs to a value of 2 per host. The linked “Ask VMware” article explains this issue and what we must do with the benchmark in more detail.

The new HCIBench Exceptions and Recommendations feature removes the need to read through a KB article by precisely mapping recommendations to one (or more) configurable parameters of HCIBench.

Now let us check how vSAN Performance Diagnostics works with vSphere 6.7 U1, which has access to this new feature (figure 2).

Figure 2

Now we can clearly see an exact issue in our HCIBench workload configuration that provides the precise recommendation to resolve this issue with the message: “Increase number of threads per disk from 1 to 2”.

Let us monitor the current write IOPS generated by the benchmark for our reference. We go to Data Center–>Cluster–>Monitor–>vSAN–>Performance (figure 3).

Figure 3

Now, as part of the evaluation, we apply the recommendation generated by vSAN Performance Diagnostics and check its impact. We now run the new HCIBench workload configuration with 2 threads per disk and the following parameters:

Number of
VMs
Number of
Disks (vmdks)
to Test
Number of
Threads/Disk
Working Set
Percentage
Block Size Read/Write
Percentage
Randomness
Percentage
1 10 2 100 4KB 0/100 100

After the benchmark completes, we use vSAN Performance Diagnostics again to see if we can now achieve the required goal of maximum IOPS. The result from Performance Diagnostics now shows “No Issues were found”, which means that we are achieving good IOPS from the vSAN system (figure 4).

Figure 4

Now, as part of actual verification, we can see the change in IOPS after applying the recommendation. From figure 5 below (screenshot from Data Center–>Cluster–>Monitor–>vSAN–>Performance), we can clearly see that there is a 25-30% increase in IOPS after applying the recommendation, which verifies that it helped us achieve our goal.

Figure 5

We believe that this feature will be very useful for customers who want to tune their HCIBench workload for a desired goal.

Prerequisites

  • Please note that this feature is currently integrated with HCIBench 1.6.7 or later.
  • This feature is available for vSphere 6.7 U1 and newer releases. It will not be available to patch releases of vSphere 6.7.

Storage DRS Performance Improvements in vSphere 6.7

Virtual machine (VM) provisioning operations such as create, clone, and relocate involve the placement of storage resources. Storage DRS (sometimes seen as “SDRS”) is the resource management component in vSphere responsible for optimal storage placement and load balancing recommendations in the datastore cluster.

A key contributor to VM provisioning times in Storage DRS-enabled environments is the time it takes (latency) to receive placement recommendations for the VM disks (VMDKs). This latency particularly comes into play when multiple VM provisioning requests are issued concurrently.

Several changes were made in vSphere 6.7 to improve the time to generate placement recommendations for provisioning operations. Specifically, the level of parallelism was improved for the case where there are no storage reservations for VMDKs. This resulted in significant improvements in recommendation times when there are concurrent provisioning requests.

vRealize automation suite users who use blueprints to deploy large numbers of VMs quickly will notice the improvement in provisioning times for the case when no reservations are used.

Several performance optimizations were further made inside key steps of processing the Storage DRS recommendations. This improved the time to generate recommendations, even for standalone provisioning requests with or without reservations.

Test Setup and Results

We ran several performance tests to measure the improvement in recommendation times between vSphere 6.5 and vSphere 6.7. We ran these tests in our internal lab setup consisting of hundreds of VMs and few thousands of VMDKs. The VM operations are

  1. CreateVM – A single VM per thread is created.
  2. CloneVM – A single clone per thread is created.
  3. ReconfigureVM – A single VM per thread is reconfigured to add an additional VMDK.
  4. RelocateVM – A single VM per thread is relocated to a different datastore.
  5. DatastoreEnterMaintenance – Put a single datastore into maintenance mode. This is a non-concurrent operation.

Shown below are the relative improvements in recommendation times for VM operations at varying concurrencies. The y-axis has a numerical limit of 10, to allow better visualization of the relative values of the average recommendation time. 

The concurrent VM operations show an improvement of between 20x and 30x in vSphere 6.7 compared to vSphere 6.5 

Below we see the relative average time taken among all runs for serial operations.

The Datastore Enter Maintenance operation shows an improvement of nearly 14x in vSphere 6.7 compared to vSphere 6.5

With much faster storage DRS recommendation times, we expect customers to be able to provision multiple VMs much faster to service their in-house demands. Specifically, we expect VMware vRealize Automation suite users to hugely benefit from these improvements.

SPBM compliance check just got faster in vSphere 6.7 U1!

vSphere 6.7 U1 includes several enhancements in Storage Policy-Based Management (SPBM) to significantly reduce CPU use and generate a much faster response time for compliance checking operations.

SPBM is a framework that allows vSphere users to translate their workload’s storage requirements into rules called storage policies. Users can apply storage policies to virtual machines (VMs) and virtual machine disks (VMDKs) using the vSphere Client or through the VMware Storage Policy API’s rich set of managed objects and methods. One such managed object is PbmComplianceManager. One of its methods, PbmCheckCompliance, helps users determine whether or not the storage policy attached to their VM is being honored.

PbmCheckCompliance is automatically invoked soon after provisioning operations such as creating, cloning, and relocating a VM. It is also automatically triggered in the background once every 8 hours to help keep the compliance records up-to-date.

In addition, users can invoke the method when checking compliance for a VM storage policy in the vSphere Client, or through the VMware Storage Policy API method PbmCheckCompliance.

We did a study in our lab to compare the performance of PbmCheckCompliance between vSphere 6.5 U2 and vSphere 6.7 U1. We present this comparison in the form of charts showing the latency (normalized on a 100-point scale) of PbmCheckCompliance for varying numbers of VMs.

The following chart compares the performance of PbmCheckCompliance on VMFS and vSAN environments.

As we see from the above chart, PbmCheckCompliance returns results much faster in vSphere 6.7 U1 compared to 6.5 U2. The improvement is seen across all inventory sizes and all datastore types and become more prominent for larger inventories and higher numbers of VMs.

The enhancements also positively impact a similar method, PbmCheckRollupCompliance. This method also returns the compliance status of VMs and adds compliance results for all disks associated with these VMs. The following chart represents the performance comparison of PbmCheckRollupCompliance on VMFS and vSAN environments.

Our experiments show that compliance check operations are significantly faster and more light-weight in vSphere 6.7 U1.

DRS Enhancements in vSphere 6.7

A new paper describes the DRS enhancements in vSphere 6.7, which include new initial placement, host maintenance mode enhancements, DRS support for non-volatile memory (NVM), and enhanced resource pool reservations.

Resource pool and VM entitlements—old and new models

A summary of the improvements follows:

  • DRS in vSphere 6.7 can now take advantage of the much faster placement and more accurate recommendations for all DRS configurations. vSphere 6.5 did not include support for some configurations like VMs that had fault tolerance (FT) enabled, among others.
  • Starting with vSphere 6.7, DRS uses the new initial placement algorithm to come up with the recommended list of hosts to be placed in maintenance mode. Further, when evacuating the hosts, DRS uses the new initial placement algorithm to find new destination hosts for outgoing VMs.
  • DRS in vSphere 6.7 can handle VMs running on next generation persistent memory devices, also known as Non-Volatile Memory (NVM) devices.
  • There is a new two-pass algorithm that allocates a resource pool’s resource reservation
    to its children (also known as divvying).

For more information about all of these updates, see DRS Enhancements in vSphere 6.7.

VMware’s AI-based Performance Tool Can Improve Itself Automatically

PerfPsychic  our AI-based performance analyzing tool, enhances its accuracy rate from 21% to 91% with more data and training when debugging vSAN performance issues. What is better, PerfPsychic can continuously improve itself and the tuning procedure is automated. Let’s examine how we achieve this in the following sections.

How to Improve AI Model Accuracy

Three elements have huge impacts on the training results for deep learning models: amount of high-quality training data, reasonably configured hyperparameters that are used to control the training process, and sufficient but acceptable training time. In the following examples, we use the same training and testing dataset as we presented in our previous blog.

Amount of Training Data

The key of PerfPsychic is to prove the effectiveness of our deep learning pipeline, so we start by gradually adding more labeled data to the training dataset. This is to demonstrate how our models learn from more labeled data and improve their accuracy over time. Figure 1 shows the results where we start from only 20% of the training dataset and gradually label 20% more each time. It shows a clear trend that as more properly labeled data is added, our model learns and improves its accuracy, without any further human intervention. The accuracy is improved from around 50% when we have only about 1,000 data points to 91% when we have the full set of 5,275 data points. Such accuracy is as good as a programmatic analytic rule that took us three months to tune manually.

Figure 1. Accuracy improvement over larger training datasets

Training Hyperparameters

We next vary several other CNN hyperparameters to demonstrate how they were selected for our models. We change only one hyperparameter at a time and train 1,000 CNNs using the configuration. We first vary a different number of iterations in training, namely for how many times we go through the training dataset. If the number of iterations is too few, the CNNs cannot be trained adequately and, if the iteration number is too large, training will take a much longer time and it also might end up overfitting to the training data. As shown in Figure 2, between 50 to 75 iterations is the best range, where 75 iterations achieve the best accuracy of 91%.

Figure 2. Number of training iterations vs. accuracy

We next vary the step size, which is our granularity to search for the best model. In practice, with a small step size, the optimization is so slow that it cannot reach the optimal point in a limited time. With a large step size, we risk passing optimal points easily. Figure 3 shows that, between 5e-3 to 7.5e-3, the model produces good accuracy, where 5e-3 predicts 91% of the labels correctly.

Figure. 3 Step size vs. accuracy

We last evaluate the impact of issue rate of the training data in terms of accuracy. Issue rate is the percentage of training data that represents performance issues among the total. In an ideal set of training data, all the labels should be equally represented to avoid overfitting. A biased dataset generally results in overfitting models that can barely  achieve high accuracy. Figure 4 below shows that when the training data have under 20% of issue rate (that is, under 20% of the components are faulty), the model basically overfits to “noissue” data points and predicts all components are issue-free. Because our testing data have 21.9% of components without issues, it stays at 21.9% accuracy. In contrast, when we have over 80% of an issue rate, the model simply treats all components as faulty and thus achieves the 78.1% accuracy. This explains why it is important to ensure every label is equally represented, and why we mix our issue/noissue data in a ratio between 40% to 60%.

Figure 4. Impact of issue rate

Training Duration

Training time is also an important factor in a practical deep learning pipeline design. As we train thousands of CNN models, spending one second longer to train a model means a whole training phase will take 1,000 seconds longer. Figure 9 below shows the training time vs. data size and the number of iterations. As we can see, both factors form a linear trend; that is, with more data and more iterations, training will take linearly longer. Fortunately, we know from the study above that any more than 75 iterations will not help accuracy. By limiting the number of iterations, we can complete a whole phase of training in less than 9 hours. Again, once the off-line training is done, the model can perform real-time prediction in just a few milliseconds. The training time simply affects how often and how fast the models can pick up new feedback from product experts.

Figure 5. Effect of data size and iteration on training time

Automation

The model selection procedure is fully automated. Thousands of models with different hyperparameter settings are training in parallel on our GPU-enabled servers. The trained results compete with each other by analyzing our prepared testing data and reporting the final results. We then pick the model with the highest correct rate, put it into PerfPsychic and use it for online analysis. Moreover, we also keep a record of the parameters in the the winning models and use them as initial setups in future trainings. Therefore, our models can keep evolving.

PerfPsychic in Application

PerfPsychic is not only a research product, but also an internal performance analysis tool which is widely used. Now it is used to automatically analyze vSAN performance bugs on Bugzilla.

PerfPsychic automatically detects new vSAN performance bugs submitted in Bugzilla and extracts its usable data logs in the bug attachment. Then it analyzes the logs with the trained models. Finally, the analysis results are emailed to bug submitters and vSAN developer group where performance enhancement suggestions are included.

Below is part of an email received yesterday that gives performance tuning advice on a vSAN bug. Internal information are hidden.

Figure 6. Part of email generated by PerfPsychic to offer performance improvement suggestions

 

VMware Speedily Resolves Customer Issues in vSAN Performance Using AI

We in VMware’s Performance team create and maintain various tools to help troubleshoot customer issues—of these, there is a new one that allows us to more quickly determine storage problems from vast log data using artificial intelligence. What used to take us days, now takes seconds. PerfPsychic analyzes storage system performance and finds performance bottlenecks using deep learning algorithms.

Let’s examine the benefit artificial intelligence (AI) models in PerfPsychic bring when we troubleshoot vSAN performance issues. It takes our trained AI module less than 1 second to analyze a vSAN log and to pinpoint performance bottlenecks at an accuracy rate of more than 91%. In contrast, when analyzed manually, an SR ticket on vSAN takes a seasoned performance engineer about one week to deescalate, while the durations range from 3 days to 14 days. Moreover, AI also wins over traditional analyzing algorithms by enhancing the accuracy rate from around 80% to more than 90%.

Architecture

There are two operation modes in the AI module: off-line training mode and real-time prediction mode. In the training mode, sets of training data, which are labeled with their performance issues, are automatically fed to all potential convolutional neural network (CNN) [1] structures, which we train repeatedly on GPU-enabled servers. We train thousands of models at a time and pick the one that achieves the best accuracy to a real-time system. In the real-time prediction mode, unlabeled user data are sent to the model chosen from the training stage, and a prediction of the root cause (faulty component) is provided by it.

As shown in Figure 1, data in both training and prediction modes are first sent to a data preparation module (Queried Data Preparation), where data are formatted for later stages. The data path then diverges. Let’s first follow the dashed line for the data path of labeled training data. They are sent to the deep learning training module (DL Model Training) to train an ensemble of thousands of CNNs generated from our carefully designed structures. After going through all the training data for more than thousands of times and having the training accuracy rate converged to a stable value, the trained CNNs will compete with each other in the deep learning model selection module (DL Model Selection), where they are requested to predict the root causes of testing data that the models have never seen before. Their predictions are compared to the real root causes, which are labeled by human engineers, to calculate the testing accuracy rate. Finally, we provide an ensemble of models (Trained DL Model) that achieve the best testing accuracy to the real-time prediction system.

Figure 1: Deep Learning Module Workflow

You might expect this training process to be both time consuming and resource hungry and so, it should be carried out off-line on servers equipped with powerful GPUs. On the contrary, prediction mode is relatively light-weight and can adapt to real-time applications.

Following the solid line in Figure 1 for prediction mode, the unlabeled normalized user data are sent to our carefully picked models, and the root cause (Performance Exception) is predicted based on a small amount of calculations. The prediction will be returned to the upper layer such as our interactive analytic web UI, automatic analysis, or proactive analysis applications. Like the interactive analytic part, our web UI also has a means of manually validating the prediction, which will automatically trigger the next round of model training. This completes the feedback loop and ensures our models continue to learn from human feedback.

AI Wins Over Manual Debugging

Diagnosing performance problems in a software-defined datacenter (SDDC) is difficult due to both the scale of the systems and the scale of the data. The scale of the software and hardware systems results in complicated behaviors that are not only workload-dependent but also interfering with each other. Thus, pin-pointing a root cause requires thorough examinations of the entire datacenter. However, due to the scale of data collected across a datacenter, this analysis process requires many human efforts, takes an extremely long time, and is prone to errors. Take vSAN for example—dealing with performance-related escalations typically requires cross-departmental efforts examining vSAN stacks, ESXi stacks, and physical/virtual network stacks. In some cases, it took months for many engineers to pinpoint problems outside of the VMware stack, such as physical network misconfigurations. On average, it takes one week to deescalate a client’s service request ticket with the effort of many experienced engineers working together.

PerfPsychic is designed to address challenges we have faced and to further make performance diagnostics more scalable. PerfPsychic builds upon a data infrastructure that is at least 10 times faster and 100 times more scalable than the existing one. It provides an end-to-end interactive analytic UI allowing users to perform the majority of the analysis in one place. The analysis results will then immediately be fed back to the deep learning pipeline in the backend, which produces diagnostic models that detect a faulty component more accurately as more feedback gets collected. These models mostly take only a few hours to train, and can detect faulty components in a given dataset in a few milliseconds, with comparable accuracy to rules that took us months to tune manually.

AI Wins Over Traditional Algorithms

To prove the effectiveness of our AI approach, we tested it against traditional machine learning algorithms.

First, we created two datasets: training data and testing data, as summarized in Table 1.

Table 1: Training and Testing Data Property

Training data are generated from our simulated environment: a simple 4-node hybrid vSAN setup. We manually insert performance errors into our testing environment to collect the training data with accurate labels. In the example of a network issue, we simulate packet drops by having vmkernel drop a receiving packet at VMK TCP/IP for every N packets. This mimics the behavior of packet drops in the physical network. We vary N to produce enough data points for training. Although this does not 100% reproduce what happens in a customer environment, it is still a best practice since it is the only cost-effective way to get a large volume of labeled data which are clean and accurate.

The testing data, in contrast to the training data, are all from customer escalations, which have very different system configurations in many aspects (number of hosts, types and number of disks, workloads, and so on). In our testing data, we have 78.1% of the data labeled with performance issues. Note that the “performance issue” refers to a specific component in the system that is causing the performance problem in the dataset. We define “accuracy” as the percentage of predictions that the CNN model gives the correct label to all components from the testing datasets (“issue” or “no issue”).

With the same training data, we trained one CNN, and four popular machine learning models: Support Vector Machine (SVM) [2], Logistic Classification (LOG) [3], Multi-layer Perceptron Neural Network (MLP) [4] and Multinomial Naïve Bayes (MNB) [5]. Then we tested the five models against the testing dataset. To quantify model performances, we calculate their accuracy as follows.

Finally, we compared the accuracy rate achieved by each model, which are shown in Figure 2. The result reveals that AI is a clear winner, with 91% accuracy.

Figure 2: Analytic Algorithm Accuracy Comparison

Acknowledgments

We appreciate the assistance and feedback from Chien-Chia Chen, Amitabha Banerjee and Xiaobo Huang. We also feel grateful to the support from our manager Rajesh Somasundaran. Lastly, we thank Julie Brodeur for her help in reviewing and recommendations for this blog post.

References

  1. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions. CoRR,” abs/1409.4842, 2014.
  2. J. Smola, B. Schölkopf, “A Tutorial on Support Vector Regression,” Statistics and Computing Archive Volume 14 Issue 3, August 2004, p. 199-222.
  3. C. Bishop, “Pattern Recognition and Machine Learning,” Chapter 4.3.4.
  4. E. Rumelhart, G. E. Hinton, R. J. Williams, “Learning representations by back-propagating errors,” http://www.iro.umontreal.ca/~pift6266/A06/refs/backprop_old.pdf.
  5. Zhang, “The optimality of Naive Bayes,” Proc. FLAIRS, 2004, http://www.cs.unb.ca/~hzhang/publications/FLAIRS04ZhangH.pdf.

vSphere with iSER – How to release the full potential of your iSCSI storage!

By Mark Ma

With the release of vSphere 6.7, VMware added iSER (iSCSI Extensions for RDMA) as a native supported storage protocol to ESXi. With iSER run over iSCSI, users can boost their vSphere performance just by replacing the regular NICs with RDMA-capable NICs. RDMA (Remote Direct Memory Access) allows the transfer of memory from one computer to another. This is a direct transfer and minimizes CPU/kernel involvement. By bypassing the kernel, we get extremely high I/O bandwidth and low latency. (To use RDMA, you must have an HCA/Host Channel Adapter device on both the source and destination.) In this blog, we compare standard iSCSI performance vs. iSER performance to see how iSER can release the full potential of your iSCSI storage.

Testbed Configuration

The iSCSI/iSER target system is based on open source Ubuntu 18.04 LTS server with 2 x E5-2403 v2 CPU, 96 GB RAM, 120 GB SSD for OS, 8 x 450 GB (15K RPM) SAS drive and Mellanox ConnectX-3 Pro EN 40 GbE NIC (RDMA capable). The file system is ZFS with 4 x mirror set with 8 x 450 GB SAS drive (RAID 10). ZFS has an advanced memory feature that can produce very good random IOPS and read throughput. We did not place any SSD for caching since the test is rather to compare protocol difference, not disk drives. The iSCSI/iSER target is Linux SCSI target framework (TGT).

The iSCSI/iSER initiator is ESXi 6.7.0, 9214924 with 2 x Intel Xeon E5-2403 v2 CPU @ 1.8 GHz, 96 GB RAM, USB boot drive, and  Mellanox ConnectX-3 Pro EN 40 GbE NIC (RDMA capable) has been used to benchmark the performance boost that iSER enables vs. iSCSI.

Both target and initiator connect to 40 GbE switch with QSFP cables for optimal network performance.

Both NICs have the latest firmware with 2.42.5000.

To measure performance, we used VMware I/O Analyzer, which uses the industry-standard benchmark Iometer.

iSCSI Test

We set the target with the iSCSI driver to ensure the first test measures the standard iSCSI protocol.

Figure 1

For the iSCSI initiator, we simply enable the iSCSI software adapter.

Figure 2

iSCSI test one: Max Read IOPS—this test shows the max read IOPS (4K random read I/Os per second) from the iSCSI storage.

Result: 34,255.18 IOPS

Figure 3

iSCSI test two: Max Write IOPS, this test shows the max write IOPS from the iSCSI storage.

Result: 36,428.26 IOPS

Figure 4

iSCSI test three: Max Read Throughput—this test shows the max read throughput from the iSCSI storage.

Result: 2,740.80 MBPS

Figure 5

iSCSI test four: Max Write Throughput—this test shows the potential max write throughput from the iSCSI storage. (The performance is rather low due to ZFS RAID configuration and limited disk spindle.)

Result: 112.04 MBPS

Figure 6

iSER Test

We set the target to the iSER driver to ensure the second test measures only iSER connections.

Figure 7

For the iSER initiator, we need to verify that an RDMA-capable NIC is installed. For this, we use the command:
esxcli rdma device list

Figure 8

Then we run the following command from the ESXi host to enable the iSER adapter.

esxcli rdma iser add

 

Figure 9

 

iSER test one: Max Read IOPS—this test shows the max read IOPS from the iSER storage.

Result: 71,108.85 IOPS, which is 207.59% over iSCSI.

Figure 10

iSER test two: Max Write IOPS—this test shows the max write IOPS from the iSER storage.

Result: 69,495.70 IOPS which is 190.77% over iSCSI.

Figure 11

iSER test three: Max Read Throughput—this test shows the max read throughput, measured in megabytes per second (MBPS), from the iSER storage.

Result: 4,126.53 MBPS which is 150.56% over iSCSI.

Figure 12

iSCSI test four: Max Write Throughput—this test shows the max write throughput from the iSER storage. (The performance is rather low due to ZFS raid configuration and limited disk spindle.)

Result: 106.48 MBPS which is about 5% less than iSCSI.

Figure 13

Results

Figure 14

Figure 15

Figure 16

The performance over random I/O is about a 200% increase for both read and write. That’s about a 150% increase for read throughput. Write throughput is about the same. The only difference is the storage protocol. We also performed these tests in an environment made up of older hardware, so just imagine what vSphere with iSER could do when using state-of-the-art, NVME-based storage and the latest 200 GbE network equipment.

Conclusion

The result seemed too good to be true, so I ran the benchmark several times to ensure its consistency. It’s great to see VMware’s Innovation initiative in action. Who could think that the “not so exciting” traditional iSCSI storage could be improved over 200% through the efforts of VMware and Mellanox. It’s great to see VMware continues to push the boundary of the Software-Defined Datacenter to better serve our customers in their digital transformation journey!

About the Author

Mark Ma is a senior consultant at VMware Professional Services. He is heavily involved with POC, architecture design, assessment, implementation, and user training. Mark specializes in end-to-end virtualization solutions based on Citrix, Microsoft, and VMware applications.