PerfPsychic our AI-based performance analyzing tool, enhances its accuracy rate from 21% to 91% with more data and training when debugging vSAN performance issues. What is better, PerfPsychic can continuously improve itself and the tuning procedure is automated. Let’s examine how we achieve this in the following sections.
How to Improve AI Model Accuracy
Three elements have huge impacts on the training results for deep learning models: amount of high-quality training data, reasonably configured hyperparameters that are used to control the training process, and sufficient but acceptable training time. In the following examples, we use the same training and testing dataset as we presented in our previous blog.
We in VMware’s Performance team create and maintain various tools to help troubleshoot customer issues—of these, there is a new one that allows us to more quickly determine storage problems from vast log data using artificial intelligence. What used to take us days, now takes seconds. PerfPsychic analyzes storage system performance and finds performance bottlenecks using deep learning algorithms.
Let’s examine the benefit artificial intelligence (AI) models in PerfPsychic bring when we troubleshoot vSAN performance issues. It takes our trained AI module less than 1 second to analyze a vSAN log and to pinpoint performance bottlenecks at an accuracy rate of more than 91%. In contrast, when analyzed manually, an SR ticket on vSAN takes a seasoned performance engineer about one week to deescalate, while the durations range from 3 days to 14 days. Moreover, AI also wins over traditional analyzing algorithms by enhancing the accuracy rate from around 80% to more than 90%.
With the release of vSphere 6.7, VMware added iSER (iSCSI Extensions for RDMA) as a native supported storage protocol to ESXi. With iSER run over iSCSI, users can boost their vSphere performance just by replacing the regular NICs with RDMA-capable NICs. RDMA (Remote Direct Memory Access) allows the transfer of memory from one computer to another. This is a direct transfer and minimizes CPU/kernel involvement. By bypassing the kernel, we get extremely high I/O bandwidth and low latency. (To use RDMA, you must have an HCA/Host Channel Adapter device on both the source and destination.) In this blog, we compare standard iSCSI performance vs. iSER performance to see how iSER can release the full potential of your iSCSI storage.
A new white paper is available comparing Spark machine learning performance on an 8-server on-premises cluster vs. a similarly configured VMware Cloud on AWS cluster.
Here is what the VMware Cloud on AWS cluster looked like:
VMware Cloud on AWS configuration for performance tests
Three standard analytic programs from the Spark machine learning library (MLlib), K-means clustering, Logistic Regression classification, and Random Forest decision trees, were driven using spark-perf. In addition, a new, VMware-developed benchmark, IoT Analytics Benchmark, which models real-time machine learning on Internet-of-Things data streams, was used in the comparison. The benchmark is available from GitHub.
We published a paper that shows how VMware is helping advance PMEM technology by driving the virtualization enhancements in vSphere 6.7. The paper gives a detailed performance analysis of using PMEM technology on vSphere using various workloads and scenarios.
These are the key points that we cover in this white paper:
We explain how PMEM can be configured and used in a vSphere environment.
We show how applications with different characteristics can take advantage of PMEM in vSphere. Below are some of the use-cases:
How PMEM device limits can be achieved under vSphere with little to no overhead of virtualization. We show virtual-to-native ratio along with raw bandwidth and latency numbers from fio, an I/O microbenchmark.
How traditional relational databases like Oracle can benefit from using PMEM in vSphere.
How scaling-out VMs in vSphere can benefit from PMEM. We used Sysbench with MySQL to show such benefits.
How modifying applications (PMEM-aware) can get the best performance out of PMEM. We show performance data from such applications, e.g., an OLTP database like SQL Server and an in-memory database like Redis.
Using vMotion to migrate VMs with PMEM which is a host-local device just like NVMe SSDs. We also characterize in detail, vMotion performance of VMs with PMEM.
We outline some best practices on how to get the most out of PMEM in vSphere.
I’m excited to announce that the “Extreme Performance Series” is back for its 6th year with 14 sessions created and being presented by VMware’s best and most distinguished performance engineers, principals, architects and gurus. You do not want to miss this years program as it’s chalk full of advanced content, practical advice and exciting technical details!
Spread across 5 different VMworld tracks, you’ll find these sessions full of performance details that you won’t get anywhere else at VMworld. They’ll also be recorded so if the sessions you want to see aren’t being hosted in your region, you’ll still get access to it.
Underlying each release of VMware vSphere are many performance and scalability improvements. The vSphere 6.7 platform continues to provide industry-leading performance and features to ensure the successful virtualization and management of your entire software-defined datacenter.
You’ve probably already heard about VMware Cloud on Amazon Web Services (VMC on AWS). It’s the same vSphere platform that has been running business critical applications for years, but now it’s available on Amazon’s cloud infrastructure. Following up on the many tests that we have done with Oracle databases on vSphere, I was able to get some time on a VMC on AWS setup to see how Oracle databases perform in this new environment.
It is important to note that VMC on AWS is vSphere running on bare metal servers in Amazon’s infrastructure. The expectation is that performance will be very similar to “regular” onsite vSphere, with the added advantage that the hardware provisioning, software installation, and configuration is already done and the environment is ready to go when you login. The vCenter interface is the same, except that it references the Amazon instance type for the server.
Ever wondered how DRS distributes resources to VMs? How much resources your VMs are entitled to? How reservations, limits, and shares (RLS) affect your VMs’ resource availability? Our new fling, DRS Entitlement Viewer, is the answer.
DRS Entitlement Viewer is installed as a plugin to the vSphere Client. It is currently only supported for the HTML5-based vSphere Client. Once installed, it gives the hierarchical view of vCenter DRS cluster inventory with entitled CPU and memory resources for each resource pool and VM in the cluster.
Entitled resources can change with VMs’ resource demand and with the VM’s and resource pool’s RLS settings. So, users can get the current entitlements based on the VMs’ current demands and RLS settings of the VMs and resource pools.
DRS Entitlement Viewer also provides three different what-if scenarios:
Changing RLS settings of a VM and/or resource pool
What-if all the VMs’ resource demand is at 100%
Both 1 and 2 happen together
Users can pick one of the three scenarios and can get new entitlements without actually changing RLS settings on the cluster.
Finally, DRS Entitlement Viewer also provides an option to export the new RLS values from a what-if scenario as a vSphere PowerCLI command that customers can execute against their vCenter to apply the new settings.