VMware Cloud on AWS m7i: Performance of Microsoft SQL Server VMs

We recently announced the new m7i instance type of VMware Cloud on AWS, which uses networked NFS storage instead of vSAN. This gives you flexible and scalable storage based on VMware Cloud Flex Storage and Amazon FSx for NetApp ONTAP.

For other VMware Cloud on AWS instance types, vSAN storage is included, which is based on the local disks on each host. vSAN datastores grow with the cluster as you add more nodes. With m7i, the amount of storage does not depend on the number of hosts, and you can size it as needed based on storage requirements.

In this blog post, we look at the performance of SQL Server VMs running on a 3-node m7i cluster with NFS storage vs a 3-node i4i cluster with vSAN storage. The m7i instances are based on Intel processors that are one generation newer than i4i. Refer to table 1 for an overview of the two instance types.

Table 1. Configuration of the instance types used

Methodology

For our tests, we used a set of 16 vCPU and 24 vCPU VMs. Because the m7i instances had 48 cores, both 16 vCPUs and 24 vCPUs were divided the total cores evenly, making performance comparisons clean and easy to understand. To support the maximum number of 16 vCPU VMs on the m7i cluster, we assigned each VM 60 GB of RAM.

We used a Windows Server 2022 VM installed with Microsoft SQL Server 2022 for the tests, and we used the open-source DVD Store 3.5 benchmark tool to run OLTP application loads against SQL Server and measure the results. DVD Store simulates an online store where customers browse, review, rate reviews, join with exclusive membership, and purchase products. Results are reported in orders per minute (OPM) which is a measurement of throughput. Higher OPM scores are better.

Performance results for 24 vCPU vs 16 vCPU SQL Server VMs

The results in figures 1 and 2 show the performance of 24 vCPU VMs and 16 vCPU VMs. The line is the total throughput in OPM across all VMs in each test. The number of VMs increases in each test, starting with 1 and ending with the maximum possible, based on the number of vCPUs matching the number of threads available on the cluster.

The OPM slows the rate of increase when the number of vCPUs passes the number of physical cores and the second logical thread (hyperthread) on each core must be used. This starts at 8 and 12 VMs for the 24 and 16 vCPU tests, respectively.

The bars in each chart represent the CPU utilization of the three hosts for each test. As we added more VMs , VMware Distributed Resource Scheduler (DRS) automatically placed and possibly dynamically moved the VMs between the hosts to manage load. As the results show, this kept the CPU utilization across all hosts fairly consistent even as the load increased.

The last data point on the 16 vCPU test (figure 2) shows a slight decline in OPM because the cluster is slightly overloaded at this point in the test. Despite this, we could have continued scaling the performance of the m7i cluster by adding more instances. In these tests, we only used a 3-instance cluster, but we could have added more instances to the cluster to increase its capacity.

Figure 1. Virtualized SQL Server performance and host CPU utilization for a VMware Cloud on AWS, 3-host cluster with 24 vCPUs per VM

Figure 2. Virtualized SQL Server performance and host CPU utilization for a VMware Cloud on AWS, 3-host cluster with 16 vCPUs per VM

Figure 3 compares the performance of the 16 vCPU and 24 vCPU VMs so that the same number of total vCPUs were assigned in each test case; for example, for 48 vCPUs, we compared 2×24 vCPUs to 3×16 vCPUs (both equaling 48).

Figure 3. SQL Server VM performance: 16 vCPU compared to 24 vCPU

Performance comparison with i4i

We migrated the same VMs to the i4i cluster and repeated the tests. It is important to note that to keep the VMs the same as much as possible, we didn’t increase the amount of RAM assigned to the VMs, even though there was about 2.5x more RAM on the i4i cluster.

As figure 4 shows, the results for i4i were similar in terms of OPM and host CPU utilization. A key difference: we were able to run up to 24 VMs on i4i vs the maximum of 18 VMs on m7i. This is due to the higher number of cores and greater memory on the i4i instances.

Figure 4. VMware Cloud on AWS 3 host i4i cluster: SQL Server VMs and host CPU utilization of 16 vCPUs per VM compared to vSAN

Figure 5 compares the difference between m7i and i4i. Because the individual cores of the m7i instances are higher performing than those of the i4i cluster, figure 5 shows a performance advantage on the left side of the graph. Once the m7i instances need to rely on the second logical thread (hyperthread) from each physical core to support workloads, the i4i performance becomes similar. And finally, when the i4i instances take full advantage of the higher core count, the performance exceeds that of m7i.

Figure 5. Difference between m7i and i4i

Conclusion

Benchmark testing shows good performance of SQL Server on m7i, within the resource capacity the m7i instances provide. Because the m7i instances have fewer cores and less memory than i4i, it is important to use the VMC sizer to ensure that the platform meets the needs of your databases.

We saw slightly higher latencies with the NFS storage attached to m7i, but overall, this did not have a big effect on the results of our testing. It is important to also size the NFS storage attached to the m7i and be sure to set the IOPs and throughput for the storage to levels that match workload requirements.