This week we will publish three blogs on the topic of sizing virtual machines for Java workloads. In the bogs we will discuss various sizing considerations, best practices, sizing limits, and the most common configuration used by our customers.
A sizing exercise in a virtualized environment for Java workloads is similar to that in physical environment. The only difference is that virtualized environment provides more flexibility, such as the ability to easily change the compute resource configuration. For more detailed information, we encourage you to review the Enterprise Java Applications on VMware – Best Practices Guide (http://www.vmware.com/resources/techresources/1087)
Sizing Virtual Machines for JVM workloads – Part 1
Before delving into various sizing considerations we’ll provide some background information about the practical sizing limits of JVMs.
Background: JVM Practical Sizing Limits
Figure 1 illustrates the theoretical and practical sizing limits of Java workloads. These are critical limits that you need to be aware of when sizing JVM workloads.
Figure 1. Theoretical and Practical Sizing Limits of JVMs
The JVM theoretical limit is 16 exabytes, but there is no practical system that can provide this amount of memory, so we present this as the first theoretical limit.
- The second limit is the amount of memory a guest OS can support. In most practical cases, this is several terabytes and depends on the operating system used.
- The third limit is the ESXi 5 1TB RAM per virtual machine limit, which is ample for any workload that we have encountered.
- The fourth limit (really, the first practical limit) is the amount of RAM that is cost-effective on typical ESXi hosts.
- The fifth limit is the total amount of RAM across the server, and how this is divided into the number of NUMA nodes, where each processer socket will have one NUMA node worth of NUMA-local memory. The NUMA-local memory can be calculated as the total amount of RAM within the server divided by the number of processor sockets. We know that for optimal performance you should always size a virtual machine within the NUMA node memory boundaries. ESXi has many NUMA optimizations that come into play, but even so, it is best to stay NUMA local.
For example, if the ESX host has 256GB of RAM across two processor sockets, it has 2 NUMA nodes with 128GB (256GB/2) of RAM across each NUMA node. This implies that when you are sizing a virtual machine, it should not exceed the 128GB limit in order for it to be NUMA local.
The limits outlined above can help drive your design and sizing decision as to how practical and feasible it is to size large JVMs. However, there are other considerations that come with sizing very large JVMs such as GC tuning complexity and the knowledge needed to maintain large JVMs. In fact, most commonly sized JVMs within the VMware customer base are around 4GB of RAM for a typical enterprise Web application. On the other hand, larger JVMs exist, and we have customers that run large scale monitoring systems and large distributed data platforms on JVMs ranging from 4GB to 128GB.
With large JVMs comes the need to better understand GC tuning. VMware has helped many customers with their GC tuning activities, even though GC tuning on physical is no different than on virtual. The reason is that VMware has uniquely integrated vFabric Java and vSphere expertise into one spectrum, which has helped our customers optimally run many Java workloads on vSphere. When faced with the question of whether to vertically scale the size of the JVM and virtual machine, first consider a horizontal scale out approach. VMware has consistently found that our customers get better scalability with the horizontal scale out approach.
Furthermore, when sizing, it is helpful to categorize the size of the JVMs and virtual machines based on the Java workload types, as shown in Figure 2.
Figure 2. Common JVM Sizes and Workload Categories
We usually find that the reason customers vertically scale a JVM is due to perceived simplicity of deployment and leaving existing JVM processes intact. Be aware of workload-related choices.
- For example, a customer initially deploys one JVM process and as demand increases for more applications to be deployed, instead of horizontally scaling out by creating a second JVM and virtual machine, a vertical scale up approach is taken. As a consequence, the existing JVM is forced to vertically scale and carry many different types of workloads with varied requirements.
- Keep in mind that some workloads, such as, a job scheduler, have a need for high throughput, while a public facing Web application demands fast response time. Stacking these types of applications on top of each other within one JVM complicates the GC cycle tuning opportunity. When tuning GC for higher throughput it is usually at the cost of decreased response time, and vice-versa.
- You can achieve both higher throughput and better response time with GC tuning, but it unnecessarily extends the GC tuning activity. When faced with this deployment choice it is always best to split out different types of Java workloads into their own JVMs. One approach is to run the job scheduler type of workload in its own JVM and virtual machine, and the Web-based Java application on its own JVM and virtual machine.
In Figure 3, JVM-1 is deployed on a virtual machine that has mixed application workload types, which complicates GC tuning and scalability when attempting to scale up this application mix in JVM-2. A better approach is to split the Web application into JVM-3 and the job scheduler application into JVM-4 (that is, scaled out horizontally with the flexibility to vertically scale if needed). If you compare the vertical scalability of JVM-3 and JVM-4 versus vertical scalability of JVM-2 you will find JVM-3 and JVM-4 always scale better and are easier to tune.
Figure 3. Splitting Workload Types to Improve Scalability
In Part 2 we will look at an actual sizing example with some practical numbers that you can directly apply.