Resource and over-subscription management are always the most challenging tasks facing a Cloud Admin. To deliver a guaranteed SLA, one method OpenStack Cloud Admins have used is to create separate compute aggregates with different allocation / over-subscription ratios. Production workloads that require guaranteed CPU, memory, or storage would be placed into a non-oversubscribed aggregate with 1:1 over-subscription, dev workloads may be placed into a best effort aggregate with N:1 over-subscription. While this simplistic model accomplishes its purpose of an SLA guarantee on paper, it comes with a huge CapEx and/or high overhead for capacity management / augmentation. Worst yet, because host aggregate level over-subscription in OpenStack is simply static metadata consumed by the nova scheduler during VM placement, not real time VM state or consumption, huge resource imbalances within the compute aggregate and noisy neighbor issues within a nova compute host are common occurrences.
New workloads can be placed on a host running close to capacity (real time consumption), while remaining hosts are running idle due to differences in application characteristics and usage pattern. Lack of automated day 2 resource re-balance(management) further exacerbates the issue. To provide white glove treatment to critical tenants and workloads, Cloud Admins must deploy additional tooling to discover basic VM to Hypervisor mapping based on OpenStack project IDs. This is both expensive and ineffective in meeting SLAs.
Over-subscription works if resource consumption can be tracked and balanced across a compute cluster. Noisy neighbor issues can be solved only if the underlying infrastructure supports quality of service (QoS). By leveraging OpenStack Nova flavor extra-spec extensions along with vSphere industry proven per VM resource reservation allocation (expressed using shares, limits and reservations), OpenStack Cloud Admins can deliver enhanced QoS while maintaining uniform consumption across a compute cluster. It is possible to leverage Image metadata to deliver QoS as well, this blog will focus on Nova flavor extra-spec.
The VMware Nova flavor extension to OpenStack was first introduced upstream in Kilo and is officially supported in VIO release 2.0 and above. Additional requirements are outlined below:
- Requires VMware Integrated OpenStack version 2.0.x or greater
- Requires vSphere version 6.0 or greater
- Network Bandwidth Reservation requires NIOC version 3
- VMware Integrated OpenStack access as a cloud administrator
Resource reservations can be set for following resource categories:
- CPU (MHz)
- Memory (MB)
- Disk IO (IOPS)
- Network Bandwidth (Mbps)
Within each resource category, Cloud Admin has the option to set:
- Limit – Upper bound, not to exceed limit resource utilization
- Reservation – Guaranteed minimum reservation
- Share Level – The allocation level. This can be ‘custom’, ‘high’ ‘normal’ or ‘low’.
- Shares Share – In the event that ‘custom’ is used, this is the number of shares.
Complete Nova flavor extra-spec details and deployment options can be found here. vSphere Resource Management capabilities and configuration guidelines is a great reference as well and can be found here.
Let’s look at an example using Hadoop to demonstrate VM resource management with flavor extra-specs. Data flows from Kafka into HDFS, every 30 minutes there’s a batch job to consume the newly ingested data. Exact details of the Hadoop workflow are outside the scope of this blog. If you are not familiar with Hadoop, some details can be found here. Resources required for this small scale deployment are outlined below:
(reserved – Max)
(reserved – Max)
Master / Name Node
Based on above requirements, Cloud Admin needs to create Nova flavors to match maximum CPU / Memory / Disk requirements for each Hadoop component. Most of OpenStack Admins should be very familiar with this process.
Based on the reservation amount, attach corresponding nova extra specs to each flavor:
Once extra specs are mapped, confirm setting using the standard nova flavor-show command:
In just three simple steps, resource reservation settings are complete. Any new VM consumed using new flavors from OpenStack (API, command line or Horizon GUI) will have resource requirements passed to vSphere (VMs can be migrated using the nova rebuild VM feature).
Instead of best effort, vSphere will guarantee resources based on nova flavor extra-spec definition. Specific to our example, 4 vCPU / 16G / Max 1G network throughput will be reserved for each DataNode, NameNode with 4 vCPU / 16G / Max 500M throughput and Kafka nodes will have 20% vCPU / 50% Memory reserved. Instances boot into “Error” state if requested resources are not available, ensuring existing workload application SLAs are not violated. You can see that the resource reservation created by the vSphere Nova driver are reflected in the vCenter interface:
Name Node CPU / Memory:
Name Node Network Bandwidth:
Data Node CPU / Memory:
Data Node Network Bandwidth:
Kafka Node CPU / Memory:
Kafka Network Bandwidth:
vSphere will enforce strict admission control based on real time resource allocation and load. New workloads will be admitted only if SLA can be honored for new and existing applications. Once a workload is deployed, in conjunction with vSphere DRS, workload rebalance can happen automatically between hypervisors to ensure optimal host utilization (future blog) and avoid any noisy neighbor issues. Both features are available out of box, no customization is required.
By taking advantage of vSphere VM resource reservation capabilities, OpenStack Cloud Admins can finally enjoy the superior capacity and over-subscription capabilities a cloud environment offers. Instead of deploying excess hardware, Cloud Admins can control when and where additional hardware are needed based on real time application consumption and growth. Ability to consolidate, simplify, and control your infrastructure will help to reduced power, space, and eliminate the need for any out of box customization in tooling or operational monitoring. I invite you to test out nova-extra spec in your VIO environment today or encourage your IT team to try our VMware Integrated OpenStack Hands-On-Lab, no installation is required.