Rightsizing for Azure cloud cost optimization can be one of the most effective ways to reduce the cost of your monthly Azure bill. However, it can be difficult and overwhelming for organizations to identify which resources are running, at what capacity, and by which owners, in order to rightsize their environment—a challenge that worsens as they increase cloud usage and services, or adopt a multi-cloud strategy.
In this article, we’ll discuss some of the challenges with managing resource efficiency, how to approach rightsizing for Azure cloud cost optimization, and tools to help manage rightsizing your cloud environment at scale.
Reasons for resource inefficiency in the cloud
It’s common for developers to spin up new Virtual Machines (VMs) that are substantially larger than necessary. This may be intentional to give themselves extra headroom, or accidental since they don’t yet know the performance requirements of the new VM.
In addition to overprovisioning resources, other common causes of wasted cloud spend include unattached virtual server disk infrastructure, aged snapshots, disassociated IP addresses, zombie assets, running systems during non-production hours, etc. Regardless of the reason, failing to rightsize resources can lead to exponentially higher costs on your cloud bill.
Rightsizing your Azure cloud infrastructure
Rightsizing is the process of analyzing the utilization and performance metrics of your infrastructure, determining whether or not they’re running efficiently for what you’re paying, and then taking action to improve by modifying the infrastructure as needed—upgrading, downgrading, or terminating. This can be done for compute, storage, database, containers, and many other infrastructure types.
- Upgrading is recommended for workloads with consistently high utilization. The performance metrics must be analyzed over a period of time to come to this conclusion.
- Downgrading, or downsizing, is recommended for underutilized resources that achieve the same core performance, even with a downsized workload.
- Terminating is recommended for “zombie” resources, which are assets running within your account but are not in use. Terminating these resources results in immediate cost savings.
Rightsizing Azure Virtual Machines
When rightsizing VMs, it’s important to consider CPU, memory, disk, and network in/out utilization, and to review trended metrics over time. This way, you can make decisions around reducing the size of the VM without hurting the performance of the applications on the VM. For example, if memory utilization, network utilization, and/or disk use is above 50% of the provisioned capacity, downsizing a VM to half its current capacity will likely affect workload performance.
In these circumstances, it may be better to change the VM family—from, for example, General Purpose to Compute Intensive or Memory Intensive—or deploy the workload in a Virtual Machine Scale Set, which not only has the advantage of helping reduce spend in Azure, but which also increases application resiliency if a problem occurs with the VM. It may also be worth migrating the workload from a VM to containers to make more efficient use of resources.
A good starting place for rightsizing is to look for VMs that have an average CPU less than 5% and max CPU greater than 20% for 30 days. VMs that fit these criteria are viable candidates for rightsizing or termination.
Rightsizing Azure disk storage
Disk storage can also be rightsized. The critical factors to consider with disk storage are capacity, IOPs, and throughput. With this information, you can select the disk size from the options available for Standard SSD, Standard HDD, Premium SSD, or Ultra disks.
- Standard SSD: Cost-effective storage option optimized for workloads that need consistent performance at lower IOPs levels. Suitable for web servers, lightly used enterprise applications, and Dev/Test workloads.
- Standard HDD: Deliver reliable, low-cost disk support for VMs running latency-insensitive workloads. Suitable for backup, non-critical, infrequently accessed workloads.
- Premium SSD: Deliver high-performance and low-latency disk support for VMs with IO-intensive workloads. Suitable for production and performance-sensitive workloads.
- Ultra Disks: Deliver high throughput, high IOPs, and consistent low latency disk storage. Suitable for data-intensive workloads such as SAP HANA, top-tier databases, and transaction-heavy workloads.
One thing to note is that Premium storage is billed based on the total disk size, regardless of consumption. For example, if you attach an empty Premium P20 512GB disk to a VM, you will be charged for the full 512GB per month, regardless of use. Keep a close eye on the utilization of Premium storage to minimize wasted cost.
Rightsizing Azure database services
Similar to rightsizing your IaaS, Azure customers also need to rightsize PaaS services such as Azure SQL databases. There are two primary types of relational Azure database services (Azure SQL Database and SQL Managed Instances), and each type has three purchasing models (virtual core (vCore)-based, database transaction unit (DTU)-based, and serverless). Within the DTU and vCore purchasing models, there are three service tiers and the choice exists to deploy databases as single or multiple databases in an elastic pool.
In most cases, organizations will only be using one type of database service, and other than DTU-based Azure SQL databases, it’s only necessary to provision compute and storage. Memory will nearly always be close to 100%, IOPS limits are determined by the service tier, and workloads with unpredictable usage should be deployed in an elastic pool where they can benefit from the cost advantages of shared resources.
With DTU-based databases, there’s only one metric to provision–how many DTUs you need to make your database operate efficiently. But because of the way DTU-based databases increase in size (often doubling in capacity when you go up one size), these are the easiest to overprovision. Therefore, it’s important to constantly monitor the utilization of DTU-based databases in order to prevent unnecessary costs.
Azure SQL Database is one of the most popular cloud-based, fully managed, relational database services. For in-depth recommendations on how to optimize your Azure SQL Database, we recommend you see our dedicated article on the subject, which explains how you can optimize and manage Azure SQL Database across the three areas of excellence for cloud management: cloud operations, cloud financial management, and cloud security and compliance.
Learn more: How to Optimize Azure SQL Database
Optimize your cloud environment at scale
As you can imagine, manually checking every instance individually to ensure it’s rightsized for optimal cost and performance can be an overwhelming and time-consuming process. This is especially true for organizations with a large, disparate cloud footprint, or for organizations using more than one cloud provider.
A cloud management platform can provide the metrics and recommendations you need to make accurate rightsizing decisions at scale and across cloud environments. CloudHealth customers leverage the CloudHealth platform’s rightsizing functionality to quickly identify underutilized infrastructure and get recommendations for downgrading or terminating assets.
Recommendations are provided based on utilization and performance metrics (e.g. CPU, memory, etc.) that can be ingested into the platform via APIs, integration partners (e.g. Datadog, New Relic, Wavefront), or the CloudHealth Agent. Once the metrics are available, you have the power to set performance thresholds specific to your business and you can take advantage of advanced filtering capabilities by dynamic business groupings, regions, and more.
To learn more about how CloudHealth’s rightsizing functionality makes it easy to quickly identify underutilized infrastructure for Azure cloud cost optimization, see our solution brief here.
For more recommendations on how to lower your Azure bill, see our complete guide: 8 Best Practices for Reducing Spend in Azure