This is the final blog in our Rightsizing Recipe for Success series. In our previous two installments we discussed consolidating data and creating functional business groups ( e.g. function, environment, and owner), and then analyzing target data sets to effectively optimize your infrastructure. At this point we’ve essentially baked the cake in the rightsizing recipe. Now it’s time to put on the icing and get ready for primetime!
We should be feeling pretty good about the current state and optimization of our resources. But there is one final question. How can we maintain an optimized environment?
To accomplish this we need a system to eliminate noise so we can focus resources on the parts of our infrastructure that need attention. This brings us to policies.
Policies allow us to set rules to flag instances that don’t meet the performance thresholds we’ve established. Since we do not have any app usage patterns to predict our infrastructure performance, we should implement three basic policies to get us started until we see how our app is used.
Underutilized Instances Policy
This will help keep us in check if there is a seasonal period that impacts our app. We can create a policy to generate a notification when servers should be downgraded for cost efficiency purposes. For example, if average CPU is less than 25% and max CPU is less than 40% for one week, then email product management.
Overutilized Instances Policy
If our hot new app goes viral, we want to be ready. Consider creating a policy to flag overutilized instances and upgrade them to keep everything running smoothly. The last thing we want is performance issues when we’re in the spotlight! To receive notifications for overutilized instances, use a similar template as the one above. For example, we would create a critical notification to product management when the average CPU/Memory/Disk is more than 80% and max CPU/Memory/Disk is more than 95% for one week.
Terminating Zombie Instances Policy
Finally, to round out the initial set of policies, create a policy that identifies zombie instances. This can save the tech team from over provisioning resources and burning through a budget! Design a simple policy that notifies product management when the average metric you are measuring is say, less than 5% and the max is less than 20% for one week. If we’re confident with the thresholds we can also automate the instance termination with a workflow.
This 5 step rightsizing recipe can help ensure our hot new app is running efficiently and that a process for ongoing optimization has been established. Now we’re ready for primetime!
Final Thoughts
If using Excel feels overwhelming to manage your growing infrastructure, a solution such as CloudHealth can help. CloudHealth pulls your cost, usage, and performance metrics, consolidates them into a single pane of glass view, and tracks assets at a resource level to easily identify functional business groups for your infrastructure. Additionally, the custom scoring algorithm will reduce your evaluation time by providing concise recommendations based on min, max, and average performance scores. Custom rightsizing reports can be generated and sent to individual users and managers to optimize usage.
To accelerate optimization, you can take action to terminate, downsize or upgrade instances directly from the platform. Lastly, the cloud governance capabilities provide a seamless way to generate policies and notifications for any of the above scenarios.
By following our rightsizing steps for success, it becomes easy to consolidate data, clean up the environment, analyze performance, optimize usage, and save money. Once historical data exists for a particular environment, you can effortlessly create policies with CloudHealth to be notified of changes in your cloud environment as it evolves.