It is one of the most frequently asked questions that I deal with in my duties as a VMware Technical Account Manager, how do I guarantee performance for my current physical system that is being virtualized. On the face of it there seems to be numerous answers and approaches that can be used to achieve this goal however before we can choose one we need to understand exactly what the question is.
This seems to be the fundamental issue, the question is often not well understood and either made too complex or trivialised. When the question is made too complex we are not able to find an approach that will ever satisfy the question and when it is trivialised we run the risk of not being able to satisfy the requirements that are needed to perform this task.
So where do we start. There are many excellent web resources on the technical aspect of this but I am going to focus on the logical steps that I would take in order to achieve the desired outcome.
Step 1: Understand the current environment that the system resides in. This should include everything from the physical location, connectivity and any political issues such as an unwillingness to virtualize in the past. Do not leave any stone unturned in understanding what you are dealing with no matter how simple it seems.
Step 2: Interview the system owner, business owner and anyone else that has a stake in the system, including if possible users, understand all of their requirements in terms of performance expectations. I do not at this stage get concerned with existing performance assumptions that have been made by any of the people I have spoken to yet as many of these are based on their pre-existing notion of the system and it's operation be it good or bad.
Step 3: Baseline existing performance: This is the most important step and without it there is really no way to understand what we are required to deliver. I disregard any preconceived ideas of performance from my previous interviews and only focus on the existing service and what it is doing in terms of performance. This is a fairly lengthy stage and has a number of sub components to it as follows:
- Identify the physical system performance characteristic in terms of internal system measures, these include the CPU, Memory, Disk, Network and any other relevant items that you are able to collect. It is recommended that you collect this info for at least 4 weeks or longer to get a good picture of these performance variables during many different loading periods.
- Identify the performance of the system with respect to how users will perceive it. This is the harder part as it requires you to identify the most optimal way to measure that performance that a user will experience. There are several approaches here but I suggest that whatever you use make sure it is repeatable and consistent to ensure the results are not skewed. You may want to use a synthetic transaction type test system or possibly a script based timed test. These tests of course need to be run over a good period of time during peak and troughs in the environment to again ensure you are not missing out on any potential peak load that could cause issues if not catered for.
Step 4: Document all of the baseline results and ensure that are agreed upon with the business owner of the system as these will be used later on to identify how we are doing.
Step 5: Identify the system requirements based on the documented performance criteria to migrate the system to in the new VIrtual Environment. This might be a bit tricky as you are still going to perform baseline tests in the new environment so this will be just an initial take on what is required. Whether you set the requirements too high or too low here should not be of concern as the next step should take care of getting this right.
Step 6: Monitor the performance of the new environment using the same method as step 3 above but now in the new virtual environment. This again needs to be done using both methods to ensure that all performance criteria are being fulfilled. At this stage however we might find that system owners/users have higher or lower expectations on the new system. This can be a cause for concern however I always suggest that we don't use any of these subjective suggestions on performance and rather just focus on the technical part of getting performance base lined still at this stage. The goal here should be to at least be on parity with the existing system unless we already know that a very different performance characteristic is being demanded, we will however take care of this in the next step.
Step 7: Identify new requirements: Based on the system we have created in step 6 we now have the opportunity to either increase or decrease the system's performance. This is a very subjective part of the new system and I think should require documented evidence of existing performance and the rationale for the new performance characteristic, often this is due to an aging system or possibly due to the system being under utilized and now finding that more users will be using the system.
Step 8: Make changes to the system to fit the new requirements. This will be quite iterative and require lots of testing until the system satisfies the existing or new requirements. This is the stage where we need to ensure sign off once all owners are comfortable with what we have achieved and we also need to ensure as part of that sign off we have performance base lined and we have used a tool that can be used in the future should we find that owners are complaining of performance issues.
We very rarely hear complaints of good performance but of course we regularly hear people saying that the new environment is much slower than the old system for whatever reason, be it factual or perception based. By following these steps we should be in a good position to check and hopefully refute this should it occur anytime in the future life of the system. Taking the time to do this will save you endless issues in the future when the inevitable complaint comes in about the performance of the system.
There is of course much more to this, I have only touched on what we should do if there will be an increase in system usage and this can be a major issue and one that requires a blog post all on it's own as it changes our approach from relying on a baseline of the existing system to trying to ascertain the growth pattern and requirements for this. A very tricky situation and one that I hope to cover in the future.
In the meantime don't treat any system migration project as simply business as usual but rather take the time to baseline and manage the performance of the new virtualized system to ensure that you have delighted customers.