by: Mike Szafranski
With task automation, it is pretty simple to calculate that it is worth taking 2 hours to automate a 10-minute task if you perform that task more than 12 times. Even considering the fixed and variable costs of the automation solution, the math is pretty straightforward.
But the justification for automating more complex processes composed of dozens of ‘10 minute tasks’ completed by different actors – including the inevitable scheduling and wait time between each task – is a bit more complex. Nonetheless, an approach exists.
You can find it laid out in Kim, Behr, and Spafford’s modern classic of business fiction, The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win [IT Revolution Press, 2013], in which the authors show how the principals of lean manufacturing are directly applicable to IT process automation.
So what lessons do we learn when building a case for process automation by applying lean manufacturing principles to IT Ops? Let’s take a look.
Simple Steps Build the Business Case
First, you need to break the process you’re interested in into its constituent parts.
Step 1 – Document Stages in the Process and Elapsed Time. Through interviews, identify major process stages and then document the clock time elapsed for each. Note, use hard data for elapsed time if possible. People involved in the process rarely have an accurate perception of how long things really take. Look at process artifacts such as emails, time stamps on saved documents, configuration files, provisioning, or testing tool log files to measure real elapsed time.
Step 2 – Document Tasks and Actors. Summarize what gets accomplished at each stage and, most importantly, detail all the tasks and record which teams perform them. If a task involves multiple actors working independently with a handoff, that task should be broken down into sub-tasks.
Step 3 – Document FTE Time. Record the work effort required for each task. We’ll call that the Full Time Equivalent (FTE). This is the time it takes to do the actual task work, assuming no interruptions, irregularities, or rework.
Step 4 – Document Wait Time. Understanding wait time is critical to building a case for process automation. If actors are busy, or if there are handoffs between actors, then elapsed time is often multiple times longer than FTE time. This is because at each handoff, the task must sit in queue until a resource is ready to process the task.
After taking these steps, you can summarize in a chart similar to this.
In Lean Manufacturing, the concept of wait time or queue time has a mathematical formula [see chapter 23 of The Phoenix Project]. The definition is:
The formula, of course, offers hard proof of what you already knew – that the busier you are, the longer it takes to get new work done. With multiple actors on a task, each can contribute to wait time, with the amount they contribute depending on how busy they are.
In the example below, there are five separate teams (security, network, dev, QA and VM) involved in the Validate Firewall step in the flow. Each team is also busy with other tasks.
Figure 2. In a manually constructed environment, the network settings, firewall rules, and application ports need to be validated. More often than not, they need to be adjusted due to port conflicts or firewall rules. Wait times correlate strongly with % ultilization.
As you can see, the time spent by FTEs is 5.5 hours, which is only around 15% of the clock time. Clearly, with complex tasks, FTE is only a part of the story.
Step 5 – Account for Unplanned Work. Unplanned work occurs when errors are found, requiring a task from an earlier step in the process to be reworked or fixed.
In complex automation, unplanned work is another reality that complicates the process and increases FTE time. It also dramatically impacts clock time – in two ways. First, there’s the direct impact of additional time spent waiting for the handoff back upstream in the process. Second, and even more dramatic, is the opportunity cost. Planned work tasks need to stop while the process actor sets things aside and addresses the unplanned work. Unplanned work can thus have a multiplier effect, causing cascading delays up and down the process flow.
One aim of automation, of course, is to reduce unplanned work – and that reduction that can also be calculated, further adding to the business case for process automation. Indeed, studies have shown that, currently, unplanned work consumes 17% of a typical IT budget.
Process Automation Can Offer More Than Cost Reduction
But there’s potentially even more to the story than a complete picture of IT work and detailed accounting of reduced work effort and timesavings. The full impact of process automation can include:
- Improved throughput
- Enabling rapid prototyping
- Higher quality
- Improved ability to respond to business needs
The cumulative impact of these can be substantial. Indeed, it can easily exceed the total impact of direct cost reductions.
Step 6 – Estimate total benefit to business functions. If calculating the value of reducing FTE, wait times, and unplanned work is relatively straight forward, figuring the full business impact of reducing overall calendar time for a critical processes (from 4 weeks to 36 hours, say) requires more than a direct cost reduction calculation. It’s worth doing, though, because the value derived from better quality, shorter development times, etc., can substantially exceed the value of FTE hours saved through automation (see figure 3).
Figure 3. The secondary impacts of automating processes and increasing agility and consistency can be much larger than the value of the FTE hours saved.
You do it by asking IT customers to detail the benefits they see when processes are improved. There are many IT KPIs that can help here, such as the number of help desk tickets received in a specific period, or the number and length of Severity 1 IT issues.
We used this method at VMware when we automated dev/test provisioning and improved the efficiency of 600 developers by 20%. We achieved a direct cost reduction related to time and effort saved. But we found an even bigger impact, even if it was harder to quantify, in improved throughput, in always being able to say, “Yes” to business requests, and in enabling rapid prototyping.
Lessons Learned
With these steps, you can capture major process stages, tasks, actors, calendar time, work effort, and points of unplanned work, quantifying the business value of automating a process end-to-end – and making your case for end-to-end process automation all the stronger.
Key takeaways:
- It’s possible to make a business case for automating end-to-end IT processes;
- You can do this by applying concepts from lean manufacturing;
- The concepts of wait time and unplanned work are central;
- Efficiency driven cost reduction is only part of the equation, however;
- To quantify the full value of agility, work with IT customers to gauge improvements in KPIs that reflect improved business outcomes.
Follow @VMwareCloudOps on Twitter for future updates, and join the conversation by using the #CloudOps and #SDDC hashtags on Twitter.