Home > Blogs > VMware Operations Transformation Services > Tag Archives: task automation

Tag Archives: task automation

Understanding Process Automation: Lean Manufacturing Lessons Applied to IT

by: Mike Szafranski

With task automation, it is pretty simple to calculate that it is worth taking 2 hours to automate a 10-minute task if you perform that task more than 12 times. Even considering the fixed and variable costs of the automation solution, the math is pretty straightforward.

But the justification for automating more complex processes composed of dozens of ‘10 minute tasks’ completed by different actors – including the inevitable scheduling and wait time between each task – is a bit more complex. Nonetheless, an approach exists.

You can find it laid out in Kim, Behr, and Spafford’s modern classic of business fiction, The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win [IT Revolution Press, 2013], in which the authors show how the principals of lean manufacturing are directly applicable to IT process automation.

So what lessons do we learn when building a case for process automation by applying lean manufacturing principles to IT Ops? Let’s take a look.

Simple Steps Build the Business Case

First, you need to break the process you’re interested in into its constituent parts.

Step 1 – Document Stages in the Process and Elapsed Time. Through interviews, identify major process stages and then document the clock time elapsed for each. Note, use hard data for elapsed time if possible. People involved in the process rarely have an accurate perception of how long things really take. Look at process artifacts such as emails, time stamps on saved documents, configuration files, provisioning, or testing tool log files to measure real elapsed time.

Step 2 – Document Tasks and Actors. Summarize what gets accomplished at each stage and, most importantly, detail all the tasks and record which teams perform them. If a task involves multiple actors working independently with a handoff, that task should be broken down into sub-tasks.

Step 3 – Document FTE Time. Record the work effort required for each task. We’ll call that the Full Time Equivalent (FTE). This is the time it takes to do the actual task work, assuming no interruptions, irregularities, or rework.

Step 4 – Document Wait Time. Understanding wait time is critical to building a case for process automation. If actors are busy, or if there are handoffs between actors, then elapsed time is often multiple times longer than FTE time. This is because at each handoff, the task must sit in queue until a resource is ready to process the task.

After taking these steps, you can summarize in a chart similar to this.

In Lean Manufacturing, the concept of wait time or queue time has a mathematical formula [see chapter 23 of The Phoenix Project]. The definition is:

The formula, of course, offers hard proof of what you already knew – that the busier you are, the longer it takes to get new work done. With multiple actors on a task, each can contribute to wait time, with the amount they contribute depending on how busy they are.

In the example below, there are five separate teams (security, network, dev, QA and VM) involved in the Validate Firewall step in the flow. Each team is also busy with other tasks. 

Figure 2. In a manually constructed environment, the network settings, firewall rules, and application ports need to be validated. More often than not, they need to be adjusted due to port conflicts or firewall rules. Wait times correlate strongly with % ultilization.

As you can see, the time spent by FTEs is 5.5 hours, which is only around 15% of the clock time. Clearly, with complex tasks, FTE is only a part of the story.

Step 5 – Account for Unplanned Work. Unplanned work occurs when errors are found, requiring a task from an earlier step in the process to be reworked or fixed.

In complex automation, unplanned work is another reality that complicates the process and increases FTE time. It also dramatically impacts clock time – in two ways. First, there’s the direct impact of additional time spent waiting for the handoff back upstream in the process. Second, and even more dramatic, is the opportunity cost. Planned work tasks need to stop while the process actor sets things aside and addresses the unplanned work. Unplanned work can thus have a multiplier effect, causing cascading delays up and down the process flow.

One aim of automation, of course, is to reduce unplanned work – and that reduction that can also be calculated, further adding to the business case for process automation. Indeed, studies have shown that, currently, unplanned work consumes 17% of a typical IT budget.

Process Automation Can Offer More Than Cost Reduction

But there’s potentially even more to the story than a complete picture of IT work and detailed accounting of reduced work effort and timesavings. The full impact of process automation can include:

  • Improved throughput
  • Enabling rapid prototyping
  • Higher quality
  • Improved ability to respond to business needs

The cumulative impact of these can be substantial. Indeed, it can easily exceed the total impact of direct cost reductions.

Step 6 – Estimate total benefit to business functions. If calculating the value of reducing FTE, wait times, and unplanned work is relatively straight forward, figuring the full business impact of reducing overall calendar time for a critical processes (from 4 weeks to 36 hours, say) requires more than a direct cost reduction calculation. It’s worth doing, though, because the value derived from better quality, shorter development times, etc., can substantially exceed the value of FTE hours saved through automation (see figure 3). 

Figure 3. The secondary impacts of automating processes and increasing agility and consistency can be much larger than the value of the FTE hours saved.

You do it by asking IT customers to detail the benefits they see when processes are improved. There are many IT KPIs that can help here, such as the number of help desk tickets received in a specific period, or the number and length of Severity 1 IT issues.

We used this method at VMware when we automated dev/test provisioning and improved the efficiency of 600 developers by 20%. We achieved a direct cost reduction related to time and effort saved. But we found an even bigger impact, even if it was harder to quantify, in improved throughput, in always being able to say, “Yes” to business requests, and in enabling rapid prototyping.

Lessons Learned

With these steps, you can capture major process stages, tasks, actors, calendar time, work effort, and points of unplanned work, quantifying the business value of automating a process end-to-end – and making your case for end-to-end process automation all the stronger.

Key takeaways:

  • It’s possible to make a business case for automating end-to-end IT processes;
  • You can do this by applying concepts from lean manufacturing;
  • The concepts of wait time and unplanned work are central;
  • Efficiency driven cost reduction is only part of the equation, however;
  • To quantify the full value of agility, work with IT customers to gauge improvements in KPIs that reflect improved business outcomes.

Follow @VMwareCloudOps on Twitter for future updates, and join the conversation by using the #CloudOps and #SDDC hashtags on Twitter.

Refresher Course in Automation Economics

It’s a key question in developing a private or hybrid cloud strategy: “What processes should we automate?”

There are plenty of candidates: provisioning; resource scaling; workload movement. And what about automating responses to event storms? Incidents? Performance issues? Disaster recovery?

To answer the question, though, you need to first establish what you’re looking to gain through automation. There are two basic strategic approaches to automation, each with specific value propositions:

  • task automation – where the proposition is more, better, faster
  • service automation – where you’re looking to standardize and scale

In my last post, I looked at how the automation strategy determines your HR needs.

In this post, I’ll highlight a simple economic model that can be used to cost justify task automation decisions. Next time, I’ll refine the math to help analyze decisions about what to automate when pursuing a service automation strategy.

The Cost Justification for Task Automation – the Tipping Point

From a cost perspective, it makes sense to automate IT tasks if:

  • the execution of the automated task has a lower cost than the execution of a manual version of the task.
  • the automated process can be run a large number of times to spread the cost of development, testing, and ongoing maintenance of the automation capability.

Brown and Hellersten at the IBM Thomas Watson Research Center expressed the idea in a simple model.[1] It compares the fixed and variable costs of manual process versus automated version of the same process. The cost calculation is based on the variable N, which represents the number of times the automated process will execute.

IT organizations typically automate existing manual processes. So we consider the fixed cost of developing the manual process as part of the automated process costs.

With these two equations, we can solve for an automation tipping point Nt. Nt, then, is the number of times a process is executed at which it becomes cost effective to automate the process.

Changing the task automation tipping point

Now, what actions could we take that would shift the tipping point? We might:

1. Reduce automation fixed costs. If we can drive down automation fixed costs, automation becomes economically attractive at lower number of process executions.

Automation fixed costs include purchasing and maintaining the automation platform, as well as standardizing process inputs, ensuring the process is repeatable, developing policies, coding automation workflow based on those policies, testing each automation workflow, documenting error and establishing exception handling procedures. We also need to add in ongoing maintenance and management of automation routines that may change as IT processes evolve. If any of this work can become highly standardized, Nt will be lower, which will in turn increase the scope of what can be further automated.

2. Minimize automation variable costs. Reducing automation variable costs also makes automation attractive at lower number of executions.

Variable costs include both the cost of each automation execution and the cost of managing exceptions that typically are triaged via manual resolution processes. With a very large number of process executions, the variable cost of each incremental automated process execution would essentially be zero except for costs related to handling exceptions such as errors and process failures. Standardizing infrastructure and components configurations, and thus management processes, reduces exceptions and lowers the tipping point.

3. Pick the right tasks.  Automating manual processes with high cost of execution is an obvious win. The slower and harder the manual task, the higher the cost of each execution, and the lower the tipping point for automating the process.

Benefits other than cost reduction

Automation offers benefits beyond cost reduction, of course. In the cloud era, demand for agility and service quality are also driving changes in the delivery and consumption of IT services.

Automation for agility 

Agility is key when it comes to quickly provisioning a development or a test environment, rolling it into production, avoiding the need for spec hardware, accelerating time to market and reducing non-development work. Typically, 10-15% of total development team effort is spent just configuring the development environment and its attendant resources. Automation can make big inroads here. Note, too, that agility and speed-to-market factors, which generally have a revenue-related value driver, typically aren’t included in task automation tipping point calculations.

Automation for service quality

Automation promises greater consistency of execution and reduced human error, quality-related benefits that also aren’t factored in the calculations above. Downtime has a cost, after all. Deploying people with different skills and variable (and often ad hoc) work procedures at different datacenter facilities, for example, directly impacts service quality. Automated work procedures reduce both human error and downtime.

Back to the math

Really, we should add the quality-related costs of error and inconsistency to our manual variable processes costs, since they mirror how automation error recovery costs are calculated.

To account for the manual process quality costs, the tipping point calculation could replace “Manual variable costs” with “(Manual variable costs + Manual quality costs)” in the denominator.

Doing that would further lower tipping point number that justifies automation.

Here’s how I sum up these concepts applied to task automation environment:

  • If a manual task is easy, it is difficult to justify automating it because the tipping point number will be very high or never reached
  • If a manual process is hard and error prone, it is easy to justify automation i.e. Nt is a low number
  • If there are a lot of process exceptions that result in a large percentage of process executions that result in a manual intervention – it makes it harder to justify automation
  • If automation routines are hard to program, or take a lot of time and effort to tweak and maintain over time due to ad hoc run book procedures – it makes it harder to justify automation

In the next post, I’ll explore the economic justifications for automation under a service automation strategy.

Follow @VMwareCloudOps for future updates, and join the conversation by using the #CloudOps and #SDDC hashtags.


[1] Reducing the cost of IT Operations – Is automation always the answer? IBM Thomas J. Watson Research Center.  Proceedings of the 10th conference on Hot Topics in Operating Systems, June 12-15, 2005, Santa Fe, NM