Home > Blogs > VMware Operations Transformation Services > Tag Archives: IT operations

Tag Archives: IT operations

Adopt Before You Adapt Your IT Processes

worthingtonp-cropBy John Worthington

Many people familiar with ITSM have heard the expression ‘adopt & adapt’ as a good practice, but it’s worth noting the order in which these words are placed. You must adopt before you can adapt. This leads to the question, when has a process been ‘adopted’?

Incomplete Process [i]

If a process doesn’t have a purpose, or if the process purpose is not understood by the organization, it is hard to consider it implemented. If a process is performed so inconsistently or irregularly over time or in different business units, that it does not systematically achieve its purpose, it has not been adopted.

At this level, more efforts are needed to adopt the process. This may require transitional change efforts that may include strategy, structures, and/or systems.

Performed Process

If the process achieves its purpose it’s normally considered ‘adopted’, even if the relative maturity is low. The organization understands the purpose of the process and there is evidence that the outcomes of the process are achieved, such as the production of a document, change of state or meeting a goal.

Reviewing the base practices associated with the process can help determine whether all the desired outcomes of the process are achieved, even if some specific outputs (i.e., work products) are not in evidence.

At this level of maturity, the process can be adapted and improved. This requires developmental change efforts; project plans that should communicate the changes and provide knowledge transfer to key stakeholders.

Why is this important?

When we are adapting multiple processes as part of an ITaaS or SDDC transformation, even a single incomplete process can significantly increase the scope of the effort. You cannot adapt what has not been adopted!

ITaaS Transformation and Established Processes

Incident Management is typically a process that has been adopted. For example, all these objectives[ii] may be met:

  • Process AdoptionEnsure that standardized methods and procedures are used for efficient and prompt response, analysis, documentation, ongoing management and reporting of incidents
  • Increase visibility and communication of incidents to business and IT support staff
  • Enhance business perception of IT through use of a professional approach in quickly resolving and communicating incidents when they occur
  • Align incident management activities and priorities with those of the business
  • Maintain user satisfaction with the quality of IT services

Even if the process documentation is not elaborate the process may be achieving its purpose and providing its expected outcomes. It’s not uncommon for organizations to have this process formally described, have trained practitioners, be well supported by tools and standardized across the organization. These would characterize this as Level 3 (Established) maturity.

In this case, vRealize Operations can integrate easily with the process (i.e., by automatically creating Incident records in the tool) as appropriate.

ITaaS Transformation and Incomplete Processes

In an ITaaS transformation, capacity management can be an example of an incomplete process. For example, the following objectives[iii] of capacity management may be difficult to achieve:

  • Produce and maintain an appropriate and up-to-date capacity plan, which reflects the current and future needs of the business
  • Ensure that service performance achievements meet all of their agreed targets by managing the performance and capacity of both services and resources
  • Assist with the diagnosis and resolution of performance and capacity related incidents and problems

If IT services are not well defined, if the problem management process is not established, or if the capacity management process is not well supported (by people and technology) then it is common that it would not meet Level 1 (Performed) requirements.

The use of vRealize Operations can address the technology process support requirements, but you will still need to define services to manage service performance and you will still need to establish problem management as a process in order to assist with capacity related problems. You may also need to establish roles associated with capacity and performance management that are not currently well defined in the organization.

[Note: it is not required that an ITIL-based process be in existence, but the process will still need to be considered performed (adopted) in order to adapt.]

Adopt or Adapt is not a matter of choice

You do not choose developmental, transitional or transformational change; you discover what change is required based on organizational demands[iv]. This is why assessment and discovery activities are so important. These activities make sure your implementation plans have the inputs needed to ensure a complete plan, and the appropriate developmental, transitional and/or transformational strategies.

In the examples provided, we can easily adapt the existing incident management process to ITaaS. However, there may be more work needed to establish capacity management and related processes. The level of effort needed to achieve this can vary significantly based on organizational requirements, objectives and your starting point.

Understanding this is key to establishing a transformation path that minimizes effort and maximizes the value out of the people, processes and technology — in other words, developing ITaaS organizational capabilities.

[i] Process Assessment and ISO/IEC 15504, Van Loon
[ii] ITIL© Incident Management
[iii] ITIL© Capacity Management
[iv] Beyond Change Management: How to Achieve Breakthrough Results Through Conscious Change Leadership, by Dean Anderson and Linda Ackerman Anderson

=====================

John Worthington is a VMware transformation consultant and is based in New Jersey. Follow @jMarcusWorthy and@VMwareCloudOps on Twitter.

Staffing Your Cloud Organization – A Heuristic Model

Approximating staffing ratios in a cloud organization as a logarithmic function of infrastructure metrics.

Pierre Moncassin-cropBy Pierre Moncassin

Customers who want to establish true cloud services based on VMware’s SDDC solution (or any other provider for that matter), realize that in order to fully leverage the technology, they need to adapt their IT organization.

More specifically, they need to setup a dedicated team – a cloud Center of Excellence (COE) to manage and operate their cloud services.

The structure and roles of that team are described in detail in ‘Organizing for the Cloud’.

During practically all Operations Transformation projects, a question frequently asked is: what is the optimum staffing level to setup this cloud organization (FTE a.k.a. Full Time Equivalent)?

The standard consultant answer is of course  ‘it depends’. But in this blog, I will explain in more detail what “it depends” means in this context.

In an earlier blog, I described “10 key factors to estimate staffing ratios to operate platforms with vRealize Automation and vRealize Operations Manager”.

  • Number of lines of business
  • Number of data centers
  • Level staff skill/experience
  • Number of cloud services
  • Workflow complexity
  • Internal process complexity (includes support requirements eg 5 days/5 or 24 hour/7)
  • Number of third party integrations
  • Rate of change
  • Number of VM’s
  • Number of user dashboards/reports

Now these 10 factors, and probably hundreds of other factors will determine the complexity of the tasks that the cloud organization needs to perform and therefore, the staffing level. Clearly there are thousands of possible combinations of these factors. But if I want to see how the FTE count evolves with a single , easy-to-quantify parameter (such as number of virtual machines or any other ‘simple’ infrastructure metric’), we need to make strict assumptions to ‘tie down’ the other factors.

So let’s assume that we are looking at a single organization evolving over time; as time passes the number of virtual machines gradually increases, but so does the number and complexity of the services, as well as the demand for support coverage:

  1. Between 1 and 100 VM’s, the COE is running as a pilot, there are no support requirements, only a small number of services to run.
  2. Between 100 and 1000 VM’s., the COE is running cloud services regionally with some basic service levels.
  3. Over say, 30,000 VM’s, the COE is now running a global operation with 24/7 support requirement and a broad range of services.

Practical observation of a number of real-life examples suggests an evolution broadly similar to the logarithmic curve in figure 1. Now this is still a model that deliberately simplifies and ‘smooths out’ the FTE curve, but there are two practical implications:

  • The staffing levels may rise most steeply at the beginning of the curve. When the organization transitions from a pilot to a fully operating COE, the staffing need levels rise significantly.
  • The FTE curve flattens out then the organization matures and can handle high volumes. Once the COE is operating with a high level of automation with experienced staff, adding workload only requires a marginal increase to the FTE’s count.

In reality of course the complexity – i.e. the demand on FTE – never grows quite linearly.

We would see threshold effects. For example when we reach 300 worksloads, a new 24×7 service may be added to the portfolio, which requires a rapid increase in FTE.

Take-aways:

  • The faster rise in FTE will occur in the early stages of build-up of cloud services; this is ‘normal’ given that we see an increase altogether of the number of services and the service levels and therefore significantly increasing the demands on the cloud organization;
  • Once well established and automated, the FTE level should only increase marginally with rising infrastructure volumes – your organization will have learned to cope with increasing quantities.
  • We need to caveat that although the FTE curve may look broadly logarithmic, threshold effects are inevitable: new demands on service level (eg new compliance requirements, 24×7 etc) can create an ‘uptick’ in FTE without necessarily a prior ‘uptick’ in volumes.

What we have presented here in an intuitive model to understand how increasing volumes impact FTE. You are welcome to share your experience and perhaps refine this heuristic model.

—-
Pierre Moncassin is an operations architect with the VMware Operations Transformation global practice and is based in the UK.

VMworld 2015 – Day 4 Recap

Wednesday Sept 2

dc2105-150x150By Andy Troup

Kevin Lees, our principal architect in our Operations Transformation Services practice, spoke today about Best Practice Approaches to Transformation with the Software Defined Data Center. Kevin speaks from experience, spending most of his time with customers on-site with transformation projects. Kevin has seen firsthand what works and what certainly doesn’t. Recommendations he shared this morning included:

  • Start with a formal service definition process—include all stakeholders (LoB, Ops, Infrastructure, Dev, Finance)
  • Include Security and Compliance right off the bat
  • A 360 degree service definition exercise drives technology decisions, not the other way around
  • Look at the new roles that will be needed: e.g. Business Relationship Manager, Service Owner
  • Create a Service Marketing Plan for key stakeholders in the organization
  • Assume change will be constant—adopt an agile planning methods (e.g. 2 week sprints); release features on a regular basis rather than waiting for final project completion
  • Take an iterative approach rather than a sequential approach. Start simple, gradually expand (this applies to the process side as well as the service offering side.)
  • Merge workstreams: technical workstream, operations transformation workstream, cloud service management
  • Break down silo’s (Kevin has some really good advice here arming and rewarding champions or change agents in the functional groups to help this happen. Exec sponsorship is also critical.)

You can find the session recording on the VMworld mobile app or vmworld.com to get the benefit of all of Kevin’s insights.

Last day of the conference is tomorrow! Here’s what to attend:

  • 10:30 AM
    OPT 5029 How to Use Service Definitions to VMware vRealize Business to Build Highly effective, Service-Based Cost Models
  • 10:30 AM
    OPT 4707 Integrating vRealize Automation with ITSM and Service Catalog
  • 12:00 PM
    OPT5709 Building a SDDC with CIT (customer presentation)
  • 1:30 PM
    OPT 5369 Proactive Monitoring of a Service: People, Process and Technology

Don’t forget to use the VMworld mobile app to easily locate these final day sessions.

And, thanks for sharing the week with us! Please do reply to this post with any observations about the subject of transformation of your own, either from your own experiences or as a result of any the Operations Transformation sessions you attended this week. Looking forward to hearing from you.


Andy Troup is a Cloud Operations Architect with over 25 years of IT experience. He specializes in Cloud Operations and Technology Consulting Service Development. Andy is also a vCAP DCA and VCP. Andy possesses a proven background in design, deployment and management of enterprise IT projects. Previously, Andy co-delivered the world’s first and subsequent vCloud Operational Assessments (Colt Telecomm & Norwegian Government Agency) to enable the early adoption of VMware’s vCloud implementation.

 

 

 

 

Establish Your IT Business Management Office (ITBMO) To Run IT Like a Business

Khalid HakimBy Khalid Hakim

We hear a lot about (and maybe have interacted with) Project Management Offices (PMOs), and possibly about Service Management Offices (SMOs), but IT Business Management Office (ITBMO) sounds like a new buzz word in today’s modern IT business taxonomy. PMOs typically focus on the management and governance of IT projects, while SMOs are responsible for the governance and management of IT services and the processes to ensure effective service delivery. ITBMOs, however, go beyond this to the next IT business maturity level to address business and finance partnership with IT to help IT organizations transform into services-based, business-oriented, and value-focused organizations.

Click here to read my blog on this topic, covering the business outcomes and value of the ITBMO, the functions contained within, and six steps for standing up the ITBMO.

ITBMO

And if you’re heading to VMworld, don’t miss this session on Tuesday 9/1 at 5:30pm!

OPT 5075 6 Steps to Establish Your IT Business Management Office (ITBMO) with VMware vRealize Business

VMworld 2015

———-

Khalid Hakim is an operations architect with the VMware Operations Transformation global practice. You can follow him on Twitter @KhalidHakim47.

VMware’s Internal Cloud – A Use Case for Operating a Cloud at Scale

Ahmed_cropped By Ahmed Al-Buheissi

OneCloud is VMware’s internal cloud that is used by the various internal departments to test, experiment, demonstrate and train on the different products at VMware. It is an impressively large cloud, hosting about 80,000 virtual machines (VM’s), with about 40,000 running VM’s at any one time. It is used by R&D, Internal IT, Education, and Field sales, to name but a few.

The VMware Operations Transformation team engaged the OneCloud management team to discuss and understand the operation of such a large cloud. The following are some of the areas that were explored:

Evolution

The cloud was first established in 2012, where it started in a single datacenter and one core. The following year it expanded rapidly into a second datacenter and three more cores.  Soon the word spread and new tenants were invited. By the end of 2013, the demand was so high, it almost ran out of capacity. This means that 2014 was primarily focused on increasing capacity, both vertically and horizontally, and at scale. The team almost doubled in size, and spread across five datacenters in different locations around the world.

Organization

The OneCloud team consists of about 80 full-time staff, divided into an Infrastructure Operations team and a Service Operations team. The structure closely aligns with VMware’s cloud operations model, as documented in the VMware whitepaper Organizing for the Cloud.  

Service Operation

The OneCloud team runs like a separate business, with its own service desk and support structure. The team utilizes ServiceNow to manage the IT Service Management processes, such as Incident, Problem and Change management. They also have their own service delivery liaisons who manage the relationship with tenants, and help in understanding their use of the system, upcoming projects and any future demand.

Infrastructure Operations

This use-case is a live example of utilizing VMware-on-VMware, where VMware products are utilized at the hypervisor, networking and storage levels, then for provisioning and automation, and also for monitoring. Most of the VMware vCloud Suite of products are used for running the OneCloud infrastructure.

Hear more at VMWorld!

If you want to hear more about operating OneCloud, please join us at VMworld 2015 on Wednesday Sept 2nd at 2:30PT for the session: 80,000 VM’s and growing!  VMware’s Internal Cloud Journey told by the People on the Frontlines.  Hear the people who helped create and manage this large cloud talk about their experiences.

To add this session via the VMworld Schedule Builder, search for OPT5972.

====
Ahmed Al-Buheissi is an operations technical architect with the VMware Operations Transformation global practice and is based in Melbourne, Australia.

Transforming Your IT Operations

The way customers now consume technology has changed significantly.  They are becoming more comfortable using platforms such as e-commerce marketplaces.  In the US markets alone, online sales are predicted to grow by almost 60% to over $400B by 2018.

IT organizations are now aggressively trying to transform their technology platforms from static legacy infrastructure to the dynamic, agile infrastructures provided by virtualization, cloud and the software-defined data center.

In this short video, operations architect David Crane shares why your focus needs to include more than just technology when moving to a new technical architecture and infrastructure for IT operations. Analysis of, along with the planning and design of your future state operating model, are just as important if you want to realize the full benefits of transformation.

Change Management’s Balancing Act

worthingtonp-cropBy John Worthington

I had some interesting discussions with a client about Change Management the other day. There was considerable focus around the risk matrix; after all, the risk of a change dictates approval flow….change is about managing risk right?

Balancing ActThe change management challenge is about balancing positive risk (value) with negative risk (service interruption), and as always we want to maximize positive risk and minimize negative risk.

From a change management perspective, maximizing positive risk is focused on acceleration and velocity. In order to do this, we need proven repeatable procedures. Building a library of proven change models will not only provide higher quality (via repeatability), but a rich source of information for automation that can further increase velocity (and time to value). Standard changes are an easy and logical place to start building this library.

The focus on minimizing negative risk, and a desire to perfect the risk matrix, can divert attention to the process. Don’t get me wrong, we do want a consistent way to assess risk across all stakeholders, but tuning the risk matrix will not build repeatability into the process and will not by itself reduce risk.

It’s the actions taken by change management practitioners – change managers, CAB Members and change analysts – that can increase the velocity of change while reducing risk. This is accomplished by fine-tuning a library of predefined procedures for known types of changes.

So if the powers that be seek to slow the process down by formally reviewing more and more changes, don’t focus on the risk matrix — at the end of the day the powers that be will decide what needs to be reviewed (think change freeze, major incidents) and it won’t matter what the risk matrix looks like.

The risk matrix is a guide that should direct all staff to do look at how changes can be more effectively modeled. As the library of repeatable changes increased, as automation is enabled, risk can be managed at higher velocities. Start with standard changes and move up from there; the risk matrix is likely to take care of itself.


John Worthington is a VMware transformation consultant and is based in New Jersey. Follow @jMarcusWorthy and@VMwareCloudOps on Twitter.

 

Top 5 Tips for Organizing the Cloud

You’re ready to reap the rewards. Is your organization ready to deliver?

5 Tips for Organizing the CloudThe technical and business advantages of the software-defined cloud era are well understood. But all too often a critical aspect of adopting the cloud model is overlooked: the organizational impact. The fact is the transition to the cloud changes roles, skills, processes and organizational structures. Yet, many IT leaders become so focused on the vision of the cloud—or its technological requirements—that they lose sight of whether their IT staff is properly prepared for the new world.

The resource, Top 5 Tips for Organizing the Cloud, will help you prepare for—and execute—a successful move to the cloud.

Green vs. Grey — Rethinking Your IT Operations

Neil MitchellBy Neil Mitchell

Can you really create a new greenfield IT organization with no legacy constraints?

In this short video, operations architect Neil Mitchell explains that while anything is theoretically possible, most IT execs need to face the reality of impact on legacy IT operations.

====
Neil Mitchell is an operations architect with the VMware Operations Transformation global practice and is based in the UK.

Key People, Process and Policy Considerations for vRealize Automation Success

Keng-Leong-Choong-cropBy Choong Keng Leong

Organizations implement VMware vRealize Automation (vRA) with the aim of shortening the provisioning of infrastructure services and the release of applications through self-service and automation. To achieve this, there is a need for balance between governance and business agility. Projects are more likely to fail or face significant obstacles if they do not plan adequately and ensure the necessary policies, processes and workflows are in place.

In this blog, we’ll explore some of these key planning and design activities that are often overlooked on the journey to cloud automation.

Key Players

vRealize Automation - Key PlayersThe very first thing we need to do is identify key players. The roles are mapped to actual team members in the organization. Minimally, we need to identify:

  • Service consumers – Authorized users of the self-service portal who can request and manage their cloud services, and which business groups they belong to
  • Approvers – Approves all possible requests
  • Cloud administrators Administers and manages the cloud infrastructure, cloud resources, and the configuration and maintenance of vRA
  • System administrators – Administers, configures and maintains the guest operating systems in the virtual machine
  • Application administrators – Installs, administers, configures and maintains the application software hosted on the virtual machine
  • Cloud security and compliance analyst –Monitors, analyzes and tests the security and compliance of application, guest OS and infrastructure

A common mistake is not identifying all the necessary key players and involving them in the planning and design early, which could have drastic impact to the vRA workflow designs.

Service Models

vRealize Automation - Service ModelsThe next step is to determine what cloud services will be offered through vRA. Many organizations start by offering Infrastructure-as-a-Service (IaaS), provisioning virtual machines leveraging existing vSphere virtual machine templates. For organizations that are heavily virtualized, this is not transformational and has very little incremental impact visible to the business.

To realize the full values of vRA, organizations should look beyond provisioning up to the OS level. The steps that follow after the server with OS is ready usually involve manual or scripted steps and multiple parties (app, middleware, db, security, etc.). Being able to automate these steps, package them and offer the package as a cloud service will result in significant efficiency gains. For example, instead of offering Windows 2012 as a catalog item, why not offer a SQL Server 2012 or a Tier 2 Application consisting of a pair of load-balanced Apache Tomcat Servers and a SQL Server?

Developing service models requires engaging the business to understand their requirements. For example, what is the point in offering a Windows Server 2003 R2 catalog item when no new business applications will be running on it. We also need to understand the service levels and performance requirements so that we can provision the machines in the correct pool of resources that provide these capabilities. We also need to identify which business groups will be entitled to these services.

Request Models

vRealize Automation - Request ModelsOnce the service models are defined, we can identify all the use cases for vRA and the types of requests within the scope of vRA. Request models (i.e. workflows) for the services are mapped out and documented. These may include:

  •  Request for a virtual machine
  • Request for a database server
  • Request to increase the resources of a virtual machine (e.g., add CPU, Memory)
  • Request to extend the lease of a virtual machine
  • Request to reboot a virtual machine
  • Request to decommission virtual machine
  • Request to snapshot a virtual machine
  • Request to back up a virtual machine

It is common to start by mapping out the current workflows and automating some of the steps using vRA and/or vRealize Orchestrator. While this approach may be quick, it has proven inadequate in many customer use cases I have encountered. Requirements to interface with a business system, process and function appear in late stages of the vRA implementation project, jeopardizing the project’s schedule and budget. In order for an organization to automate as much of the process as possible and make significant impact to service provisioning and delivery times, the whole service fulfillment cycle needs to be studied, optimized and transformed. It’s imperative to understand the whole business process through initiation of an IT/business project, budgeting, approval, procurement, installation, building, integration, testing, release, operation, management, support and retirement. Then, you must identify how the vRA will fit and interface with the various stakeholders, functions, processes and systems. Sometimes, it is necessary to have the vRA interface with external workflows already existing in other systems such as an IT service management (ITSM) system.

In addition, each request model needs to be correctly categorized and aligned with the organization’s governance policy and processes. For example, a request for a virtual machine in production vs. a machine for development will require different change management process, approval levels and approvers. These considerations should be incorporated into the design of the workflows and vRA approval policies. The request models can also be re-categorized to reduce governance overhead due to risk reduction with process automation and standardization of blueprints.

Access and Entitlement Management

vRealize Automation - Access & Entitlement ManagementAfter the key players, service models and request models are finalized, the different security access roles for vRA can be defined and mapped to the key players, so that they have adequate permissions and privileges to perform their tasks defined in the request models. Entitlements to the services are also configured and granted to the respective business groups and/or users.

Communication and Awareness

vRealize Automation - Communication & Transition SupportBefore the launch of the vRA, don’t forget to brief all key players on the processes and how to use the vRA based on their roles. Print and distribute reference cards and stickers to remind them of the process steps and how to get support when needed. It is important to cater for more hand-holding and support during the initial transition phase. The project will fail if users start to revert to old ways and stop using vRA.

========
Choong Keng Leong is an operations architect with VMware Professional Services and is based in Singapore. You can connect with him on LinkedIn