Home > Blogs > VMware Operations Transformation Services > Tag Archives: Cloud Ops

Tag Archives: Cloud Ops

Turning IT Operations Management Dreams Into Reality

No one would blame you for dreaming about better IT operations management. You might dream up ways to make it smarter—say by anticipating problems and troubleshooting them in virtual and physical environments before they even occur.

Learn why this is no pipe dream in this infographic about VMware vRealize Operations Insight, a unified management solution with predictive analytics for performance management, capacity optimization, and real-time log analytics.

Turning IT Ops Mgmt Dreams Into Reality

VMware vCenter Operations Manager Users: Raise Your Hands!

Keng-Leong-Choong-cropBy Choong Keng Leong

I innocently asked attendees in a workshop I was delivering at one of my clients, “Who uses VMware vCenter Operations Management Suite in your company?“ I got two simple answers: “Cloud administrator” or “VM administrator.”  This triggered me to write this blog and hopefully will change your thinking if you have the same answer.

The vCenter Operations Management Suite consists of four components:

  • vCenter Operations Manager : Allows you to monitor and manage the performance, capacity and health of your SDDC infrastructure, operating systems and applications
  • vCenter Configuration Manager : Enables you to automate configuration management across virtual and physical servers, and continuously assess them for compliance with IT policies, regulatory, and security compliance
  • vCenter Hyperic : Helps to monitor operating systems, databases, and applications
  • vCenter Infrastructure Navigator : Automatically discovers and visualizes application components and infrastructure dependencies

If I were to map the vCenter Operations Management Suite to the IT processes it can support, it would look like the matrix shown in Table 1:

Table 1: A Possible vCenter Operations Management Suite to Process Mapping

What Table 1 also implies is that multiple roles will be using and accessing vCenter Operations Manager, or be a recipient of its outputs, i.e., reports. For example, the IT Director can access the vCenter Operations Manager Dashboard to view the overall health of the infrastructure. The Application Support team accesses it via a Custom Dashboard to understand applications status and performance. The IT Compliance Manager reviews the compliance status of IT systems on the vCenter Operations Manager Dashboard and gets more details from the vCenter Configuration Manager to initiate remediation of the systems.

Table 2 below shows a possible list of roles accessing the vCenter Operations Management Suite.

Table 2: Possible List of Roles Using vCenter Operations Management Suite

Tables 1 and 2 illustrate clearly that vCenter Operations Management Suite is not just another lightweight app for the cloud or VM administrator — it supports multiple IT operational processes and roles.

Taking a step further, you need to embed vCenter Operations Management Suite into operational procedures to take maximum advantage of the tools’ full potential and integrated approach to performance, capacity, and configuration management. To draw an analogy –  if you deploy a new SAP system without defining the triggers or use cases for a user to access the SAP system; establishing the procedural steps on which modules to access and how to navigate in the system; what to input; how to query and report and so on; it is unlikely the system will be rolled out successfully.

Although vCenter Operations Management Suite is not as complex, the concept is the same. You need to define procedures with tight linkage to the tools to ensure they are used consistently and in the way it is designed or configured for.

I hope that my blog motivates you to start thinking about transforming your IT operations to make full use of the capabilities of your VMware technology investment.

Choong Keng Leong is an operations architect with VMware Professional Services and is based in Singapore. You can connect with him on LinkedIn

How to Create a More Accurate, Useful, and Equitable Service Costing Process

By Khalid Hakim

In my last post, I described the pressing need for a more effective IT service costing process (as a solution to the pressing need for tighter business/IT alignment!). The question now… is how.

How can companies create a service costing process that is fast, accurate, transparent, granular, fair, and consistent—without introducing yet another time-consuming and expensive project to the IT docket?

To answer this question, I’ll use a real-world example—a company that has recently been through the process and achieved excellent results.  I won’t name the company but I can assure you it’s a real enterprise, and the results were also quite real. I’ll refer to it as “The Company.”

In the past, The Company charged for IT services by simply allocating the total IT costs among service consumers based on the number of desktops and laptops they used. This “lump sum” cost allocation of course led to the perception that “IT is always expensive.” The Company knew it needed to move to a more service-oriented, customer-centric model.

Our team provided guidance in setting up a better service costing process. The implementation began with three key steps:

1. Create a service-based cost model diagram.
A cost diagram depicts the flow of IT costs from the general ledger or cost sources all the way to the services being provisioned—in such a way that IT can present consumers with a statement showing all service costs.

2. Develop a service-based cost allocation strategy.
The Company already had a cost allocation process in place, but it did not deal with service-to-service cost allocation. So The Company’s lump-sum allocation did not demonstrate any IT value but only the cost of IT when running the business. That’s why a service-to-service costing method was needed that would account for everything:

  • Servers and their related hardware
  • Network and security allocation to servers
  • Data center, storage and data center facilities costs
  • Software and enterprise license agreements
  • Support and operations contracts
  • IT project costs
  • Labor costs
  • IT overhead

This way The Company was able to fully load IT costs into services being delivered and transfer these costs to the business.

3. Develop a service-based cost classification strategy.
The Company classified all IT service costs into the following categories:

  • In/Out Service Costs: All direct service-related costs should be part of a service cost, while non-related service costs should be part of accounting period costs.
  • Fixed vs. Variable costs. Variable costs vary based on usage or time (i.e. data center utility bills, support tickets, or service consumption). Fixed costs are fixed regardless of service or resource usage (i.e.  software license costs, hardware purchases and support contracts).
  • Direct vs. Indirect. A direct cost is directly related to a service and can be easily traced. Indirect costs are indirectly related to a service and are typically spread over a number of services.
  • CapEx vs. OpEx. Capital expenditures are major expenses incurred whose costs have to be depreciated (split over) over the useful life of an IT asset, while operational expenditures are incurred periodically.

After the strategic IT/business processes were carried out, service-specific tasks began. These included defining and charting individual services, developing service-specific cost packages, and tracking and managing service costs over time.

Next it was time to consider the “people” aspects of service costing. Technology cannot provide a solution on its own; it must be developed and deployed in conjunction with stakeholders.

Along the same lines, handling IT financial management (ITFM) activities is not a one-man show. At The Company, the following roles were defined, along with specific responsibilities:

  • Financial Controller
  • IT Financial Manager
  • Service Manager
  • IT Manager
  • CIO
  • Customer Relationship Manager
  • VP of Infrastructure Services

Developing a roles/responsibilities chart (known as a RACI) then provided a concise and easy way to track who does what along with the level of contribution and accountability.

Next came identifying the right technologies to use in the service costing process. In this case The Company made a strategic investment with VMware and deployed the VMware IT Business Management Suite. This is the technology that will help them gain cost transparency, align with the business, enable the CIO’s transformation agenda, and control and optimize the IT budget and costs.

In addition, The Company has implemented basic CIO and Service Manager dashboards to provide insight into the financial performance of all managed services. The dashboards define the visual layout of the user experience. Each dashboard is composed of frames that display customized information designed for the intended user. The dashboards enable the CIO, IT Financial Manager, and Service Managers to gain access to cost information and make data-driven decisions.

SCP white paper coverToday The Company’s IT department is well on its way to being more business-oriented and service-oriented through better service costing. The Company can now trace IT costs from general ledger all the way to all business units consuming IT services. The new costing model also lays out the key roles and responsibilities, and VMware technology helps provide cost automation, transparency, and service-based cost modeling.

I’d encourage you to get full details about the service costing process outlined in this blog post by reading my white paper, Real IT Transformation Requires a Real IT Service Costing Process.”

Until next time—may all of your IT service costs be allocated with fine granularity and full transparency!

Khalid Hakim is an operations architect with the VMware Operations Transformation global practice. You can follow him on Twitter @KhalidHakim47.

VMworld-graphicAnd if you’re heading to VMworld, don’t miss Khalid’s session #OPT1572!

Accelerate Your IT Transformation — How to Build Service-based Cost Models with VMware IT Business Management (ITBM)
A recent VMware survey showed 75% of IT decision makers list the number one challenge in IT financial management as lack of understanding of the true cost of IT services. ITBM experts and VMware Operations Transformation Architects Khalid Hakim and Gary Roos shed light on this alarming figure, and give practical advice for obtaining in-depth knowledge of the cost of IT services so you can provide cost transparency back to the business.

When you visit the VMworld 2014 Schedule Builder, be sure to check out the SDDC > Operations Transformation track for these and other sessions to help you focus on all the aspects of IT transformation.


Guidance for Major Incident Management Decisions

By Brian Florence

Brian Florence-cropIf you’re an IT director or CIO of a corporation that has large, business-critical environments, you’re very aware that if those environments are unavailable for any length of time, your company will be losing a lot of money every minute of that downtime (millions of dollars, even).

Most of my IT clients manage multiple environments, many of which fall into the business-critical category. One proactive step is to define “key” or “critical” environments, which can be assigned to a specific individual accountable for the restoration of service for that environment.

The Information Technology Infrastructure Library (ITIL) defines a typical incident management process as one that is designed to restore services as quickly as possible, and a “major incident” management process is designed to focus specifically on business-critical service restoration. When there are incidents causing major business impact that are beyond typical major incident management functions,  it’s important to pinpoint accountability (special attention, even beyond their regular major incident process) for those business-critical environments where your company would experience a significant loss of capital or critical functionality.

The First Responder Role

Under multiple business-critical environment scenarios, each major environment is assigned a first responder who assumes the major incident lead role for accountability and leadership. The first responder has accountabilities that are typically over and above the normal incident management processes for which an incident manager and/or major incident manager may be responsible. The first responder’s accountabilities are to:

  • Restore service for those incidents that fall into the agreed-upon top priority assignment (P0/P1, or S0/S1, depending upon whether priority or severity is the chosen terminology), as well as all technical support team escalations and communications to management regarding incident status and follow-up, once resolved.
  • Create documentation to guide the service restoration process (often referred to as a playbook or other unique name recognized for each major environment), which specifies contacts for technical teams, major incident management procedures for that specific environment, identification of the critical infrastructure components that make up the environment, or other environment-specific details that would be needed for prompt service restoration and understanding of the environment.
  • Develop the post-incident review process and communications, including the follow-up problem management process (in coordination with any existing problem management team) to ensure its successful completion and documentation.

I also recommend that this primary process management role of accountability be assigned to someone familiar with all of the components and processes of the specific environment they are responsible for, so the management process can run as smoothly as possible for business-critical incidents.

Reducing the Business-Impact of Major Incidents

With a first responder in place, the procedure for resolving major incidents is more prescribed. With each major incident, your company learns what is causing incidents—and most importantly, has a documented process in place for resolution.  Ultimately, the incidents are resolved faster and more efficiently, and your company avoids costly loss of critical functionality or capital due to downtime and is able to avoid similar incidents in the future

The business increasingly looks to IT to drive innovation. By keeping business-critical environments available, you can deliver on business goals that contribute to the bottom line.

Brian Florence is a transformation consultant with VMware Accelerate Advisory Services and is based in Michigan.

Using vCloud Suite to Streamline DevOps

By: Jennifer Galvin

A few weeks ago I was discussing mobile app development and deployment with a friend. This particular friend works for a company that develops mobile applications for all platforms on a contract-by-contract basis. It’s a good business. But one of the key challenges they have is the time and effort required to install a client’s development and test environment so that they can start development. Multiple platforms need to be provisioned. And development and testing tools that may be unique to each platform must be installed. This results often in needing to maintain large teams with specialized skills and having to maintain a broad range of dev/test environments.

I have always been aware that VMware’s vCloud Suite can speed up deployment of applications, (even complex application stacks), but I didn’t know if long setup times were common in the mobile application business. So I started to ask around:

“What was the shortest time possible it would take for your development teams to make a minor change to a mobile application, on ALL mobile platforms – Android, iPhone, Windows, Blackberry, etc?”

comic part 1 The answers ranged between “months” and “never”.

Sometime later, after presenting VMware’s Software Defined Datacenter vision to a tech meetup in Washington, D.C. a gentleman approached me to discuss the question posed. While he liked the SDDC vision, he wondered if I knew of a way to use vCloud Suite and software controlled everything to speed up mobile development. So I decided to sketch out how the blueprints and automated provisioning capabilities of the vCloud Suite could help speed up application development on multiple platforms.

First, let’s figure out why this is so hard in the first place – after all, mobile development SDK’s are frameworks, and while it takes a developer to write an app, the SDK is still doing a lot of the heavy lifting. So why is this still taking so long? As it turns out, there are some major obstacles to deal with:

  • Mobile applications always need a server-side application to test against: mobile applications interact with server-side applications, and unless your server-side application is already a stable, multi-tenant application that can deal with the extra performance drain of 32 developers running amok (and you don’t mind upsetting your existing customers), you’re going to need to point them at a completely separate environment.
  • The server-side application is complex and lengthy to deploy: A 3-tier web application with infrastructure (including networking and storage), scrubbed production database data to provide some working data, and front-end load balancing is the same kind of deployment you did when the application initially went into production. You’re not going to start development on your application any time soon unless this process speeds up (and gets more automated).

Let’s solve these problems by getting a copy of the application (and a copy of production-scrubbed data) out into a new Testing area so the developers can get access to it, fast. vCloud Suite provides a framework for the server-side application developers to express its deployment as a blueprint, capable of deploying not just the code, but all the properties to automate the deployment, and consumes capacity from on-premises resources as well as those from the public cloud. That means that when it comes time to deploy a new copy (with the database refreshed and available), it’s as easy as a single click of a button.

comic part 2comic part 3Since the underlying infrastructure is virtualized, compute resources are used or migrated to make room for the new server-side application. Other testing environments can even be briefly powered down so that this testing (which is our top priority) can occur.

Anyone can deploy the application, and what used to take hours and teams of engineers can now be done by one person. However, we are still aiming to deploy this on all mobile platforms. In order to put all of our developers on this challenge, we first need to ensure they have the right tools and configurations. In the mobile world, that means more than just installing a few software packages and adjusting some settings. In some cases, that could mean you need new desktops, with entirely different operating systems.

Not every mobile vendor offers an SDK on all operating systems, and in fact, there isn’t one operating system that’s common to the top selling mobile phones today.

For example, you may only develop iOS applications using xCode, which runs only on Mac OSX. Both Windows and Android rely on an SDK compatible with Windows, and each has dependencies on external libraries to function (especially Android). Many developers typically favor MacBooks running VMware Fusion to accommodate for all of these different environments, but what if you decide that to re-write the application quickly, you require some temporary contractors? Those contractors are going to need those development environments with the right SDKs and testing.

This is also where vCloud Suite shines. It provides Desktop as a Service to those new contractors. The same platform that allowed us to provision the entire server-side application allows us to provision any client-side resources they might need.

By provisioning all of the infrastructure at once, we are now ready to redevelop our mobile app. We can spend developer time development and testing, making it the best app possible, instead of wasting resources for work environment deployment.

comic part 4

Now, let’s think back to that challenge I laid out earlier. Once you start deploying your applications using VMware’s vCloud Suite, how long will it take to improve your mobile applications across all platforms? I bet we’re not measuring that time in months any longer. Instead, mobile applications are improved in just a week or two.

Your call to action is clear:

  • Implement vCloud Suite on top of your existing infrastructure and public cloud deployments.
  • Streamline your application development process by using vCloud suite to deploy both server and client-side applications, dev and test environments, dev and test tools, and sample databases – for all platforms – at the click of a button.

Follow @VMwareCloudOps on Twitter for future updates, and join the conversation by using the #CloudOps and #SDDC hashtags on Twitter.

A Critical Balance of Roles Must Be in Place in the Cloud Center of Excellence

By: Pierre Moncassin

There is a rather subtle balance required to make a cloud organization effective – and, as I was reminded recently, it is easy to overlook it.

One key requirement to run a private cloud infrastructure is to establish a dedicated team i.e. a Cloud Center of Excellence. As a whole this group will act as an internal service provider in charge of all technical and functional aspects of the cloud, but also deal with the user-facing aspects of the service.

But there is an important dividing line within that group: the Center of Excellence itself is separated between Tenant Operations and Infrastructure Operations. Striking a balance between these teams is critical to a well-functioning cloud. If that balance is missing, you may encounter significant inefficiencies. Let me show you how that happened to two IT organizations I talked with recently.

First, where is that balance exactly?

If we look back at that Cloud Operating Model (described in detail in ‘Organizing for the Cloud‘), we have not one, but two teams working together: Tenant Operations and Infrastructure Operations.

In a nutshell, Tenant Operations own the ‘customer-facing’ role. They want to work closely with end-users. They want to innovate and add value. They are the ‘public face’ of the Cloud Center of Excellence.

On the other side, Infrastructure Ops only have to deal one customer – Tenant Operations. In addition to this, they also have to handle hardware, vendor relationships and generally, the ‘nuts and bolts’ of the private cloud infrastructure.

Cloud Operating Model
But why do we need a balance between two separate teams? Let’s see what can happen when that balance is missing with two real-life IT organization I met a little while back. For simplicity I will call them A and B – both large corporate entities.

When I met Organization A, it had only a ‘shell’ Tenant Operations function. In other words, their cloud team was almost exclusively focused on infrastructure. The result? Unsurprisingly, they scored fairly high on standardization and technical service levels. End users either accepted a highly standardized offering, or had to go through loops to negotiate obtained exceptions – neither option was quite satisfactory. Overall, Organization A struggled to add recognizable value to their end-users: “we are seen as a commodity”. They lacked a well-developed Tenant Organization.

Organization B had the opposite challenge. They belonged to a global technology group that majors on large-scale software development. Application development leaders could practically set the rules about what infrastructure could be provisioned. Because each consumer group yielded so much influence, there was practically a separate Tenant Operation team for each software unit.

In contrast, there was no distinguishable Infrastructure Ops function. Each Tenant Operations team could dictate separate requirements. The overall intrastructure architecture lacked standardization – which risked defeating the purpose of a cloud approach in the first place. With a balance tilted towards Tenant Operations, Organization B probably scored highest on customer satisfaction – but only as long as customers did not have to bear the full cost of non-standard infrastructure.


In sum having two functionally distinct teams (Tenants and Infrastructure) is not just a convenient arrangement, but a necessity to operate a private cloud effectively. There should be ongoing discussions and even negotiation between the two teams and their leaders.

In order to foster this dual structure, I would recommend:

  1. Define a chart for both teams that clearly outlines their relative roles and ‘rules of engagement.’
  2. Make clear that each team’s overall objectives are aligned, although the roles are different. That could be reflected through management objectives for the leaders of each team. However, this also requires some governance in place to give them the means to resolve their discussions.
  3. To help customers realize the benefits of standardization, consider introducing service costing (if not already in place) – so that the consumer may realize the cost of customization.

Follow @VMwareCloudOps and @Moncassin on Twitter for future updates, and join the conversation by using the #CloudOps and #SDDC hashtags on Twitter.

Industry Veterans Share Key Lessons on Delivering IT as a Service

Ian Clayton, an ITSM industry veteran, and Paul Chapman, VMware’s Vice President of Global Infrastructure & Cloud Operations, know a lot about IT service delivery. Join Ian and Paul next Tuesday, July 23rd at 9am PT, as they share real lessons learned about delivering IT as a Service.

You will get more out of this brief webcast than most of the sessions presented at expensive conferences!

The webinar will cover:

  • VMware’s own IT service delivery transformation based on cloud
  • Business justification of an ITaaS delivery model
  • Key success factors for driving technology and operational transformation

Outside-in thinking is needed to give IT a winning strategy. But inside-out leadership is required to make the changes that enable a successful execution. Don’t miss this opportunity to hear from IT experts as they share real advice in successfully delivering IT as a Service in the cloud era – register now!

We’ll also be live-tweeting during the event via @VMwareCloudOps – follow us for updates! Also join the conversation by using the #CloudOps and #SDDC hashtags.

Tips for Using KPIs to Filter Noise with vCenter Operations Manager

By: Michael Steinberg and Pierre Moncassin

Deploying monitoring tools effectively is both a science and an art. Monitoring provides vast amounts of data, but we also want to filter the truly useful information out of these data streams – and that can be a challenge. We know how important it is to set trigger points to get the most out of metrics. But deciding where exactly to set those points is a balancing act.

We all know this from daily experience. Think car alarms: If limits are set too tight, you can trigger an alarm without a serious cause. People get used to them. They become noise. On the other hand, if limits are too loose, the important events (like an actual break in) are missed, which reduces the value of the service that the alarm’s supposed to deliver.

Based on my conversations with customers, vCOps’ out-of-the-box default settings tend to be on the tight side, sometimes resulting in more alerts than are useful.

So how do you make sure that you get the useful alerts but not the noise? I’ve found that assigning Key Performance Indicators (KPIs) to each VM is the best way to filter the noise out. So this post offers some tips on how to optimally use KPIs.

First, Though, a Quick Refresher on KPIs

By default, vCOps collects data for all metrics every five minutes. As part of its normal operations, vCOps applies statistical algorithms to that data to detect anomalies in performance – KPIs are outputs from those algorithmic measurements.

Within vCOps, a metric is identified as a KPI when its level has a clear impact on infrastructure or application health. When a KPI metric is breached, the object it is assigned to will see its health score impacted.

A KPI breach can be triggered in the following ways:

  • The underlying metric exceeds a given value (Classic Threshold).
  • The underlying metric is less than a given value (Classic Threshold).
  • The underlying metric goes anomalous. This is a unique capability of vCOPs where a ‘normal’ range is automatically calculated so that abnormal values can be detected.

Typically, you would use one of these three options when setting a threshold, but combinations are also allowed. For example, you may want to set a classic threshold for disk utilization that exceeds a certain percentage. This can be combined with a dynamic threshold – where an alert is triggered if CPU utilization goes above its monthly average by more than x%.

Tips for Optimizing KPIs

KPIs provide the granular information that make up the overall health score of a component in the infrastructure, such as an application. The overall health score is a combination of statistics for Workload, Anomalies, and Faults.

Overly-sensitive KPI metrics, however, can cause health scores to decrease when there isn’t an underlying issue. In such instances, we need to optimize the configuration of vCOps so that the impact of anomalous metrics on health scores is mitigated.

Here are some ideas for how to do that:

Tip 1 – Focus on Metrics that Truly Impact Infrastructure Health

First, it’s good to limit the number of metrics you put in place.

With too many metrics, you’re likely to have too many alerts – and then you’re still in a situation analogous to having car alarms going off too often to be noticed.

Remember, overall health scores are impacted by any metric that moves outside its ‘normal’ range. vCOps calculates the ‘normal’ range based on historical data and its own algorithms.

Tip 2 – Define KPI Metrics that will Trigger Important Alerts

Next, you want the alerts that you do define to be significant. These are the alerts that impact objects important to business users.

For example, you could have a business application with a key dependency on a database tiers. An issue with a database or its performance would thus impact the user community immediately. To highlight these metrics, then, you’d want to focus on the set of metrics that can most closely monitor that database’s infrastructure setup KPIs.

Tip 3 – Use KPIs Across All Infrastructure Levels

In order to see the maximum benefit of KPI metrics, each metric should be assigned to the individual virtual infrastructure object (i.e. Virtual Machine), as well as any Tiers or Applications that the Virtual Machine relates to.

This is an important step as it makes the connection between the VM metrics and the application it relates to. For example, it may not be significant in itself that a VM is over-utilized (CPU usage over threshold), but it becomes important if the application it supports is impacted.


Let’s assume a customer has a series of database VM servers that are used for various applications. The VM, Tier and Application assignments are illustrated below in the table.

VM Tier Application
orasrv1 DB WebApp1
orasrv2 DB CRMApp1
orasrv3 DB SvcDesk1

The application team has specified that the CPU Utilization for these VMs should not exceed 90% over three collection intervals (15 minutes). Therefore, our KPI metric is CPU Utilization %.

The KPI metric is assigned to all of the resources identified in the table above. Each VM has the KPI assigned to it. The DB Tier within each Application also has the KPI assigned to it. For example, the DB tier within the WebApp1 application is assigned a KPI for the orasrv1 VM. Finally, each Application also has the KPI assigned to it. For example, the WebApp1 application is assigned a KPI for the orasrv1 VM.

With these assignments, health scores for the VMs, Tiers and Applications will all be impacted when the CPU Utilization for the respective VM is over 90% for 15 minutes. Virtualization administrators can then accurately tell application administrators when their Application health is being impacted by a KPI metric.


When it comes to KPI alerts, there are 3 steps you can take to help “filter the noise” in vCOPs.

1)   Focus on a small number of metrics that truly impact infrastructure health.

2)   Define KPI metrics that will trigger the important alerts.

3)   Set up these KPI metrics consistently across infrastructure levels (eg VM, Application, DB), so that issues are not missed any particular level.

For future updates, follow @VMwareCloudOps on Twitter, and join the conversation by using the #CloudOps and #SDDC hashtags.

Reaching Common Ground When Defining Services – Highlights from #CloudOpsChat

On May 30th, we hosted our monthly #CloudOpsChat on “Reaching Common Ground When Defining Services.” Thanks to all who participated for making it an informative and engaging conversation. We would also like to thank John Dixon (@GreenPagesIT) from GreenPages and Khalid Hakim from VMware (@KhalidHakim47) for co-hosting the chat with us.

To kick off the chat we asked, “What exactly is an IT service?”

Our co-host @KhalidHakim47 suggested they are intangible by nature, unlike products. Our other co-host, @GreenPagesIT, gave the textbook answer: IT services are an asset worthy of investment. He added that an application alone is not an IT service. @kurtmilne defined an IT service as something designed to deliver something to someone in a form or function that meets their need. @AngeloLuciani said that an IT service delivers a business outcome. @KongYang saw it as a bounded deliverable that states which things are being provided by whom and the support that’s to be rendered when things fail.

Next we asked, “Why should you define services in the first place?” Followed by, “What are the benefits of doing so for your users?”

@KhalidHakim47 started off by saying that you cannot claim you manage services until they are defined in the first place. @kurtmilne said service definitions set expectations, which are a key dependency for creating satisfied users. @jfrappier added to Khalid’s point, saying that you also can’t control your public cloud vendors, so as a consumer you need clear definitions. Khalid went on to say that without a service definition, the boundaries may be loose between IT deliverables – setting expectations becomes much clearer when you address a well-defined service. @harrowandy chipped in saying the definition of services helps to make sure that the customer and IT are expecting the same outcome, with which @alamo_jose agreed. Co-host @GreenPagesIT said IT services help to organize people around a delivery objective instead of a technology objective.

We then noted that multiple roles contribute to specifying a service definition and asked, “What roles are involved in defining each service?”

@KhalidHakim47 argued that the driving and accountable role for defining a service is the service owner/manager, but it is not a one-man show. According to Khalid, @CloudOpsVoice and @alamo_jose, some of the key roles involved include the Business Unit Liaison, IT Service Manager, Consumer Relationship Manager, Portfolio/Catalog Manager and Architect, the Service Liaison Manager and Service Catalog Manager. Co-host @GreenpagesIT explained that at first pass, it’s a small group that defines the service, but eventually more parties become involved as you roll into CSI. @harrowandy said the service must have an owner who takes the service from cradle to grave and from initiation to retirement.

We then asked our audience, “Are there recommended approaches to getting multiple groups of users to reach consensus in their service definition?”

@AngeloLuciani explained that groups need to be driven by the business strategy and outcomes. @harrowandy agreed, adding that if groups don’t know the business strategy, how can IT provide them what they want? Co-host @KhalidHakim47 suggested that during the service definition planning phase, all roles that are expected should be looped into the exercise with clear goals and outcomes. @KongYang made a great analogy, saying too many chefs in the kitchen will kill the service – instead, we should look to have one chef for one service, a point with which many of our participants agreed.

Next, co-host @GreenPagesIT wondered: “Are there recommended approaches to balancing the needs of both IT and service consumers?”

@kurtmilne said that IT can deliver fast and cheap if standardized, but slow and expensive if customized. Agreeing,  @KhalidHakim47 said there’s a balancing act between packaging/standardizing and customizing. @harrowandy suggested using the “80/20” rule: You can get 80% of what you want now, or wait a certain number of weeks for the remaining 20%. Kurt also brought up the fact that IT service standardization gives users more flexibility and business process level, with which @alamo_jose agreed, adding that IT must help the business understand that reality. Co-host @KhalidHakim47 noted that standardization drives efficiency, but allowing more service levels gives more freedom as well. Co-host @GreenPagesIT added that requirements should be negotiated during the service definition and not specs.

Switching gears, we then asked “What service components do you think should be included in a service definition?”

@kurtmilne stated that pricing services is key – pricing requires accurate costing, and costing requires clear service definition, thus making the whole process come full circle. @alamo_jose added that ownership, SLA/OLA, a clear definition, features, cost and related services should all be included. Co-host @GreenPagesIT said that knowledge of how to access the service is a necessary service component, as well as hours of operation.

To round off the chat we closed with the question, “What do you do after you define services? What are the next steps?”

For @jfrappier, the answer was, “IT needs to define, then document and automate.” @alamo_jose chipped in, saying that once the service is defined, it should be published in the Service Catalog, with @AngeloLuciani adding that IT also needs to educate and communicate on how to leverage the services. @ckulchar, however, had a very different answer – once services are defined and delivered, he suggested, users should drink beer and celebrate!

Thanks again to everybody who participated in our #CloudOpsChat, and stay tuned details around our next #CloudOpsChat!

Feel free to tweet us at @VMwareCloudOps with any questions or feedback, and join the conversation by using the #CloudOps and #SDDC hashtags.