Home > Blogs > VMware Operations Transformation Services > Tag Archives: vCenter Operations Manager

Tag Archives: vCenter Operations Manager

VMware vCenter Operations Manager Users: Raise Your Hands!

Keng-Leong-Choong-cropBy Choong Keng Leong

I innocently asked attendees in a workshop I was delivering at one of my clients, “Who uses VMware vCenter Operations Management Suite in your company?“ I got two simple answers: “Cloud administrator” or “VM administrator.”  This triggered me to write this blog and hopefully will change your thinking if you have the same answer.

The vCenter Operations Management Suite consists of four components:

  • vCenter Operations Manager : Allows you to monitor and manage the performance, capacity and health of your SDDC infrastructure, operating systems and applications
  • vCenter Configuration Manager : Enables you to automate configuration management across virtual and physical servers, and continuously assess them for compliance with IT policies, regulatory, and security compliance
  • vCenter Hyperic : Helps to monitor operating systems, databases, and applications
  • vCenter Infrastructure Navigator : Automatically discovers and visualizes application components and infrastructure dependencies

If I were to map the vCenter Operations Management Suite to the IT processes it can support, it would look like the matrix shown in Table 1:

Table 1: A Possible vCenter Operations Management Suite to Process Mapping

What Table 1 also implies is that multiple roles will be using and accessing vCenter Operations Manager, or be a recipient of its outputs, i.e., reports. For example, the IT Director can access the vCenter Operations Manager Dashboard to view the overall health of the infrastructure. The Application Support team accesses it via a Custom Dashboard to understand applications status and performance. The IT Compliance Manager reviews the compliance status of IT systems on the vCenter Operations Manager Dashboard and gets more details from the vCenter Configuration Manager to initiate remediation of the systems.

Table 2 below shows a possible list of roles accessing the vCenter Operations Management Suite.

Table 2: Possible List of Roles Using vCenter Operations Management Suite

Tables 1 and 2 illustrate clearly that vCenter Operations Management Suite is not just another lightweight app for the cloud or VM administrator — it supports multiple IT operational processes and roles.

Taking a step further, you need to embed vCenter Operations Management Suite into operational procedures to take maximum advantage of the tools’ full potential and integrated approach to performance, capacity, and configuration management. To draw an analogy –  if you deploy a new SAP system without defining the triggers or use cases for a user to access the SAP system; establishing the procedural steps on which modules to access and how to navigate in the system; what to input; how to query and report and so on; it is unlikely the system will be rolled out successfully.

Although vCenter Operations Management Suite is not as complex, the concept is the same. You need to define procedures with tight linkage to the tools to ensure they are used consistently and in the way it is designed or configured for.

I hope that my blog motivates you to start thinking about transforming your IT operations to make full use of the capabilities of your VMware technology investment.

========
Choong Keng Leong is an operations architect with VMware Professional Services and is based in Singapore. You can connect with him on LinkedIn

VMware’s Call to IT Professionals: Be Brave

At VMworld last week, VMware CEO Pat Gelsinger laid it out: IT is no place for the timid. In a world where business has become not only lightning fast, but also increasingly fluid, he said, “…the biggest risk is perpetuating the status quo.” Success depends on willingness to move fast, be decisive, and take calculated risks.

No matter where you are on your IT transformation journey, learning from others who have taken those risks and moved their organizations forward can help you take the next step. VMworld attendees had plenty of opportunities to do just that, with companies like Medtronic and Boeing sharing their stories of operational transformation. In addition, VMware consultants and experts dug into practical application around SDDC and more.

Medtronic: Taking Risks to Improve Health
Medtronic SessionAs the world’s largest healthcare technology company, Medtronic provides some of the most critical types of technology solutions: those that alleviate pain, restore health, and extend life. Steve Arsenault, Medtronic Vice President of Global Application and Infrastructure Services, subscribes fully to risk taking as a requirement for IT. He told the keynote audience that even more important than taking risks, “…you have to be able to articulate those risks to your business stakeholders in a way that they understand.”

Later in the day, Medtronic IT’s Adrian Woodward and John Bistodeau shared how Medtronic is leveraging the VMware vCloud Automation Center and the vCloud Suite to jumpstart IT transformation and ultimately lead them to delivering both infrastructure and platforms as a service.

There was also an air of realism in their IT journey, as they had to be bold within the confines of the all-too-familiar flat IT budgets. Remarkably, they were able to reduce a 27-step process to four steps, and from 10 days to a single day turnaround time. The results speak for themselves, and it’s safe to say that VMware is helping Medtronic IT live up to its internal marketing slogan “Medtronic runs on IT.”

Boeing: Virtual Technology, Real Results
Operating within a company that has innovation in its DNA, Senior IT Director Enes Yildirim introduced ways that Boeing IT has shown tangible business results through its move to IT as a service. The aerospace giant’s cloud strategy includes program goals around performance, cost, operational maturity, differentiation, and strategy. Within that framework, the organization follows guiding principles that lay a foundation for a culture of innovation in IT:

  • Bank on Success
  • Leverage the Good
  • Self-Fund Innovation
  • Partner for Success
  • Don’t Accept the Status Quo!

By fostering innovation and adopting the cloud strategy, Boeing IT has significantly reduced cost per virtual machine, while increasing capacity to serve the business.

vCenter Operations Manager: People and Process Considerations
One of the most popular sessions in the Operations Transformation track at VMworld was Rich Benoit, VMware operations architect, presenting Maximizing the Out-of-the-Box Functionality of vCenter Operations Manager: People and Process Considerations. Rich cautioned the group against the “technology-only” approach. Instead, he encouraged getting in front of change by identifying roles and processes that will be impacted even before initiating a project. This way, as the silos break down and collaboration begins, people can focus on moving forward. In an implementation as broad as vCenter Operations Manager, Rich detailed the new organizational requirements and how to ensure the top 20 dashboards in the tool are used and the data shared to show business value. Stay tuned for a future blog and deeper dive by Rich.

Medtronic and Boeing have embraced the calculated risk taking required to make operational changes. With the right technology tools and an open approach to people and process changes, it’s a little easier to take Pat’s advice and be brave in the face of IT challenges.

Check out videos from VMworld 2014.

 

 

Implementing a Cloud Infrastructure Is About Changing Mindsets: Three Ways Cloud Operations Can Help

By: Pierre Moncassin

A few weeks ago, I had the privilege of attending the first in a series of cloud operations customer roundtables in Frankfurt, Germany. The workshop was expertly run by my colleague Kevin Lees, principal consultant at VMware and author of “Organizing for the Cloud” as well as numerous VMware CloudOps blog posts.

Customer participation in the round table exceeded our expectations – and was highly revealing. It quickly became obvious that process and organization challenges ranked at the top of everyone’s priorities. They needed no convincing that a successful cloud deployment needs operations transformation in addition to leading-edge tools.

Even so, I was amazed how rapidly the conversation turned from technical strategy to organizational culture and, most importantly, changing mindsets.

I remember one customer team in particular outlining for us the challenge they face in operating their globally-distributed virtual infrastructure. They were acutely aware of the need to transform mindsets to truly leverage their VMware technology – and of how difficult that was proving to be.

For them, changing mindsets meant looking beyond traditional models, such as the monolithic CMDB (an idea deeply entrenched in physical IT). It also meant handling the cultural differences that come with teams based in multiple locations around the world: and, more than ever, the need to align teams with different functional objectives to common goals and gain commitments across boundaries.

To state the obvious, changing organizational mindsets is a vast topic, and many books have written about it (with many more to come, no doubt). But here I want to explore one specific question: How can cloud operations help IT leaders, like our customer above, in their journeys to mindset change?

For them, I see three main areas where cloud operations can bring quick wins:

1) Create Opportunities to Think Beyond ‘Classic’ IT Service Management

Part of the journey to cloud operations is to look beyond traditional frames of reference. For some of our customer teams, the CMDB remains an all-powerful idea because it is so entrenched in the traditional ITSM world. In the world of cloud infrastructure, the link between configuration items and physical locations becomes far less rigid.

It is more important to create a frame of reference around the service definition and everything needed to deliver the service. But adopting a service view does require change, and that’s not something that we always embrace.

So how do you encourage teams to “cross the chasm?” One simple step would be to encourage individuals to get progressively more familiar with VMware’s Cloud Operations framework (by reading ‘Organizing for the Cloud,’ for example).

After that, they could take on a concrete example via a walk-through of some key tools. For example, a VMware vCenter Operations Manager demo can illustrate how a cloud infrastructure can be managed in a dynamic way. It would show how dashboards automatically aggregate multiple alerts and status updates. Team members would see how built-in analytics can automatically identify abnormal patterns (signaling possible faults) in virtual components wherever they are physically located. A demo of vCloud Automation Center’s use of blueprints to automate provisioning of full application stacks would show how new tools that leverage abstraction can help break through process-bound procedures that were developed for more physical environments.

All of this would build familiarity with, and likely excitement at, the possibilities inherent in cloud-based systems.

2) Break Down Silos with the Organizational Model

A key principle of VMware’s cloud operations approach is to break down silos by setting up a Center of Excellence dedicated to managing cloud operations. You can read more about how to do that in this post by Kevin Lees.

The main point, though, is that instead of breaking processes up by technology domain (e.g. windows/unix etc.) or by geography, Cloud Operations emphasizes a consistency of purpose and focus on the service delivered that is almost impossible to achieve in a siloed organizational structure.

Simply by creating a Cloud Infrastructure Operation Center of Excellence, you are creating a tool with which you can build the unity that you need.

3) Boost Team Motivation

Lastly, although a well-run cloud infrastructure should in itself add considerable value to any set of corporate results, don’t forget the influence held by individual team members facing a change in their work practice.

In particular, consider their likely answer to the question “What’s in it for me?”

Factors that might positively motivate team members include:

  • Acquiring new skills in leading-edge technologies and practices (including VMware certifications, potentially)
  • Contributing to a transformation of the IT industry
  • Being part of a well-defined, well-respected team e.g. a Center of Excellence.

So, remember to make that case where you can.

Here, then, are three key ways in which you can leverage cloud operations to help change mindsets:

  1. Understand that moving to cloud is a journey. Every person has their own pace. Build gradual familiarity both with new tools and concepts. Check out more of our CloudOps blog posts and resources!
  2. Build a bridge across cultural differences with the Center of Excellence model recommended by VMware CloudOps.
  3. Explain the benefits to the individual of making the jump to cloud e.g. being part of a new team, gaining new skills – and a chance to make history!

Follow @VMwareCloudOps on Twitter for future updates, and join the conversation by using the #CloudOps and #SDDC hashtags on Twitter.

Problem Management with vCenter Operations: Dealing with Events and Incidents Before They Impact Users

By: Pierre Moncassin

In some more traditional IT environments, if you have “problem manager” anywhere near your job title, you are probably faced with formidable challenges.

Let me guess… your mission is to steer the IT infrastructure clear of forthcoming issues – sometimes referred to as root causes – that will lead to incidents. Most of the time, though, you can only see what occurred in the past. To take a page from the famous TV Series, an incident has occurred and detective Columbo is called to the scene. What has occurred, he asks? Is there a pattern? Did anyone notice other incidents occurring around the same time?

That kind of thing you can probably do in your sleep. But however talented a detective you may be, this fact remains: You likely have little visibility into future incidents. You see some clues scattered around (also known as alerts), but these alerts cannot be readily interpreted without hours of manual work.

Fortunately, a tool like vCenter Operations Manager allows you to accelerate the scenario for Problem Management. Think of it as an assistant that can connect all the clues together and link them to potential suspects (root causes). The groundwork is done for you so that you can focus on the truly proactive work.

But vCenter Operations Manager pushes the envelope even further. Proactive analytics can detect impending outages before users are impacted. In detective terms, not only can you identify the suspects faster, you get an advance notice on their next move.

Now enough theory – let’s see how that works in practice.


Fig 1.
First off, let us look at the Health Badge (Fig 1.) which is built in as standard with vCenter Operations Manager. It is a dashboard that can provide you with instant visibility into the current state of the infrastructure. You can not only identify immediate issues but also use proactive capabilities like the risk badge to detect which areas of the infrastructure might fail in the future. In a nutshell: You don’t need to wait for an outage before responding.


Fig 2.
Another way to identify potential issues is by setting up Early Warning Smart Alerts in vCenter Operations Manager. These are alerts designed to tell you that some infrastructure components underpinning your cloud services are not operating “normally”. Unless it’s a traditional incident/response scenario, your overall service may well be operating perfectly fine – but the alert tells you that an issue will soon need attention and gives you a chance to be pro-active about it.

vCenter Operations Manager deploys advanced analytics to determine whether a component is operating within a “norm.” For now, it’s enough to say that once vCOps detects “abnormal” components beyond a certain threshold, an Early Warning Smart Alert is issued. It is the signal for the detective (a.k.a. the Problem Manager) to start investigating.

As soon as a potential issue is identified, you can drill into potential root causes (as shown in Fig. 2, right hand side). It is only a short step then from detection to active prevention and remediation. If the vCenter Configuration Manager (vCM) toolset is also deployed, you can directly access the virtual infrastructure configuration and review what recent change events have occurred. If the issue is related to a known change event within VCM, you may be able to roll back the change with a single command.

In summary, the toolsets not only accelerate detection, they also allow you to take appropriate preventative actions.

Right, but is it always that easy? Not always, of course. There are situations where there are so many alerts triggered (e.g. “Alert Storms”) that the root cause becomes harder to identify. But again, the good news is that there are known ways to cut down the noise – see our earlier blog, “Tips for Using KPIs to Filter Noise with vCenter Operations Manager” for more details.

The bottom line is that if you are a Problem Manager using vCenter Operations Manager, you will see your work increasingly shifting from reactive to proactive tasks. This is because you can let automation do the groundwork. (I digress a little here, but you will find that the same happens across many traditional IT roles when moving to a vCloud Infrastructure. Less time spent on physical-world “nuts and bolts” frees more time for proactive planning. By the way, if you are curious to see how the roles evolve, check out our “Organizing for the Cloud” white paper.)

In conclusion, here are three technical reasons why VMware vCenter Operations Manager will be a game-changer for you:

  • You will accelerate root cause analysis with instant drill-down access into infrastructure issues that may impact your overall services.
  • You get a comprehensive view of the infrastructure situation via visual summaries, like the Health dashboards.
  • Last but not least, you leverage proactive analytics to get an early notice of impending incidents. Now that is something that even detective Columbo did not have.

Follow @VMwareCloudOps on Twitter for future updates, and join the conversation by using the #CloudOps and #SDDC hashtags on Twitter.

Tips for Using KPIs to Filter Noise with vCenter Operations Manager

By: Michael Steinberg and Pierre Moncassin

Deploying monitoring tools effectively is both a science and an art. Monitoring provides vast amounts of data, but we also want to filter the truly useful information out of these data streams – and that can be a challenge. We know how important it is to set trigger points to get the most out of metrics. But deciding where exactly to set those points is a balancing act.

We all know this from daily experience. Think car alarms: If limits are set too tight, you can trigger an alarm without a serious cause. People get used to them. They become noise. On the other hand, if limits are too loose, the important events (like an actual break in) are missed, which reduces the value of the service that the alarm’s supposed to deliver.

Based on my conversations with customers, vCOps’ out-of-the-box default settings tend to be on the tight side, sometimes resulting in more alerts than are useful.

So how do you make sure that you get the useful alerts but not the noise? I’ve found that assigning Key Performance Indicators (KPIs) to each VM is the best way to filter the noise out. So this post offers some tips on how to optimally use KPIs.

First, Though, a Quick Refresher on KPIs

By default, vCOps collects data for all metrics every five minutes. As part of its normal operations, vCOps applies statistical algorithms to that data to detect anomalies in performance – KPIs are outputs from those algorithmic measurements.

Within vCOps, a metric is identified as a KPI when its level has a clear impact on infrastructure or application health. When a KPI metric is breached, the object it is assigned to will see its health score impacted.

A KPI breach can be triggered in the following ways:

  • The underlying metric exceeds a given value (Classic Threshold).
  • The underlying metric is less than a given value (Classic Threshold).
  • The underlying metric goes anomalous. This is a unique capability of vCOPs where a ‘normal’ range is automatically calculated so that abnormal values can be detected.

Typically, you would use one of these three options when setting a threshold, but combinations are also allowed. For example, you may want to set a classic threshold for disk utilization that exceeds a certain percentage. This can be combined with a dynamic threshold – where an alert is triggered if CPU utilization goes above its monthly average by more than x%.

Tips for Optimizing KPIs

KPIs provide the granular information that make up the overall health score of a component in the infrastructure, such as an application. The overall health score is a combination of statistics for Workload, Anomalies, and Faults.

Overly-sensitive KPI metrics, however, can cause health scores to decrease when there isn’t an underlying issue. In such instances, we need to optimize the configuration of vCOps so that the impact of anomalous metrics on health scores is mitigated.

Here are some ideas for how to do that:

Tip 1 – Focus on Metrics that Truly Impact Infrastructure Health

First, it’s good to limit the number of metrics you put in place.

With too many metrics, you’re likely to have too many alerts – and then you’re still in a situation analogous to having car alarms going off too often to be noticed.

Remember, overall health scores are impacted by any metric that moves outside its ‘normal’ range. vCOps calculates the ‘normal’ range based on historical data and its own algorithms.

Tip 2 – Define KPI Metrics that will Trigger Important Alerts

Next, you want the alerts that you do define to be significant. These are the alerts that impact objects important to business users.

For example, you could have a business application with a key dependency on a database tiers. An issue with a database or its performance would thus impact the user community immediately. To highlight these metrics, then, you’d want to focus on the set of metrics that can most closely monitor that database’s infrastructure setup KPIs.

Tip 3 – Use KPIs Across All Infrastructure Levels

In order to see the maximum benefit of KPI metrics, each metric should be assigned to the individual virtual infrastructure object (i.e. Virtual Machine), as well as any Tiers or Applications that the Virtual Machine relates to.

This is an important step as it makes the connection between the VM metrics and the application it relates to. For example, it may not be significant in itself that a VM is over-utilized (CPU usage over threshold), but it becomes important if the application it supports is impacted.

Example

Let’s assume a customer has a series of database VM servers that are used for various applications. The VM, Tier and Application assignments are illustrated below in the table.

VM Tier Application
orasrv1 DB WebApp1
orasrv2 DB CRMApp1
orasrv3 DB SvcDesk1

The application team has specified that the CPU Utilization for these VMs should not exceed 90% over three collection intervals (15 minutes). Therefore, our KPI metric is CPU Utilization %.

The KPI metric is assigned to all of the resources identified in the table above. Each VM has the KPI assigned to it. The DB Tier within each Application also has the KPI assigned to it. For example, the DB tier within the WebApp1 application is assigned a KPI for the orasrv1 VM. Finally, each Application also has the KPI assigned to it. For example, the WebApp1 application is assigned a KPI for the orasrv1 VM.

With these assignments, health scores for the VMs, Tiers and Applications will all be impacted when the CPU Utilization for the respective VM is over 90% for 15 minutes. Virtualization administrators can then accurately tell application administrators when their Application health is being impacted by a KPI metric.

Take-Away

When it comes to KPI alerts, there are 3 steps you can take to help “filter the noise” in vCOPs.

1)   Focus on a small number of metrics that truly impact infrastructure health.

2)   Define KPI metrics that will trigger the important alerts.

3)   Set up these KPI metrics consistently across infrastructure levels (eg VM, Application, DB), so that issues are not missed any particular level.

For future updates, follow @VMwareCloudOps on Twitter, and join the conversation by using the #CloudOps and #SDDC hashtags.