Home > Blogs > VMware Operations Transformation Services > Tag Archives: pierre moncassin

Tag Archives: pierre moncassin

A Day in the Life of a Cloud Service Owner

Pierre Moncassin-cropCloud Service OwnerBy Pierre Moncassin

Customers often tell me, “I totally get the principles behind the role of Cloud Service Owner – but can you describe what do they actually do throughout their work day?”

Let me start with a caveat: we are all aware that in a knowledge economy few roles have the luxury (or burden, depending on the point of view) of a completely predictable, set routine.  We are in an era where many traditional, factory assembly line jobs have been re-designed to become more agile and less repetitive, at least in some leading-edge companies.  The rationale being that job rotation and innovation if not creativity, leads to higher productivity. Less routine, more output.

What is a Cloud Service Owner?

When I say Cloud Service Owner (CSO), I am referring specifically to the Service Owner role described in the VMware white paper: Organizing for the Cloud.

A CSO is a relatively senior role that includes responsibility for the end-to-end life-cycle of cloud services, with numerous levels of interaction with the business and cloud team members. So I will endeavor to highlight some typical aspects of a ‘day in the life’ of that role – bearing in mind all the caveat above.

Cloud Service Owner

The Cloud Service Owner ensures a consistent, end-to-end Service Lifecycle

Cloud Service Owner Stakeholders and Interactions

The CSO interacts with a number of key stakeholders, first of all within the business lines. When a new service is requested, the CSO reviews the requirements with the Business stakeholders; which may include not only the definition and costing of the service but also a discussion of how quickly it can be placed into production.

In a DevOps environment, that interaction with the business lines will still take place, with the difference that business and application development may be a single team (the business stakeholder may be an application product owner). When the (DevOps) business proposes to launch a new product (i.e. cloud based application or application features), they may need to discuss with the Service Owner how to develop the continuous delivery automation to support that new product.

The CSO will also interact with the members of the cloud operations team, for example with the Cloud Service Architect who will assist in all aspects of the design of the automation workflows and blueprints underlying the services.

Interactions will also include working with other support groups such as the Service Desk – for example, to review incidents relating to the services and work out patterns or root causes; possibly involve the Service Analyst to improve the monitoring and risk management underpinning these services.

Of course the list does not end there. The CSO may also interact with third party providers (especially in a hybrid cloud or multi-cloud model), as well as contractors. If the cloud platform is part of an outsourcing arrangement, the CSO will likely have a counterpart within the outsourcer.

Key Processes for a Cloud Service Owner

From a process perspective, our CSO will be involved in all aspects of the service life-cycle – this is by definition of the broad remit of the role. But I can highlight some key ‘moments’ in that life-cycle.

  • Initial definition of the service in cooperation with the business stakeholders.
    This is a critical stage whether both the scope of the service is defined, but also the expected costs and service levels.
  • Monitoring the performance of the service.
    In traditional IT, the review of SLA performance with business takes a key role once a service is up and running. In a cloud environment, much of SLA monitoring may be automated, however SLA review is still an important role for the CSO.
  • Continuous Improvement and ultimately, de-commissioning of the service.
    It is expected that the service will evolve and that ultimately it will be de-commissioned (possibly freeing some capability). This is also an activity that needs close cooperation with business lines.

Toolsets for a Cloud Service Owner

I mentioned at the outset that the CSO is not necessarily expected to a toolset expert.  However, in order to fully leverage the capabilities of a cloud platform, the CSO will be able to leverage the key cloud management tools, specifically:

  • vRealize Automation – the CSO will have a solid understanding of how blueprints are designed and where/when existing blueprints can be re-used to create new (or amend existing) services.
  • vRealize Business – understand the costing models and how they can be built into the tool to automate charging/billing.
  • vRealize Operations – leverage the tool to track service performance, and generally managing service ‘health’ and capacity.
  • NSX – the CSO is less likely to interact with this tool on a daily basis, but will benefit from a solid understanding the capability of this tool to help design the automation workflows and to plan the security controls that may be required when deploying the services (e.g. leveraging micro-segmentation).

The list is not exhaustive. Many organizations are considering, or already working towards DevOps adoption, and in those cases the CSO will benefit from a broad grasp of industry tools whether related to continuous delivery (think Jenkins, Codestream, Puppet, Ansible), or specifically to Cloud-Native Applications (think Docker,  VMware’s Photon OS).

Take Aways:

  • For roles such as Cloud Service Owner, design activities should take precedence over fire-fighting. These roles will prefer an engineering approach to problems (fix a problem once).
  • Do not expect a rigid work-day pattern – the Cloud Service Owner is a senior role and will need to liaise with a range of stakeholders, and interact directly and indirectly with several tools and processes.
  • The Cloud Service Owner will maintain a healthy understanding of toolsets; and leverage that knowledge daily to design, manage and occasionally de-commission services. This is an aspect of the role that may be significantly different from more traditional Service Manager roles.
  • However, the Cloud Service Owner is not meant to be a toolset specialist. If they spend their entire day immersed in toolsets, it is a sign that the role is not fully developed!
  • The Cloud Service Owner will work in increasingly close cooperation with the business lines– not from silo to silo but through an increasingly permeable boundary.

=======

Pierre Moncassin is an operations architect with the VMware Operations Transformation global practice and is based in the UK.

3 Common Mistakes when Breaking Organizational Silos for Cloud and DevOps

Pierre Moncassin-cropOrganizational SilosBy Pierre Moncassin

Every customer’s journey to Cloud, DevOps or other transformative initiatives is unique to an extent.  Yet all those journeys will come across a similar set of challenges.  With the exception of truly green-field projects, each transformation to Cloud/DevOps involves dealing with the weight of legacy – organizational and technical silos that hamper their momentum towards change.

This is why I often hear from customer teams: “We know that we need to break down those silos – but exactly how do you break them?”

Whilst I do not advocate a one-size-fit-all answer, I want to share some recommendations I promote on how to go about breaking those silos – and some mistakes to avoid along the way.

From where do silos come?

As discussed in earlier blogs, silos usually come into existence for valid reasons – at the origin. For example, when infrastructure administration relies on manual and highly specialized skills, it appears to make sense to group the skills together into clusters of deep expertise. Unix gurus, for example, might cluster together, as might Microsoft Windows experts, SQL database specialists and so on.  These are examples of silos teams build around infrastructure skills – experts of all those areas need to align their mode of operation to support cloud infrastructure services.

Other examples of commonly found silos include:

  • Application Development to Operations: DevOps emerged precisely as a way to break down one of the ‘last great silos’ of IT – the persistent gap between the Development teams and Operations teams.
  • Business to IT: When IT becomes so reliant on a specialist set of skills (think mainframe programming) significant inefficiencies arise in cross-training IT staff to business or vice-versa. In transitioning to Cloud/DevOps, this is another of the ‘great silo risks’ that the transformation will mitigate and ultimately break down completely as Business, Application Development and Operations function as an integrated team.

Common mistakes when attempting to break down silos.

a) Toolset-only approach.

A frequent temptation for project teams is to install software-based tools and assume (or rather, hope) that the silos will just vanish by themselves. In Cloud transitions, teams might install automated provisioning, but forget to work across the business/IT silos. Result – adoption by the business generally ends up minimal. In DevOps transition attempts, the technology approach might consist of deploying, for example, Jenkins, Code Stream etc. – tools meant for continuous delivery efforts, but failing to bridge the gap fully with day two operations management, for example without governance around incident-handling or idempotent configuration management. Without a clear path to resolution that cuts across the silos, it is easy to see when issues are not resolved satisfactorily. The impact on customer satisfaction is predictably less than optimal.

b) Overlook the value of ‘traditional’ skills

During the transition to Cloud/DevOps, especially when considering a toolset-only approach, it can appear at first sight that many of the legacy skills have become irrelevant.   But this is often a mistaken perception. Legacy skills are likely still relevant, they simply need to be applied differently.

For example, traditional operating systems skills are almost always relevant for Cloud, however they will be applied differently. Instead of manually configuring servers, the administrators will develop blueprints to provision servers automatically. They will use their knowledge to define standardized operating system builds.

Traditional skills become all the more critical when we look into soft skills. The ability to manage stakeholder relationships, communicate across teams, organizational and business specific knowledge – are all essential to running an effective Cloud/DevOps organization.

c) Focus on problem not solution

This is a well-known principle of change management – focusing on the problem will not solve it. Rather than present the teams with a problem, for example existence of a silo, it is often far more effective to work on the solution – cross-silo organization and processes.

Does it work? I can certainly relate the experience of ‘seeing light bulb’ moments with highly specialized teams.  Once they see the value of a cross-silo solution, the response is far more often “we can do this” as opposed to defending the status quo of individual silos.

In sum, focus on the vision, the end-state and the value of the end-to-end solutions.

Five recommendations to help break down silos.

  1. Shift from silo mindset to Systems Thinking. Conceptually, all the ‘common mistakes’ that I mentioned above can be traced back to the persistence of a silo mindset – whether focusing on traditional (versus leading-edge skills), new toolsets (versus legacy ones), or isolated ‘problem’ areas. The better approach is Systems Thinking. Systems thinking implies an understanding that the overall organization is more than the sum of the parts. It means looking for ways not just to improve the efficiency of individual elements (skillsets, tools, process steps) but optimize the way these elements interact.
  2. Create vision. As mentioned earlier, creating the vision is a vital step to get the team’s buy-in and to overcome silos. This can entail an initial catalog of services and outline workflows to fulfill these services. Potentially, it may be worth setting up a pilot platform to showcase some examples.
  3. Build momentum. Building the vision is important but not enough. One the initial acceptance is reached, the transformation team will need to build the momentum. For example by recruiting ‘champions’ in each of the former silos.
  4. Proceed in incremental steps, building up a track record of ‘small wins’ and gradually increasing the pace of change.
  5. Establish the permanent structure. One the change in motion, it will be necessary to define the long-term roles that operate the Cloud/ DevOps operations. These roles are detailed in ‘Organizing for the Cloud’: https://www.vmware.com/files/pdf/services/VMware-Organizing-for-the-Cloud-Whitepaper.pdf.

Take-aways

  • Breaking silos is a result rather than the end. Start by building the vision to engage teams and motivate them to break the silos themselves.
  • Do not rely on technology alone. Toolsets augment processes, but do by themselves overcome silos (e.g. vRealize Code Stream, vRealize Cloud Automation and other VMware Cloud automation tooling) as long at they are leveraged to sustain the vision and constantly build momentum.
  • Leverage existing skills. Many of the legacy, previously silo’ed skills can be adapted to the future cloud/DevOps organization.

=======

Pierre Moncassin is an operations architect with the VMware Operations Transformation global practice and is based in the UK.

3 Capabilities Needed for DevOps that You Should Already Have in Your Cloud Organization

Pierre Moncassin-cropBy Pierre Moncassin

A number of enterprise customers have established dedicated organizations to leverage VMware’s cloud technology. As these organizations reach increasing levels of cloud maturity, we are more and more often asked by our customers: “how is our organization going to be impacted by DevOps?“

Whilst there are many facets – and interpretations – to DevOps, I will highlight in this blog that many of the skills needed for DevOps are already inherent to a fully- functioning cloud organization. Broadly speaking, my view is that we are looking at evolution, not revolution.

First, let’s outline briefly what we understand by DevOps from a people/process/technology point of view:

  • DevOps EvolutionPeople: DevOps originated as an approach, even a philosophy that aims to break down organization silos, specifically the traditional gap between application developers and operations teams. This is why it is often said that DevOps is first of all, about people and culture. Application Developers are sometimes depicted as “agents of change” whilst the Operations team are seen as “guardians of stability” – teams with opposite objectives that can lead to well-documented inefficiencies.
  • Process: From a methodology point of view, DevOps integrates principles such as “agile development”. Agile this provides the methodological underpinning for Continuous Delivery, an approach that relies on the frequent release of production-ready code. Whilst Agile development was originally about applications, DevOps extends the principle to infrastructure (leading to the idea of “agile infrastructure”).
  • Technology: DevOps processes necessarily incorporate the use of development and automation technologies such as: source code control and management (e.g, Git); code review systems (e.g., Gerrit); configuration management (e.g., Puppet, Chef, Ansible, SaltStack); task execution and management (e.g., Jenkins); artifact and application release tooling (e.g., VMware vRealize Codestream); and others. In order to manage those tools as well as applications generated by them, DevOps also incorporates operations tooling such as provisioning and monitoring of the underlying infrastructure (e.g., vRealize Automation and vRealize Operations).

Features of a cloud organization adapted for VMware’s cloud technology, are described in detail in the white paper “Organizing for the Cloud” (link below):

https://www.vmware.com/files/pdf/services/VMware-Organizing-for-the-Cloud-Whitepaper.pdf

DevOps Organizational Model

Here are, in my view, some key capabilities in the cloud organization as recommended by VMware:

1) The rise of developers’ reach.

As development departments mature beyond  writing strictly  application code, their reach spans broader knowledge bases. This includes writing code that performs end-to-end automation of application development, deployment and management: applications and infrastructure as code. Developers utilize the same skills traditionally relied on in application teams and apply them towards  cloud services:

  • Provisioning for example with VMware vRealize Automation.
  • Automating network configuration with VMware NSX
  • Automating monitoring and performance management (VMware vRealize Operations).

This shift in reach from Ops to Dev forms the the basis of ‘infrastructure-as-code’ – a now relatively standard cornerstone of DevOps.

2) Ability to work across silos

One of the defining capabilities of a cloud team  – and a key skill required of all team members, is to be able to break the boundaries between silos:

  • Technical silos: for example the customer-facing team (Tenant Operations, also known as IT Service Center) will define end-to-end cloud services across technical silos such as compute (servers), networks and storage. Service Owners and Service Architects will define the scope and remit of such services; Service Developers will put together the workflows and scripts to allow end users to provision those services automatically.
  • Functional silos – merging “Design” and “Run”. Whilst traditional IT organizations tend to separate teams of architects/designers from operations team, the cloud development teams bring those skills together. Service Developers for example will build workflows that include not only the deployment of infrastructure, but automate its monitoring and configuration management at runtime. Service Owners are involved both in the definition of services but also act as point of contact in resolving incidents impacting those services.  DevOps takes this trend to the next level by merging the “dev” and “ops” teams.

3) Increased alignment with the business

Whilst all IT organizations aim to align with the business,  A model organization (as described in “Organizing for the Cloud”) aligns business lines with practical structures and roles.  For example this model defines dedicated roles such as:

  • Service Architects who translate business requirements into functional and technical architectures.

DevOps continues this trend towards business alignment: in a context where business is increasingly driven by revenue-generating applications, application development becomes integral to the lines of business.

DevOps Organization

In sum, a well-functioning cloud team will have established many of the positive traits needed for DevOps – a preference for rapid development over fire-fighting, for bridging silos across technologies and processes, and for close cooperation across business lines.

Going one step further DevOps pushes these traits to the extreme – preferring continually improving development and automation of application and infrastructure. For example a Devops team might leverage VMware’s Cloud Native Apps capabilities to build applications optimized to run on cloud from “day one” (for more details see https://www.vmware.com/cloudnative/technologies).

Take-away – practical ways to prepare your cloud team for DevOps;

  • Encourage job rotation of key team members across technical skills and functions.
  • Continuously expand your team’s knowledge and practice of cloud automation tools. This can include advanced training on tool such as vRealize Automation, vRealize Operations; as well as generic skills in analysis and design.
  • Ensure that key tenant operations roles (i.e. customer facing roles) are in place and give them increasing exposure to application development and business lines.
  • Develop an awareness of Agile approach for example by formal training and/or nominating ‘Champions’ in your team.
  • Build up a skill base in Continuous delivery, for example leveraging training or a pilot with vRealize Codestream.

—-
Pierre Moncassin is an operations architect with the VMware Operations Transformation global practice and is based in the UK.

5 Steps to Build a Security Strategy for the Digital Enterprise

From team bonding to micro-segmentation: a 5-step journey to develop a proactive security mindset in your cloud organization.

Pierre Moncassin-cropBy Pierre Moncassin

Only a few years ago security was at best an afterthought for some cloud teams., Everyone in the team thought that security was someone else’ s problem. For some less-fortunate organizations, this mindset did not change until a major security breach occurred with resulting financial losses, reputational damage not to mention job cuts. By that point security did becomes everyone’s problem – but that realization happened far too late.

To avoid this sort of less-than-optimal scenario, let me share here what I see as some key steps to develop security mindset right at the core of your cloud organization

First, why is security in the cloud specifically challenging?

IT security risks have of course existed well before the cloud era. However cloud technologies have brought along a new dimension to the risk. Reasons include:

  • Due to unprecedented ease and speed to provision infrastructure, a new population of business users have become able to provision their own cloud infrastructure. They may not all be fully aware of the corporate IT security guidelines (or may not feel bound to follow them strictly).
  • Fast provisioning in the cloud has often led to a proliferation of “temporary” workloads – many of which are not rigorously controlled.
  • Data in the cloud can be stored anywhere. Users are usually not aware of where their data is located physically, or have no control over that location. Therefore protecting confidential data becomes an additional challenge. Some country legislation, for example, mandate that confidential data from their nationals must remain within designated geographies.

Step 1: Build a broad awareness and knowledge base.

All cloud team members need to understand the basics of security for their cloud platform. That includes not only the enterprise security policy, but also a broad awareness of relevant laws (e.g. data protection) and compliance requirements (e.g. PCI, Sarbanes Oxley).  It also helps to build some basic awareness of common security breaches. In order to incentivise this learning, consider including security training in personal objectives (also known as ’MBO’); include security awareness in new hire onboarding and individual training plans.

Step 2: Break down technical silos

As I explained in on my recent blogs, technical silos occur quite naturally as specialists organize themselves along groups of expertise (networks and servers, operating systems and hardware). However entrenched silos can easily cause gaps in security coverage. This is because hackers are experts at finding fault lines between silos – those tiny gaps or fault lines from which they can launch an intrusion.  They will look for the ‘weakest link’ wherever it might be found (e.g. access password too simple, un-patched operating system patch, lax email security, defective firewall  – the list of risks is long).

Instead of relying on a silo mentality, the team needs to consider security end-to-end, and assume that breaches can occur in any layer of the infrastructure. In the same way as cloud services need to be designed end-to-end across silos, teams need to work together to manage security risks.

Step 3: Involve the business stakeholders

Part of setting up a cloud organization with VMware’s model, involves building close working relationships with business stakeholders. Specific roles within VMware’s cloud organization model will be in place to liaise with the business (eg Service Owner, Customer Relationship Manager).  And security is a key part of this cooperation. Some key aspects are:

  • Establish clearly responsibilities (e.g., who patches the workloads? who checks compliance?)
  • Document the responsibilities and expectations e.g. within the service level agreements;
  • Ensure regular communications about security between business users and cloud team (e.g. are there security-critical applications? Confidential data? What level of confidentiality?)

Step 4: Automate day-to-day security & compliance checks.

As part of operating a VMware cloud, the team will most likely be using tools such as VMware’s vRealize Automation and vRealize Operations Manager. These tools can be configured and leveraged to enhance some of your security and compliance procedures – adding much-needed automation to routine, day-to-day activities that otherwise consume effort and attention. Here are some examples of steps your teams can take to leverage these tools for security & compliance.

  • Ensure that provisioning blueprints are up-to-date with the latest security policy (e.g. patch levels).
  • Configure vRealize Operations Manager’ dashboards to display an aggregate view of compliance risk across your virtual infrastructure. For example, vRealize Operations Manager can be configured with extensions and third party integrations that allow to extend its analytical capabilities across a broad variety of sources including VMware Cloud Air, VMware NSX, Amazon AWS, NetApp Storage (for further details check out: http://www.vmware.com/files/pdf/vrealize/vmware-vrealize-operations-management-packs-wp-en.pdf).
  • Leverage vRealize Operations Manager’ ability to automate and report on compliance checks (the technical capabilities are described in more detail in this VMware blog: https://blogs.vmware.com/management/2015/03/compliance-in-vrealize-operations-6.html).
  • Leverage the potential of automated integration with your support desk. Once detected, compliance or risk issues must be acted upon. These events can be automatically associated to the creation of an incident ticket. I have outline the potential of such integrations in an earlier blog  https://blogs.vmware.com/cloudops/2015/09/cloud-itsm-integration.html
  • From an organizational point of view, what we want is to automate as far as possible the bulk of routine compliance checks and security monitoring, so that the teams can focus on the ‘big picture’ work pro-actively to identify emerging security threats

Step 5: Shift paradigm on network security with micro segmentation.

Whilst the expression “paradigm shift” has been much over-used, it still fits perfectly to describe the evolution from traditional network security to micro-segmentation.

The traditional approach to securing a private cloud’s network is to setup strong security (firewalls) at the perimeter. This is the fortress model of security – highly protected boundaries (perimeter) and a gate to control traffic at the entrance.

The downside is that all “fortresses” share a weakness by construct. To understand why, let’s consider the typical stages of a data breach:

  • Intrusion: attacker finds a breach in the perimeter
  • Lateral Movement: the intrusion is expanded for example, by compromising neighboring workloads or applications.
  • Extraction: potentially sensitive data from the compromised systems.
  • Cleanup/deletion: the intruder attempts to remove traces of the intrusion (deleting log files etc.).

Security Data BreachIn the event where an intruder manages to pass through the security gate, moving from room to room within the fortress becomes relatively easy. In IT terms, once a network’s perimeter is breached and a first workload is compromised, the intruder can often move “laterally” to compromise other workloads with little or challenge, then locate potentially sensitive data to retrieve (‘Exfiltrate’).  There may be other lines of defense within the fortress (traditional network) – but these tend to be static, and once broken the same problem of “lateral mobility” occurs again.

Micro-segmentation allows fine-grained network security that can prevent not only the initial intrusion, but challenge attempts the other stages i.e. Lateral Movement Exfiltration, Cleanup.  The reason is that each ‘room’ (or workload) can be isolated from the other. We could compare this new model to the layout of submarine where each section of the ship is partitioned by watertight doors. Each compartment  (micro-segment) can contain an intrusion. The would-be intruder is just as challenged to move from one compartment to the other, as getting past the entrance door in the first place.

However micro-segmentation means more than fine-grained network isolation. It offers the possibility to tailor security policies down to the workload level, therefore increasing to a new level the control over cloud security.

For example, network security rules can be associated to logical objects like a workload. When the workload is moved from a network location to another, the security rules are maintained – they ‘follow’ the workload rather than being attached to a fixed network address.

Security Rules

Leveraging that potential requires a new mindset – shifting from a static security model to dynamic, fine-grained security. It also requires the cloud team to develop new skills. For example to replace routine configuration skills with automation, traditional network skills need to be complemented with design and programming skills.

Key take aways:

  • Think of security as by essence, teamwork. Encourage your team to coordinate security across silos – users, cloud engineers, security teams.
  • Leverage your automation tools such as VMware vRealize Automation and vRealize Operations Manager – they will help automate some of your security and compliance procedures.
  • Transform your team’s perspective on network security by leveraging micro-segmentation, moving from the traditional ‘fortress’ security model to a dynamic, fine-grained approach.

—-
Pierre Moncassin is an operations architect with the VMware Operations Transformation global practice and is based in the UK.

Staffing Your Cloud Organization – A Heuristic Model

Approximating staffing ratios in a cloud organization as a logarithmic function of infrastructure metrics.

Pierre Moncassin-cropBy Pierre Moncassin

Customers who want to establish true cloud services based on VMware’s SDDC solution (or any other provider for that matter), realize that in order to fully leverage the technology, they need to adapt their IT organization.

More specifically, they need to setup a dedicated team – a cloud Center of Excellence (COE) to manage and operate their cloud services.

The structure and roles of that team are described in detail in ‘Organizing for the Cloud’.

During practically all Operations Transformation projects, a question frequently asked is: what is the optimum staffing level to setup this cloud organization (FTE a.k.a. Full Time Equivalent)?

The standard consultant answer is of course  ‘it depends’. But in this blog, I will explain in more detail what “it depends” means in this context.

In an earlier blog, I described “10 key factors to estimate staffing ratios to operate platforms with vRealize Automation and vRealize Operations Manager”.

  • Number of lines of business
  • Number of data centers
  • Level staff skill/experience
  • Number of cloud services
  • Workflow complexity
  • Internal process complexity (includes support requirements eg 5 days/5 or 24 hour/7)
  • Number of third party integrations
  • Rate of change
  • Number of VM’s
  • Number of user dashboards/reports

Now these 10 factors, and probably hundreds of other factors will determine the complexity of the tasks that the cloud organization needs to perform and therefore, the staffing level. Clearly there are thousands of possible combinations of these factors. But if I want to see how the FTE count evolves with a single , easy-to-quantify parameter (such as number of virtual machines or any other ‘simple’ infrastructure metric’), we need to make strict assumptions to ‘tie down’ the other factors.

So let’s assume that we are looking at a single organization evolving over time; as time passes the number of virtual machines gradually increases, but so does the number and complexity of the services, as well as the demand for support coverage:

  1. Between 1 and 100 VM’s, the COE is running as a pilot, there are no support requirements, only a small number of services to run.
  2. Between 100 and 1000 VM’s., the COE is running cloud services regionally with some basic service levels.
  3. Over say, 30,000 VM’s, the COE is now running a global operation with 24/7 support requirement and a broad range of services.

Practical observation of a number of real-life examples suggests an evolution broadly similar to the logarithmic curve in figure 1. Now this is still a model that deliberately simplifies and ‘smooths out’ the FTE curve, but there are two practical implications:

  • The staffing levels may rise most steeply at the beginning of the curve. When the organization transitions from a pilot to a fully operating COE, the staffing need levels rise significantly.
  • The FTE curve flattens out then the organization matures and can handle high volumes. Once the COE is operating with a high level of automation with experienced staff, adding workload only requires a marginal increase to the FTE’s count.

In reality of course the complexity – i.e. the demand on FTE – never grows quite linearly.

We would see threshold effects. For example when we reach 300 worksloads, a new 24×7 service may be added to the portfolio, which requires a rapid increase in FTE.

Take-aways:

  • The faster rise in FTE will occur in the early stages of build-up of cloud services; this is ‘normal’ given that we see an increase altogether of the number of services and the service levels and therefore significantly increasing the demands on the cloud organization;
  • Once well established and automated, the FTE level should only increase marginally with rising infrastructure volumes – your organization will have learned to cope with increasing quantities.
  • We need to caveat that although the FTE curve may look broadly logarithmic, threshold effects are inevitable: new demands on service level (eg new compliance requirements, 24×7 etc) can create an ‘uptick’ in FTE without necessarily a prior ‘uptick’ in volumes.

What we have presented here in an intuitive model to understand how increasing volumes impact FTE. You are welcome to share your experience and perhaps refine this heuristic model.

—-
Pierre Moncassin is an operations architect with the VMware Operations Transformation global practice and is based in the UK.

5 Steps to Building Your High-Performance Cloud Organization

Make sure learning happens by design; not just trial and error.

Pierre Moncassin-cropBy Pierre Moncassin

One of the often-overlooked aspects of building an effective cloud organization lies in the training and development of team members. My customers often ask, “How do I accelerate my IT organization’s transition to cloud?” Well, there is much more to my answer than relates to deploying toolsets.

What the IT organization needs is accelerated learning—learning at organizational level as well as individual. All too often, that learning happens in part by accident.  An enthusiastic project team installs the technology first, sometimes as a pilot. The technology works wonders and produces great initial results, e.g., IT services can be provisioned and managed with levels of speed and efficiency that were simply not possible before. Then sometimes, the overall project just stalls. Not because of a technical shortfall. The reason is that the organization has not completely figured out how to fully leverage that technology, and more importantly, how to fit it in with the rest of the IT organization. This is a shortfall of learning.

Faced with the challenge of learning to leverage the technology, many organizations fall back on the tried and tested approach known as “learning on the job.”  After all, this is an approach that has worked for centuries! But in the fast-paced cloud era, you want to accelerate the learning process. Really, you want learning by design not just by trial and error. So, where do you start?

Here are some practical lessons that I have collected by supporting successful projects with customers and within VMware:

1. Design a plan for the organization.
Org for Cloud wp
It stands to reason that the future organization will be different from the current, “pre-cloud” organization. However, the optimal structure will not be reached without planning. In practice we want to gradually flesh out your tenant operations and infrastructure operations teams, as describe in more details in the white paper: Organizing for the Cloud.

In turn, this means orchestrating the transition from the current roles into the target organization. Each transitioned role will require a skills development plan adapted to the individual.

2.    Plan for formal skills development.
The fist step to plan skills development is to carry out a gap analysis of each selected team member, against their future roles (e.g. , service owner, service architect, and so forth). Each role carries specific requirements in terms of technical skills—without delving in all the details, a blueprint manager will need deeper knowledge of VMware vRealize Automation than a customer relationship manager; however the customer relationship manager will need some awareness of the blueprints and how they can be leveraged to meet customer requirements effectively.

3. Reinforce learning with mentoring and coaching.
Mentoring and coaching are effective ways to reinforce the individual’s own learning. Typically mentoring will focus on knowledge transfer based on personal experience. For example, encourage sharing of experience by pairing up the new service architect with an experienced service architect (either in another part of the organization—if existing—or from another organization).

Coaching will focus on individual skill development—either by learning directly from the coach, or from the coach supporting an individual’s own learning journey.

Although coaching/mentoring is by definition highly personalized  (learner centric), it is a good idea to establish a formal structure around it. For example, assign coaches/mentors to all future cloud team members, with a mechanism to track activity and results.

4.    Develop leaders with both business and technical skills.
As when building any team, it is important to identify and nurture a cadre of leaders for the cloud organization.  These leaders will be both the formal leadership roles (tenant operations leader, infrastructure operations leader), but also critical roles such as service owner and service architect.

Such leaders will hold a key role in representing the cloud organization within the broader business.  Part of their development will include broadening their understanding of the business. For example, by assigning them mentors within the lines of business—this is another example where mentoring comes in handy.

However business acumen, whilst important, is not enough. These roles also need to develop broad technical skills to be able to articulate solutions across technical silos and understand the new capabilities introduced by cloud automation.

5.    Reach out to the broader organization with a champions community.
Champions, a.k.a change agents, are advocates within the rest of the organization (especially within the lines of business) who will spread the awareness and support for the cloud. These champions help bridge the silos with business users and win “hearts and minds.” Refer to my earlier blog where I explained how we leverage a change agent program within VMware and the lessons that can be inferred. Your change agents will make sure that the broader organization/business learns about the cloud project and ultimately adopts it.

Takeaways:

  • Plan the transition and learning curve both for your organization and the individuals.
  • Combine formal learning with individual-centric learning (coaching and mentoring).
  • Invest effort in developing at an early stage, the future leaders and champions  for cloud adoption. Make sure that their planned learning spans across both technical and business knowledge.

==========
Pierre Moncassin is an operations architect with the VMware Operations Transformation global practice and is currently on long-term assignment in Asia-Pacific. Follow @VMwareCloudOps on Twitter for future updates.

7 Tips on Leveraging a Change Agent Program to Boost Cloud Adoption

By Pierre Moncassin

Pierre Moncassin-cropIn nearly every other discussion I have with customers about cloud adoption, I hear mention of their challenge with “mindset change.” That challenge is often faced on both sides of the consumer/provider equation, as users (IT consumers) and operators (internal providers) need to change their approaches in order to define, operate, and consume cloud services efficiently.

These same organizations are fully aware that tackling mindset change is essential. The question is: How to go about it with often (typically always) restricted resources and funds?

Changing mindsets for cloud adoption takes more than technical training
One option is to invest in a formal change management consultancy project. Whilst such programs certainly deliver value (and many first-tier consultancies offer such services), they also require a substantial investment both in terms of expenditure and bandwidth of your internal resources.

The next option (and often the default option) boils down to education—typically a mix of functional training for users, and technical training for operators. Without a doubt, training brings valuable knowledge; however it does not always lead to changes in behavior.

Here’s where I can provide you with some useful guidance with lessons from a “change agent” program I was involved with at VMware. This program was designed to build internal awareness and disseminate expertise within a fast-developing global practice. Each of these principles below can be generalized to your broader cloud adoption initiative:

  1. Recruit early enthusiasts—preferably volunteers who want to be ahead of the curve.
  2. Make it personal—recognize individual contributions who are making an impact. Encourage participants to share information, and network with their counterparts in other locations.
  3. Mix structured, semi-structured, and informal communication—formal (meetings, webinars), semi-structured (brainstorming), and informal (social events, ad hoc discussions).
  4. Make the most of social media—great for facilitating free-flowing communication across dispersed team members.
  5. Work with existing structures and processes—no need to re-invent the wheel—our “change agents” are encouraged to use existing internal training programs—often they will be early adopters and provide valuable feedback on how to improve the training so others will benefit.
  6. Train the trainer—or more accurately, train the evangelist. Each individual is encouraged in turn to evangelize within their own team.
  7. Recruit across a diverse range of experience, seniority, and skills—the more diverse the participants, the broader the adoption and reach of the program across the user base. Also the varied experience brings valuable knowledge and feedback into the program.

Results
Within eight months, this unique program has helped VMware’s practice develop a community of more than 100 change agents in over 20 countries! Change agents have contributed to shape and refine the structured training programs in place, and continue to be actively involved in curriculum development.

Whether you are currently struggling with cloud adoption issues or anticipating them with future cloud initiative, I encourage you to try such a program as I’ve described above, and begin to apply these principles. I’d be interested in hearing about your experiences.

===
Pierre Moncassin is an Operations Architect with the VMware Operations Transformation global practice and is currently on long-term assignment in Asia-Pacific. Follow @VMwareCloudOps on Twitter for future updates.

How to Take Charge of Incident Ticket Ping Pong

By Pierre Moncassin

Pierre Moncassin-cropWhen incident tickets are repeatedly passed from one support team to another, I like to describe it as a “ping pong” situation. Most often this is not a lack of accountability or skills within individual teams. Each team genuinely fails to see the incident as relevant to their technical silo. They each feel perfectly legitimate in either assigning the ticket to another team, or even assigning it back to the team they took it from.

And the ping pong game continues.

Unfortunately for the end user, the incident is not resolved whilst the reassignments continue. The situation can easily escalate into SLA breaches, financial penalties, and certainly disgruntled end users.

How can you prevent such situations? IT service management (ITSM) has been around for a long while, and there are known mitigations to handle these situations. Good ITSM practice would dictate some type of built-in mechanisms to prevent incidents being passed back and forth. For example:

  • Define end-to-end SLAs for incident resolution (not just KPIs for each resolution team), and make each team aware of these SLAs.
  • Configure the service desk tool to escalate automatically (and issue alerts) after a number of reassignments, so that management becomes quickly aware of the situation.
  • Include cross-functional resolution teams as part of the resolution process (as is often done for major incident situations).

In my opinion there is a drawback to these approaches—they take time and effort to put in place; incidents may still fall through the cracks. But with a cloud management platform like VMware vRealize Suite, you can take prevention to another level.

A core reason for ping pong situations often lies in the team’s inability to pinpoint the root cause of the incident. VMware vRealize Operations Manager (formerly known as vCenter Operations Manager) provides increased visibility into the root cause, through root cause analysis capabilities. Going one step further, vRealize Operations Manager gives advance warning on impending incidents—thanks to its analytical capabilities. In the most efficient scenario, support teams are warned of the impending incident and its cause, well ahead of the incident being raised. Most of the time, the incident ping pong game should never start.

Takeaways:

  • Build a solid foundation with the classic ITSM approaches based on SLAs and assignment rules.
  • Leverage proactive resolution, and take advantage of enhanced root cause analysis that vRealize Operations Manager offers via automation to reduce time wasted on incident resolution.


Pierre Moncassin is an operations architect with the VMware Operations Transformation global practice and is based in Taipei. Follow @VMwareCloudOps on Twitter for future updates.

 

A New Angle on the Classic Challenge of Retained IT

By Pierre Moncassin

Pierre Moncassin-cropWhen discussing the organization models for managing cloud infrastructure with customers, I have come across situations where some if not all infrastructure services are outsourced to a third party. In these situations my customers often ask – does your (VMware) operating model still apply? Should I retain cloud-related skills in-house? If so, which ones?

The short answer is: Yes. The advice I give my customers is that their IT organization should establish a core organization modeled on the “tenant operations” team as defined in Organizing for the Cloud, a VMware white paper by my colleague Kevin Lees.

Let’s assume a relatively simple scenario where a single outsourcer is providing “standard” infrastructure services — such as computing, storage, backups. In this scenario, the outsourcer has accepted to transform at least some of its services towards software-defined data center (SDDC), which is by no means an easy step (I will return to that point later).

For now let’s also assume a cooperative situation where customer and outsourcer are collaboratively working towards a cloud model. The question is — what skills and functions should the customer retain in-house? Which skills can be handed over to the outsourcer?

The question is a classic one. In traditional infrastructure outsourcing, we would talk about a “retained IT” organization.  For the SDDC environment, here are some skill groups that I believe have to be preserved within the core, in-house team:

  • Service Design and Self-service Provisioning is clearly a skillset to keep in-house. The in-house team must be able to work with the business to define services end-to-end, but the team should also be able to grasp accurately the possibilities that automation offers with software such as VMware vCloud Automation Center.  Though I am not suggesting that the core team needs to be expert in all aspects of workflows, APIs or scripting, they do need a solid grasp of the possibilities of automation.
  • Process Automation and Optimization.  A solid working knowledge of automation software is useful but not enough.  The in-house teams are required to decide which processes to automate and how. They need to make business-level decisions. Which processes are worth automating? What is the benefit of automation versus its cost?
  • Security and Compliance is often a top priority for cloud adopters. The cloud-based services need to align with enterprise policies and standards.  The retained IT function must be able to demonstrate compliance and where needed, enforce those standards in the cloud infrastructure.
  • Service Level Management and Trend Analysis. Whilst the retained IT organization does not need to be involved in the day-to-day monitoring and troubleshooting, they need to be able to monitor key service levels. Specifically, the business users will be highly sensitive to the performance of some business-critical applications. The retained IT organization will need to keep enough knowledge of these applications and of performance monitoring tools to ensure that application performance is measured adequately.
  • Application Life Cycle (DevOps). We have assumed in our scenario an infrastructure-only outsourcing — the skills for application development remaining in-house.  In the SDDC environment, the tenant operations team will work closely with the application development teams. Amongst other skills, the retained IT will need detailed knowledge not only of application provisioning, but also the architectures, configuration dependencies, and patching policies required to maintain those applications.

I have reviewed skills groups needed as more automation is used, but there will be less reliance on skills that relate to routine tasks and trouble-shooting. Skills that can typically be outsourced include:

  • Routine scripting and monitoring
  • System (middleware) configuration
  • Routine network administration

The diagram below is a (very simplified) summary of the evolution from traditional retained IT to tenant operations for SDDC environments.

Retained IT modelIt is also worth noting that the transformation from traditional infrastructure outsourcing to SDDC is a far from obvious step from the point of view of an outsourcer. Why should the outsourcer invest time and cost to streamline services, if the end customer has already contracted to pay for the full cost of service? Gaining buy-in from the outsourcer to transform its model can be a significant challenge. Therefore it is prudent to key to gain acceptance either:
–  early in the contract negotiations, so that the provider can build in a cloud delivery model in its service offering,
– or towards the end of a contract when the outsourcer is often highly motivated to obtain a renewal.

Finally outsourcers may initiate their own technology refresh programs, which can create a win-win situation when both sides are prepared to invest in modernization towards SDDC.

3 Key Take-Aways

  1. Organizations that undertake their journey to SDDC with an outsourcer are advised to establish a core SDDC  organization including most tenant operations skills; a key focus is to leverage automation (whilst routine, repetitive tasks can be outsourced).
  2. The exact profile of the tenant operations (retained IT) will depend on the scope of the outsourcing contract.
  3. Early contract negotiations, renewals, or technology refresh can create opportunities to encourage an outsourcer to move towards the SDDC model.

———
Pierre Moncassin
is an operations architect with VMware’s Global Operations Transformation Practice and is based in the UK. Follow @VMwareCloudOps on Twitter for future updates.

New Technical Roles Emerge for the Cloud Era: The Rise of the Cross-Domain Expert

By Pierre Moncassin

Pierre Moncassin-cropSeveral times over the last year, I have heard this observation: “It is all well and good to introduce new cloud management tools — but we need to change the IT roles to take advantage of these tools. This is our challenge.” As more and more of the clients I work with prepare their transition to a private cloud model, they increasingly acknowledge that traditional IT specialist roles need to evolve.

We do not want to lose the traditional skills — from networking to storage to operating systems — but we need to use them in a different way. Let me explain why this evolution is necessary and how it can be facilitated.

Emergence of Multi-Disciplinary Roles
In the traditional, pre-cloud IT world, specialists tended to carve a niche in their specific silos: they were operating systems specialists, network administrators, monitoring analysts, and so on. There was often little incentive to be concerned about competencies too far beyond one’s silo. After all, it was in-depth, vertical expertise that led to professional recognition — even more so when fast troubleshooting was involved (popularly known as “firefighting”). With a brilliant display of troubleshooting, the expert could become the hero of the day.

In the same silo model, business-level issues tended to be handled far away from the technologists. The technology specialists were rarely involved in such questions as billing for IT usage or defining service levels — an operations manager or service manager would worry about those things.

Whilst this silo model had its drawbacks, it still worked well enough in traditional, pre-cloud IT organizations — where IT services tended to be stable and changes were infrequent. But it does not work in a cloud environment, because the cloud approach requires end-to-end services — defined and delivered to the business.

Cloud consumers do not simply request network or storage services; they expect an end-to-end service across all the traditional silos. If an application does not respond, end users do not care whether the cause lies within networks or middleware: they expect a resolution of their service issue within target service levels.

Staffing the Cloud Center of Excellence
To design and manage such cloud-based services, the cloud center of excellence (COE) requires broader roles than the traditional silos. We need architects and analysts who can comprehend all aspects of a service end-to-end. They will have expertise in each traditional silo, but just as importantly, the ability to architect and manage services that span across each of those silos. I call these roles “cross-domain experts,” because they possess both the vertical (traditional silo) and horizontal (cross-silo) expertise, including a solid understanding of the business aspects of services.

Cross-domain competencies are essential to bring a cross-disciplinary perspective to cloud services. These experts bring a broad spectrum of skills and understand the ins and outs of cloud services across network, server, and storage — as well as a solid grasp of multiple automation tools. Beyond the technical aspects, they are also able focus on the business impact of the services.

Cross-domain experts also need to cross the bridge between the traditionally separate silos of  “design” and “build.” Whilst in the traditional IT model the design/development activities could be largely separated from the build requirements, a service-for-the-cloud model needs to be designed with build considerations up front.

Org for Cloud wpEvery team member in the COE needs to possess an interdisciplinary quality. If we look more specifically at the organization model defined in the white paper Organizing for the Cloud, after the leaders, these hybrid roles are foremost to be found in the following categories:

  • In the tenant operations team, the key hybrid roles are service architect and service analyst.
  • In the infrastructure operations team, the architect is a key hybrid role.

Takeaways

  • To build a successful cloud COE, develop multi-disciplinary roles with broad skills across traditional silos (such as networks, servers, and middleware). Break down the traditional barriers between design and build.
  • Foster both formal training and practical experience across domains.
  • Organize training in both automation and management tools.

——-
Pierre Moncassin is an operations architect with VMware Operations Transformation Services and is based in France. Follow @VMwareCloudOps on Twitter for future updates, and join the conversation by using the #CloudOps and #SDDC hashtags on Twitter.