Conway’s Law teaches us that how an organization, group or team is structured has a great impact on the outcomes the organization produces. If you split a team responsible for a product into four sub-teams, the final product will be made up of four distinct components or modules. It is important, therefore, to organize IT the right way to get the full benefits of the increased flow, fast feedback cycles, and a culture of learning and experimentation – benefits that DevOps promises.
How IT Is Often Structured Today
While there a few different ways to structure IT that are rather common today, I will focus on a typical split between architecture, infrastructure and software. Another typical way would be “Plan-Build-Run”, but similar arguments I will make can be made about that structure.
See the structure depicted on the right – there are a few Shared Functions, such as Project Management or Process Management. There is a “Plan” function, called “Solutions and Architecture”, and a split between Applications, Infrastructure and Monitoring and Tooling. In such organizations, we often find separation of duties, e.g. application developers are typically not allowed to deploy the applications into the production environment. This is why “Application Provisioning” is part of the “Monitoring & Tooling” group. Security is often not even formally part of IT, but instead is organized in a separate function, reporting to a Chief Information Security Officer (CISO).
So, what are the problems with such organizations? Here are a few:
- Responsibility for applications is spread across four different functions: Applications Architecture, Applications, Monitoring and Tooling, and Release Management. Each one of these four reports into a different manager.
- Applications are dependent on additional stakeholders, like Enterprise Architecture and Infrastructure, again not part of the same group or reporting chain.
- Security is often not even formally part of IT, but instead is organized in a separate function, reporting to a Chief Information Security Officer (CISO).
Organizations like these encapsulate different portions of the service lifecycle in functions, creating silos and handovers, which impede fast application delivery with high quality.
How can we improve? The basic idea is to implement small teams that are responsible for a (micro-) service across its entire life cycle. Think of a (micro-) service as a component, application, or even a module of a larger application. The basic idea is that a service should be doing one thing, and do the one thing well. If your larger service is hosting a website, there’s multiple micro-services involved in that. At the very bottom, it starts with a service that provides the bare metal to run compute, network and storage on. The web server would be its own micro-service, running on a VM, or it could run in a container. There’s probably a relationship to a micro-service running a content management system for the website. A service providing database functionality. And so on.
The services will define interfaces, and as long as these interfaces remain unchanged, innovation can happen “inside the service” without disturbing the larger architecture. This principle is called “loose coupling”, and it is an important principle in architecting and developing applications and services today. Because of Conway’s Law, organizational structures should be built to amplify the idea of loose coupling.
The structure on the right reflects such a loose coupling. There still are functions that are shared across all of IT, such as Enterprise Architecture, Process Management or Project Management. However, the (micro-)services are now encapsulated in separate teams, each team being responsible for its own micro-service. Note that this diagram still does not include a separate security function. I will come back to that later. Let’s take a look at a few select functions in more detail.
The purpose of the Enterprise Architecture (EA) function is to develop IT strategies that ensure business and IT alignment. A lot of my customers claim they have an EA function when, in fact, they have solution architects who worry about the specific technical architecture of a service, instead of the overall architecture and strategy of IT. EAs are responsible for building such reference architectures, lead the development of the overall IT roadmap, and coordinate the architecture review process. They are also expected to identify upcoming business and technology trends, analyzing the potential impact of those trends on the business and propose required changes in the strategies they maintain.
Cloud Infrastructure Services
Cloud Infrastructure Service (CIS) is responsible for providing multi-cloud, multi-vendor infinite infrastructure in an API-driven, easily consumable manner. In a sense, CIS is the one-stop shop for Infrastructure and Platform as a service (IaaS / PaaS). This function has two sub-groups – Cloud Infrastructure Engineering and Cloud Infrastructure Operations.
Cloud Infrastructure Engineering act as the service team responsible for the IaaS and PaaS services. As such, they develop architectures based on the EAs global strategy, maintain the cloud management platform and build and maintain CI/CD pipelines for the IaaS and PaaS components.
Cloud Infrastructure Operations perform the daily activities needed to manage the physical infrastructure, such as racking and stacking physical infrastructure, maintaining the data center systems, such as power and cooling, and they are also responsible for the physical security of the data center.
Cloud Services consists of all the micro-service teams that are responsible for all the services IT delivers. This means ownership of the end-to-end service mostly related to applications, including architecture, design, development, testing, and deployment and on-going operation. The cloud services are built on the platform(s) provided by Cloud Infrastructure services. Cloud Service teams are expected to maintain a service backlog together with product owners representing the customers of the service. The backlog must include functional and non-functional requirements, including requirements related to security, reliability, availability, continuity or performance and capacity.
The main purpose of the Service Consulting function is to provide guidance and resources on application re-platforming, refactoring, and operations to Cloud Infrastructure Services and Cloud Service teams. The guidance must also include assistance with changing the team culture, if needed, to allow for a more collaborative culture.
Service Consulting can assist other service teams with reliability and availability issues, identify largest sources of stress within services, and also help with identifying potential emergencies waiting to happen, such as single points of failure. They provide knowledge transfer related to cultural and technical issues, such as teaching teams to conduct blameless post-mortems, establish service level objectives or defining automation that can or should be implemented.
What About Security?
As mentioned above, security requirements are to expected to be covered by each of the micro-service teams. This does not mean that there will not be a security organization, or that the CISO position will no longer exist. The CISO still is a very important function, as the CISO is expected to set security strategy and policy for the entire organization. But each service team is expected to be in compliance with all of these security policies, and in essence include security from the ground up when defining, designing, building, deploying and operating their service.
The benefits of structuring IT as described in this blog come in three main areas. First, the organizational structures forces responsibility for end-to-end services. The emphasis on services focuses the organization and the teams to concentrate on delivering outcomes that customers need to achieve instead of just application functionality. The fact that service teams have responsibility for the services across their entire life cycle means that there are no handovers or “throw over the wall” points in the life cycle, reducing the possibility for things to go wrong.
Secondly, the idea that service teams are covering all requirements for a service means that things like security, reliability or availability are no longer bolted on after the service has been built and is now being tested. These requirements are to be represented on the service backlog, and the service teams will ensure to integrate the necessary functionality into the service.
Last, but not least, this structure fosters a DevOps / Agile approach to service development and management. The teams represent loosely coupled services, with the ability to innovate “inside the service” as much as possible, while not disturbing the interface. This will lead to a “learn fast, learn often” mentality, enabling ever quicker innovation cycles.
As a Transformation Architect, Kai supports the VMware Advisory Transformation Services (ATS) team. Kai assists his clients to strategize and transform their IT organization to a services focused organization. Through client assessment of people, process, and technology, Kai and the ATS team will develop roadmaps and enhance the processes and procedures required to transform a client’s environment into a Software Defined Datacenter (SDDC).