By Bharath Siravara, director of R&D, Cloud-Native Apps Business Unit, VMware

The VMware Photon Platform is a new infrastructure stack optimized for cloud-native applications. It consists of Photon Machine and the Photon Controller, a distributed, API-driven, multi-tenant control plane that is designed for extremely high scale and churn.

When we unveiled the platform at VMworld in August, we also announced that we would be open sourcing the Photon Controller so we could engage directly with developers, customers and partners. Today we are making good on that promise. If you are a developer interested in forking and building the code, go to the github page. If you are interested in using the software to see what it’s all about, follow our Getting Started Guide to bring up a full system even on your laptop.

Photon Controller was itself built as a distributed, highly scalable fabric and embodies the key characteristics of a cloud-native application. The technical architecture of the Photon Controller is as shown in the figure below.

PC ArchitecturePhoton Controller implements a novel distributed scheduler. It is a hierarchy of scheduler service nodes with each node in the tree only having awareness of its direct children. Schedulers bubble essential stats on their load/utilization up to their parents, and parents route requests down through the scheduler tree to resolve placement requirements. This way the schedulers avoid heavy load on a single metrics/configurations database.

Photon Controller has a number of loosely coupled components or services. They are managed via ‘distributed coordination’ whereby the endpoints are registered in Apache Zookeeper. Using Zookeeper, services might be fully scale-out (active/active), or have standby servers (active/passive) or have clearly partitioned work (e.g. scheduler).

External APIs are provided as a REST/JSON interface using the Dropwizard framework (plus swagger for self-documenting). The external REST/JSON API is exposed by a set of horizontally scaled out API Servers that share a persistent database (the CloudStore) that holds the state. A load balancer (like haproxy) fronts the API servers.

The CloudStore, along with most other services, are implemented using a brand new framework that we are also open sourcing today called Project Xenon. CloudStore acts as the single source of truth for all objects (containers, clusters, VMs, disks, networks, physical hosts, etc.) that are managed by Photon Controller. In order to meet the scalability and availability requirements, all services, and the CloudStore in particular, need to be built such that they are scalable and durable. Xenon gives us a multi-version replicated document store built around lucene, which allows us to build highly scalable components that are built as a collection of microservices. All services are written using Java with most being written using Xenon.

Each physical host managed by Photon Controller has an agent that runs on it and provides an RPC interface implemented using thrift which all other components use to communicate with the host. The agent is designed to be hypervisor agnostic. But the only implementation of the agent is on ESX, is written in python and talks to ESX via public APIs.

There are a number of special components that oversee the health of other components. For example, the ‘Chairman’ is responsible for the schedulers’ health and tree topology.

The system is intended to be self-healing, and to this end, the ‘housekeeper’ component is responsible for longer-running cleanup operations. These aren’t (typically) initiated by an external API request, but instead by internal components observing an unexpected situation that needs auto-resolution (if possible) rather than just logging an error.

Once you have had a chance to use Photon Controller, my team would love to get your feedback in our Google Group!