cloud_foundry products

Diego Phase 1: Staging in a multiplatform world

featured-mountain-stagingThe first phase of Diego’s development has focused on offloading the staging workflow – the task of converting uploaded app bits to compiled droplets ready for running in Cloud Foundry – from the existing DEAs to Diego. From the outset one of Diego’s mandates has been to make it relatively easy to support multiple platforms (e.g. linux + heroku buildpacks/linux + docker/windows) on one Cloud Foundry installation.

This blog post outlines what has emerged out of this first phase of development, and describes Diego’s architecture with an emphasis on how multiplatform support is envisioned to work.

The Pieces

To wrap your head around Diego you need to wrap your head around the components that make Cloud Foundry tick. Here’s a bullet-point list broken out into existing runtime components and new components introduced by Diego.

Runtime Components

  • Cloud Controller receives user inputs and sends NATS staging messages to Diego. The diego code is under diego_staging_task.rb and differs from the DEA codepath. in particular, we’ve moved the responsibility for computing staging environment variables completely into CC.
  • NATS is the message bus used by the existing Cloud Foundry runtime
  • Loggregator streams application logs back to the user.
  • DEAs run staged droplets in warden containers

Diego Components

  • ETCD is the high-availability key-value store used to coordinate data across Diego’s components.
  • Stager listens for staging messages over NATS and constructs and places a staging-specific RunOnce (more on these below) into ETCD.
  • Executor picks up RunOnces from ETCD and executes them in a garden container (garden is a Go rewrite of warden).
  • Linux-Smelter transforms a user’s application bits into a compiled droplet. It does this by running Heroku-style buildpacks against the app bits.
  • FileServer provides blobs for downloading (the smelters live here) and proxies droplet uploads to the CC (this allows us to have a simpler upload interface downstream of the CC).

Other Diego Pieces

In addition to these components (which, basically, map onto separate processes running on seaparate VMs) there are a few additional puzzle pieces/concepts that must be understood when discussing Diego:

  • RunOnces: The executor is built to be a generic platform-agnostic component that runs arbitrary commands within garden containers. The RunOnce is the bag of data that tells the executor what commands to run. When the executor receives a RunOnce it:
    • Creates a garden container and applies the memory and disk limits conveyed by the RunOnce
    • Runs the actions described in the RunOnce. There are four actions:
      • Download: downloads a blob from a URL and places it in the container
      • Upload: grabs a file from the container and uploads it to a URL
      • Run: runs a provided script in the context of provided environment variables
      • FetchResult: fetches a file from the container and sets its content as the result of the RunOnce
    • Marks the RunOnce as succeeded/failed and saves it to ETCD
  • RuntimeSchema is a central repository of models (including the RunOnces) and a persistence layer that abstracts away ETCD. Diego components use the runtime-schema to communicate with each other.
  • Inigo is an integration test suite that excercises Diego’s intercomponent behavior.

The Staging Flow

Here’s a detailed outline of how information flows during the Diego staging process:

  • The user pushes an app to Cloud Controller via the CF cli: cf push my_app
  • CC sends a staging NATS message. This message includes:
    • The App Guid
    • A target stack (used to support multi-platform CF deployments)
    • An ordered list of buildpacks (names + download URLs) to run when compiling the application
    • The location of the user’s app bits (a url to a blobstore)
    • The environment variables to be applied during the staging process (e.g. VCAP_APPLICATION, VCAP_SERVICES)
  • An available Diego stager receives the staging NATS message and constructs a RunOnce with the following actions:
    • A Download action to download the user’s app bits
    • Download actions for each of the requested buildpacks
    • A Download action to download the correct smelter (this is selected by stack)
    • A Run command to run the smelter (along with the environment variables received from CC)
    • An Upload command to upload the droplet
    • A FetchResults action to fetch staging metadata from the smelter
  • The Diego stager then puts the RunOnce in ETCD
  • A compatible Diego executor (one that matches the desired stack) picks up the RunOnce, spawns a container, and executes its actions. When it does this the executor also streams logs generated by commands run in the container back to the user via Loggregator.
  • On success, the executor marks the RunOnce as complete and puts it back in ETCD
  • A Diego stager then pulls out the completed RunOnce and notifies CC that staging is complete.

It’s important to understand that the Diego stager’s role is quite small: it simply interprets the incoming NATS message, produces a valid staging RunOnce, and then conveys the result of executing said RunOnce back to the CC. The pool of Diego executors is doing all the heavy lifting (namely: actually downloading the app bits and producing a droplet).

Multi-platform Entry Points

Diego is built to make supporting multiple-platforms relatively straightforward. The parameter that determines which platform an app is targeted for is the stack:

  • The user selects a target platform by specifying a stack when pushing an app.
  • The Diego stager selects a smelter based on the stack and creates a RunOnce with the associated stack.
  • The Diego executors are configured with a stack parameter. An executor will only pick up a RunOnce if the stack denoted by the Runonce matches the executor’s stack.

Given this, to support a new platform (e.g. Windows) one needs the following pieces:

  1. A smelter designed for the target platform. For linux, Diego has a linux-smelter that runs through Heroku-style buildpacks. For windows, for example, one could construct a smelter that simply validates and repackages a user’s app bits in preparation for running against a .NET stack (in such a case the notion of buildpacks is unnecessary and can be ignord). The smelter’s api is simple: it should be an executable that accepts a certain set of command line arguments and produces a droplet.tgz file.
  2. A platform-specific plugin for the executor. This is compiled into the executor when it is built and solves certian platform-specific issues (e.g. converting environment variables from an array-of-arrays input format to a platform-specific format).
  3. A platform-specific Garden backend plugin. Garden performs all containerization via a backend plugin with a well-defined interface. Garden ships with a linux-backend that constitutes a reference implementation. To target other platforms one simply needs to write an API-compatible backend plugin.

In terms of deployment we envision that most components (the stager, fileserver, etcd, cloud controller, nats) will be deployed to Linux VMs. Only the Diego Executor and Garden need to live on the non-linux target platform. Since these components are written in Go, recompiling and targetting a supported non-linux platform should be relatively straightforward.