By Vladimir Vivien, Staff Engineer at VMware
One of the most important propositions of Kubernetes is the ability to seamlessly provide storage capabilities on nodes where workloads are scheduled. Kubernetes provides a powerful plugin framework, along with a standard API, that allows different storage systems to be exposed as volume plugins. This framework standardizes the way persistent, ephemeral, or local storage is consumed by pods as file or block volumes. Primitives such as StorageClasses, PersistentVolumes, and PersistentVolumeClaims provide a declarative mechanism that separates storage implementations from consumptions. This separation allows storage operations such as provisioning, attaching, and mounting to be abstracted and made portable across clusters.
Originally, volume plugins were required to be implemented as part of the core Kubernetes codebase (in-tree). While this made the deployment of workloads with persistent volumes simpler, it presented some challenges:
- As more storage providers contributed their plugins, the codebase grew, resulting in larger Kubernetes binaries.
- Newcomers found that code contribution to Kubernetes can be intimidating.
- Bugs introduced by plugins can take down core components, such as the kubelet.
- Being part of the Kubernetes codebase meant a slower release cycle. It also meant that vendors were forced to release their code as open source.
The Container Storage Interface
As a solution to these issues, Kubernetes adopted the Container Storage Interface, or CSI, which is a community-driven effort to standardize how file and block storage are exposed to and accessed by container orchestrators, such as Cloud Foundry, Kubernetes, and Mesos.
CSI uses gRPC, a protocol for remote procedure calls, to define a programmatic interface composed of standard volume operations. These operations are implemented outside the container orchestrator’s codebase as plugin extension points that run out-of-process to expose volume services to the orchestrator.
This approach promotes storage driver interoperability between container orchestrators. A CSI driver can be implemented to work with different orchestrators with minimal or no change to its codebase.
CSI is a control-plane only specification and is not involved in the volume data consumption. It expects CSI drivers to be implemented as a set of long-running services to handle volume operation requests from the orchestrator. There are three services a vendor should implement, depending on the level of functionality, for a driver:
- Identity – The orchestrator queries this service to determine the identity and capability of the driver.
- Controller – This service exposes volume operations, such as provisioning and attachment, whose execution is not coupled with a node.
- Node – The node service provides operations, such as mount and unmount, that must be executed on the node where the volume is needed.
CSI dictates that, at a minimum, the Identity and the Node services must implemented. For instance, a driver that does not support controller services for volume creation and attachment can omit implementation of those methods.
CSI and Kubernetes
CSI got introduced in Kubernetes starting with the release of version 1.9 and graduated to general availability with version 1.13. One of the goals of Kubernetes CSI was to make implementations transparent to users by leveraging the existing internal volume controller mechanism. This allows Kubernetes to drive CSI-backed operations such as dynamic provisioning, attachment, and mounting using existing primitives such as StorageClasses, PersistentVolumeClaims, and PersistentVolumes.
Although CSI does not dictate how orchestrators should deploy CSI drivers, Kubernetes CSI prescribes a set of components, and their deployment model, as depicted in the following figure.
In Kubernetes, the deployment of a CSI driver includes the driver itself and several other containerized components that support storage functionalities as discussed in the following section.
CSI Driver Container
In Kubernetes, the driver can be deployed as a containerized process that hosts the long-running volume services and exposes them as gRPC endpoints over Unix domain sockets to other components. The driver can be deployed as a single binary that implements all supported volume operations or as separate binaries that implement their respective services (identity, node, controller). To minimize the coupling of the driver with the Kubernetes API, the Kubernetes CSI implementation includes several accompanying sidecars that externalize volume functionalities such as volume provisioning and attachment.
External Provisioner Sidecar
Since provisioning is not usually coupled to a specific node, this sidecar container can be deployed as a StatefulSet along with the CSI driver to handle volume creation and deletion requests. It inspects the API server for new PersistentVolumeClaims (PVCs) objects that are annotated for CSI. Upon discovery, the provisioner uses information from the PVC to initiate the creation of a new volume by delegating volume operations to its co-located CSI driver using gRPC calls. Conversely, when the PVC is removed, this component automatically sends volume operation calls to the driver to delete the volume.
External Attacher Sidecar
Similar to the provisioner, this component can be deployed on any node as a StatefulSet along with the CSI driver container. It watches the API server for VolumeAttachment objects, which are an indication to initiate a gRPC call to the driver to attach a new volume to a specified node. When the object is removed, it initiates a call to the CSI driver to detach the volume.
This component is prescribed to be deployed as a sidecar container on all worker nodes along with the driver. It is responsible for registering and exposing the Identity service endpoint of the CSI driver to the kubelet using an internal plugin registration mechanism. Once registered, the kubelet can call the driver using gRPC over Unix domain sockets to handle node-specific volume operations such as mount.
Internal Kubelet Plugin
This component runs internally as part of the kubelet and is opaque to driver authors and CSI users. It is responsible for coordinating volume operations that require access to the node’s filesystem. For instance, the CSI kubelet plugin receives requests for mounting volumes to be made available for workload pods. After preparing the mount point, it delegates the operation to the CSI driver container to complete the operation.
Other Kubernetes CSI Features
As CSI continues to evolve and mature, the community is working feverishly to bring new features to the Kubernetes implementation. The following alpha and beta features will graduate to general availability in the near future:
- Block volume – creates CSI drivers that can expose block volumes to a Kubernetes workload.
- Topology – uses topology rules to constrain volume placements on a subset of cluster nodes.
- Snapshot – uses Kubernetes to orchestrate a volume snapshot and its restoration using CSI drivers.
- Volume resize – provides ability to resize CSI-backed volumes after they have been provisioned.
- Ephemeral volume – ability to embed, in pod specs, CSI volumes that are tied to the lifecycle of the pods.
- In-tree to CSI migration – a framework to allow older volume plugins to migrate to their CSI counterparts.
Learn More About CSI
There is a lot more to CSI than is covered here. Find out more at the following links:
- Kubernetes CSI documentation – https://kubernetes-csi.github.io/docs/
- Kubernetes CSI Github repository – https://github.com/kubernetes-csi
- The Container Storage Interface spec – https://github.com/container-storage-interface/spec
- Sample CSI driver for Kubernetes – https://github.com/kubernetes-csi/csi-driver-host-path
- Known CSI driver implementations – https://kubernetes-csi.github.io/docs/drivers.html