HPC’s shift to hybrid cloud and Kubernetes from Bare Metal:
While traditional computing has virtualized the majority of workloads over the past decade, HPC has mostly remained bare metal, with more than 80% of the workloads still remaining unvirtualized. With its eternal quest to extract maximum performance from the hardware, there has been a hesitation to move to any form of virtualization due to the perception of performance reduction due to overhead. Some of the major drawbacks with running everything bare metal still exist in HPC environments. It takes many months to get a bare metal HPC environment up and running and these environments are not fully utilized at all times leading to inefficient use of resources
The world of IT and other computing are now shifting to hybrid clouds and Kubernetes, which is almost 100% virtualized. The world of HPC has realized that there can be major benefits to leveraging these technologies and it’s on demand capabilities to innovate faster, while reducing cost at the same time.
Singularity the container runtime for HPC:
Container technologies has enabled the pre-packaging of applications concisely for easier deployment. Modern application ecosystems have leveraged containers to bring agility to deployment and lifecycle management to their environments. Docker has been the most popular container runtime and has helped cloud native applications and microservices rapidly evolve.
Even though Docker is the most prominent container registry and platform, it was primarily designed for Microservices and not for High Performance Computing (HPC). Singularity is the container solution designed from the ground up for HPC and scientific computing. Singularity container is encapsulated in a single file making it highly portable and secure. Singularity is an open source container engine that is preferred for HPC workloads and has more than a million containers runs per day with a large specialized user base.
Container based applications can bring a lot of challenges to orchestration, day 2 operations and lifecycle management. With the advent of Kubernetes, container management and orchestration are being standardized. Initially Kubernetes supported only the use of docker as a container runtime in its initial versions, but it has evolved. The Container Runtime Interface (CRI) in Kubernetes available since version XXX provides support for multiple container runtimes.
In an earlier blog article, we showed the use of singularity containers to run machine learning workloads. In this solution we will look at combining the capabilities of Kubernetes offered through VMware PKS to run HPC applications leveraging Singularity runtime. The Singularity CRI is currently is in beta testing and therefore might not be suitable for production environments. The Sylabs Documentation site provides details about the CRI.
HW & SW Components of the Solution:
The solution was developed in the VMware Solutions lab leveraging the following components.
Table 1: HW components of the solution
The VMware SDDC and other SW components used in the solution are shown below:
Table 2: SW components of the solution
Deploying the Solution:
CentOS 7.6 Linux was used as the base operating system for the Kubernetes master and worker nodes. VMware Essentials version v1.14.3+vmware.1 was used as the Kubernetes deployment. Below are some of the steps that are required to deploy the Singularity CRI. Many packages found in EPEL have to be installed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
[root@sc2kubm40 ~]# yum install -y epel-release <strong>Singularity and git related components can then be installed</strong> [root@sc2kubm40 ~]# yum install -y singularity-runtime singularity git socat golang gcc libseccomp-devel <strong>Since the Singularity CRI is in beta, it needs to be downloaded with git and compiled. These steps can be avoided once it becomes generally available.</strong> [root@sc2kubm40 ~]# git clone https://github.com/sylabs/singularity-cri.git Cloning into 'singularity-cri'... remote: Enumerating objects: 396, done. remote: Counting objects: 100% (396/396), done. remote: Compressing objects: 100% (312/312), done. remote: Total 7523 (delta 115), reused 337 (delta 66), pack-reused 7127 Receiving objects: 100% (7523/7523), 7.59 MiB | 3.18 MiB/s, done. Resolving deltas: 100% (3573/3573), done. [root@sc2kubm40 ~]# cd singularity-cri [root@sc2kubm40 singularity-cri]# git checkout tags/v1.0.0-beta.6 -b v1.0.0-beta.6 Switched to a new branch 'v1.0.0-beta.6' [root@sc2kubm40 singularity-cri]# make GO bin/sycri [root@sc2kubm40 singularity-cri]# make install INSTALL /usr/local/bin/sycri INSTALL /usr/local/etc/sycri/sycri.yaml <strong>A systemd startup script for Singularity CRI should be created as shown below or using your favorite editor.</strong> [root@sc2kubm40 singularity-cri]# cat <<EOF > /etc/systemd/system/sycri.service > [Unit] > Description=Singularity-CRI > After=network.target > > [Service] > Type=simple > Restart=always > RestartSec=1 > ExecStart=/usr/local/bin/sycri > Environment="PATH=/usr/local/libexec/singularity/bin:/bin:/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin" > > [Install] > WantedBy=multi-user.target > EOF <strong>Once the service has been created, it has to be enabled to startup automatically on boot and started.</strong> [root@sc2kubm40 singularity-cri]# systemctl enable sycri Created symlink from /etc/systemd/system/multi-user.target.wants/sycri.service to /etc/systemd/system/sycri.service. [root@sc2kubm40 singularity-cri]# systemctl start sycri <strong>SELINUX and swap should be disabled as shown below.</strong> # Disable SELinux [root@sc2kubm40 singularity-cri]# setenforce 0 [root@sc2kubm40 singularity-cri]# sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux # Disable swap [root@sc2kubm40 singularity-cri]# swapoff -a [root@sc2kubm40 singularity-cri]# sed -e '/swap/s/^/#/g' -i /etc/fstab <strong>Verify the Singularity runtime is running and is listening on a TCP socket as shown below.</strong> [root@sc2kubm40 singularity-cri]# ls -l /var/run/singularity.sock srw------- 1 root root 0 Oct 30 11:12 /var/run/singularity.sock <strong>The Kubelet configuration should be edited to make it use the singularity runtime instead of the default docker runtime as shown below.</strong> [root@sc2kubm40 ~]# cat /etc/sysconfig/kubelet KUBELET_EXTRA_ARGS=--container-runtime=remote \ --container-runtime-endpoint=unix:///var/run/singularity.sock \ --image-service-endpoint=unix:///var/run/singularity.sock <strong>The Kubelet service has to be stopped and started to activate the changes.</strong> [root@sc2kubm40 singularity-cri]# systemctl stop kubelet [root@sc2kubm40 singularity-cri]# systemctl start kubelet <strong>Initialize kubeadm with the singularity run time as shown below.</strong> [root@sc2kubm40 singularity-cri]# kubeadm init --pod-network-cidr=192.168.0.0/16 --cri-socket unix:///var/run/singularity.sock I1030 11:17:10.162386 27725 version.go:240] remote version is much newer: v1.16.2; falling back to: stable-1.14 [init] Using Kubernetes version: v1.14.3 [preflight] Running pre-flight checks [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' . . . |
1 |
Your Kubernetes control-plane has initialized successfully! |
Then you can join any number of worker nodes by running the following on each as root:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
kubeadm join 172.16.7.40:6443 --token ljjnc1.xxtlzc0fmctyrwgv \ --discovery-token-ca-cert-hash sha256:a79fc7723df3c3c343356ac614c787c0e8ea3e8440a6d7892cf2820079805a50 <strong>Kubernetes config should be updated in the master to point to the cluster as shown.</strong> [root@sc2kubm40 ~]# export KUBECONFIG=/etc/kubernetes/admin.conf [root@sc2kubm40 ~]# cp /etc/kubernetes/admin.conf .kube/config [root@sc2kubm40 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION sc2kubm40 NotReady master 3m45s v1.14.3+vmware.1 |
A calico based pod network is deployed on the cluster as shown below
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
[root@sc2kubm40 ~]# kubectl apply -f https://docs.projectcalico.org/v3.8/manifests/calico.yaml configmap/calico-config created customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created clusterrole.rbac.authorization.k8s.io/calico-node created clusterrolebinding.rbac.authorization.k8s.io/calico-node created daemonset.apps/calico-node created serviceaccount/calico-node created deployment.apps/calico-kube-controllers created serviceaccount/calico-kube-controllers created [root@sc2kubm40 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION sc2kubm40 Ready master 4m32s v1.14.3+vmware.1 |
The master node should be untainted to allow it to also act as a worker
1 2 3 4 5 6 7 8 9 |
[root@sc2kubm40 ~]# kubectl taint nodes --all node-role.kubernetes.io/master- node/sc2kubm40 untainted [root@sc2kubm40 ~]# ps -ef | grep singu root 27448 9725 0 11:23 pts/0 00:00:00 grep --color=auto singu root 28439 1 1 11:18 ? 00:00:03 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --container-runtime-endpoint=unix:///var/run/singularity.sock |
Now that the simple one node K8 cluster is up, we will deploy a simple hello world pod to confirm that the cluster is functional and is leveraging the Singularity container runtime.
1 2 3 4 5 6 7 8 9 |
[root@sc2kubm40 ~]# kubectl run hello --image=gcr.io/google-samples/hello-app:1.0 --port=8080 --generator=run-pod/v1 pod/hello created [root@sc2kubm40 ~]# kubectl get pods NAME READY STATUS RESTARTS AGE hello 1/1 Running 0 6s |
The network port for the deployed is exposed to outside users leveraging the NodePort as shown below.
1 |
[root@sc2kubm40 ~]# kubectl expose pod hello --target-port=8080 --type=NodePort |
The port that should be used to access the application externally can be displayed as shown below
1 2 3 |
[root@sc2kubm40 ~]# echo 'http://'$(kubectl get pods -l 'run=hello' -o jsonpath='{.items[0].status.hostIP}')':'$(kubectl get services -l 'run=hello' -o jsonpath='{.items[0].spec.ports[0].nodePort}') <strong>http://172.16.7.41:32677</strong> |
The help world pod can now be accessed using the URL in a browser as shown below.
Figure 1: Demo application running on Singularity CRI
To verify that the container runtime is Singularity use.
1 2 3 |
[root@sc2kubm40 ~]# kubectl describe nodes | grep Runtime <strong>Container Runtime Version: singularity://3.4.1-1.2.el7</strong> |
Conclusion
Singularity is the most commonly used container runtime for High Performance computing environments Leveraging containers can simplify the application packaging and deployment of complex HPC Applications. In this solution we show cased a Essentials PKS deployment leveraging a singularity runtime. An example application was deployed and demonstrated. Now that we have shown a simple pod running on Kubernetes with a Singularity runtime, we can apply the same concept to all containerized HPC applications. In future work, we hope to show complex HPC applications running on Kubernetes. Through this solution we have shown that Singularity Containers in combination with VMware PKS provides a great platform to run modern HPC applications.