Anushka Makhija and Austina Lin contributed to this post.
If you run Tanzu Application Service for Windows or Tanzu Kubernetes Grid Integrated Edition with Windows guests, you know the process for creating a Windows stemcell can be complicated. For instances running in public clouds like AWS, Azure, or GCP, Microsoft allows for the distribution of these stemcells directly through the public cloud vendor. Which means that, as a platform operator, all you need to do is download the stemcell and upload it to Operations Manager.
For private clouds, the process is a bit different. Since Microsoft does not allow for the distribution of licenses in this context, it has fallen on platform operators to manually create their own stemcells. Initial steps for creating those stemcells include procuring a “clean” Windows image, configuring the networking, running the necessary Powershell commands, and converting the VM into a stemcell. Each month, as new operating system patches are released, most of this work needs to be repeated. The process is complicated, time-consuming, and worst of all, error-prone. It is also worth noting that with the general availability of Stembuild in June 2020, the manual process of Windows stemcell creation will be deprecated in terms of VMware support.
That’s why the Windows and .NET teams at VMware created Stembuild, a tool that enables platform operators to automate the creation of Windows stemcells. It removes the complex toil from this process and replaces it with two simple commands: stembuild construct
and stembuild package
.
If you are new to Stembuild and how it works, check out the videos here.
For those of you already using Stembuild, or thinking about adding it to your Windows stemcell creation process, this post will go over some best practices and show you how to avoid some common pitfalls.
So I just updated my windows server 1803 stemcell on PASW in 20 minutes and deployed to *all* live VMs with no downtime to any apps. <click>
"That was easy"1person + 20minutes + 1deploy = PatchedWindowsServers#dotnetatpivotal #pivotal #windows #zerodowntime #patchedservers pic.twitter.com/praWz5VuC8
— David Dieruf (@DierufDavid) January 10, 2019
Best practice #1: Use a vanilla Windows image
Many organizations want to customize their OS images, which is understandable. Adding virus scanning, removing unnecessary packages, or making the OS compatible with internal management tools is often necessary. This is especially true with long-lived OS instances, like those running on end-user client devices. But it isn’t the case with Windows stemcells.
When creating stemcells with Stembuild—or even without it—customized Windows images can lead to issues with both creation and deployment. Moreover, customized stemcells for Linux or Windows aren’t supported on public cloud stemcells. For these reasons, using a vanilla base Windows OS can keep your infrastructure stable, supportable, and consistent across a multicloud deployment.
Additionally, because Windows stemcells are going to be run as part of a cloud native application, they are necessarily small in scope and short-lived. In a microservices architecture, an individual instance may only perform a single, small function. And once that function is done, that instance will be destroyed. So the same security practices used on long-lived instances are not necessary for a cloud native context.
Customizations can still be done with Bosh Add-ons. These are a cloud-friendly way of making the customizations necessary to run your business and/or comply with regulations. They are added after the stemcell has been uploaded. Contact your VMware representative if you have questions as to how this can be done.
Best practice #2: Avoid manual shutdown of your VM
Once the base Windows Server image has been uploaded to vCenter, configured, and cloned, it’s time to run stembuild construct
. This command hardens the cloned VM into a stemcell. Things like cleaning registry settings, applying security policies, updating root certificates, and installing the Bosh agent are automated with construct
. Once the work is complete, construct
runs Sysprep, which shuts down the VM. Now the operating system is clear of network settings and its hostname, and ready for Bosh to use the image over and over again.
This last step is important, as Stembuild will automatically shut down the target VM once completed. A manual shutdown can result in incomplete Sysprep, or the stemcell not being Bosh-ready.
Shutting down the VM manually in vCenter. Don’t do this!
If the VM is hanging and not shutting down after stembuild construct
, ensure all users are logged out of the Windows instance. Do this prior to running stembuild construct
. If the issue continues, there may be another, underlying issue that needs to be resolved.
Best Practice #3: Only use Bosh SSH to connect to the target VM
Stembuild construct
has now hardened the operating system and Stembuild package
has created the stemcell for you, and you have uploaded the stemcell to Operations Manager. So now, as Bosh uses the image to deploy managed VMs (i.e., Diego cells), you might need to connect to one of the instances. But while you might be tempted to try RDP or WinRM to remote in, both of those options were removed during the Stembuild process. The recommended way of remoting into an instance operating system is through Bosh SSH. It’s built into the Bosh CLI and is a very secure way of managing Cloud Foundry cells.
Bosh SSH is a powerful tool. In addition to advanced security functionality like encryption and the creation of centralized, audible logs, it will also create a unique temporary user account for each SSH session. Upon exiting, the account (and all of its contents) will be deleted to ensure no lingering data can be used maliciously.
Going further (or getting started)
All three best practices ensure that you can create Windows stemcells easily, securely, and in a repeatable fashion. If you would like to dive further into automating this process, check out this Concourse pipeline. It will completely automate the stemcell creation process with Stembuild—from monitoring Windows for updates to pushing your new stemcell into Operations Manager.
For more information about Stembuild, including demos, check out these videos.