Forgive me. It has been more than six weeks since my last post. To make up for it, this post is pretty long, weighing in at around 3,000 words. On the plus side, you only need to read all of those words in the example code output you're really interested.
Part of the reason I have been slacking on the blog front is that our team has been working to get our new content and infrastructure built, upgraded and ready for all of you to enjoy at the VMworlds this year. We have some pretty exciting new labs in coming, as well as a some of VMworld sessions about how we run things and how some people are using our labs: INF1311 - Deploy and Destroy 85K VMs in a week: VMware Hands-on Labs Backstage and SDDC2277 - A New Way of Evaluating VMware Products and Solutions - VMwareHands-on Labs
There are some cases when you have plenty of bandwidth between your source and target, and other times when that is just not possible. Sill, the data must go through! So, if bandwidth between clouds is on the lower end, or if we are refreshing a version of an existing vPod, our second replication mechanism using rsync may save us quite a bit of time.
It has been said in a variety of different ways that the best I/O is the one you don’t need to do – that is especially true for moving data over great distances and/or via low-bandwidth connections. For the 2013 content year, as we developed this solution we transferred our base vPods using the LFTP method, then used the base pods as “seeds” for an rsync differential replication.
This solution is a combination of the right tools and the right process. Either one used in isolation would not provide us as much benefit as using them together. I know it sounds a bit gestalt and maybe a little cheesy, but I hope you see what I mean when you read through this post.
Tool: What is rsync?
Rsync is a program designed to efficiently update files at one location from the "master" at another location. The basis of rsync's efficiency is that computing and transferring cryptographic hashes of file parts is more efficient than transferring the data itself. This means that rsync uses the hashes (MD5 and a "rolling checksum") to identify portions of files that are identical and does not transfer those parts. This mechanism may not be beneficial for all types of data, but we discovered that it can work well for our vPods. Note that we are not concerned about the cryptographic strength of these hashes -- they are only used as a tool to detect commonality between the files.
When I first started looking at this tool, I noticed that most people are using it to mirror directory trees for backup purposes. These people are worried about how rsync detects different files between the file system trees so that they do not transfer the same file twice. My use case is slightly different: I am concerned about moving really large files that exist in a pretty flat directory structure. This means that I want to detect commonality within the files themselves. Fortunately, rsync can handle both tasks very well.
Rsync works at a file level. This means that it compares "fileX" at host A to "fileX" on host B. When we export vPods from vCD, we get a directory full of VMDK files with seemingly random names. This means that we did not necessarily have the same sets of files at each site, so we needed a way to match files that had a good chance of being similar in order to achieve maximum benefit. If I set rsync to "update" the system disk from an ESXi host from the seed copy of a Windows machine, that wouldn't be very efficient. Ideally, we want to update something that is as close as possible to that which we are trying to send.
Rsync calculates the differences between two files in real time during the replication, so using rsync to determine which mappings might be most efficient is not really practical. Due to some specifics around our use case, we are able to use the OVF file to give us a fighting chance. Our lab teams start with a “base vPod” which contains some standard components, and most of the components that are added to the pods have standardized names.
This means that we can begin by keying off the names of the VMs in the vPods to match the base (seed) to each team's initial (draft 1) version. For example, each pod has a "controlcenter" that looks very similar from a disk perspective, and many vPods with an "esx-01a" VM use the same build, so we usually only need to transfer that once.
Process: Lab Development Flow and Differences
Before lab topics have been decided for the year, the Hands-on labs core team begins work on the Base Templates. These templates serve as the basis for most of the Hands-on Labs vPods and contain the controlcenter, shared storage, a vpodrouter, a pair or more vESXi hosts, and at least one vCenter Server, typically an appliance. We usually create at least to different versions: Single Site and Dual Site. In addition, we provide other component VMs like additional vESXi hosts, a variety of Windows and Linux VMs, and appliances. When these are finished, they are checked in to our Master-Templates catalog and replicated to our target clouds as the initial "seeds" for future replication tasks.
When our teams begin development of our labs, we ask them to start with one of our Master Templates and build a "draft 1" vPod that is specific to their lab. This initial draft is intended to have all of the base components that each team will be using within their final vPod. When the teams have assembled this version, they check it in so we can export it and perform an initial replication to our target cloud(s). This is done very early in the process so that we have the maximum amount of time to replicate what should be the most differences we will see for each vPod. This is represented by blue blocks in the diagram on the right.
Because we started from a standard base, the core services within each vPod: ControlCenter, vESXi hosts, vCenter, and even storage appliances tend to have a lot of initial commonality. When we get a "draft 1" pod with the base components in it, that gets us even closer to the final state for each vPod. As we replicate incremental versions of the vPods, they deviate further from the base vPod, but they should be much closer to the previous version of that same vPod. So, when we need to update from "draft1" to "draft2" vPods, we use the "draft1" export as the seed for "draft2" and only transfer the that have changed between those versions -- the green blocks in the diagram.
By the time we get to the "final" version that we present at VMworld, there should be minimal differences (purple block) between that and "draft2", so WAN replication should take the least amount of time. At least, that is how it would work in the best case scenario.
OVF Mapping with PowerShell
When we export one of our vPods from vCloud director, we get an OVF file and a bunch of VMDKs. The OVF file contains, among other things, a map of the VMDK files to the VMs in the vPod. Our OVF mapping process reads the source and target OVF files and matches the VMDK files in the source and target exports to one another based on the component VM names and the position of the disks on each VM. The output of this process is a “lab map” which contains Cygwin commands to rename the current (old) VMDK files on the target with the corresponding names of VMDKs in the new export. This allows rsync to perform its differential update on the files -- we provide what it needs to work properly.
The OVF mapping process is implemented as a PowerCLI script we call Sync-VPod-Phase1. Dealing with an OVF in Powershell is a matter of opening it while casting the input variable as XML and navigating the structure. We do this for both the source and target OVFs, enumerate the VMs and VMDKs in each, and compare them to one another. For our use case, the basic structure of a given lab very rarely changes, especially as we transition in lifecycle from development to production and then maintenance.
As a failsafe, and because we had some fun developing naming conventions on the fly, the current script looks for a file called OLD.OVF in the new vPod’s export directory called – this is a copy of the OVF file from the existing vPod and is used by the script to map the old vPod. So, after I copy this OLD.OVF file from the existing vPod, I can issue the following command to map a new vPod called HOL-SDC-1303-v12 in the E:\HOL-Lib\ directory on the local (source) catalog host to the old one called HOL-SDC-1303-v11 that exists in the E:\HOL-Lib directory on the target-catalog catalog host and spit out the map file in E:\HOL-Lib\LabMaps :
Sync-VPod-Phase1 -Target target-catalog -OldName HOL-SDC-1303-v11 -NewName HOL-SDC-1303-v12 -SourcePath E:\HOL-Lib\ -TargetPath /cygdrive/e/HOL-Lib/ -OutputPath E:\HOL-Lib\LabMaps\
#### SCRIPT OUTPUT BEGIN ####
=*=*=*=* Sync-VPod-Push Phase 1 Start 02/27/2014 07:30:43 *=*=*=*= =====>Begin VMDK Mapping ==> HOST: vpodrouter OLD Hard disk 1->vm-c72c5fd3-c151-4bce-83f2-35053ba01206-disk-0.vmdk NEW Hard disk 1->vm-c9a1b951-2576-47b1-b90d-3a1f829b2862-disk-0.vmdk ==> HOST: esxcomp-01b OLD Hard disk 1->vm-7d01340f-0c45-474d-a2db-7838e6adee85-disk-0.vmdk NEW Hard disk 1->vm-0a1bb398-c78e-4db1-8616-f5d7fd0ab056-disk-0.vmdk ==> HOST: controlcenter OLD Hard disk 1->vm-8b05be29-2393-484d-b6c8-10e9c1ae1a4c-disk-0.vmdk NEW Hard disk 1->vm-200dabd3-58c7-4949-991c-137be7c4f5a2-disk-0.vmdk ==> HOST: esxcomp-01a OLD Hard disk 1->vm-4c1dbf27-6d7f-4b6b-95ef-4c26d624e623-disk-0.vmdk NEW Hard disk 1->vm-f174414a-cf84-477f-a560-2fd4e595d6d1-disk-0.vmdk ==> HOST: esxcomp-02a OLD Hard disk 1->vm-f200cf90-014a-43cd-9672-b60c4af0d9bc-disk-0.vmdk NEW Hard disk 1->vm-9980c5a2-ff71-422a-ba1f-adc2ec81966f-disk-0.vmdk ==> HOST: esx-02a OLD Hard disk 1->vm-dd3acf98-75b2-4999-a91d-946be4b31687-disk-0.vmdk NEW Hard disk 1->vm-a2d4d64c-fc33-42fb-8693-b646f36354f0-disk-0.vmdk ==> HOST: esx-01a OLD Hard disk 1->vm-82894142-1a4e-4343-827b-7519ea571da2-disk-0.vmdk NEW Hard disk 1->vm-be8b6106-b6f6-48ac-a4ec-6e557ab6284f-disk-0.vmdk ==> HOST: stgb-l-01a OLD Hard disk 5->vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-3.vmdk NEW Hard disk 5->vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-3.vmdk ==> HOST: stgb-l-01a OLD Hard disk 3->vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-1.vmdk NEW Hard disk 3->vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-1.vmdk ==> HOST: stgb-l-01a OLD Hard disk 1->vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-4.vmdk NEW Hard disk 1->vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-4.vmdk ==> HOST: stgb-l-01a OLD Hard disk 2->vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-0.vmdk NEW Hard disk 2->vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-0.vmdk ==> HOST: stgb-l-01a OLD Hard disk 4->vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-2.vmdk NEW Hard disk 4->vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-2.vmdk ==> HOST: vc-l-01a OLD Hard disk 1->vm-daa9c980-b5f7-4b98-b9e0-f7ec349bcda3-disk-0.vmdk NEW Hard disk 1->vm-962e9b35-2aeb-4750-a156-fbf37750e5b2-disk-0.vmdk ==> HOST: vc-l-01a OLD Hard disk 2->vm-daa9c980-b5f7-4b98-b9e0-f7ec349bcda3-disk-1.vmdk NEW Hard disk 2->vm-962e9b35-2aeb-4750-a156-fbf37750e5b2-disk-1.vmdk ==> HOST: nsxmgr-l-01a OLD Hard disk 1->vm-c51755ec-ea55-41f8-8a18-091f9742af28-disk-0.vmdk NEW Hard disk 1->vm-4d58c86f-b341-499e-9221-6ff5df9d7ac1-disk-0.vmdk =====>VMDK Mapping Complete Rename directory HOL-SDC-1303-v11 to HOL-SDC-1303-v12 =====>Perform file sync: REPLICATION COMMAND: rsync -tvhPrn --stats --delete --max-delete=3 /cygdrive/e/HOL-Lib/s0/HOL-SDC-1303-v12/ doug@target-catalog:/cygdrive/e/HOL-Lib/HOL-SDC-1303-v12 =*=*=*=*=* Sync-VPod-Push-Phase1 End 02/27/2014 07:30:44 *=*=*=*=*=
#### SCRIPT OUTPUT END ####
I created this script to write a lot of data to the console. Most of the output can usually be ignored, but is useful for troubleshooting anomalies in the mapping and will throw warnings if a new VM or new VMDK has been added in the new version. To be honest, most of the “anomalies” had to do with spaces in the vPod names, which translated to spaces in the paths that had to be escaped for Windows AND Cygwin AND ssh. My advice is to avoid the use of spaces in template names unless it is absolutely critical. It took me a while, but I think I have the escape sequences down. The most important output of the script is the mapping file which is deposited in the location specified in the -OutputPath parameter. In this case, that file is E:\HOL-Lib\LabMaps\HOL-SDC-1303-v12.txt
The Mapping File
The mapping file contains commands necessary to rename the old vPod’s export files with the new export’s names. These are written to execute via ssh from the local (source-catalog) machine to the specified remote machine (target-catalog). These commands can be passed into a Cygwin process, or otherwise executed via SSH, but this was a new process and timing was critical to our success, so I ran much of this process manually during the 2013 development cycle. By “manually,” I do not mean that I typed each of the commands. Rather, I pasted the commands into a terminal window, allowed them to execute, and verified the output. The original contents of the mapping file look something like this:
#### LabMap BEGIN ####
#### STARTING 02/27/2014 07:30:43 ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-c72c5fd3-c151-4bce-83f2-35053ba01206-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-c9a1b951-2576-47b1-b90d-3a1f829b2862-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-7d01340f-0c45-474d-a2db-7838e6adee85-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-0a1bb398-c78e-4db1-8616-f5d7fd0ab056-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-8b05be29-2393-484d-b6c8-10e9c1ae1a4c-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-200dabd3-58c7-4949-991c-137be7c4f5a2-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-4c1dbf27-6d7f-4b6b-95ef-4c26d624e623-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-f174414a-cf84-477f-a560-2fd4e595d6d1-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-f200cf90-014a-43cd-9672-b60c4af0d9bc-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-9980c5a2-ff71-422a-ba1f-adc2ec81966f-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-dd3acf98-75b2-4999-a91d-946be4b31687-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-a2d4d64c-fc33-42fb-8693-b646f36354f0-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-82894142-1a4e-4343-827b-7519ea571da2-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-be8b6106-b6f6-48ac-a4ec-6e557ab6284f-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-3.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-3.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-1.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-1.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-4.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-4.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-96aa7d32-2282-46bb-b567-1954890a187d-disk-2.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-dafb7f02-650f-4077-a940-e0c0cba36b25-disk-2.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-daa9c980-b5f7-4b98-b9e0-f7ec349bcda3-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-962e9b35-2aeb-4750-a156-fbf37750e5b2-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-daa9c980-b5f7-4b98-b9e0-f7ec349bcda3-disk-1.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-962e9b35-2aeb-4750-a156-fbf37750e5b2-disk-1.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-c51755ec-ea55-41f8-8a18-091f9742af28-disk-0.vmdk /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11/vm-4d58c86f-b341-499e-9221-6ff5df9d7ac1-disk-0.vmdk" ssh doug@target-catalog "mv /cygdrive/e/HOL-Lib/HOL-SDC-1303-v11 /cygdrive/e/HOL-Lib/HOL-SDC-1303-v12" rsync -tvhPrn --stats --delete --max-delete=3 /cygdrive/e/HOL-Lib/HOL-SDC-1303-v12/ doug@target-catalog:"/cygdrive/e/HOL-Lib/HOL-SDC-1303-v12" #### COMPLETED 02/27/2014 07:30:44
#### LabMap END ####
When executing the commands, I looked for any errors in the renaming processes. What I found was that Windows likes to hang on to things and sometimes the rename would fail because Powershell still had the file open following the export from vCD. The only way I was able to release that was to exit the offending Powershell process.
Replication time with rsync
Once you have the files on the target side renamed, you can execute the “rsync” line out of the LabMap -- it should be the line right above the #### COMPLETED line. This is the command that performs the actual differential replication. Here, I have implemented it in “dry run” mode because I like it to tell me what it is going to do before it actually does anything. Yes, rsync will delete files in the target that are not in the source, so you need to be very careful: typos can have disastrous effects.
In our example, here is the command generated by the script:
rsync -tvhPrn --stats --delete --max-delete=3 /cygdrive/e/HOL-Lib/HOL-SDC-1303-v12/ doug@target-catalog:/cygdrive/e/HOL-Lib/HOL-SDC-1303-v12
You can paste this into the cygwin terminal window and see what comes out. It will not make any changes, just report what it will do (“dry run”). The paths in this example are a little different, but here is an actual run from my machine:
If the dry run output shows that any VMDKs will be deleted, it is a good idea to verify why that is occurring. Sometimes a VM has been removed from a new version of a pod, and sometimes a temp drive has been removed from a VM. Either way, it is best to know that before allowing rsync to wipe it out on the remote side. Some tools have sharp edges and rsync is most definitely one of them.
Making it real
So, once things look good with the dry run, we can make a slight adjustment to the rsync command to make it actually do the work. Simply change “-tvhPrn” to “-tvhPr” and run the command. Note that I specify the (--delete) option, but temper it with (--max-delete=3) to minimize potential damage. You may need to adjust that value, depending on your use case.
rsync -tvhPr --stats --delete --max-delete=3 /cygdrive/e/HOL-Library/s0/HOL-SDC-1303-v12/ dbaer@wdc2-tfr:/cygdrive/e/HOL-Library/HOL-SDC-1303-v12
WARNING: Executing this command will delete all of the files that it said it would delete in the dry-run and then begin replication. If you mistype something, it could delete everything. Do yourself a favor and use the command history: up arrow, then just remove the "n" from the switches. Don't change anything else without executing another dry run.
As mentioned previously, it is not possible for rsync to give a time estimate up front since it calculates deltas on the fly. Using the specified rsync switches, you will be provided with realtime estimates for the current file.
You will notice that the time estimates and effective bandwidth numbers fluctuate, sometimes wildly. Unless there have been very significant changes to a pod, we typically average double-digit MB/s transfer rates, even over a link with substantially lower bandwidth actually available. You may have patches of 1-2 MB/s, or even sub-1MB/s where there have been a lot of changes, but those are typically fairly small except during initial replication.
A few cool things about rsync
- it is using SSH here, so all communication is encrypted
- it calculates checksums on the files, so you know that whatever is on the source is the same as is on the target
- once you’re finished, you can kick the same command off again and it will recalculate and compare all of the checksums (and hopefully not transfer anything!)
- it can pick up where it left off — sometimes — depending on the type of failure and the options specified. Read the man page for rsync for more details.
In my experience using 32-bit Cygwin on Windows 2008, the actual bandwidth of a single stream over SSH seems to be capped at around 2 MB/s, depending on the actual bandwidth between catalog hosts. This becomes an issue when you have a lot of new blocks or a new pod -- but you can use the LFTP method described in my previous post to overcome those limitations.
As each file completes, rsync will record the final effective bandwidth and duration information for that file. In the previous diagram, you can see that we effectively transferred 8.46 GB in around 12 minutes. That’s not too bad for a WAN transfer.
You can track progress through the vPod by looking at the “to-check=##/##” section. At the end of the replication, rsync provides a summary report detailing how much data was actually transferred. In this example, there was 2.52 GB of changed blocks in a 64.05 GB vPod, meaning that we did not need to transfer 61.53 GB over the WAN!
You can see that there is quite a bit of up-front work involved with getting the rsync delta replication to work for our use case, but I assure you that it is usually worth the effort. It seems a little more complicated here than it has become in our production process, and there is room to complete the automation and allow the script to take care of the whole thing. For 2013, this was being developed and many one-off replications were required. I like to completely understand a process before I unleash the Automation -- otherwise, you can have Really Bad Things happen when something unexpected occurs. There are quite a few moving parts here and not a lot of margin for error.
For no-brainer, hands-off, one-time replication, it is tough to beat LFTP. But, if you need to move large files quickly, and you have a fairly current previous version at your target site, rsync is pretty slick.