MySQL VM Backup with vSphere Data Protection

I often get questions regarding backup of a MySQL database server (VM) using vSphere Data Protection (VDP). VDP does not have an agent for MySQL, but you can of course perform an image-level (entire-VM) backup of a VM running MySQL. With VDP and many other backup solutions that use the vSphere APIs for Data Protection (VADP), the backup and recovery of Linux VMs is crash-consistent. In other words, there is no quiescing of the file system and applications running inside the VM when a backup of the VM is performed. When you recover a VM from a crash-consistent backup, it is similar to the state of the server when the power has failed unexpectedly (no graceful shutdown) and then power is restored. MySQL, as with other popular database solutions, has good, built-in protection against data loss and corruption when recovered from a crash-consistent state, but there are no guarantees. The goal is to minimize the chance of corruption and data loss. Here are a few recommendations when using VDP to back up a VM running MySQL:

Probably the best option for backing up MySQL databases with VDP is to perform scheduled (cron job) dumps of the databases into another location. You could then schedule the backup job to run after the database dump completes – the amount of time in between the two jobs can vary considerably depending on the number of databases you have and the size of the databases. Make sure the database dumps are finishing before the backup job starts. The location of the database dumps could be in a separate directory on the same .vmdk file, a separate .vmdk file belonging to the same MySQL VM, or perhaps even a completely separate VM. My preferred configuration is a separate .vmdk belonging to the MySQL VM. This keeps the backup process simple – backing up a single VM – but avoids the risk of these database dumps consuming too much space on the primary .vmdk file(s) that contain the OS and MySQL databases. MySQL has a mysqldump command which dumps the database to a flat file full of SQL statements for recreating schema objects and data. More details on mysqldump can be found here.

I have seen other articles that suggest the use of the pre-freeze-script and post-thaw-script files to quiesce the database at the moment the VM snapshot is taken for the backup job. Typically, running “flush tables with read lock” is suggested for the pre-freeze-script and “unlock tables” for the post-thaw-script. Technically, this does work, but there are some things you should be aware of:

The pre-freeze-script and post-thaw-script mechanisms are only supported by VMware for use with Windows VMs. This KB article has more information. The good news is work is in progress to improve things for Linux VMs.
The “flush tables with read lock” statement does work, but there are risks: If there is a long running SQL statement in progress, the “flush tables…” statement will not finish until the other currently running SQL statement does – that means the VM snapshot process will be severely delayed or time out and you will probably not get the consistent backup you were hoping for. Another item to be aware of with this statement is that the read lock is released as soon as the session that made the call ends. When the pre-freeze-script finishes, the session is closed. There can still be a small amount of time between when the pre-freeze-script finishes and when the VM snapshot process is completed. That means there is a slight chance a change could be made to the database after the pre-freeze-script completes and before the snapshot is in place. Also, since the lock is released when the pre-freeze-script ends, there is no need to put “unlock tables” in the post-thaw-script.

Certainly, there are solutions in the market place that are made specifically for backing up MySQL databases and these should be considered. I simply wanted to discuss how this might look with VDP along with a few items to be aware of.

@jhuntervmware