Mirage and MongoDB, Part 1: Introduction
By Yan Aksenfeld, Member of Technical Staff, VMware
This four-part blog-post series discusses the addition of MongoDB to Mirage.
This blog post discusses MongoDB and how it optimizes Mirage performance.
VMware Mirage requires many components, including servers, management consoles, gateways, file portals, and disk volumes. In VMware Mirage 5.4, VMware introduced a substantial new component to Mirage—the MongoDB database. Why add another component?
Prior to Mirage 5.4, the product required very expensive storage arrays for large implementations. Mirage is an I/O-intensive product, and it is no wonder, given that thousands of endpoints upload and download large amounts of data over unstable WAN connections. Historically, VMware tried to reduce the amount of data written to storage with innovative deduplication and optimization techniques—but the fact remains that in large-scale disaster recovery scenarios, Mirage needs to store large amounts of data.
After a substantial amount of testing, and working with several large customers, we found that around 70 to 80 percent of files backed up and uploaded from endpoints to servers are smaller than 64 KB. These small files also constitute the majority of network traffic and disk operations in a Mirage deployment environment.
In order to address this, we decided on a new approach—to store all of these small files in a high-performance, highly available database. After further consideration and testing, we chose MongoDB as the best fit for this purpose. MongoDB does not replace SQL Server but enhances existing functionality. SQL Server continues to manage Mirage operations. xxx.
How Does Mirage Work with MongoDB?
Now that we understand why MongoDB was added to Mirage, let us see how small files are stored in this database. Figure 1 illustrates an implementation of Mirage with the introduction of MongoDB. Note that not all components of Mirage are displayed.
Figure 1: MongoDB Architecture with VMware Mirage
Small files are stored in the MongoDB database on upload or download. For example, when performing an operation that requires a file to be downloaded from the server (such as with a restore operation), the following takes place:
- Mirage looks for the file in the MongoDB database. If the file is there, it is downloaded and the process ends.
- Mirage looks for the file in storage. If the file is there and it is larger than 64 KB, it is downloaded and the process ends.
- If the file is in storage and it is smaller than 64 KB, it is copied to the MongoDB database for the next time it is requested. The file is downloaded from storage.
This process optimizes download operations over time by ensuring requested small files are located in the MongoDB database for faster retrieval.
After rigorous testing with multiple customers, the integration of MongoDB with Mirage has proven to decrease the number of read and write operations to Mirage storage by 50 to 80 percent. In some cases, the addition of MongoDB has even increased Mirage performance ten-fold.
Summary of Part 1
In Part 1 of this blog post series on Mirage and MongoDB, you learned that
- Mirage stores a large number of small files as part of its normal operation.
- MongoDB was chosen to enhance how Mirage works with these small files.
- The introduction of MongoDB has significantly improved the performance of Mirage.
Part 2 discusses new components, installations, and upgrades.
Part 3 takes a closer look at the underlying technology behind MongoDB and Mirage.
Part 4 provides information on troubleshooting MongoDB issues.