Home > Blogs > VMware Consulting Blog > Tag Archives: VMware Consulting

Tag Archives: VMware Consulting

Planning a DRP Solution for VMware Mirage Infrastructure

Eric_MonjoinBy Eric Monjoin

When I first heard about VMware Mirage in 2012—when Wanova was acquired by VMware and Mirage was integrated into our portfolio—it was seen more as a backup solution for desktops, or a tool for migrating from Windows XP to Windows 7. And, with an extension, it was possible to easily migrate a physical desktop to a virtual one, so most of the time when we had to design a Mirage solution, the question of DRP or HA came up as, “Why backup a backup solution?” Mirage was not seen as a strategic tool.

This has changed, and VMware Mirage is now totally integrated as an Extended UNIX Code tool to manage user desktops through different use cases. Of course, we still have backup and migration use cases, but we also have more and more customers who are using it to ensure desktops conform to IT rules and policies to ensure infrastructure reliability for Mirage. In this post we’ll describe how to design a reliable infrastructure, or at least give the key points for different scenarios.

Let’s first have a look at the different components of a Mirage infrastructure:

EMonjoin Basic Mirage Components

Figure 1 – Basic VMware Mirage Components

  1. Microsoft SQL Database—The MS SQL database contains all the configurations and settings of the Mirage infrastructure. This component is critical; if the Microsoft DB fails, then all Mirage transactions and services—Mirage Management Server service and Mirage Server service—
  2. SMB Shared Volumes—These could be any combination of NAS, Windows Server, desktop files, apps, base layers, or USMT files—all stored on theses volumes (except small files and meta-data.)
  3. Mirage Management Server—This is used to manage the Mirage infrastructure, but also acts as a MongoDB server instance on Mirage V5.4 and beyond. If it fails, administration is not possible until a new one is installed, but there’s no way to recover desktops since small files stored in the MongoDB are no longer available.
  4. Mirage Server—This is used by Mirage clients to connect into. Often, many Mirage servers are installed and placed behind load-balancers to provide redundancy and scalability.
  5. Web Management—A standard Web server service can be used to manage Mirage using a Web interface instead of the Mirage Management Console. The installation is quite simple and does not require extra configuration, but note that data is not stored on the Web management server.
  6. File Portal—Similar to Web management above, it is a standard Web server service used by end users to retrieve their files using a Web interface, and again, data is not stored on the file portal server.
  7. Mirage Gateway—This is used by end users to connect to Mirage infrastructure from an external network.

Now, let’s take a look at the different components of VMware Mirage and see which components can be easily configured for a reliable and redundant solution:

  • Mirage Management Server—This is straightforward, and actually mandatory, because with MongoDB, we need to install at least one more management server, and the MongoDB will synchronize automatically. The last point is to use a VIP on a load-balancer to connect to, and to route traffic to any available management server. The maximum number of Mirage management servers is seven due to MongoDB restrictions. Keep in mind that more than two members can reduce performance as you must wait for acknowledgement from all members for each writing operation to the database. The recommended number of management servers is two.
  • Mirage Server—By default we install at least two Mirage servers or more; one Mirage server per 1,000 centralized virtual desktops (CVDs) or 1,500 (depending on the hardware configuration), plus one for redundancy and use load-balancers to route client traffic to any available Mirage server.
  • Web Management and File Portal—Since these are just Web applications installed over Microsoft IIS servers, we can deploy them on two or more Web servers and use load-balancers in order to provide the required redundancy.
  • Mirage Gateway—This is an appliance and is the same as the previous component; we just have to deploy a new appliance and configure load-balancers in front of them. Like the Mirage server, there is a limitation concerning the number of connections per Mirage gateway, so do not exceed one appliance per 3,000 endpoints, and add one for resiliency.

Note: Most components can be used with a load-balancer in order to get the best performance and prevent issues like frequent disconnection, so it is recommended to the set load-balancer to support the following:

  • Two TCP connections per endpoint, and up to 40,000 TCP connections for each Mirage cluster
  • Change MSS in FastL4 protocol (F5) from 1460 to 900
  • Increase timeout from five minutes to six hours

Basically, all Mirage components can be easily deployed in a redundant way, but they rely on two other external components, both of which are key: the Microsoft SQL database and the SMB shared volumes, both of which work jointly. This means we have to pay special attention to which scenario is privileged:

  • Simple backup
  • Database continuity
  • Or full disaster recovery

The level of effort required is not the same and depends on the RPO/RTO required.

So let’s have a look on the different scenarios available:

  1. Backup and RestoreThis solution consists of performing a backup and restore of both Microsoft SQL database and storage volumes in case a major issue occurs on either component. This solution seems relatively simple to implement and looks inexpensive as well. It could be implemented if the attending RPO/RTO is not high. In this case, you have a few hours to restore the service, and there is no need to restore data that has been recently backed up. Restoring lost data backed up in the last couple of hours is automatic and quick. Remember, even if you lose your Mirage storage, all data is still available on the end-users’ desktop; it will just take time to centralize them again. However, this is not an appropriate scenario for large infrastructures with thousands of CVDs as it can take months to re-centralize all the desktops. If you want to use this solution, make sure that both the Microsoft SQL database and the SMB volumes are backed up at the same time. Basically, this means stopping Mirage services, performing a backup of the database using SQL Manager to get a snapshot of the storage volumes, and stopping MongoDB from backing up files. In case of failure, you have to stop Mirage (if it has not already done that by itself) and restore the last database backup and revert to the latest snapshot on the storage side. Keep in mind you must follow this sequence: first, stop all mirage services, and then the MongoDB services.
  1. Protect Microsoft SQL DatabaseSome customers are more focused on keeping the database intact, and this implies using Microsoft SQL clustering. However, VMware Mirage does not use ODBC connections, so it is not aware of having to move to a different Microsoft SQL instance if the main one has failed. The solution resides in using Microsoft SQL AlwaysOn technology, which is a combination of the Microsoft SQL clustering and the Microsoft failover cluster. It provides synchronization between “non-shared” volumes among nodes, but is also a virtual IP and virtual network name that will move to the remaining node in case of disaster, or during a maintenance period.
  2. Full Disaster Recovery/Multisite Scenario—This last scenario concerns customers who require a full disaster recovery scenario between two data centers with a high level of RPO/RTO. All components are duplicated at each data center with load-balancers to route traffic to a Mirage management server, Mirage server, or Web management/File portal IIS server. This implies using the second scenario in order to provide Microsoft SQL high availability, and also to perform a synchronous replication between two storage nodes. Be aware that synchronous replication can highly affect storage controller performance. While this is the most expensive of the scenarios since it requires extra licenses, it is the fastest way to recover from a disaster. An intermediate scenario could be to have two Mirage management servers (one per data center), but to shut down Mirage services, and replicate SQL database and storage volumes during the weekend.

EMonjoin Multi-Site Mirage

Figure 2 – E.g. Multi-Site Mirage Infrastructure

For scenario two and three, the installation and configuration of Microsoft SQL AlwaysOn in a VMware Mirage infrastructure is explained further in the white paper.


Eric Monjoin joined VMware France in 2009 as PSO Senior Consultant after spending 15 years at IBM as a Certified IT Specialist. Passionate for new challenges and technology, Eric has been a key leader in the VMware EUC practice in France. Recently, Eric has moved to the VMware Professional Services Engineering organization as Technical Solutions Architect. Eric is certified VCP6-DT, VCAP-DTA and VCAP-DTD and was awarded vExpert for the 4th consecutive year.

The Complexity of Data Center Blueprinting

GKarakasBy Gabor Karakas

Data centers are wildly complicated in nature and grow in an organic fashion, which fundamentally means that very few people in the organization understand the IT landscape in its entirety. Part of the problem is that these complex ecosystems are built up over long periods of time (5–10 years) with very little documentation or global oversight; therefore, siloed IT teams have the freedom to operate according to different standards – if there are any. Oftentimes new contractors or external providers replace these IT teams, and knowledge transfer rarely happens, so the new workforce might not understand every aspect of the technology they are tasked to operate, and this creates key issues as well.

GKarakas 1

 

Migration or consolidation activities can be initiated due to a number of reasons:

  • Reduction of the complexity in infrastructure by consolidating multiple data centers into a single larger one.
  • The organization simply outgrew the IT infrastructure, and moving the current hardware into a larger data center makes more sense from a business or technological perspective.
  • Contract renegotiations fail and significant cost reductions can result from moving to another provider.
  • The business requires higher resiliency; by moving some of the hardware to a new data center and creating fail-proof links in between the workloads, disasters can be avoided and service uptime can be significantly improved.

When the decision is made to move or consolidate the data center for business or technical reasons, a project is kicked off with very little understanding into the moving parts of the elements to be changed. Most organizations realize this a couple of months into the project, and usually find the best way forward is to ask for external help. This help usually comes from the joint efforts of multiple software and consultancy firms to deliver a migration plan that identifies and prioritizes workloads, and creates a blueprint of all their vital internal and external dependencies.

A migration plan is meant to contain at least the following details of identified and prioritized groups of physical or virtual workloads:

  • The applications they contain or serve
  • Core dependencies (such as NTP, DNS, LDAP, Anti-virus, etc.)
  • Capacity and usage trends
  • Contact details for responsible staff members

Any special requirements that can be obtained either via discovery or by interviewing the right people

GKarakas 2

In reality, creating such a plan is very challenging, and there can be many pitfalls. The following are common problems that can surface during development of a migration plan:

Technical Problems

It is vital that communication is strong between all involved, technical details are not overlooked, and all information sources are identified correctly. Issues can develop such as:

  • · Choosing the right tool (VMware Application Dependency Planner as an example)
  • · Finding the right team to implement and monitor the solution
  • · Reporting on the right information, which can prove difficult

Technical and human information sources are equally important, as automated discovery methods can only identify certain patterns; people need to put the extra intelligence behind this information. It is also important to note that a discovery process can take months, during which time the discovery infrastructure needs to function at its best, without interruption to data flows nor appliances.

Miscommunication

As previously stated, team communication is vital. There is a constant need to:

  • Verify discovery data and tweak technical parameters
  • Involve the application team in frequent validation exercises

It is important to accurately identify and document deliverables before starting a project, as misalignment with these goals can cause delays or failures further down the timeline.

Politics

With major changes in the IT landscape, there are also Human Resource-related matters to handle. Depending on the nature of the project, there are potential issues:

  • The organization’s move to another data center might abandon previous suppliers
  • IT staff might be left without a job

It can be part of an outsourcing project that moves certain operations or IT support outside the organization

GKarakas 3

Some of these people will need to help in the execution of the project, so it is crucial to treat them with respect and to make sure sensitive information is closely guarded. The blueprinting team members will probably know what the outcome of the project will bring for suppliers and the customer’s IT team. If some of this information is released, the project can be compromised with valuable information and time lost.

Blueprint Example

When delivering a migration blueprint, each customer will have different demands, but in most cases, the basic request will be the same: to provide a set of documents that contain all servers and applications, and show how they are dependent on each other. Most of the time, customers will also ask for visual maps of these connections, and it is the consultant’s job to make sure these demands are reasonable. There is only so much that can be visualized in a map that is understandable, so it is best to limit the number of servers and connections to about 10–20 per map. The following complex image is an example of just a single server with multiple running services discovered.

 

GKarakas Figure 1

Figure 1. A server and its services visualized in VMware’s ADP discovery tool

Beyond putting individual applications and servers on an automated map, there can also be demand for visualizing application-to-application connectivity, and this will likely involve manipulating data correctly. Some dependencies can be visualized, but others might require a text-based presentation.

The following is an example of a fictional setup, where multiple applications talk to each other―just like in the real world. Both visual and text-based representations are possible, and it is easy to see that for overview and presentation purposes, a visual map is more suitable. However, when planning the actual migration, the text-based method might prove more useful.

GKarakas Figure 2

Figure 2. Application dependency map: visual representation

GKarakas Figure 3

Figure 3. Application dependency map: raw discovery data

GKarakas Figure 4

Figure 4. Application dependency map: raw data in pivot table

It is easy to see that a blueprinting project can be a very challenging exercise with multiple caveats and pitfalls. So, careful planning and execution is required with strong communication between everyone involved.

This is the first in a series of articles that will give detailed overview, implementation and reporting methods on data center blueprinting.


Gabor Karakas is a Technical Solutions Architect in the Professional Services Engineering team and is based in the San Francisco Bay Area.

Working with VMware Just Gets Better

Ford DonaldBy Ford Donald, Principal Architect, GTS PSE, VMware

Imagine someone gives you and a group of friends a box of nuts and bolts and a few pieces of metal and tells you to build a model skyscraper. You might start putting the pieces together and end up with a beautiful model, but it probably won’t be the exact result that any of you imagined at the beginning. Now imagine if someone hands you that same box, along with a blueprint and an illustration of the finished product. In this scenario, you all work together to a prescribed end goal, with few questions or disagreements along the way. Think about this in the context of a large technical engagement, for example a software-defined data center (SDDC) implementation. Is it preferable to make it up as you go along, or to start with a vision for success and achieve it through a systematic approach?

Here at VMware, we’re enhancing the way we engage with customers by providing prescriptive guidance, a foundation for success, and a predictable outcome through the SDDC Assess, Design and Deploy Service. As our product line has matured, our consulting approach is maturing along with it. In the past, we have excelled at the “discovery” approach, where we uncover the solution through discussion, and every customized outcome meets a unique customer need. We’ve built thousands of strong skyscrapers that way, and the skill for discovering the right solution remains critical within every customer engagement. Today we bring a common starting point that can be scaled to any size of organization and adapted up the stack or with snap-ins according to customer preference or need. A core implementation brings a number of benefits to the process, and to the end result.

A modular technical solution

Think of the starting point as a blueprint for the well-done data center. With our approach, the core elements of SDDC come standard, including vSphere, vCenter Operations, vCenter Orchestrator, and software-defined networking thru vCNS. This is the clockwork by which the SDDC from VMware is best established, and it lays the foundation for further maturity evolutions to Infrastructure Service and Application Service. The core “SDDC Ready” layer is the default, providing everything you need to be successful in the data center, regardless of whether you adopt the other layers. Beyond that, to meet the unique needs of customers, we developed “snap-ins” as enhancements or upgrades to the core model, which include many of our desirable, but not necessarily included-by-default, assets such as VSAN and NSX.

The Infrastructure Service layer builds on the SDDC by establishing cloud-based metaphors via vCloud Automation Center and other requirements for cloud readiness, including a service portal, catalog-based consumption, and reduction of administrative overhead. The Application Service layer includes vCloud Application Director and elevates the Infrastructure layer with application deployment, blueprinting and standardization.

From our experience, customers demand flexibility and customization. In order to meet that need, we built a full menu of Snap-ins. These snap-ins allow customers to choose any number of options from software-defined storage, NSX, compliance, business continuity & disaster recovery (BCDR), hybrid cloud capabilities and financial/cost management. Snap-ins are elemental to the solution, and can be added as needed according to the customer’s desired end result.

Operational Transformation Support

Once you’ve adopted a cloud computing model, you may want to consider organizational enhancements that take advantage of the efficiency gained by an SDDC architecture. As we work with our customers in designing the technical elements, we also consult with our customers on the operational processes. Changing from high administrative overhead to low overhead, introducing new roles, defining what type of consumer model you want to implement – our consultants help you plan and design your optimal organization to support the cloud model.

The beauty of this approach shines in its ability to serve both green field and brown field projects. In the green field approach, where a customer wants the consultants to take the reins and implement top to bottom, the approach serves as a blueprint. In a brown field model, where the customer has input and opinions and desires integration and customization, the approach can be adapted to the customer’s environment, relative to the original blueprint.

So whether you’re building your skyscraper from the ground up, or remodeling an existing tower, the new SDDC Assess, Design and Deploy Service provides an adaptable model, with a great starting point that will help you get the best out of your investment.

Stay tuned for an upcoming post that gives you a look under the hood of the work stream process for implementing the technical solution.


Ford Donald is a Principal Architect and member of Professional Services Engineering (PSE), a part of the Global Technical Solutions (GTS) team, a seven-year veteran of VMware. Prior to PSE, Ford spent three years as a pre-sales cloud computing specialist focusing on very large/complex virtualization deployments, including the VMware sales cloud known as vSEL. Ford also served as coreteam on VMworld Labs and as a field SE.

 

BCDR Strategy: Three Critical Questions

Jeremy Carter headshotBy Jeremy Carter, VMware Senior Consultant

Organizations in every industry are increasingly dependent on technology, making increased resiliency and decreased downtime a critical priority. In fact, Forrester cites resiliency as the number three overall infrastructure priority this year.

A business continuity solution that utilizes the virtual infrastructure, like the one VMware offers, can greatly simplify the process, though IT still needs to understand how all the pieces of their business continuity and disaster recover (BCDR) strategy fit together.

I often run up against the expectation of a one-size-fits-all BCDR solution. Instead it’s helpful to understand the three key facets of IT resilience—data protection, local application availability, and site application availability—and how different tools protect each one, for both planned and unplanned downtime (see the diagram below). If you’d like to learn more on that front, there is a free two-part webcast coming up that I recommend you sign up for here.

As important as it is to find the right tool, you only know a tool is “right” if it meets a set of clearly defined business objectives. That’s why I recommend that organizations start their BCDR planning with a few high-level questions to help them assess their business needs.

1. What is truly critical?

Almost everyone’s initial response is that they want to protect everything, but when you look at the trade-off in complexity, you’ll quickly recognize the need to prioritize.

An important (and sometimes overlooked) step in this decision-making process is to check in with the business users who will be affected. They might surprise you. For instance, I was working with a government organization where IT assumed everything was super critical. When we talked to the business users, it turned out they had all of their information on paper forms that would then be entered into the computer. If the computer went down, they would lose almost no data.

On the other hand, the organization’s 911 center’s data was extremely critical and any downtime or loss of data could have catastrophic consequences. Understanding what could be deprioritized allowed us to spend the time (and money) properly protecting the 911 center.

As we move further into cloud computing, another option is emerging: Let the application owners decide at deployment. With tools like vCloud Automation Center (vCAC), we can define resources with differing service levels. An oil company I recently worked with integrated SRM with vCAC so that any applications deployed into Gold or Silver tiers would be protected by SRM.

Planned Downtime Unplanned Downtime VMware Data Protection2. Which failures are you preventing?

Each level of the data center has its preferred method of protection, although all areas also need to work together. If you’re concerned about preventing failures within the data center, maybe you rely on HA and App HA; however, if you want to protect the entire datacenter, you’ll need SRM and vSphere Replication (again, see chart).

3. RTO, RPO, MTD?

Another helpful step in choosing the best BCDR strategy is to define a recovery time objective (RTO), recovery point objective (RPO) and maximum tolerable downtime (MTD) for both critical and non-critical systems.

These objectives are often dictated by a contract or legal regulations that require a certain percentage of uptime. When established internally, they should take many factors into account, including if data exists elsewhere and the repercussions of downtime, especially financial ones.

The final step in the implementation of any successful IT strategy is not a question, but rather an ongoing diligence. Remember that your BCDR strategy is a living entity—you can’t just set it and forget it. Every time you make a change to the infrastructure or add a new application, you’ll need to work it into the BCDR plans. But I hope that each update will be a little easier now that you know the right questions to ask.


VMware-WebcastWant to learn more about building out a holistic business continuity and disaster recovery strategy?
Join these two great (free) webcasts that are right around the corner.

Implementing a Holistic BC/DR Strategy with VMware – Part One
Tuesday, February 18 – 10 a.m. PST

Technical Deep Dive – Implementing a Holistic BC/DR Strategy with VMware – Part Two
Tuesday, February 25 – 10 a.m. PST


Jeremy Carter is a VMware Senior Consultant with special expertise in BCDR and cloud automation. Although he joined VMware just three months ago, he has worked in the IT industry for more than 14 years. 

Inside VMware Professional Services – Connect, Empower, Focus

From Bret Connor

Happy New Year! As vice president of VMware Professional Services for the Americas delivery team, I am fortunate to work with and lead a team of gifted, creative and downright smart professionals. How do we help our customers? Simple—we deliver faster results while cutting costs and accelerating business breakthroughs that have a material impact on your top line. I can go on but instead, we produced a visual guide that tells the story of the three cornerstone principles that differentiate our work from our competitors. Enjoy!