Home > Blogs > VMware Consulting Blog > Tag Archives: infrastructure

Tag Archives: infrastructure

Perform Proactive Load Testing to Build a Successful Environment

Hans BaderBy Hans Bader

So, your company has bought a new set of hardware, referenced the latest white papers and reference architectures, and now will get the virtual machine (VM) densities promised, right? Well, maybe not. White papers and reference architectures are great starting points for designing and building your environment, but unless you are running the same workloads, your mileage may vary. The key to knowing what your infrastructure will support is to proactively perform load testing – before going into production.

Successful load testing is a considerable amount of work; it involves creating synthetic workloads, and understanding the metrics and the impact on the end-user experience. Holistic load testing will bring in different teams: storage, networking, compute, application development, software distribution and virtual infrastructure. Each of these teams has a stake in ensuring a good end-user experience.

Manage, Understand, and Set Expectations

Understand that the performance of a virtual desktop is all about the performance the end user (your customer) is seeing and perceiving. Gathering all the metrics from VMware vCenterTM, PCOIP logs and storage IOPS are all important, but ultimately it is the end-user’s perception and experience that is most important. It is easy for an administrator to say, “The VM has 2 GB out of 4 GB of memory free,” but if the user is experiencing poor performance due to network contention, the end user is still unhappy.

You must set the proper expectations and understand what you can test. Generating CPU and memory load inside the guest is relatively easy with tools such as Iometer. Iometer does a great job of generating compute load, but does not provide any user experience metrics. With remote desktops the challenge becomes testing PCOIP and client-desktop communication.

Have Your Plan in Place

Have your testing methodology, objectives and metrics documented in advance. It is important to develop your test design before starting the actual load testing process. Think it through completely; map the information flow for the entire load test process, entry points and process dependencies. If you are going to create a view pool of 1,000 desktops, will the LAN segment where you will be creating the desktop have enough IP addresses available? Do you know that anti-virus updates are a known pain point? Include these in your testing scenarios. Also include software updates if applicable.

Understand what is going to be tested and how the testing will impact end users. The end-user experience with virtual machines is more than just performance graphs of the VM in your vCenter inventory. Are you testing a local install of Microsoft Word, or a larger client-server based application? Many of the applications running in a virtual desktop are dependent on systems (databases, web services, etc.) that exist outside the desktop. Do you have an information flow diagram that shows all the systems an application may interact with? Do you know where the choke points are? Adequate desktop resources are not sufficient if you are load testing 1,000 desktops running a CRM application – but the environment can only scale to 750 users.

Your End Users Can Help You

During testing do not rely solely upon metrics: your testing must include “eyes on the glass.” Have actual users run through the test scenarios to understand how—as the load increases—the user experience may be impacted. An end user can establish what a good baseline is, what acceptable performance is, and when the end-user experience starts to degrade. These subjective user perceptions can be roughly mapped to network metrics, storage latency or memory usage.

Documented Test Plans

Leverage existing test plans where possible. Many times there are existing test plans for applications that have been developed in-house. These are company- specific and require domain subject matter experts to create and execute on. Utilizing these people can decrease the time and effort required to create and document your current test plans.

Test What is Real

This very important concept is often overlooked. Don’t simply consider CPU and memory consumption of a virtual machine. Running CPU Busy and generating 100 percent CPU usage inside a VM is not realistic. To generate accurate user experience loads you must use appropriate tools, such as:

Proper load testing of your new environment means testing both your architectural and physical designs. It is important to understand how the user load may impact your initial physical design. The number of hosts per cluster, desktops deployed per data store, and network connectivity all come into play. You may find you have been overly conservative in your resource assumptions; but you can change your cluster sizing and therefore obtain greater desktop densities.

During your load testing, use this time to understand the impact on typical administrative tasks while running the hosts. For example, how long does it take to spin up a new pool of 500 desktops when you are running a load test with 1,000 desktops? Or how long does it take to put a host in maintenance mode when it has 80 desktops running? The outcomes of these ancillary tests may change the way you administer your environment.

Expose the Weak Links

What if, during your load testing, you break something? Perhaps you’ll run out of DHCP addresses, the KMS server and your hosts start swapping, LUNS run out of space, and VMs crash. These events should not be considered failures, but rather successful tests. These events show you where to focus attention prior to the next load test so real users do not experience these problems during live operations. Yes, load testing can be a lot of work, and take a considerable amount of effort to do effectively, but the end results are worth it: end users and administrators are happy.

Plan for Remediation

Exposing a weak link during load testing is not a failure, but a positive result. You should ensure your testing plan has time built in to address any weaknesses that are uncovered or that you may have time to test again. The amount of time that has to be added depends on the amount of load that broke the system. If load testing early on with fewer users exposed a lack of DHCP addresses this is a relatively easy fix to a DHCP scope. On the other hand, if testing at full predicted load uncovered a storage performance bottleneck, the time to procure additional storage, install and configure could be much longer.

Testing Scenarios

Your first fully automated test should be a single system test—a single test to ensure your test plan runs through to completion. With no resource contention and no over-commitment on the hosts, this is your baseline. This should also be correlated with an actual user single system test, ensuring the user experience is what is expected.

For the second test, ramp up to 50 percent of what the calculated capacity is. This gives enough wiggle room so you can determine if your design assumptions are accurate. Do you have enough IP addresses? Is storage able to keep up? How are the memory stats?

Run a third test at 100 percent calculated capacity. This is where getting real users into the system is critical. How long does it take to login? Are the test scenarios within the acceptable parameters? Is the user experience acceptable? Have you met all your design criteria and business requirements?

Finally, a fourth test at more than 100 percent expected capacity should be run. Add more desktops, start a full anti-virus scan, perform a software update. No matter how well we design, we always have to plan for the worst-case scenarios. The unexpected removal of a host from a cluster dramatically impacts capacity. Put a host in maintenance mode or reboot it without putting it in maintenance mode. How does your environment perform under these extreme conditions?

“We must contemplate some extremely unpleasant possibilities, just because we want to avoid them.”

– Albert Wohlstetter, American nuclear strategist, 1960

For more information, be sure to check out the following VMware Education Courses:


Hans Bader Consulting Architect, VMware EUC. Hans has over 20 years of IT experience and joined VMware in 2009. With a focus on helping organizations being operationally ready, he works with customers to avoid common mistakes. He is a strong advocate for proactive load testing of environment before allowing users access. Hans has won numerous consulting awards within VMware.

End User Computing 101: Network and Security

By TJ Vatsa, Principal Architect, VMware Professional Services

TJ Vatsa

In my first post on the topic of End User Computing (EUC), I provided a few digestible tidbits around infrastructure, desktop and server power, and storage. In this post, we’ll go a bit further into the infrastructure components that affect user experience and how users interact with the VDI infrastructure. We’ll cover network and security, devices, converged appliances, and desktop as a service.

Let’s look a bit more closely at network and security first.

Network and Security

To ensure acceptable VDI user experience, monitor the bandwidth and latency or jitter of the network. This means performing the appropriate network assessment by deploying monitoring tools to first establish a baseline. Once that’s completed, you’ll need to monitor the network resources against those baselines. As with any network, high latency can negatively affect performance, though some components are more sensitive to high latency than others.

When deploying Horizon View desktops using the PC-over-IP (PCoIP) remote display protocol in a WAN environment, consider the Quality of Service (QOS) aspect. Ensure that the round-trip network latency is less than 250 ms. And know that PCoIP is a real-time protocol, so it operates just like VoIP, IPTV, and other UDP-based streaming protocols.

To make sure that PCoIP is properly delivered, it needs to be tagged in QoS so that it can compete fairly across the network with other real-time protocols. To achieve this objective, PCoIP must be prioritized above other non-critical and latency tolerant protocols (for example, file transfers and print jobs). Failure to tag PCoIP properly in a congested network environment leads to PCoIP packet loss and a poor user experience, as PCoIP adapts down in response. For instance, tag and classify PCoIP as interactive real-time traffic. (Classify PCoIP just below VoIP, but above all other TCP-based traffic.)

For optimizing network bandwidth, ensure that you’ve got a full-duplex end-to-end network link. Consider segmenting PCoIP traffic via IP Quality of Service (QoS) Differentiated Services Code Point (DSCP) or a layer 2 Class of Service (CoS) or virtual LAN (VLAN). While using VPN, ensure that UDP traffic is supported.

Enterprise security for corporate virtual desktops is of paramount importance for the successful rollout of VDI infrastructure. It is highly recommended that an enterprise scale, policy-based management security solution be used to define and enforce security policies within the enterprise.

Based on typical customer requirements, secure access to the VDI infrastructure is provisioned via the following user access modes:

  1. LAN Users: VDI users accessing virtual desktop infrastructure via the corporate LAN network.
  2. VPN Users: VDI users accessing corporate virtual desktop infrastructure via the VPN tunnel.
  3. Public Network Users: VDI users accessing virtual desktop infrastructure via the public network.

Use Case: VDI User Secure Access Modes

Enforcing authentication and authorization policies is a domain by itself, and is influenced by industry verticals. For instance, many hospitals prefer “tap-‘n’-go” solutions to authenticate and authorize their clinical staff to access devices and Electronic Medical Record (EMR) applications. The regulatory compliance perspective should not be ignored either when it comes to industry verticals, such as HIPAA for healthcare industry and PCI for the financial industry.

Note: The scenario depicted below is that of a typical public network user.

Infrastructure scenario

Horizon View infrastructure can be easily optimized to support any combination of secure VDI user access modes.

Devices

Based on security policies and regulatory compliance standards that are prevalent within the enterprise, I highly recommended doing a thorough end user devices/endpoints assessment. You’ll want to categorize your users based on desktop communities that support one or more types of endpoints. VMware’s Horizon View client supports a variety of endpoints, whether they’re desktops, laptops, thin clients, zero clients, mobile devices, or tablets that support iOS, Android, Mac OS X, Linux, Windows, HTML Access—just to name a few.

Converged Appliances

The converged appliances industry is rapidly and effectively maturing as more and more customers prefer converged appliances because they enable faster infrastructure deployment times. From an EUC infrastructure perspective, it’s important to evaluate available converged appliance solutions available for your business scenarios.

Vendors are and will be providing customized and optimized solutions for EUC, business continuity and disaster recovery (BCDR) as x-in-a-box, wherein the required infrastructure components, hardware and software have been validated and optimized to cater to specific business scenarios.

Desktop as a Service (DaaS)

Some customers worry about EUC datacenter planning, infrastructure procurement, and deployment.

DaaS scenario

Look to hosted desktop services, such as Horizon DaaS, to address business requirements and use cases that revolve around development, testing, seasonal bursts, and even BCDR. DaaS can even provide a more economical alternative to traditional datacenter deployment. For instance, DaaS reduces your up-front costs and lowers your desktop TCO with predictable cloud economics that enable you to move from CapEx to OpEx in a predictable way.

Plus, users can access Windows desktops and applications from the cloud on any device, including tablets, smartphones, laptops, PCs, thin clients, and zero clients. DaaS solutions like Horizon DaaS desktops can also be tailored to meet the simplest or most demanding workloads, from call center software to CAD and 3D graphics packages.

In these first two posts, we’ve gotten a good handle on infrastructure, devices, and security. In my next post, I’ll cover mobility and BYOD along with applications and image management, and weave it all together with EUC project methodology.


TJ has worked at VMware for the past four years, with over 20 years of experience in the IT industry. At VMware TJ has focused on enterprise architecture and applied his extensive experience to Cloud Computing, Virtual Desktop Infrastructure, SOA planning and implementation, functional/solution architecture, enterprise data services and technical project management.

TJ holds a Bachelor of Engineering degree in Electronics and Communications from Delhi University and has attained multiple industry and professional certifications in enterprise architecture and technology platforms. TJ is a speaker and a panelist at industry conferences such as VMworld, VMware’s PEX (Partner Exchange) and BEAworld. His passion is the real-life application of technology to drive successful user experiences and business outcomes.

Build It Right The First Time – 3 Steps for Agile Private IaaS

By Jung Hwang, Enterprise Solutions Architect, VMware

Imagine you are aJung Hwang general contractor building a house for a family. Without meeting them, you decide it should have three bedrooms, two bathrooms, and a two-car garage. When the family moves in, you have to begin renovating immediately—they have four teenage daughters and a baby on the way.

We all know that adding another bedroom, another bathroom, and doubling the garage is more costly and time consuming than it would have been to build the house to fit the family in the first place. So why do IT organizations so often make the same mistake when building out the Infrastructure as a Service (IaaS) environment?

I recently saw this with a large financial institution that deployed their infrastructure to support an initial set of low complexity infrastructure use cases. Everything went fine until they started adding other IaaS use cases, including Database as a Service (DBaaS) and Big Data as a Service (BDaaS). To accommodate larger, faster, low latency, and high IO workloads, they were forced to scale up their existing blade servers but discovered their NFS storage environment wasn’t sufficient to support the new workloads. They ended up redesigning their underlying hardware platform, including compute, storage, network, and security.

It seems obvious that a lot of “renovation” could be saved by understanding which services will “live in the house” first. So why do IT organizations often fail to employ that foresight? Continue reading