This article takes eight common misperceptions about virtualizing Hadoop and explains why they are errors in people’s understanding. The short explanations given should serve to clear up the understanding about these important topics.
Myth #1: Virtualization may add significant performance overhead to a Hadoop cluster.
This is a common question from users who are in the early stages of considering virtualizing their Hadoop clusters. Engineers at VMware (and some of its customers) have done several iterations over multiple years of performance testing of Hadoop on vSphere with various hardware configurations. These tests have consistently shown that virtualized Hadoop performance is comparable to, and in some cases better than that of a native equivalent.
I know, it’s been a while since I blogged. It’s been an insanely busy time here at VMware, especially for vSphere security. VMworld US and Europe vSphere security sessions were very popular! And since then, I’ve been traveling a whole bunch, meeting customers and talking about security operations. A recurring ask has been “How can I learn to run my vSphere and NSX environments more securely?”
Well, that is about to be answered! With input from myself and Chris McCain and the tireless work of the VMware Education team putting the content together I’m proud to say there is now a course for SDD Security Operations!
Entitled “Security Operations for the Software Defined Data Center”, the course is for vSphere admins who are getting pressured to run their infrastructure in a more secure fashion. And based on the crowds in my VMworld sessions, this should be SUPER popular!!!
Here’s a quick overview of the course and it’s objectives:
In the VMware Security Operations for the Software-Defined Data course, we teach you how to use the VMware Software-Defined Data Center (SDDC) product portfolio and tools to better manage administrator access, harden your VMware vSphere® environment, and secure data at rest and in motion. We also cover compliance and automation to help you ensure your deployments align with your security policies.
Describe the concepts involved in securing a software-defined data center and protecting the data in the data center
Manage vSphere administrator access to hosts and the VMware vCenter Server™ system based on identified job roles and requirements
Implement best-practice security of vSphere components based on organizational security policies
Configure data protection for data at rest and data in motion
Manage protection for virtual machines, endpoints, and networks
Use micro-segmentation to protect and manage multitier applications and network data
Perform activity monitoring and logging, and explore relevant logs to meet compliance requirements
Use VMware NSX™ security groups, policies, and tags to automate deployment and security processes
Use automation to respond to security-related events
So, where can you learn more? VMware Education! Here’s the link
If you take the course, please send me some feedback. A lot of hard work went into it, especially by the VMware Education folks. We’re already talking about an update late next year to incorporate “future” stuff.
Throughout this blog post I’ll highlight some of the enhancements that have been brought to the vSphere Web Client in 5.5 Update 3. This is especially important as we see customers continue to leverage the legacy vSphere Client (also referred to as the legacy C# client). Our goal is to make the Web Client everyone’s primary management tool for vCenter Server & vSphere and continuing to improve performance has been an essential requirement in doing that.
In the first part of this series we provided a high level view of the benefits of using Virtual Volumes enabled storage for database operations. In the second part of this series we examined in more detail how Virtual Volumes can improve the backup and recovery capabilities for business critical databases, specifically Oracle.
The backups for Oracle can be Database consistent or Crash consistent. In this part we will look at Crash consistent backup and recovery and also how database cloning is simplified by the use of VVol. Continue reading →
The Hadoop-based system running on vSphere that is described here was architected by Rajit Saha, (who provided the material for this blog) and a team from VMware’s IT department.
This article describes the technical infrastructure for a VMware internal IT project that was built and deployed in 2015 for analyzing VMware’s own business data.. Details of the business applications used in the system are not within the scope of this article. The virtualized Hadoop environment and modern analytics project was implemented entirely on the vSphere 6 platform.
The key lesson that we learned from this implementation is that you can start at a small scale with virtualizing big data/Hadoop and then scale the system up over time. You don’t need to wait for a large amount of hardware to become available to get started.
One question I’m commonly asked (aka weekly if not daily) is what are the perfect pCPU to vCPU ratios that I should plan for, and operate to, for maximum performance. I wanted to document my perspective for easy future reference.
There is no common ratio and in fact, this line of thinking will cause you operational pain. Let me tell you why.
VMware released NSX-v (NSX for vSphere) 6.2 back on August 20, 2015. With its release the NSX team introduced support to use NSX-v as a load balancer for the vSphere Platform Services Controller (PSC) for highly available deployments (Release Notes). This is a key new feature that enables customers to further leverage existing NSX-v deployments to simplify their vSphere infrastructure while providing additional HA capabilities for the PSC. This can be a fairly straightforward undertaking when there is an existing vCenter being used for management (e.g. a management cluster).
There is a second scenario, however, that requires some consideration. What if you’re deploying a new vSphere and NSX-v environment where a management vCenter does not already exist? Romain Decker, a Solution Architect in VMware’s Software-Defined Datacenter (SDDC) Professional Services Engineering team has put together a great blog post on the VMware Consulting Blog that walks through that exact scenario and provides a step-by-step instruction on how to work around this chicken and egg scenario using the ability to easily repoint a vCenter Server to an alternate PSC in vSphere 6.0 Update 1.
To learn more about configuring NSX-v as a load balancer for the vSphere Platform Services Controller, read Romain’s full blog post at:
In the first part of this series we provided a high level view of the benefits of using Virtual Volumes enabled storage for database operations. In this post, we will examine in more detail how Virtual Volumes can improve the backup and recovery capabilities for business critical databases, specifically Oracle.
The backups for Oracle can be Database consistent or Crash consistent. In this part we will look at Database consistent backup and recovery.
The solution requires VVol enabled storage. We leveraged SANBLaze VirtuaLun as the backend storage for the backup and recovery exercise. We used the VirtuaLun 7.3 emulator from SANBlaze. This emulator is VVol enabled and is one of the first VVol certified storage solutions available. Continue reading →
In September we announced that VMware Tools 10.0.0 Released and that VMware is now shipping VMware tools outside of the vSphere releases. Since then, we have received a lot of feedback from the community, customers, and internal folks alike. I would like to let everyone know that we have listened and we continue on our path to make VMware Tools lifecycle (and ESXi lifecycle for that matter) easier and less painful than how it may appear today.
If you’ve done any research into the high-availability options available for vCenter Server 6.0, hopefully you have had a chance to read the VMware vCenter Server 6.0 Availability Guide written in collaboration with Technical Marketing and Global Support Services as well as KB 1024051. And you might have noticed particular sections that refer to the vCenter Server Watchdog. But what exactly is the vCenter Server Watchdog?
Enabled “out of the box” in 6.0, the vCenter Server Watchdog provides better availability by periodically verifying the status of vCenter Server. It does this in two ways:
The PID Watchdog monitors the processes running on vCenter Server
The API Watchdog uses the vSphere API to monitor the functionality of vCenter Server.
If any services fail, the Watchdog attempts to restart them. If it cannot restart the service because of a host failure, vSphere HA restarts the virtual machine running the service on a new host.
That’s sounds slick, right? Well, let’s dive in and take a look at each of these watchdogs in detail. Continue reading →