Today, we are excited to welcome Cloudera officially to the VMware family. VMware and Cloudera have entered into a partnership agreement that is meant to help users of Cloudera’s Hadoop distribution, CDH4, to run in the cloud. As part of this announcement, VMware has tested and certified Cloudera’s Enterprise Big Data software to run on vSphere 5.1 and that Cloudera is now part of the VMware Ready and Technical Alliances Partner (TAP) program.
This month at EMC World, VMware CEO Pat Gelsinger stated that over 500,000 Hadoop installations exist today on bare metal servers, with compute and data tied to the same physical server. By breaking compute and data apart, and putting it on fast-to-deploy vSphere virtual machines, big data becomes inherently more accessible, compute times can improve by up to 13%, and datacenters can optimize to provide more types of data services without adding more hardware.
It comes at a time where both the volume of data is exploding and, according to PwC’s 5th Annual Digital IQ Survey, 83% of their top performing companies believe that harnessing Big Data will give their firms a competitive advantage. As such, many CIOs are formally aligning their agenda to invest in big data this year.
In fact, a study released earlier this year by NewVantage Partners with found that investment is happening faster than we think already. 85% of their 50 respondents (most with more than 30,000 employees) said they were already investing in Big Data. Many of these are still in Phase I, where they are experimenting with a pilot project. At this point in the adoption cycle, they are finding value and planning on how to expand investment and increase ROI of data mining.
This is the point that both VMware and Cloudera hope to begin to working with these data teams. As they gain experience and focus in servicing their own data needs, they will be looking for ways to do this more efficiently in the datacenter as well as expand Big Data’s reach within the organization—making it more accessible for employees. This is when they should consider moving their Hadoop data loads to the cloud for the following reasons:
- Setup new compute processes in minutes, not hours or days
- Better hardware utilization and consolidation by running mixed workloads
- Performance improvements through pooling of resources
- High Availability/Fault Tolerance through vSphere Enterprise & Enterprise+
- Make Big Data projects more accessible by offering Hadoop-as-a-Service
Cloudera’s CDH4 is available now to run on vSphere 5.1 and above. Additionally, both companies have agreed to collaboratively work together to support calls from customers running the two technologies together, although each company will only provide patches or technical support directly for their respective products.