Architecture

Big Data on vSphere : Two Customer Case Study White Papers Published

VMware-vSphere-Adobe-Deploys-HAAS-CS

Two new white papers are now available on the work done at Adobe on virtualizing Hadoop. The VMware-authored paper,  Adobe Deploys Hadoop as a Service on VMware vSphere, focuses on the business background and justifications for virtualizing the workload. It also talks about implementing Hadoop-as-a-Service by the central Technical Operations function to satisfy the needs of the business units and data analysis groups that require Hadoop as a platform. This paper also gives details about the use of the vSphere Big Data Extensions tool which was used heavily in the project, as well as the connection to vRealize Automation that forms the basis for the cloud offering at Adobe.

The second, complementary white paper, on the same architecture, Virtualizing Hadoop in Large-Scale Infrastructureswas written by the EMC consulting team that supported the project. The EMC paper, with the title “Virtualizing Hadoop in Large-Scale Infrastructures”, focuses on the technical reference architecture for the Proof-of-Concept conducted in late 2014, the results of that POC, the performance tuning work and the physical topology that was deployed using Isilon storage. The two papers were written in concert by the organizations and should be read together for a full picture of the Hadoop virtualization project. This system is now live at Adobe Digital Marketing, hosted on their Virtual Private Cloud and it is being used by different groups within the big data community there. The papers together provide an outline reference architecture for use in other installations also. Watch this space, there are more technical case studies in the works.

Speaking of technical reference material for Hadoop on vSphere, here is the current list of technical papers and websites that are now available for people to learn more about this particular subject – for your reference:

Big Data/Hadoop on VMware vSphere – Reference Materials

Deployment Guides

Reference Architectures

Customer Case Studies

Performance Studies

There are some very useful best practices in the first two technical papers.

vSphere Big Data Extensions (BDE)

Other vSphere Features and Big Data