Posts Tagged ‘ bigdata ’

Josh Simons

Hadoop Performance on vSphere 5.1

April 19, 2013
By
Hadoop Performance on vSphere 5.1

We’ve just published a third Hadoop performance paper, written by VMware performance expert Jeff Buell, which looks in detail at the relative performance of a bare-metal 32-node Hadoop cluster compared to a range of virtual clusters with up to 128 VMs. The executive summary is that while we saw a 13% performance degradation in a head-to-head comparison of a 32-node physical cluster against a 32-VM virtual cluster (one VM per host) running on the same hardware and running the same tests, virtualized performance can be increased significantly — to the point where virtualized Hadoop actually runs a bit faster than physical — by increasing the number of VMs per host. We’ve seen this effect before with Hadoop...

Read more

Josh Simons

Virtualizing Big Data

October 26, 2011
By

Analysis of large-scale, often unstructured data is becoming increasingly important within both the Enterprise and the HPC community. This is perhaps one of the most apparent areas where the convergence of HPC and Enterprise requirements can be seen as the tools and algorithmic approaches required are often the same or very similar. I imagine, for example, that the large-scale, graph-oriented “social network” analyses done by companies like Facebook are quite similar to the “anti-social network” analyses done by Homeland Security and the Intelligence community. Unsurprisingly, many VMware customers are interested in running Big Data workloads and are looking for guidance about how best to do this in a virtual environment. To help, we have published a whitepaper that examines Hadoop performance using local...

Read more

Josh Simons

Media that Gets It

September 6, 2011
By
Media that Gets It

Media outlets usually write about technologists, but today I’d like to reverse that. I met at VMworld last week with William Wallace, one of the principals from what I refer to as “insideSTAR” (STAR as is regexp “*”) and learned that they will be adding inside-BigData to their existing inside-HPC and  inside-Cloud news coverage. This excites me because it is an embodiment of the IT convergence I’ve been writing and talking about for the last few years: Specifically, the increasingly commonality between emerging Enterprise and HPC application and infrastrucure requirements. See, for example, these blog posts: It is Time for Low Latency , Summer of RDMA , HPC Clouds: A Bad Idea? I view insideSTAR’s universe this way: Those...

Read more

Josh Simons

Our Joint VMware / AMAX HPC Collaboration

April 27, 2011
By
Our Joint VMware / AMAX HPC Collaboration

I am excited we have now started our joint HPC exploration with our partner, AMAX . Based on an initial meeting on the show floor at VMworld in San Francisco last year, we decided to work together to examine several aspects of virtualized HPC of mutual interest. Areas where we see converging requirements between HPC and Enterprise customers are of particular interest to VMware as an Enterprise software company looking at broader markets and to AMAX as a dynamic computing solutions provider to HPC and Enterprise , and now Cloud customers. We are starting with Hadoop since scale-out data analytics is rapidly becoming an important workload in the Enterprise while Data Intensive Computing is simultaneously rising...

Read more