Home > Blogs > VMware vFabric Blog

Cloudera Gets More Cloudy: Partners and Certifies CDH4 on vSphere

Today, we are excited to welcome Cloudera officially to the VMware family. VMware and Cloudera have entered into a partnership agreement that is meant to help users of Cloudera’s Hadoop distribution, CDH4, to run in the cloud. As part of this announcement, VMware has tested and certified Cloudera’s Enterprise Big Data software to run on vSphere 5.1 and that Cloudera is now part of the VMware Ready and Technical Alliances Partner (TAP) program.

This month at EMC World, VMware CEO Pat Gelsinger stated that over 500,000 Hadoop installations exist today on bare metal servers, with compute and data tied to the same physical server. By breaking compute and data apart, and putting it on fast-to-deploy vSphere virtual machines, big data becomes inherently more accessible, compute times can improve by up to 13%, and datacenters can optimize to provide more types of data services without adding more hardware.

It comes at a time where both the volume of data is exploding and, according to PwC’s 5th Annual Digital IQ Survey,  83% of their top performing companies believe that harnessing Big Data will give their firms a competitive advantage. As such, many CIOs are formally aligning their agenda to invest in big data this year. Continue reading

Breaking the Mindset: Why Hadoop Can and Should Move Past Bare-Metal Deployments to Virtualization

Whenever we’ve dealt with something for a while, our way of thinking about it becomes a habit. Hadoop deals with a lot of data. Currently, the record is 100 petabytes in a Facebook cluster that analyzes log data.  Since it was built by the likes of Google and Facebook to deal with such large data volumes and performance, it originally was built to run on bare-metal servers. Since it wasn’t an option from the get-go, the notion that you can’t have that much data running on a move-able virtual machine safely has largely gone unchallenged.

However, as time has gone on, and technology has allowed for persistent storage on the cloud, organizations have started to rethink this paradigm. In fact, several companies are using Hadoop and big data today to gain competitive advantage. And while they are running it on virtualization, they are not moving the data. There are other advantages.

VMware’s Big Data product line marketing manager Joe Russell, spoke with Roberto Zicari this week in an interview on ODBMS.org that helps articulate why Hadoop not only can run on virtual infrastructure using Project Serengeti, but why companies should consider it to save time and make Hadoop more usable. Continue reading

New RabbitMQ 3.1.0 Release Available

RabbitMQ 3.1.0 is now available for immediate download.

Announced this morning on the new Pivotal blog, where RabbitMQ now resides, this version includes enhancements to garbage collection, consumption, requeuing, memory use, and dead lettering.

For those on Mac OS X, there is a newly packaged, standalone release of RabbitMQ that doesn’t require a separate Erlang install.

Some key, new capabilities include eager synchronisation of mirror queue slaves, automatic cluster partition healing, and improved statistics (including charts) in the management plugin. There are also many enhancements and bug fixes to the server, Java client, Erlang client, and a number of other plugins, including federation, old-federation, shovel, Web-STOMP, STOMP, and MQTT plugins, as well as the consistent hash exchange.

RabbitMQ’s blog post on the topic shares screenshots of several new features like the ones for new charts and filters below:

Read More:

5 Steps to Mainframe Modernization with a Big Fast Data Fabric

For growth initiatives, many companies are looking to innovate by ramping analytical, mobile, social, big data, and cloud initiatives. For example, GE is one growth-oriented company and just announced heavy investment in the Industrial Internet with GoPivotal. One area of concern to many well-established businesses is what to do with their mainframe powered applications. Mainframes are expensive to run, but the applications that run off of them are typically very important and the business can not afford to risk downtime or any degradation in service.  So, until now the idea of modernizing a mainframe application has often faced major roadblocks.

There are ways to preserve the mainframe and improve application performance, reliability and even usability.  As one of the world’s largest banks sees, big, fast data grids can provide an incremental approach to mainframe modernization and reduce risk, lower operational costs, increase data processing performance, and provide innovative analytics capabilities for the business—all based on the same types of cloud computing technologies that power internet powerhouses and financial trading markets. Continue reading

Webinar Recap: Pivotal Opens For Business, GE Gets 10% Stake and How Pivotal Plans to Deliver Next-Generation PaaS

Pivotal is now open for business!

Pivotal, first announced in December, is a new venture started by VMware and EMC that is focused on Big Data and Cloud Application Platforms. Formally launched as a stand-alone entity today, Pivotal is led by former VMware CEO Paul Maritz, who has been working as Chief Strategy Officer at EMC since last August.

In a webinar today, Maritz not only confirmed the new initiative is now a stand-alone business with 1,250 employees from VMware and EMC, but he also surprised listeners with an announcement that General Electric is making a strategic investment of $105 million into Pivotal. GE’s Vice President and Corporate Officer Bill Ruh joined the webinar today and said GE will hold a 10% stake in the new company. CEO Jeff Immelt also joined the call to explain This brings the value of the newly launched Pivotal to $1 billion.

GE also announced this morning that their Software Center is standardizing on several of Pivotal’s technologies, essentially being the first public customer to endorse the new company. Continue reading

15% Discount for Spring Java Training in May

Training is a great way to speed up development, learn how to improve performance and usability for your applications and generally build confidence in your skills. This month, SpringSource is offering java developers a 15% discount code on all VMware trainings including Core Spring, Spring Web, Enterprise Integration, and Hibernate classes.

To secure your 15% discount, be sure to use the promo code springcustomerpromo during your registration process (promo is not available for partners). All of the following qualifying classes for May, 2013 can be found below:

Step 1: Core Spring

Americas

7 Myths on Big Data—Avoiding Bad Hadoop and Cloud Analytics Decisions

Hadoop is an open source legend built by software heroes.

Yet, legends can sometimes be surrounded by myths—these myths can lead IT executives down a path with rose-colored glasses.

Data and data usage is growing at an alarming rate.  Just look at all the numbers from analysts—IDC predicts a 53.4% growth rate for storage this year, AT&T claims 20,000% growth of their wireless data traffic over the past 5 years, and if you take at your own communications channels, its guaranteed that the internet content, emails, app notifications, social messages, and automated reports you get every day has dramatically increased.  This is why companies ranging from McKinsey to Facebook to Walmart are doing something about big data.

Just like we saw in the dot-com boom of the 90s and the web 2.0 boom of the 2000s, the big data trend will also lead companies to make some really bad assumptions and decisions.

Hadoop is certainly one major area of investment for companies to use to solve big data needs. Companies like Facebook that have famously dealt well with large data volumes have publicly touted their successes with Hadoop, so its natural that companies approaching big data first look to the successes of others.  A really smart MIT computer science grad once told me, “when all you have is a hammer, everything looks like a nail.” This functional fixedness is the cognitive bias to avoid with the hype surrounding Hadoop. Hadoop is a multi-dimensional solution that can be deployed and used in different way. Let’s look at some of the most common pre-concieved notions about Hadoop and big data that companies should know before committing to a Hadoop project: Continue reading

How fast is a Rabbit? Basic RabbitMQ Performance Benchmarks

One of the greatest things about RabbitMQ is the community that surrounds it. With open source at its roots, people come together to share their code, their knowledge and their stories of how they’ve deployed it in their projects. At a recent meetup near Nice, France, database engineer Adina Mihailescu shared a presentation on choosing messaging systems. Supported by Murial Salvan’s benchmark comparing ActiveMQ, RabbitMQ, HornetQ, Apollo, QPID, and ZeroMQ, they shared some interesting performance comparisons that we’d like to share with you.

In a single laptop benchmark, Salvan ran four different scenarios in order to obtain some insight on performance of the default setups for these messaging solutions. Each test had 1 process dedicated to enqueuing and another dedicated to dequeuing. The message volume and size ranged from 200 to 20,000 to 200,000 messages and 32 to 1024 to 32768 bytes. Both persistent and transient queues and messages were used. Continue reading

10 Ways to Make Hadoop Green in the CFO’s Eyes

Hadoop is used by some pretty amazing companies to make use of big, fast data—particularly unstructured data. Huge brands on the web like AOL, eBay, Facebook, Google, Last.fm, LinkedIn, MercadoLibre, Ning, Quantcast, Spotify, Stumbleupon, Twitter, as well as some more brick and mortar giants like GE, Walmart, Morgan Stanley, Sears, and Ford use Hadoop.

Why? In a nutshell, companies like McKinsey believe the use of big data and technologies like Hadoop will allow companies to better compete and grow in the future.

Hadoop is used to support a variety of valuable business capabilities—analysis, search, machine learning, data aggregation, content generation, reporting, integration, and more. All types of industries use Hadoop—media and advertising, A/V processing, credit and fraud, security, geographic exploration, online travel, financial analysis, mobile phones, sensor networks, e-commerce, retail, energy discovery, video games, social media, and more. Continue reading

Upcoming Webinar: Paul Maritz on Pivotal and The New Platform for the New Era

The cloud, mobile applications and big, fast data are fundamentally changing how applications are built and modernized today. To speed this transformation at the enterprise level, Pivotal, the new venture by VMware and EMC, will host a live streaming event on April 24th at 10:00 am Pacific/1:00 pm Eastern with a special announcement and an unveiling of its plans to build “A New Platform for a New Era”.

The Pivotal platform will unite data, application, and cloud fabrics, helping enterprises to develop faster, understand more, and succeed at an even greater scale. It is a platform that makes the consumer grade enterprise a reality.

Pivotal brings together a prodigious set of technologies and talent from a number of EMC and VMware entities, which include Greenplum, Cloud Foundry, Spring, GemFire and other products from the VMware vFabric Suite, Cetas, and Pivotal Labs.

>> Register for webinar here!

Paul Maritz, the Pivotal Leadership Team, and special guests will unveil this platform, and make a special announcement during a live streaming event on Wednesday, April 24th at 10:00 am Pacific/1:00 pm Eastern.

Sign up for the event at gopivotal.com and follow @gopivotal on Twitter for updates.