Home > Blogs > VMware vFabric Blog > Tag Archives: HDFS

Tag Archives: HDFS

VMware’s Serengeti – Virtualized Hadoop at Cloud-scale

Not long ago I covered the topic of Big Data adoption in the enterprise. In it, I described how Serengeti enables enterprise to respond to common Hadoop implementation challenges resulting from the lack of usable enterprise-grade tools and the shortage of infrastructure deployment skills.

With the latest release of open source Project Serengeti, VMware continues on its mission to deliver the easiest and most reliable virtualized Big Data platform. One of the most unique attributes of Serengeti Hadoop deployment is that it can easily coexist with other workloads on an existent infrastructure.

Serengeti-deployed Hadoop clusters can also be configured in either local or shared, scale-out data storage architecture. This storage layer can even be shared across multiple HDFS-based analytical workloads. And, in the future, this could potentially be extended to other, non-HDFS-based data engines.

The elasticity of underlining vSphere virtualization platform, helps Serengeti to achieve new levels of efficiency. This architecture enables organizations to share the existing infrastructure with Big Data analytical workloads to deliver optimal storage capacity and performance. Continue reading

2 Ways Fast/Big Data Impacts the IT Org Structure

Though my background includes time as both a developer, architect, and CTO, much of my time today is spent discussing applications with senior IT executives. I manage an application development division of a national VAR and focus on the vFabric stack from top to bottom. One of the challenges I face is trying Screen shot 2012-08-08 at 10.48.39 AM
to provide application-centric consulting services to operations/infrastructure teams who (a) don’t really own the decision of app software infrastructure and/or (b) don’t understand it and, (c) worse in some cases, don’t care. Recently, I’ve come to love my job for two primary reasons:

1. “Cloud” technologies are forcing the Operations teams and the Application teams to “share” responsibility for overall IT efficiency. The cloud concept of an on-demand, elastic infrastructure is knocking down political walls and silos that have evolved over the past decades in IT. This is no more evident than at VMWare, where vFabric and vSphere product lines are starting to blur (e.g. vCenter –> vCloud Director –> Application Director). Finally, I have something to talk to the Infrastructure folks that gets them excited! Perhaps it is the needed automation of infrastructure that brings Ops to the Aps side. Or, perhaps it an elastic architecture that brings Aps over to the Ops side. In any event, the two teams are brought together and work together more in cloud solutions.

Continue reading

Spring and RabbitMQ – Behind India’s 1.2 Billion Person Biometric Database

Aadhaar was conceived as a way to provide a unique, online, portable identity so that every single resident of India can access and benefit from government and private services. The Aadhaar project has received coverage from all possible media – television, press, articles, debates, and the Internet. It is Screen shot 2012-07-30 at 5.53.12 PM seen as audacious use of technology, albeit for a social cause. UIDAI, the authority responsible for issuing Aadhaar numbers, has published white-papers, data, and newsletters on progress of the initiative.A common question to the UIDAI technology team in conferences, events and over coffee is – what technologies power this important nation-wide initiative? In this blog post, we wanted to give a sense of several significant technologies and approaches.

Fundamental Principles

While the deployment footprint of the systems has grown from half-a-dozen machines to a few thousand CPU cores processing millions of Aadhaar related transactions, the fundamental principles have remained the same:

Continue reading