Home > Blogs > VMware vFabric Blog


New Serengeti Release Extends Cloud Computing Support for Hadoop Community

Today VMware is releasing a significant new release of their big data virtualization open source project Serengeti called M4 or version 0.8.0. Designed to help make it easier for Hadoop users to deploy, run and manage mixed workload clusters on a virtualized platform, this release broadens support across the various distributions of the Hadoop community, including new support for Cloudera CDH4, MapR, and HBase. Additionally as part of this release, Serengeti M4, includes updated performance configuration improvements and a hardware reference architecture guide.

This release comes at a perfect time for an exploding data market. This year, worldwide we will create 4 zettabytes of new data, and more than 80% of that will be unstructured data that does not work in a traditional database management system. At the same time, businesses are learning to harness that data and use it to better their business.

A popular strategy to succeed in the data market is Hadoop, an open source data framework that that allows for the massive distributed processing of large data sets across clusters of nodes using simple programming models.  Additionally, Hadoop offers a scalable file system (HDFS) that allows users to store huge amounts of data leveraging inexpensive disks on commodity servers.  The powerful framework has spawned many new startups in Silicon Valley and has Enterprise IT departments clamoring to harness the power of this technology. Huge web applications like Facebook, LinkedIn, Yahoo! and eBay all rely on Hadoop to process and store data for hundreds of millions of users.

While these companies have large scale deployments of Hadoop, its reach goes far beyond just the big applications. By internal estimates, VMware believes there are over 250,000 active Hadoop clusters in production today, most of which are pilot implementations with fewer than 20 nodes. By next year, we expect this number to double to 500,000 active clusters, with an increasing scale and complexity among these deployments. And other experts agree this growth is not slowing, with Gartner expecting this number to increase by 800% in five years and IDC stating it will have a compound annual growth rate (CAGR) of over 60% through at least 2016.

Serengeti M4: Moving the Hadoop Community to the Cloud

Download a Trial:

Get an updated version
here

A successful open source project, Hadoop has developed a market of several, proven options of distributions. However, the project was developed initially to just run directly on bare metal servers and not within a virtual machine.  As these projects grow in size and adoption, customers are increasingly looking for better ways to optimize workloads across servers and to accelerate deployment of new clusters and Hadoop based applications using virtualization and cloud computing.

Serengeti 0.8.0 now supports all the major Hadoop distributions including the new support for Cloudera CDH4, MapR, and Hbase in addition to existing support for Apache Hadoop, Pivotal HD, Hortonworks and Cloudera CDH3 as well as Apache Pig and Apache Hive. This provides the broader Hadoop community the freedom to work with the distribution they choose while saving time and money by automating deployment and management of Hadoop clusters.

For more information on why you should consider deploying Hadoop in the cloud, see VMware’s whitepaper called Virtualizing Apache Hadoop.

New Features in the Serengeti M4 Release

Besides extending support to new distributions of Hadoop, the Serengeti 0.8.0 release also includes the following new capabilities:

  • The ability to deploy a ready-to-use HBase instance with full integration to Map-Reduce, Thrift API and RESTful API.
  • Ability to deploy HDFS persistent storage with HBase.
  • Provide HMaster HA (HBase) and Name Node HA (CDH4 and MapR) in an active and hot standby configuration, with Zookeeper coordinating failover.
  • Pooling for temp data across multiple compute nodes that automatically release when no longer in use reducing bandwidth constraints and improving performance.
  • Embedded HBase, Pig, Hive and Hive Server configurations for CDH4 and MapR.
  • Improved performance settings for disk mounts and virtual SCSI controllers
  • Improved default Hadoop configurations that match best practices

More on the Serengeti M4 release:

Further reading:

 

This entry was posted in Serengeti and tagged , , , , , , , , , on by .
Stacey Schneider

About Stacey Schneider

Stacey Schneider has over 15 years of working with technology, with a focus on working with sales and marketing automation as well as internationalization. Schneider has held roles in services, engineering, products and was the former head of marketing and community for Hyperic before it was acquired by SpringSource and VMware. She is now working as a product marketing manager across the vFabric products at VMware, including supporting Hyperic. Prior to Hyperic, Schneider held various positions at CRM software pioneer Siebel Systems, including Group Director of Technology Product Marketing, a role for which her contributions awarded her a patent. Schneider received her BS in Economics with a focus in International Business from the Pennsylvania State University.

32 thoughts on “New Serengeti Release Extends Cloud Computing Support for Hadoop Community

  1. تور لحظه آخری

    wow its wonderfullll !! tanx a lot

    تور لحظه آخری

    Reply
  2. انجام پایان نامه

    THANKS ADMIN.BEST WISHES.

    Reply
  3. سرور مجازی ایران

    Excellent post

    Reply
  4. طراحی سایت حرفه ای

    thanks

    Reply
  5. بازرگانی بهروزی

    Hello
    I thank you infinitely site http://www.bbrco.com

    Reply
  6. اموزش سئو

    yy nice

    Reply
  7. i7networks.in

    uiuiudfsdf asdsdvfvvcxv

    Reply
  8. presented.in

    nmvvb uiewrhweuh hgdasdashdj

    Reply
  9. isrfg2013.in

    vbm,vbm, ikriofiroi hefyerfe

    Reply
  10. ایرانی موزیک

    ایرانی موزیک

    Reply
  11. تور لحظه آخری

    Last minute tours to Africa Serengiti!

    Reply
  12. John

    Thanks, the article added a lot to my knowledge

    Reply
  13. ebookrally

    THANKS ADMIN.BEST WISHES.

    Reply
  14. تور مالزی

    nice. very good post

    Reply
  15. قیمت کفسابی

    قیمت کفسابی

    Reply
  16. وقت سفارت آلمان

    اخذ تضمینی وقت سفارت آلمان با حداقل هزینه و در کمترین زمان وقت سفارت آلمان را به راحتی و بدون دردسر بگیرید اخذ وقت فوری بدون معطلی وقت سفارت تضمینی با ارزانترین قیمت پانزده سال سابقه درخشان در خصوص اخذ ویزای آلمان پاسخگویی 24 ساعته توسط کارشناسان اقامت و ویزای آلمان من و همسرم با کمک شرکت سفریار موفق شدیم اولين ويزای شنگن خودمان را دریافت کنیم. خدمات بسيار عالی و پاسخگويی کارشناسان محترم به همراه قيمت مناسب، به تجربه ای شیرین و بدون استرس تبديل شد. سرکار خانم عبادی کارمند تشكر ميكنم از كادر قوی و مديريت سفریار

    Reply
  17. لحظه آخر

    The article was very useful
    good luck

    Reply
  18. Salian Safar

    I’m really interested to see how it works

    Reply
  19. سئو

    very useful text

    Reply
  20. ویزا کانادا

    thanks !

    Reply
  21. ویزای شینگن

    very nice and so used

    Reply
  22. اجاره انبار در تهران

    Excellent, it `s great

    Reply
  23. اجاره انبار در غرب تهران

    Excellent, it `s great

    Reply
  24. سلام پرواز

    sounds great

    Reply
  25. https://keramatzade.com/

    Thanks

    دختر

    Reply
  26. https://keramatzade.com/

    Thanks

    شخص ثالث

    Reply
  27. https://keramatzade.com/

    Thanks

    کسب و کار

    Reply
  28. Fatemeh

    برای مهاجرت به ترکیه حتما به این سایت سر بزنید

    Reply
  29. پوشینه بزرگسال

    thanks your post .

    Reply
  30. باربری شیراز به مشهد

    salan

    Reply
  31. تور گرجستان

    Hadoop is seriously a great tool!
    تور گرجستان

    Reply
  32. طراحی وبسایت وردپرسی

    thanks for this awsome and great article !
    آموزش های کامل و جامع طراحی وبسایت وردپرسی در وبسایت پینا دیزاین

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*