Home > Blogs > VMware vFabric Blog


10 Ways to Make Hadoop Green in the CFO’s Eyes

Hadoop is used by some pretty amazing companies to make use of big, fast data—particularly unstructured data. Huge brands on the web like AOL, eBay, Facebook, Google, Last.fm, LinkedIn, MercadoLibre, Ning, Quantcast, Spotify, Stumbleupon, Twitter, as well as some more brick and mortar giants like GE, Walmart, Morgan Stanley, Sears, and Ford use Hadoop.

Why? In a nutshell, companies like McKinsey believe the use of big data and technologies like Hadoop will allow companies to better compete and grow in the future.

Hadoop is used to support a variety of valuable business capabilities—analysis, search, machine learning, data aggregation, content generation, reporting, integration, and more. All types of industries use Hadoop—media and advertising, A/V processing, credit and fraud, security, geographic exploration, online travel, financial analysis, mobile phones, sensor networks, e-commerce, retail, energy discovery, video games, social media, and more.

At first glance, it sounds like many of the above business needs were already solved by conventional data warehouses, business intelligence, and statistical analysis programs. This is not the case—the conventional systems begin to fail when the data sets become too large, include fast-growing unstructured data formats, or face both of these issues. With size and complexity issues, traditional BI systems can become too expensive. This is why Hadoop was invented.

Simply put, Hadoop follows the MapReduce model to slice data into chunks of work, spread the work across a large number of commodity servers, and aggregate the work back into a single output. It’s parallel computing approach out-scales the old models and is more cost-effective at doing so.

Effectively Managing Hadoop from the CFO’s Eyes

In the early days of the web and enterprise apps, everyone got so enamored by the potential for growth and productivity that both business and IT teams spent money prematurely—we ended up with a massive number of underutilized servers that cost us an arm and a leg to operate. Then, we spent more to virtualize these resources, get better utilization out of our datacenters and reduce our overhead.

With the big data technology trend, we are facing the same excitement around Hadoop. It’s going to be an investment area for the next decade or two, and your CFO is going to see this coming. This time around, we can spend IT dollars much more wisely buy putting Hadoop on virtualized infrastructure from the beginning. For those of us that have learned the painful TCO lessons from the past and understand the economics of virtualization, here is a list of ten key, financially sound, cloud infrastructure requirements that should be part of any Hadoop project:

  1. Initial Hadoop projects should be explored for the most pressing issues in the company and start by aligning with the CEO and CFO’s top needs and goals.
  2. Hadoop investments should run with the same data center efficiency and cost effectiveness as other virtualized platforms that have high server consolidation ratios and require less CapEx and OpEx than non-virtualized environments.
  3. Hadoop pilots should identify a big problem, make the scope concise, and complete quickly to prove the time-to-value and identify future costs and risks thoroughly. We all learn by doing—don’t drag out the time to value by over-engineering.
  4. Hadoop must be able to co-locate with existing applications and run on existing virtualized hosts. This approach should accommodate a Hadoop pilot without new hardware or help manage shared infrastructure budgets in a cost-effective manner.
  5. Hadoop nodes should use the concept of time sharing. For example, when email, database, web, or ERP applications are idle, the compute power available should be transferred to Hadoop nodes that are analyzing improvements in business performance.
  6. The Hadoop infrastructure should be able to scale up or down elastically, on-demand, and across clouds for burst compute needs. This capability would allow you to expedite a big analysis on your company’s performance by temporarily adding new Hadoop nodes on a 3rd party cloud service to increase capacity.
  7. Hadoop VMs should not require significant resources to scale, provision, deploy, replicate, or move because a cloud-centric, virtual machine infrastructure can accommodate this.
  8. Hadoop should be available to the company as a shared service. This is one of the most cost-effective ways to provide Hadoop as a service. In this model, it is available to all departments based on chargeback accounting. Even with shared services, virtualization still allows for enough isolation to meet independent business and security needs.
  9. Hadoop should not require expensive, high availability or fault tolerance (i.e. no downtime) frameworks based on hardware. Distributed computing is meant for commodity computing in the cloud.
  10. Hadoop training, at least at a high level, should be provided to every IT person who engages with various business units and departments—Hadoop attracts talent and paves careers.

To learn more about how VMware is helping virtualize Hadoop clusters, check out Project Serengeti.

This entry was posted in Serengeti and tagged , , , , , , , , on by .
Adam Bloom

About Adam Bloom

Adam Bloom has worked for 15+ years in the tech industry and has been a key contributor to the VMware vFabric Blog for the past year. He first started working on cloud-based apps in 1998 when he led the development and launch of WebMD 1.0’s B2C and B2B apps. He then spent several years in product marketing for a J2EE-based PaaS/SaaS start-up. Afterwards, he worked for Siebel as a consultant on large CRM engagements, then launched their online community and ran marketing operations. At Oracle, he led the worldwide implementation of Siebel CRM before spending some time at a Youtube competitor in Silicon Valley and working as a product marketer for Unica's SaaS-based marketing automation suite. He graduated from Georgia Tech with high honors and an undergraduate thesis in human computer interaction.

33 thoughts on “10 Ways to Make Hadoop Green in the CFO’s Eyes

  1. Mobile money manager

    It’s a wonderful in addition to helpful little bit of info. We’re thankful that you just discussed this useful facts along with us. You need to be united states informed such as this. Many thanks for spreading.

    Reply
    1. تور کیش از شیراز

      thanks Adam Bloom.
      you are good writer..

      Reply
  2. سرور مجازی ایران

    Excellent post

    Reply
  3. دانلود فیلم

    Hadoop training, at least

    Reply
  4. خرید vpn

    We’re thankful that you

    Reply
  5. خرید vpn

    rethink its

    Reply
  6. evolvecorp.in

    hhgdfgghfg

    Reply
  7. epsfi.in

    m,hjhgjfhjfghgfhvccbvc

    Reply
  8. evolvecorp.in

    yi tryi ry ryityi mmbmbvbnvmv fghgfhgfhgfh

    Reply
  9. خرید اپل ایدی

    مای اپل آیدی فروشگاهی است که میتوانید به آسانی و به سرعت اپل آیدی خود را دریافت نمایید

    Reply
    1. aprin

      Iranian people are everywhere! lol

      Reply
  10. قالیشویی تهران

    عالی بودlike

    Reply
  11. قالیشویی تهران

    Like داشت

    Reply
  12. خرید کاندوم

    خرید کاندوم

    Reply
  13. دانلود فیلم رایگان

    eBay, Facebook, Google,

    Reply
  14. دانلود رایگان سریال

    very good blogs vmware . thnaks for website

    Reply
  15. moviecenter

    مووی سنتر مرجع دانلود فیلم و سریال رایگان با لینک مستقیم

    Reply
  16. خرید vps

    thanks for vmware

    Reply
  17. رزرو هتل

    هتل های ایران ظرفیت خود را از طریق سایت هتل یار (مرکز رزرواسیون هتل های ایران) در اختیار مسافرین قرار داده اند تا شما که قصد سفر دارید با خیالی آسوده از محل کار یا منزل خود با یک جستجوی ساده از میان گزینه‌های متنوع هتل مورد نظرتان را به صورت اینترنتی رزرو نمایید. با پرداخت آنلاین هزینه اقامت با هریک از کارت‌های عضو شتاب رزرو هتل خود را در چند دقیقه نهایی کنید. رزرو هتل بصورت اینترنتی به شما کمک می‌کند بدون اتلاف زمان و صرف هزینه‌های اضافی بهترین انتخاب را انجام دهید. شروع سفری دلپذیر و اقامتی امن را با رزرو هتل در سایت هتل یار تجربه کنید.

    Reply
  18. تجهیزات فست فود

    it was a really helpful article. thanks for doing it

    Reply
  19. بلیط خارجی

    Thank you very much !

    Reply
  20. تجربه خرید بلیط هواپیما

    That’s pretty cool. Seems like a fun game to play with Glass.

    Reply
  21. دانلود فیلم و سریال

    Hi, the whole thing is going well here and ofcourse every one is sharing
    information, that’s genuinely excellent, keep up writing.

    Reply
  22. سئو

    Admin Frosting is a feature module that replaces Drupal’s default Content. Comment and People (User) administration screens utilizing Views, Views Bulk Operations (VBO) andAdmin Frosting is a feature module that replaces Drupal’s default Content. Comment and People (User) administration screens utilizing Views, Views Bulk Operations (VBO) and Rules … Rules …

    Reply
  23. تور استانبول لحظه آخری

    We and our members of our company have liked the site so much that we look at your site every day. We always pray you and your colleagues that your good luck will also pray to us to succeed in this. Hope good days

    Reply
  24. راه اندازی رستوران

    Seems like a fun game to play with Glass.

    Reply
    1. راه اندازی فست فود

      oh by the way , check out my web. it’s kinda cool

      Reply
  25. گروه تحقیقاتی و اموزشی کرامت

    گروه تحقیقاتی و اموزشی کرامت

    Reply
  26. www.alufoil.cn

    Is it necessary to install vmtool before sharing files?

    Reply
  27. 3003 aluminum coil

    Very good, very good, I finally understood
    For more info, please visit https://www.alufoil.cn/

    Reply
  28. پوشک بزرگسال

    thanks your page

    Reply
  29. پوشک بچه

    very good.

    Reply
  30. اجاره ماشین

    tnk

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*