Home > Blogs > VMware vFabric Blog > Category Archives: Case Study

Category Archives: Case Study

How Instagram Feeds Work: Celery and RabbitMQ

Instagram is one of the poster children for social media site successes. Founded in 2010, the photo sharing site now supports upwards of 90 million active photo-sharing users. As with every social media site, part of the fun is that photos and comments appear instantly so your friends can engage while the moment is hot.  Recently, at PyCon 2013 last month, Instagram engineer Rick Branson shared how Instagram needed to transform how these photos and comments showed up in feeds as they scaled from a few thousand tasks a day to hundreds of millions.

Rick started off his talk demonstrating how traditional database approaches break, calling them the “naïve approach”. In this approach, when working to display a user feed, the application would directly fetch all the photos that the user followed from a single, monolithic data store, sort them by creation time and then only display the latest 10:

SELECT * FROM photos
WHERE author_id IN
(SELECT target_id FROM following
WHERE source_id = %(user_id)d)
ORDER BY creation_time DESC
LIMIT 10;

Instead, Instagram chose to follow a modern distributed data strategy that will allow them to scale nearly linearly. Continue reading

How Indeed.com Handles 35 Million Job Postings Per Day Using RabbitMQ

Indeed.com is hosting a tech talk that will be a case study of how they scaled their aggregation engine to handle 35 million job postings per day using RabbitMQ. The talk will be hosted at Indeed’s engineering site in Austin, TX at 7 pm on March 27th.

For those of you not familiar with how large Indeed is, it is interesting to note that the job search company Indeed.com is one of the largest web sites in the world. According to Alexa.com, Indeed is currently the #224th biggest website in the world, and in cities like Atlanta and Chicago, it’s the 55th most popular website overall. According to research by SilkRoad, 2 out of every 5 hires came via Indeed (based on data from 150,000 hires).

As expected, the engineering team behind this large-scale application needs to support some very large scale numbers. In a recent post on their company blog, the Indeed team shared just how big those numbers are:

  • More than 100 million monthly unique visitors
  • More than 3 billion searches per month
  • More than 1000 searches per second
  • 50 country-specific sites in 26 languages

The scale of their application, both in terms of processing throughput and geographic diversity, means that the team relies on a messaging layer powered by RabbitMQ. Continue reading

Scaling and Modernizing .NET and Java: SQLFire Performance Test Blows Away Traditional RDBMS

We all know the devil is in the details when it comes to technology.

Yet, our recent vFabric SQLFire performance test (a benchmark from vFabric SQLFire Best Practices Guide) is certainly worth review if you need to scale a Java app, .NET app, or other legacy data source.

If you don’t know what vFabric SQLFire is, it is basically what happens when Apache Derby gets married to vFabric GemFire:

  • Apache Derby is used for its RDBMS components, JDBC driver, query engine, and network server.
  • The partitioning technology of GemFire is used to implement horizontal partitioning features of vFabric SQLFire.
  • vFabric SQLFire specifically enhances the Apache Derby components, such as the query engine, the SQL interface, data persistence, and data eviction, as well as adding additional components like SQL commands, stored procedures, system tables, functions, persistence disk stores, listeners, and locators, to operate a highly distributed and fault tolerant data management cluster.

Continue reading

Messaging Architecture: Using RabbitMQ at the World’s 8th Largest Retailer

Today, we are pleased to have a guest blogger from a VMware customer share with us their story of how RabbitMQ transformed their business by “solving some really interesting problems”. The following is sent courtesy of Pablo Molnar of MercadoLibre:

If you haven’t heard of MercadoLibre (NASDAQ: MELI), we are the largest e-commerce ecosystem in Latin America. Our website offers a wide range of services to sellers and buyers throughout the region including marketplace, payments, advertising, and e-building solutions. Our products are present in over 14 countries, and the company is ranked as 8th largest online retailer in the world. We were also on Fortune’s list of the fastest growing companies in 2012, and we use RabbitMQ to solve some interesting problems.

About Our Technology Stack and How RabbitMQ Helps

In terms of technology infrastructure, MercadoLibre is fully committed to the open source development model. Most of our apps are primarily written in Grails, Groovy, and NodeJS,  but we don’t stick to any language or framework. We entrust tool selection responsibilities to the Software Engineers on each team. Almost all applications are hosted by our in-house cloud computing provisioning system and implemented via OpenStack with more than +7000 virtual instances at the moment. Also, we have successfully launched applications using emerging storage solutions like Redis and MongoDB. With an average of 20 million requests per minute and 4GB bandwidth per second, our traffic management layer is crucial and most of the routing rules job is done by Nginx proxy servers. Our labs department includes a huge Apache Hadoop cluster to perform complex analytical queries, and we are experimenting with real-time data processing using Apache Kafka and Storm.

Continue reading

4 Reasons You Should Consider Modernizing Mainframe Apps in 2013

IT organizations are facing significant challenges maintaining legacy mainframe applications: challenges ranging from the high cost of proprietary hardware and software, to the attrition of people with qualified mainframe skills and experience, and the inability to support modern computing demands of mobile and big fast data.

Cloud computing offers an opportunity to rationalize and modernize application portfolios, which can include migrating legacy mainframe apps to the cloud. Unfortunately, many IT organizations see the prospect of modernizing mainframe apps as a “mission impossible”; the path forward too cloudy and the costs and risks are too great.

As a result, many resign themselves to living with the burdens of a legacy mainframe environment.   And while remaining status quo may appear to be the best option, over time, it only intensifies the challenges associated with maintaining mainframe apps.   Eventually the business loses confidence in IT’s ability to deliver, and costs continue to rise without corresponding value. Continue reading

The Best VMware vFabric Stories of 2012 & What’s In Store for 2013

As this year comes to a close, it’s time to be reflective of what happened in the past and start planning for a new year. The vFabric team has had some major achievements this year, introducing several new products to the market including the innovative vFabric Application Director, the widely anticipated Project Serengeti to enable rapid cloud deployments for Hadoop, and a new tool to the vFabric Suite users called vFabric Administration Server (VAS).  We announced a new VMware Cloud Applications Marketplace to help further accelerate application development with a professionally moderated library of enterprise grade, ready-to-use application components that can be run on any cloud.

Next year is going to be even bigger with the Pivotal Initiative where several of the products covered on this blog will be following the new venture. This is still in the planning stages, so we will be expecting to share with you the plans for our products alongside the formal communications from each of the companies involved. (Sorry — no extra information is available right now)

One thing that we are going to be doing in early 2013 is to move the conversation of how you manage applications to be with the conversations of how you manage virtual infrastructure. To that end, we will be moving all topics of Application Performance Manager, AppInsight, Application Director, Hyperic, and Spring Insight to the VMware Management Blog as of January 1st. To make sure you keep up with the management topics, please be sure to follow us @vmwareappmgmt and @vmwaremgmt.

In the meantime, we’d like to reshare with you the top 20 stories we had for 2012, and invite you to comment here on what stories you would like to see us cover on either blog for 2013.

Continue reading

Cloud Diaries: Turning 13 Datacenters into 6? How vFabric Application Director Helps

Data center consolidation.

Those three words often mean a lot of things – a lot of work, a lot of change, a lot of cost savings, a lot of leadership, and a lot of coordination.  Of course, the payoff of doing it right can also be outstanding.

We had the opportunity to gain personal, anonymous observations from a senior technical architect of a European consulting firm who knows firsthand that data center consolidation can create value, citing “moving thirteen datacenters run by thirteen teams to six data centers run by one team is the catalyst for huge improvements in many areas.” Our architect’s company provides recommendations, architecture, installation, customized solutions, and operations services for IT.  In their conversation with VMware, we found that deployment automation is a critical requirement to many of their client’s consolidation plans, and they pointed out how vFabric Application Director is fundamental to the approach.

Continue reading

Why Hyperic is Going to Support PostgreSQL Only As a Backend Database

The next release of Hyperic is coming up soon and the biggest change is to the backend. In the next release, we will only support one database, namely PostgreSQL. Those of you who have been with Hyperic for a while as long as I have may be surprised considering our history with PostgreSQL, but, as you read though this blog, it will start to make sense.

History of PostgreSQL and Hyperic

For the last few years Hyperic has supported only two databases for production use at scale—Oracle and MySQL. This in itself was a big change since at one point, PostgreSQL was our bread and butter.  Hyperic was originally designed on PostgreSQL 7.x. As an open source project, PostgreSQL has a very easy license for distribution. As a startup company we had to get our product out into the marketplace quickly and affordably, so therefore PostgreSQL made sense.

Continue reading

How it Works: VMware’s Own Internal Self-Service Cloud

Learn More

Register for VMworld!
Click Here

Register for Session OPS-CIM2646 – Cloud Application Platform Automation on vSphere Infrastructure Leveraging Application Director : Real-World Example of Running a 4 Billion-Dollar Business (VMware IT):
Click Here

Register for Session APP-CAP2757 – Accelerate Adoption by Leveraging IaaS for a Complete Deployment and Monitoring Lifecycle:
Click Here

Register for Session OPS-CIM2852 – Automated Provisioning for Business Critical Applications (Microsoft/Java) in Private or Public Cloud:
Click Here

Follow all VMware AppMgmt updates at VMworld on Twitter:
Click Here

One of the hottest new products in the vFabric portfolio is vFabric Application Director.

Many at VMware are excited about it simply because of what it means to our everyday work life. Earlier this year, we published a story on “How VMware IT Reduced Provisioning Time by 80% Using vFabric Application Director and More.”

Now that number is up to 90%. Here’s an overview of what the business workload lifecycle management implementation looks like under the hood.

As shared in the earlier post, our goal was to automate the end-to-end application life-cycle management in a private cloud and eventually across the clouds. Automation by definition speeds things up and makes them less error prone, but in this case, it also meant that VMware’s IT organization could decouple itself from the everyday operations of the app and product teams it serviced. This split between IT and DevOps is a goal for many organizations today who are looking to be more agile, save money and maintain strong IT governance.

To achieve it, VMware IT automated several key processes across organizations including:

  • Cloud & Catalog Management
  • Workload Blueprint Design
  • Workload Automation
  • Self-service Portal
  • Change Management

Naturally, we looked first to eating our own dogfood and built these processes on top of several VMware products including VM Studio, vCloud Director, vFabric Application Director, vCenter Orchestrator and Service Manager.

Continue reading

How Even the Ocean (Data) Is In The Cloud

Recently, VMware worked with the Ocean Observatory Initiative to discuss an interesting case study that affects us all. The U.S. has built an ocean of big data on the ocean itself. Currently, we are collecting about 8 terabytes a day or 3 petabytes a year of data about the ocean in order to more efficiently and safely study the body of water that covers over 70% of earth.

The Ocean Observatories Initiative (OOI) is a 25-year program responsible for managing a networked set of 100s of sensor instruments that sit in the ocean, take measurements, send data back to a massive data infrastructure, and make data-sets and reports available to oceanographers, scientists, educators, and the public on a very broad scale. This system, quite literally, is a Hubble Telescope for observing the ocean. While this mega-system has an amazing history and tons of interesting capabilities, we think it’s pretty cool that VMware vSphere and vFabric RabbitMQ play key roles. Continue reading