Home > Blogs > VMware vFabric Blog


Messaging Architecture: Using RabbitMQ at the World’s 8th Largest Retailer

Today, we are pleased to have a guest blogger from a VMware customer share with us their story of how RabbitMQ transformed their business by “solving some really interesting problems”. The following is sent courtesy of Pablo Molnar of MercadoLibre:

If you haven’t heard of MercadoLibre (NASDAQ: MELI), we are the largest e-commerce ecosystem in Latin America. Our website offers a wide range of services to sellers and buyers throughout the region including marketplace, payments, advertising, and e-building solutions. Our products are present in over 14 countries, and the company is ranked as 8th largest online retailer in the world. We were also on Fortune’s list of the fastest growing companies in 2012, and we use RabbitMQ to solve some interesting problems.

About Our Technology Stack and How RabbitMQ Helps

In terms of technology infrastructure, MercadoLibre is fully committed to the open source development model. Most of our apps are primarily written in Grails, Groovy, and NodeJS,  but we don’t stick to any language or framework. We entrust tool selection responsibilities to the Software Engineers on each team. Almost all applications are hosted by our in-house cloud computing provisioning system and implemented via OpenStack with more than +7000 virtual instances at the moment. Also, we have successfully launched applications using emerging storage solutions like Redis and MongoDB. With an average of 20 million requests per minute and 4GB bandwidth per second, our traffic management layer is crucial and most of the routing rules job is done by Nginx proxy servers. Our labs department includes a huge Apache Hadoop cluster to perform complex analytical queries, and we are experimenting with real-time data processing using Apache Kafka and Storm.

MercadoLibre’s system was initially designed, built, and run as a single monolithic structure. Two years ago, we initiated a deep technical overhaul of our systems. The goal was to decentralize by migrating the large system into an open API-based platform and enterprise services bus (ESB). This would enable each of the separate business units to operate as an independent business and be responsible for their own success. Today, from the final user’s perspective, using the platform is a seamless experience that feels just like a single application. Behind the scenes, the site’s user experience coexist within a decentralized model and is integrated with dozens of independent applications. Each app is fully owned and controlled from inception to production by a separate team.

Since the decentralized architecture gives us the flexibility to choose the right tool-set for the job, we can do things like use an elasticsearch persistence approach for the items-search module while the view-item-page component could store the same data with a MongoDB solution. Both solutions will have the same dataset, but each of them will have the data shaped and persisted in the most efficient way for each requirement. With that in mind, this kind of architecture is really powerful and also leads us into a big challenge—how do we maintain the same representation of data synchronized across all the required applications?

Real-Time Updates and News Feeds with RabbitMQ

Here is where RabbitMQ assists us to build a flexible enterprise service bus . Any application that must consume certain types of events can connect to the ESB and start receiving real time updates. As a result, every time there is a new listing on the platform, a seller updates an item, a bid is placed, or whatever event that could affect the representation of the data, a message is sent through RabbitMQ so each consumer is aware of the news and reacts properly to the event.

Clustering and failover capabilities plus the AMQP model flexibility made RabbitMQ an excellent solution to build our service bus. For the readers that are not familiar with RabbitMQ and AMQP concepts, basically there are three important entities: exchanges, bindings, and queues. Exchanges are the entry point of any message, queues are the containers where consumers fetch messages, and bindings are rule-sets that connect exchanges to queues. This is just a vague and simple description. If you are new to RabbitMQ or AMQP, check the high-level overview from the RabbitMQ web site.

If we look at the “search” and “vip” applications mentioned above and map them to these messaging models, it would look like this diagram:

All the CRUD operations take place in the “items-app”. After any alteration of item data (create, update and delete), a message will be sent to the “items-feed” exchange. This exchange is a fanout type, meaning that any message will be routed to all bound queues regardless of the routing key. Finally, each subscribed application consumes the messages from it’s own queue. I would like to remark that we delegate the binding and queue object declarations to each consumer to avoid manual administration of objects or changing the code every time a new consumer is added.

Let’s take a closer look at a typical message sent to the “items-feed” exchange:

{"item_id": "MLA149074678", "seller_id": 61264586, "category_id": "MLA22230"}

Actually the whole representation of an item is much larger but in the example above we are sending just three fields. Why we are not sending the complete item?

The justification is basically because we cannot guarantee the order (and/or duplication) of the messages in the consumer side. Let me explain it with a real-world scenario.

When a new item is listed there is an automatic background process that activates it. So, it’s pretty common to send two messages with a minimum gap between them in the following order:

#1: {"item_id": "MLA149074678", "seller_id": 61264586, "category_id": "MLA22230", "status":"active_pending"}

#2: {"item_id": "MLA149074678", "seller_id": 61264586, "category_id": "MLA22230", "status":"active"}

Because of the nature of the broker and multi-threading constraints on the consumer side, there are real chances that message two arrives before message one. This could lead into inconsistent state of data. To prevent this behaviour, we encourage each message to contain only static data in order for the consumer process it in an idempotent way. If required, we always can pull the rest of the data from the Items API endpoint (e.g. : https://api.mercadolibre.com/items/MLA149074678

Ok, now let’s add a new use case in our scenario for the “moderations-app.” The aim of this application is to moderate certain text-free fields of the item to ensure they comply with our terms of service. So, every update the user makes on fields like title, subtitle, and descriptions must be moderated by this app.

The simplest solution might be just to add a new queue like the others and start consuming the events. However, there are harmful consequences with this approach—the queue will be sent many unnecessary events, possibly leading to poor performance. This specific app doesn’t care about all item’s events, it just wants to be informed only when some of the three mentioned fields are updated. Is that possible?

Sure thing. It’s probable that many applications will just need to consume certain type of events. To enable that, we send on each message, metadata about the event itself (like updated fields or if it corresponds to a new listing), and some other handy conditions. The metadata is carried in the header part of the message. So, suppose the seller updates the price and the title of an item. The above example updated with the header metadata could be shown like this:

Message payload:
{"item_id": "MLA149074678", "seller_id": 61264586, "category_id": "MLA22230"}

Message headers:
["type": "update", "price": true, "title": true]

The AMQP protocol has different kinds of exchanges, each one with different routing capabilities. Specifically, there is a “Headers Exchange” designed to route messages based on attributes defined in the message header (check “Headers Exchange” in http://www.rabbitmq.com/tutorials/amqp-concepts.html). By using the proper binding and the proper exchange, we are able to consume messages that match specific conditions. This is a really powerful and flexible feature for us because we are not only able to route messages based on content, but also leverage the logic of custom events consumption on subscribers. This is possible because all declaration of entities are implemented in the AMQP protocol and the consumer application is responsible for being bound to the queue with his custom condition.

Putting all these new concepts together with the last scenario, the “moderations-app” could be shown as follows:

Instead of the fanout exchange, the “moderations-app” bounds the queue to the header exchange with the required condition. When the ‘x-match’ binding argument is ‘any’, just one matching argument is enough to accept the message in the queue. When ‘x-match’ is set to ‘all’, all conditions must match to admit the message. Note that the exchange-to-exchange binding allows all messages to be available in the header exchange.

Conceptually, this is how the real time news feed is implemented. Additionally, it is worth mentioning that our production environment has a RabbitMQ cluster of three nodes with the help of HAProxy to handle reconnection logic of clients. The message rates of the cluster are pretty high (1k-5k msg/sec) and there are more than 20 different applications consuming the items feed in order to replicate data.

Event-Driven applications with RabbitMQ

I must confess that in the past, here at MercadoLibre, we’ve abused a bit of offline processing done by jobs. Imagine yourself working on a cool new feature, and, suddenly, you realize that some tasks could be performed in an asynchronized style to gain more performance. A common approach was storing all the necessary data in a table and then creating a job that runs every X amount of time to process our stuff. It worked ok in the past, but there were hidden implications using this approach:

  • You probably need to deal with batches. When the job starts it’s likely to fetch many pending tasks at once. Then you have to process them all together and split the results into the ones that failed and the ones that didn’t. Finally, you have to update the tasks state results in the table. That means at least two database operations for every loop and (potentially) one more if there are failed tasks.
  • Scaling issues. You probably need to add or change code if the job is required to run concurrently across multiple instances so that you avoid overlapping tasks. This could require extra effort on database indexes for each job.
  • The pending tasks are not processed in real time because the job is not an active consumer. Processing a task will strictly depend on the job scheduler.

Not cool…

Let’s break this database centric approach and start coding lightweight solutions with the help of RabbitMQ.

Instead of storing the pending task in a table, why don’t we just send a message to a queue? An active consumer will be eagerly waiting for new messages and fetch one at a time. This will allow us to focus on processing just one message and properly inform the successfulness of the operation to the broker. Contrary to a database operation, the acknowledgement is implemented in the AMQP protocol making it much faster. If (for any reason) there is an error, the broker will automatically resend the message. If we need to handle errors in a different way, we can redirect errors to another queue. This results in a solution that is more maintainable, elegant, and clean.

Additionally, when the processing workforce is not enough, we just need to instance more workers to scale horizontally. With modern cloud computing infrastructure, this could be achieved by launching more VM’s. The broker will then automatically balance the load to each consumer. The ultimate benefit is that the idle workers will consume the tasks as soon as they arrive to the queue — a real time experience!

Our Conclusions

We have concluded that an event-driven approach accomplishes better results from multiple aspects.

With our desire to give something back to the community, we have published a Github repository (https://github.com/mercadolibre) with several interesting, open source projects like Chico UI, BigCache, and BigQueue, among others. Finally, we recently opened the platform to any external developer. You can check out the documentation of our restful API’s on the developer website (http://developers.mercadolibre.com).

>> For more information on RabbitMQ:

About the Author: Pablo Molnar has been working in software development for more than 8 years. He expertise is large-scale Java however in the last years he has specialized in Groovy and Grails technologies. With a strong technology background he enjoys the design and development of large, distributed and scalable systems. Pablo is a graduate of National Technological University, Argentina with a degree in Systems Engineering. Today, he is Technical Leader in MercadoLibre (NASDAQ: MELI) working on the REST API’s open platform. In his spare time, he keeps self entertained experimenting with cutting-edge technologies and discover movie plot holes.
This entry was posted in Case Study, RabbitMQ and tagged , , , , , , , , on by .
Stacey Schneider

About Stacey Schneider

Stacey Schneider has over 15 years of working with technology, with a focus on working with sales and marketing automation as well as internationalization. Schneider has held roles in services, engineering, products and was the former head of marketing and community for Hyperic before it was acquired by SpringSource and VMware. She is now working as a product marketing manager across the vFabric products at VMware, including supporting Hyperic. Prior to Hyperic, Schneider held various positions at CRM software pioneer Siebel Systems, including Group Director of Technology Product Marketing, a role for which her contributions awarded her a patent. Schneider received her BS in Economics with a focus in International Business from the Pennsylvania State University.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>