In its sophomore year, the RabbitMQ Summit was perfectly timed a month after RabbitMQ 3.8 was released. With quorum queues and observability changes, this release is a game changer for RabbitMQ users. It was great to learn all the ways this release was improving the user experience. As always, it was also great to hear directly from users at WeWork, Zalando, Bloomberg, and FreeNow (formerly MyTaxi).
@DormainDrewitz leading the panel discussion at #RabbitMQ summit pic.twitter.com/IyHdMq1eT8
— Madhav Sathe (@madhav_sathe) November 4, 2019
As with last year, there was a panel of RabbitMQ experts at one point. Carl Hörberg from CloudAMQP returned from last year, joined by Michael Klishin and Jack Vanlightly from Pivotal, and Ayanda Dube from Erlang Solutions. Here are my notes from moderating this esteemed panel. You can compare to last year 😉
Why is RabbitMQ relevant today?
We wasted no time getting to the exciting questions. The RabbitMQ project launched a dozen years ago. That's given the project time to mature, but a lot has happened in the industry. Cloud has become mainstream. Projects like Apache Kafka and Kubernetes have arrived on the scene and taken off in popularity. Reactive applications are changing how application components talk to each other. So, is RabbitMQ still relevant? Is it "cloud-native"?
An emphatic yes, says the panel, and here’s why:
-
It's cloud-friendly. It's supports automation and available in all the clouds.
-
It's observable, particularly with the new features in 3.8.
-
It's great for decoupling application components. This is key for cloud-native application design.
-
It's very flexible and supports lots of use cases, from financial services to Internet of Things (IoT).
Scaling was a mixed story. On the one hand, spinning up new clusters is seamless and a lot has improved since 3.6. But the core engineering team acknowledged that there is still work to do—although scaling beyond, say, one million messages per second may be an edge use case.
Only famous names from the RabbitMQ community on stage right now; a bunch of professionals about how widely #RabbitMQ is used today, and how its usage is just growing and growing. #RabbitMQSummit pic.twitter.com/HrL2QC1Pd4
— Lovisa Johansson (@lillajja) November 4, 2019
Most exciting features in RabbitMQ 3.8
Released in early October, RabbitMQ 3.8 has three key new features: quorum queues (for high availability), feature flags (to improve the upgrade process), and support for Prometheus and Grafana (for monitoring and observability). All three are exciting, but when pressed, the panel was most excited about… (drumroll)… quorum queues!
Here's why quorum queues are so exciting:
-
They enhance the overall safety of the system.
-
They’re more network efficient. With the previous model of mirrored queues, you paid a lot for intra-network communication.
-
They’re based on Raft, which is great for distributed systems.
-
It will be exciting to watch them get faster.
Monitoring and observability improvements came in second. Interestingly, the core engineering team has already improved quorum queue memory performance up to 25% because of what they have learned from running load tests with enhanced observability.
Upgrading to RabbitMQ 3.8
With all the exciting features in 3.8, you might wonder why everyone isn't running it already. Looking at data from RabbitMQ hosting service CloudAMQP, a third of its users haven't even moved to 3.7 yet! There are many reasons for this, but it turns out that upgrading RabbitMQ under load with minimal downtime can be tricky. Feature flags will make this process easier going forward. With feature flags, operators can perform rolling upgrades of a cluster, rather than requiring a cluster-wide shutdown. So, what should users be thinking about today as they plan their upgrades to 3.8?
-
Make sure you've provisioned more resources
-
Recommend a blue-green deployment
-
Check out the upgrade guide
Tuning RabbitMQ for performance
RabbitMQ performance is always a hot topic. From design, to resource planning, to monitoring, there are lots of things to consider when tuning RabbitMQ for performance. With RabbitMQ 3.8, some of these considerations are changing. Here's what the panelists highlighted.
-
Performance optimization starts with metrics. Make sure you have a monitoring system in place before going into production. That data is critical to understanding causes of performance issues.
-
Designing for optimal performance is a discipline in itself. Make sure you have the right resources for your cluster. Running out of memory is the most common issue, often due to a high number of connections.
-
Don't use features (replication, persistence, etc.) that you don't need. The sheer number of feature combinations has exploded. This can make it hard to predict performance, so use a load tester to understand your system.
-
Be careful how you use dead lettering. If you dead-letter a whole queue at the same time, you'll see a performance hit.
-
Be mindful of data locality. What node clients connect to affects inter-node traffic and latency, in particular for consumers. HTTP API can be used to discover which node hosts master replica for a particular queue. This is an area of improvement in clients and at the protocol level.​
Incredibly insightful panel discussion between @vanlightly, @dube_aya, @carlhoerberg and @michaelklishin. We're all ears. #rabbitmq #rabbitmqsummit pic.twitter.com/4LsXzCtNSS
— Erlang Solutions (@ErlangSolutions) November 4, 2019
Client behavior
One of the things that makes RabbitMQ so appealing is the wide range of languages supported by its client libraries. But there is inevitably some variation in what's available to different clients. One common challenge is how clients behave when a node goes down. How does it reconnect?
-
Project Reactor adds some useful capabilities, but that's only available for Java..
-
There's a knowledge gap for many users when it comes to the underlying AMQP protocol. However, many users don't need to know it and are unlikely to learn it (Last year, Gavin Roy gave a great talk on this, btw). There may be ways to abstract away how to create new connections and reuse connections in client libraries.
How healthy is the RabbitMQ community?
There are lots of ways to measure the health of an open source community. Some are admittedly vanity metrics (stars on GitHub?), so it takes a few angles to get a clear picture. I asked the panel how they measure the RabbitMQ community's health. Here's what they had to say:
-
The Google Groups mailing list is large and growing, but not growing quickly. Admittedly, the search experience is (ironically) not great and, as a Google-based service, it's not available to all.
-
Pull requests are also growing, but not rapidly.
-
The vendors with offerings around RabbitMQ get a lot of inquiries, from all over. Interestingly, 84codes' hosted RabbitMQ offering (CloudAMQP) sees continuous growth and demand for larger clusters, but its Kafka offering (CloudKarafka) hasn't seen as much growth.
I'll add that the This Month in RabbitMQ blog series surfaces a steady pace of community writings and resources published around the Internet. If you haven't checked that out, the team began publishing a monthly round-up of updates about a year ago on the RabbitMQ Blog.
Learn more about RabbitMQ 3.8 and how to upgrade:
– Webinar: What's new in RabbitMQ 3.8
– Webinar: Understanding RabbitMQ: For Developers and Operators
– Guide to upgrading RabbitMQ