Uncategorized

Why I Love Redis

Some History

I fell in love with Redis in the Spring of 2010, when the stars of the MTV show “The Buried Life” appeared on Oprah[1]. Shortly thereafter, I had a chance to send my first love letter, by featuring Redis in the original launch announcement of what became OpenStack (at the time known only as Nova)[2].

I joined Pivotal some years after they acquired Redis, and have been the sponsor of our partnership with Redis Labs since they took over as custodian of the Redis open source project, even using my keynote at RedisConf last year to announce the “McKenty Postulate."

Correcting Some Misconceptions

Like most developers, when I first encountered Redis I treated it as a “cache," an in-memory-only accelerator for workloads that didn’t have time for a round-trip to a “real” database, and a way to prevent an avalanche of requests from beating said database to death. But I quickly realized something amazing – Redis saves everything to disk! In other words, REDIS IS A PERSISTENT DATABASE.

One of the best parts of getting started with Redis is how easy it is to, literally, get started: Installing redis gives you a single binary, which you… run, by typing `redis`. But when run in this fashion, it presents some limitations – it’s a single instance of redis, running on a single server. And many folks don’t realize you can go beyond this.

Redis Labs (the current employer of Salvatore, aka @antirez, aka the BDFL of Redis) provides “Redis Enterprise”, a sharded and clustered version of redis with none of the scaling limitations of open source Redis, and few of the operational challenges of the OSS clustering projects. In other words, REDIS SCALES.

Cognitive Overhead and Mental Models and Affordances

My day job involves ranting coaching the development, operations and leadership teams within Fortune 500 companies on how to make continuous delivery work within their organizations. Most of the ideas our team show up with come out of organizational psychology and the art science of managing responsible autonomy, and there’s really only one secret: Cognitive Overhead is real, and it’s a silent killer.

In order to grapple with abstractions (such as “Infrastructure” or the “Economy”), we have to build a mental model that allows us to hide the endless details of such a complex system, without losing too much fidelity in our interactions. And any time we force a given team (like the developers, for example) to maintain two different mental models for the same problem space, we’re killing their efficiency.

So this is one of the reasons I love Redis: It allows me to use the same mental model for stored data, cached data, and the data structures within my application itself. No ORM is required.

Lies, Damn Lies and Statistics

Remember when I said that Redis was a persistent database? I’ve probably triggered an army of angry database defenders who have some nuanced opinions about persistence and consistency models, so let’s aim for an early cease-fire, shall we?

In any distributed system (meaning any datastore that has to be bigger than a single server, or more available than a single data center), no persistence model is perfect. The tradeoffs that get made are always around adding extra “9s” to the likelihood that your data is consistent and available.

But for the tens of thousands of applications that use Redis as a long-term, durable datastore – it’s persistent *enough*. (Maybe an interesting index would be the dollars spent and operator hours of effort required per “9”, per bit.) 

A few notes about OSS and performance

Like most open source projects, there are a number of different ways to cluster and scale out Redis. But for most large enterprises, especially those facing mission-critical scenarios (think Black Friday for a major retailer), the popular solution is Redis Enterprise from Redis Labs. It supports the open source Cluster API, and has been benchmarked to sustain less than 1 millisecond performance as concurrent users grow by thousands, if not millions, in a matter of seconds. (Linear scaling means that by simply adding machines to the cluster, you can take Redis from 10M ops/sec with 6 servers, to 30M ops/sec with 18 servers, to 50M ops/sec with 26 servers – all while sustaining that sub-millisecond latency).

Are there faster data solutions? Probably. Have I ever needed something faster than Redis? Not yet.  

Conclusions

Are there places where Redis isn’t the right solution? Of course. For some microservices and most legacy workloads, a SQL database at the heart of your application is still the right choice, whether or not you add Redis to improve performance in some spots. And if you have a need for enterprise caching that supports transactions (with commit/rollback), event handlers, continuous querying, and in-cache code execution, then something like Apache Geode (and Pivotal Cloud Cache) is a legit option. (It can be especially helpful when you’re grappling with an older monolith, as it allows you to remove some key bottlenecks without changing any source code.) But as the most beloved data store in the world (and far and away the most downloaded image on DockerHub outside of Linux, with 819M downloads at last count), Redis is a natural choice any time you’re trying to make your developers happy. Which should be always, right? 

PS – Modules!

As if durability, scale, high performance, low cognitive overhead and an ENORMOUS developer community weren’t enough, as of the summer of 2016 Redis now has a plugin model that allows it to natively deal with JSON data (replacing most use cases for a dedicated document store), full-text search (covering a ton of indexing requirements), graphs(!) and more. Salvatore has even demonstrated an in-place machine learning module

While developers get huge benefits from limiting the sprawl of mental models, your operations teams will thank you for addressing a dozen different use cases with a single technology.

[1] I had been managing their web presence for a number of years, which included a community site where any new user could create a “bucket list” for themselves. 

[2] Sadly, Redis was forcibly removed from Nova and replaced with MySQL later in 2010. In retrospect, making OpenStack “easier” for contributors by backing away from a completely stateless architecture should have been a warning sign of future community challenges.