An update about Redis 2.6 and Sentinel

Thursday, 23 August 12
I'm back to work after two good weeks of vacations, and I hope you enjoyed your time as well during your days of rest. It's a bit hard to start after many days of pause, but at the same time it feels good to work again at Redis 2.6 and Redis Sentinel. This blog post is an update about the state of this two projects.

Redis 2.6 ETA

The time at which Redis 2.6 will ship as a stable release is one of the most frequently asked questions in the Redis community recently. Actually there is no ETA about Redis 2.6 stable because the idea is to do a stable release only when the test coverage will improve, and once a few more known bugs are fixed.

You can check the open issues with the 2.6 milestone for a complete list.

About the coverage, that's the current state of affairs.

But the reality is that as a Redis user my suggestion is to don't care about labels like release candidate or stable. What matters, after all, is just the actual level of stability. And Redis 2.6 will reach a production level stability before it goes out of release candidate, and I'll make sure to make this information public.

What is production ready?

There are probably much more software engineering books than there should be ;) So despite of the general title of this section my goal here is to state what stable means for me and for the Redis project. After all there are a number of possible metrics, but mine is very simple and in the latest three years worked very reliably.
A Redis branch without active development (major changes to the source code that are not just additions without impacts to what we already have) is stable when the number of critical bugs discovered in the latest 8 weeks is near to zero.

In this context a critical bug is a bug that corrupts data, that crashes the server, that creates an inconsistency between master and slave, and that, at the same time, is not triggered by a crazy edge-case, so edge that it is unlikely to run into it involuntarily.

Note that bugs disconvered both in the beta release and in the current stable release are not counted here, as they are not specific of the beta release.
Usually in a new Redis beta release there is a moment where bugs are rarely discovered because too little users are actually using the release. At some point users start to switch to the new release more and more so that the rate of reporting raises. Later we fix enough of the major things that the remaining bugs are hard enough to discover that for months no critical bug is found at all. This is when something can be considered production ready for me.

Is 2.6 production ready?

Redis 2.6 is still not production ready because critical bugs are still found with a too high rate, even if this rate is slowing down consistently. We need more weeks of work.

Also there are critical knonw issues such as Issue #614 that needs to be addressed changing a few things in the 2.6 internals.

Once I feel that the discovery of a critical bug is as likely in 2.4 as in the current 2.6 release candidate, I'll make sure to inform the community. I hope that this will happen in one or two months at max as the rate at which bugs are discovered in the 2.6 branch is encouraging, but I can't predict the future, so take this ETA with a bit of salt.

About issue #614

Issue #614 is a pretty interesting affair IMHO, as it shows how an error in the design, that was, not picking the simplest approach that could possibly work, caused a number of issues in the history of the Redis development.

Basically while Redis is a simple system, once you start mixing blocking operations such as BRPOP / BLPOP / BRPOPLPUSH with replication and scripting, things can easily get pretty convoluted.

In the old days we already had blocking operations such as BRPOP. Operations that block the client if there is no data to fetch from the list (as it is empty, aka non existing). Once an element is pushed on the list, the first client waiting in the queue for this list is unblocked, as there is finally data to pick.

However in the past there were only two ways to push elements into a list: LPUSH or RPUSH, but guess what, if a list is empty this two operations are the same, so actually there was conceptually only one way to push elements into a list.

In this simplified world, if you push an element and there is another client waiting for it, you can just pass the element to the client! So it's a non operation from the point of view of the data set, replication link, append only file. What a great premature optimization, eh? So my mistake was to handle blocking operations in a synchronous way: Redis currently unblocks and serves the client waiting for push directly in the context of the execution of the push operation.

But then we introduced BRPOPLPUSH that can push as a side effect of waiting for an element. Later we also introduced variadic list push operations. Things were not as simple as usually, for instance if there is one client waiting for elements in mylist, and another client does LPUSH mylist a b c d, then we should replicate only LPUSH mylist b c d, as the first element was consumed by the blocking client. Well that and another zillion of other more complex cases as BRPOPLPUSH can in turn push an element to another list that has clients waiting with blocked operations and so forth.

Eventually I reworked the core so that everything worked, in the form of a few recursive functions and a more complex replication layer that was able to alter the commands in the replication and AOF link, and that was also able to replicate multiple commands for a single call.

But guess what? This does not work with scripting. We can't rewrite scripts to exclude a few elements from their effects. Ultimately the reality is that my implementation of this stuff sucked.

The fix is conceptually easy, and is just, the simplest thing that could possibly work. I simply need to rewrite that to avoid serving blocking clients in the context of the push operation. On push we'll just mark the keys that had clients waiting for data, and that actually received data. Then once the command returns we can serve those clients, in two easy separated steps. So there is no longer to alter commands in the replication link to half-push stuff. We just push everything so that the effects of the push operation (or script) will be full in the dataset and in the replication link and AOF file. Then there is the pop stage.

Long story short I'll fix this in the next days rewriting part of the core of Redis 2.6. This will not help stability as it is touching proven (but partially broken) code, but will give us less bugs in the future.

Redis Sentinel

I can't be more happy with Redis Sentinel, it was just an idea a few weeks ago, and now it's a working system. Because Redis Sentinel is a completely self-contained stuff I'm also tempted to merge it into 2.6 once it is stable and tested enough. We need a few more weeks and many more users to check the real degree of stability of Sentinel.

However there are a few more things that should be addressed ASAP, and this is my short term TODO list of things you'll see committed in the next days and weeks:

  • Support slave priority. Sentinel actually already has this concept internally, but Redis slaves don't publish a priority in INFO output currently. The lower the priority, the more suitable for promotion a slave is. A priority of zero however means: never promote me as a master.
  • SLAVEOF sentinel://mastername/ip:port,ip:port,ip:port... Now that we have sentinel we can use it to make configuration simpler. This form of SLAVEOF will simply query a Sentinel among the listed ip:port pairs to discover what the master is currently (with the specified name) and replicate with it.
  • Support for AUTH in Sentinel.
  • Check for -BUSY and sent SCRIPT KILL before going into ODOWN condition.
  • INFO command for Sentinel.
  • A number of minor but important changes to the state machine that can improve reliability of Sentinel under unexpected conditions.
  • Update the documentation.

Well, also I should start to blog more, even at the cost of getting a bit less things done every week. Please if I don't post an update on Redis every week ping me on Twitter and tell me I'm a charlatan ;)
Posted at 09:49:25 | permalink | discuss | print
Do you like this article?
Subscribe to the RSS feed of this blog or use the newsletter service in order to receive a notification every time there is something of new to read here.

Note: you'll not see this box again if you are a usual reader.


blog comments powered by Disqus