versione stampabile

Everything about Redis 2.4

Friday, 29 July 11

A few months ago I realized that the cluster support for Redis, currently in development in the unstable branch, was going to require some time to be shipped into a stable release, and required significant changes in the Redis core.

At the same time I and Pieter already had a number of good things not related to cluster in our development code: delaying everything for the cluster stable release was not acceptable. So I took a different path, forking 2.2 into 2.4, and merging my and Pieter's developments (at least the ones compatible with the 2.2 code base) into this new branch. In other words 2.4 was possible because git rules.

2.4 delayed the work into unstable, but this was a good compromise after all. And now the effort finally reached a form that is near to be stable as we are at the release candidate number five. You can find Redis 2.4-rc5 in the Redis site download section, and in a few weeks this will be rebranded Redis 2.4.0-stable if no critical bugs will be discovered.

This article is going to show you in detail all the new things introduced in Redis 2.4. Before continuing, no... scripting is not included. It will be released with Redis 2.6 that will be based on the Redis unstable code base instead. Redis 2.6 is planned for this fall.

The following is a summary of all the changes contained in 2.4. We'll show every one in detail in the course of the article.

Small sorted sets now use significantly less memory.
RDB Persistence is much much faster for many common data sets.
Many write commands now accept multiple arguments, so you can add multiple items into a Set or List with just a single command. This can improve the performance in a pretty impressive way.
Our new allocator is jemalloc.
Less memory is used by the saving child, as we reduced the amount of copy on write.
INFO is more informative. However it is still the old 2.2-alike INFO, not the new one into unstable composed of sub sections.
The new OBJECT command can be used to introspect Redis values.
The new CLIENT command allows for connected clients introspection.
Slaves are now able to connect to the master instance in a non-blocking fashion.
Redis-cli was improved in a few ways.
Redis-benchmark was improved as well.
Make is now colorized ;)
VM has been deprecated.
In general Redis is now faster than ever.
We have a much improved Redis test framework.

Everything on this list was coded by me and Pieter Noordhuis but feedbacks from users were really helpful. A special thank goes to Hampus Wessman that spotted and fixed interesting bugs.

VMware kindly sponsored all our work as usually. Thanks!

Memory optimized Sorted Sets

One of the most interesting changes in Redis 2.2 was the support for memory optimized small values. Why to represent a Redis List as a linked list if it only got 10 elements for instance? If you have a billion of lists this is going to take a lot of space since there are a lot of pointers, many allocations each with its own overhead, and so forth.

So we introduced the ability to switch encoding on the fly. Lists, Sets, Hashes, all start encoded as an unique blob that uses little memory, even if it requires O(N) algorithms to do things that are otherwise O(1). But once a given threshold is reached Redis converts this values into the old representation. So the amortized time is still O(1) to perform the operation on the element, but we use a lot less memory. Many datasets are composed of millions of small lists, hashes, and so forth.

However in Redis 2.2 we applied this optimization to everything but Sorted Sets. Redis 2.4 finally brings this optimization to Sorted Sets as well, as we discovered that there are many users also using data sets with many many small sorted sets. And this brings us to the next point...

Faster RDB persistence

If our small values are encoded as a blobs, this means we can do something very interesting from the point of view of persistence: this values are already serialized!

The kind of representation we use for small values does not have pointers or alike. The only change that we required was to put all the integers (lengths and relative offsets in the encoding format) in an endianess independent form. I used little endian encoding as this means no conversion most of the times.

This is a huge win from the point of view of RDB persistence. In Redis 2.2 to save an hash with ten fields represented as an zipmap (this is one of our special encoding formats) required to iterate the hash and save every field and value as a different logical objects in the RDB format.

Now instead we save the serialized value as it is in memory! Many datasets are now an order of magnitude faster to load and save. This also means that Redis 2.2 can't read datasets saved with 2.4.

Variadic write commands

Finally many write commands are able to take multiple values! This is the full list:

SADD set val1 val2 val3 ... -- now returns the number of elements added (not already present).
HDEL hash field2 field3 field3 ... -- now returns the number of elements removed.
SREM set val1 val2 val3 ... -- now returns the number of elements removed.
ZREM zset val1 val2 val3 ... -- now returns the number of elements removed.
ZADD zset score1 val1 score2 val2 ... -- now returns the number of elements added.
LPUSH/RLPUSH list val1 val2 val3 ... -- return value is the new length of the list, as usually.

Since Redis ability to process commands faster is not usually related to the time needed to alter the data set, but to the time spent into I/O, dispatching, sending the reply back, this means that now for some applications there is some impressive speed improvement.

Just an example:

> redis-cli del mylist
(integer) 1
> ./redis-benchmark -n 100000 lpush mylist 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
====== lpush mylist 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ======
  100000 requests completed in 1.28 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

99.93% <= 1 milliseconds
99.95% <= 2 milliseconds
100.00% <= 2 milliseconds
78247.26 requests per second

> redis-cli llen mylist
(integer) 2101029

Yes, we added two million items into a list in 1.28 seconds, with a networking layer between us and the server. Just saying...

You may ask, why we modified only a specific number of commands into variadic versions? We did it in all the commands where the return value would not require a type change, nor to be dependent by the number of arguments. For all the rest there will be scripting... doing this and more for you :)

Jemalloc FTW

The jemalloc affair is one of our most fortunate use of external code ever. If you used to follow the Redis developments you know I'm not exactly the kind of guy excited to link some big project to Redis without some huge gain. We don't use libevent, our data structures are implemented in small .c files, and so forth.

But an allocator is a serious thing. Since we introduced the specially encoded data types Redis started suffering from fragmentation. We tried different things to fix the problem, but basically the Linux default allocator in glibc sucks really, really hard.

Including jemalloc inside of Redis (no need to have it installed in your computer, just download the Redis tarball as usually and type make) was a huge win. Every single case of fragmentation in real world systems was fixed by this change, and also the amount of memory used dropped a bit.

So now we build on Linux using Jemalloc by default. Thanks Jemalloc! If you are on osx or *BSD you can still force a jemalloc build with make USE_JEMALLOC=yes, but those other systems have a sane libc malloc so usually this is not required. Also a few of those systems use jemalloc-derived libc malloc implementations.

Less copy-on-write

Redis RDB persistence, and the AOF log rewriting system are based on fork() memory semantic in modern operation systems. While the child is writing the new AOF or an RDB file, it is cool to have the operating system preserving a point-in-time copy of the dataset for us, but every time we change a page of memory in the parent process this will get duplicated. This is known as copy-on-write, and is responsible for the additional memory used by the saving child in Redis.

We did different changes in the past in order to reduce copy on write, but one of the latest change needed was still not implemented, related to the internal working of our hash table implementation iterator. Finally 2.4 has this change. It is interesting to note that I did the error of back porting this change into Redis 2.2. This was responsible of many bugs in the course of Redis 2.2 recent history. In the future I'll continue to be conservative as I was in the past and will do just the minimal changes in stable releases.

The additional copy on write in Redis 2.2 looked like a bug, but fixing those bug with a patch involving several changes into the core was surely not a good idea. To wait for the next release is almost always the right thing to do in the case of non critical bugs.

More fields in INFO

The new INFO into unstable is much better compared to the one into 2.2 and 2.4. It was not a good idea to backport it into 2.4 as it was too much different code, but the new 2.4 INFO has a few interesting new fields, especially this two:

used_memory_peak:185680824
used_memory_peak_human:177.08M

Your RSS and your fragmentation rate are usually related to the peak memory usage. Now Redis is able to hold this information, and this is very useful for memory related troubleshooting.

So for instance if you have an RSS of 5 GB but your DB is almost empty, are you sure it used to be always empty? Now there is just to look at this field.

Two new introspection commands: OBJECT and CLIENT

The DEBUG command was already able to show a few interesting informations about Redis objects. However you can't count on DEBUG as this command is not required to be stable over time, and should never be used if not in order to hack on Redis code base.

The OBJECT command brings a few interesting information about Redis values in a space that is accessible and usable by developers.

You can find the full documentation of the Object command here.

Another interesting new command is the CLIENT command. Using this command you are able to both list and kill clients. I'm sorry but I've still to write the documentation for this command, so here I'll show an interactive usage example:

redis 127.0.0.1:6379> client list
addr=127.0.0.1:49083 fd=5 idle=0 flags=N db=0 sub=0 psub=0
addr=127.0.0.1:49085 fd=6 idle=9 flags=N db=0 sub=0 psub=0

We got the list of clients, and some info about what they are doing (or not doing, see the idle field). Now it's time to kill some client:

redis 127.0.0.1:6379> client kill 127.0.0.1:49085
OK
redis 127.0.0.1:6379> client kill 127.0.0.1:49085
(error) ERR No such client

Non blocking slave connect

Redis master - slave replication was a non blocking process already almost for everything but the connect(2) call performed by the slave to the master.

This is finally fixed. A small change but with a significantly better behavior compared to the past. We still need to fix a few things about replication, but we'll do other changes in order to make replication better for cluster. Redis Cluster uses replication in order to maintain copies of nodes, so you can expect that as cluster will evolve replication will also evolve.

Better redis-cli and redis-benchmark

Redis-cli is now able to do more interesting things. For instance you can now prefix a command with a number to run the command multiple times:

redis 127.0.0.1:6379> 4 ping
PONG
PONG
PONG
PONG

Another interesting change is the ability to reconnect to an instance if the link goes down and to retry the reconnection after every command typed.

Finally redis-cli can be now used to monitor INFO parameters together with grep. In the following example we display the memory usage every second.

./redis-cli -r 10000 -i 1 info | grep used_memory_human
used_memory_human:909.22K
used_memory_human:909.22K
used_memory_human:909.22K

Redis-benchmark was also improved, and now you can specify the exact command to benchmark, that is an awesome change. You can see an example run of the new redis-benchmark in the paragraph related to variadic commands.

Other improvements

An important change in Redis 2.4 is that it is the last version of Redis featuring VM. Redis will warn you that it is not a good idea to use VM as we are going to no longer support it in future versions of Redis as already discussed many times.

Also the new test is much faster, we have a full article about this change. The new continuous integration is also helpful, running our code base over valgrind multiple times every hour.

Another interesting change is the colorized make process ;) You may thing this is just a fancy thing, but actually it is much simpler to see compilation warnings this way.

I hope you'll enjoy Redis 2.4, and a big thank you to all the Redis community! Since it's friday, have a good week end :)

142389 views^*

Posted at 17:46:47 | permalink | discuss | print