Testing the new Redis AOF rewrite
Tuesday, 13 December 11
Redis 2.4 introduced variadic versions of many Redis commands, including SADD, ZADD, LPUSH, RPUSH. HMSET was already available in Redis 2.2. So for every Redis data type, we know have a way to add multiple items in a single command, for example:
The AOF was still generated using a single command for every element inside an aggregate data type. For instance a three elements list required three different calls to LPUSH in the rewritten AOF file.
Finally Redis 2.6 (that will be forked from the current unstable branch, just removing the cluster code) is introducing the use of variadic commands for AOF log rewriting. The result is that both rewriting and loading an AOF file containing aggregate types and not just plain key->string pairs will be much faster.
How much faster?
We'll start checking the speed gain that can be obtained in a real world dataset with very few keys containing aggregate data types, that is, the database of lloogg.com. Since lloogg was designed in the early stage of Redis development where Hashes where still not available, it stores a lot of user counters as separated keys, so there are a lot of keys just containing a string (huge waste of memory, but I've still to find the time to modify the code). However there is around a 5% of sorted sets. This is an excerpt from the full output of Redis Sampler against the lloogg live DB.
Bigger gains
To test the new code with a database that better represents an use case where most of the keys are aggregate values I created a dataset with 1 million of hashes containing 16 fiels each. Fields are reasonably sized, like Field1: Value1, Field2: Value2, and so forth.
I used this Lua script to create the dataset:
Now the same metrics as above but against this new dataset:
Because since Redis 2.4 BGSAVE directly outputs the encoded version of the value, if the value is encoded as a ziplist, an intset or a zipmap. This is a huge advantage, both while loading and saving the database, that could be easily implemented in the AOF rewrite. However I'm currently not doing it as probably in the next versions of Redis we'll have an option to rewrite the AOF log in RDB format itself... so with the unification of the two systems a lot of problems will be reduced.
- LPUSH mylist item1 item2 item3
- SADD myset A B C
- ZADD myzset 1 first 2 second 3 third
- HMSET myhash name foo surname bar
The AOF was still generated using a single command for every element inside an aggregate data type. For instance a three elements list required three different calls to LPUSH in the rewritten AOF file.
Finally Redis 2.6 (that will be forked from the current unstable branch, just removing the cluster code) is introducing the use of variadic commands for AOF log rewriting. The result is that both rewriting and loading an AOF file containing aggregate types and not just plain key->string pairs will be much faster.
How much faster?
We'll start checking the speed gain that can be obtained in a real world dataset with very few keys containing aggregate data types, that is, the database of lloogg.com. Since lloogg was designed in the early stage of Redis development where Hashes where still not available, it stores a lot of user counters as separated keys, so there are a lot of keys just containing a string (huge waste of memory, but I've still to find the time to modify the code). However there is around a 5% of sorted sets. This is an excerpt from the full output of Redis Sampler against the lloogg live DB.
TYPES ===== string: 95480 (95.48%) zset: 4469 (4.47%) list: 48 (0.05%) set: 3 (0.00%)As you can see this is far from the ideal dataset to make the new AOF changes to look cool, still the result is significant:
- Time needed to rewrite the AOF log, and size of the resulting file with the OLD rewrite: about 12 seconds, 569 MB
- Time needed to rewrite the AOF log, and size of the resulting file with the NEW rewrite: about 9 seconds, 479 MB
- Time to BGSAVE, for reference: about 9 seconds, file size: 344 MB.
- Time to load the RDB: 7.156 seconds
- Time to load the OLD AOF: 15.232 seconds
- Time to load the NEW AOF: 12.589 seconds
Bigger gains
To test the new code with a database that better represents an use case where most of the keys are aggregate values I created a dataset with 1 million of hashes containing 16 fiels each. Fields are reasonably sized, like Field1: Value1, Field2: Value2, and so forth.
I used this Lua script to create the dataset:
local i, j for i=1,1000000 do for j=1,16 do redis.call('hmset','key'..i,'field:'..j,'value:'..j) end end return {ok="DONE"}(Note, if you use the latest unstable branch you can run it using: redis-cli --eval /tmp/script.lua)
Now the same metrics as above but against this new dataset:
- Time needed to rewrite the AOF log, and size of the resulting file with the OLD rewrite: about 17 seconds, 851 MB
- Time needed to rewrite the AOF log, and size of the resulting file with the NEW rewrite: about 10 seconds, 440 MB
- Time to BGSAVE, for reference: about 4 seconds, file size: 158 MB.
- Time to load the RDB: 1.888 seconds
- Time to load the OLD AOF: 31.946 seconds
- Time to load the NEW AOF: 17.512 seconds
Because since Redis 2.4 BGSAVE directly outputs the encoded version of the value, if the value is encoded as a ziplist, an intset or a zipmap. This is a huge advantage, both while loading and saving the database, that could be easily implemented in the AOF rewrite. However I'm currently not doing it as probably in the next versions of Redis we'll have an option to rewrite the AOF log in RDB format itself... so with the unification of the two systems a lot of problems will be reduced.
Do you like this article?
Subscribe to the RSS feed of this blog or use the newsletter service in order to receive a notification every time there is something of new to read here.
Note: you'll not see this box again if you are a usual reader.
Subscribe to the RSS feed of this blog or use the newsletter service in order to receive a notification every time there is something of new to read here.
Note: you'll not see this box again if you are a usual reader.