Sorting in key-value data model

Friday, 13 March 09
The key-value data model is an emerging paradigm in many forms, from key-value stores to caching systems, where data is represented in form of a map between keys and values. Example of real world systems using this model are memcached, a distributed caching system, Tokyo Cabinet and Redis, two key-value databases.

This data model is at the same time old and new: as old as LISP's associative lists, and even older, but at the same time a new emerging paradigm for scalable applications.

In order to exploit the full potential of key-value stores, more advanced operations and design patterns on the key-value model are needed besides the Set-ket, Get-key, Delete-key basic operations of Dictionary data structures. For instance one additional and very useful operation is an atomic increment/decrement.
set x 10
incr x => 11
incr x => 12
The atomic increment operation is particularly useful since it allows to obtain an unique identifier for an object.
id = incr nextId => 100
set obj_<id> "My new object"
Multiple instances of a program can access simultaneously the key-value store and still be sure to obtain unique references (keys) where to store objects, in order to reference this new objects by key in other parts of the key-value data space using simply the identifier.

Some key-value stores like Redis support Lists as values. List support allows to model a large number of problems in a simpler way, expecially when atomic operations to append or consume elements from the list are provided.

This two features, atomic increments and lists as values can be combined in a very powerful way. For example imagine to have the following problem: the key-value store contains keys representing objects with an unique reference (identifiers), this objects can be news in a social news site like Reddit.
id = incr NextId => 1
set news_url_<id> "http://foobar.org"
set news_title_<id> "My foobar story"
push mylist 1

id = incr NextId => 2 set news_url_<id> "http://antirez.com" set news_title_<id> "The blog you are reading right now" push mylist 2

id = incr NextId => 3 set news_url_<id> "http://someothersite.com" set news_title_<id> "Some Other Site" push mylist 3
Later I'll be able to ask a specific range of the list (for example the latest 10 elements added) and populate a web page asking for the news_url_<id> and news_title_<id> keys.

While to retrieve things in the same or reverse chronological order is very useful (for example the 'new sumbissions' page of a Reddit alike site needs such a feature), there are times when we want to perform sorting of this items. To generate the front page of a social news site is such an example of application demanding sorting.

This article is a proposal for a sorting strategy in a key-value data model.

Sorting and weight keys

In order to perform sorting we need another player in our data schema, that is, weights. Our three news with ID 1, 2 and 3 can have different weights generated accordingly to the news age and score (the score may result from user votes). One of our social news system will be in charge of updating this scores performing set operations on score keys:
set score_1 124
set score_2 4461
set score_3 -50
We can now define a new SORT operation on our key-value data model that takes as input key holding a list, and a pattern to retrieve scores for every element of the list:
sort mylist by score_* => 3 1 2
Elements of the list are compared by the value of the score_<id> key. A real world implementation could provide a way to specify if the sorting is done ascending or descending, and the range of values we want to retrieve in order to implement pagination or to take only the first N top items.

Sorting into a distributed environment: key tags

One of the greatest benefits of the key-value data model is that the dataset can be partitioned into N different servers just hashing the key. This can be a problem when sorting is required since the scores needed to sort a given list may be distributed across different stores. This problem can be resolved making the sorting server able to access the other stores to lookup the score keys in other servers: while this can be a good solution the latency time may slow down the sorting operation too much.

An alternative is to make sure that all the scores needed to sort a given key are stored in the same servers of the key holding the list. In order to do so instead to define the hashing of the key as a simple operation that hashes the whole key, only a specific subset of the key is hashed if the key contains a special delimiter.

For example the key "foobar" can be hashed as a full 6 bytes key, but if the pattern [...] is found inside the key, only the part of the key between [ and ] is hashed. As long as the part of the string inside the [] delimiters is the same multiple keys will hash to the same value, and will be stored in the same server.
Hash(mylist[homenews]) == Hash(score_10_[homenews]) == Hash(score_20_[homenews])

Implementation

What was described in this article will soon appear on the Redis key-value store. If you have feedbacks about how to improve this model please leave a comment here. Thank you.

vote on reddit or hacker news
40462 views*
Posted at 12:02:55 | permalink | 24 comments | print
Do you like this article?
Subscribe to the RSS feed of this blog or use the newsletter service in order to receive a notification every time there is something of new to read here.

Note: you'll not see this box again if you are a usual reader.

Comments

Matteo Merli writes:
13 Mar 09, 14:37:32
Hi, this approach for sorted items is really interesting. Reading this I would think of a system in which when you insert a key you can specify an index (or more than one) like:

put( Key, Value, [MyIdx1=X, MyIdx2=Y] )

this way the (Key,Value) will be inserted with usual hashing, and then the key will be added also to the specified indexes. Now every index will be routed (as a key) with hashing to the corresponding server.

then I could have something as :

get_range( MyIdx1, 1, 50 ) -> [K1, K2, ...]

that will involve the query only to the server that owns MyIdx1. Anyway, this doesn't solve the problem when selecting using more indices, as the could resides on different servers.
antirez writes:
14 Mar 09, 13:45:30
@Matteo: thanks for your comment. For now we have a single ordering weight vector, the idea is that if you have more than one, there should be a client that will take care to sync into an unique weight all the other indexes.

About the problem of N servers, your method is workable of course, but I think that key tags are a more general way of making sure a set of N keys will reach the same server.
Tube Home writes:
31 Oct 10, 09:56:51
great article, so informative and at the same time easy to understand. and I really liked the way you used reddit as an example)
800 numbers writes:
22 Nov 10, 14:43:23
I must say that the key value data model is catching attention in a pretty large scale recently and I think that their ability to represent data in the mode of a map between keys and values very efficiently is what that makes them more applicable and practicable. What is interesting about the key value data model is that it is new practically but old theoretically, which thereby makes this the end choice for many users!
sonnerie gratuite sony ericsson writes:
07 Dec 10, 08:35:48
I am very enjoyed for this site. Its an informative topic. It help me very much to solve some problems. Its opportunity are so fantastic and working style so speedy. I think it may be help all of you. Thanks a lot for enjoying this beauty article with me. I am appreciating it very much! Looking forward to another great article. Good luck to the author! all the best!
Top Rated Dating Websites writes:
08 Dec 10, 20:22:19
Great post Antirez! Do you have the direct experience with using it in enterprise model?
Glee Episodes writes:
11 Dec 10, 05:50:09
You have valid points but I think there's still something missing about this article. I'll have to read this again. Still... thanks.
bathroom vanities writes:
16 Dec 10, 01:45:07
They need to make this conference in 2 day, they make a seperate day fro media and for the customer. This i9s a good idea to make sure all of them get the best services from microsoft. http://www.esleepmasters.com/Bathroom_Vanities_s/2...
burberry shoes writes:
17 Dec 10, 02:44:18
More please, this information helped me consider a few more things, keep up the good work. http://www.watcheshall.com/burberry-shoes.html
Hell Tube writes:
16 Jan 11, 21:28:09
About the problem of N servers, your method is workable of course, but I think that key tags are a more general way of making sure a set of N keys will reach the same server.
SelviLiems writes:
18 Jan 11, 11:46:25
Love your sibkling is a must. You can depend on them if you having nobody no more in this world. We have the same blood and this make us need to love our siblings. http://www.centredegenetique.ca/
T writes:
01 Feb 11, 11:10:47
this seems extremely familiar to what MySpace had implemented 4-5 years ago for the cache system.
F writes:
10 Feb 11, 06:22:33
f
<a href="http://www.newbalenciagabag.com">Christian louboutin boot</a> writes:
16 Feb 11, 07:46:28
Thanks for sharing the slides. I downloaded the pdf because i will use it from time to time.
[url=http://www.newbalenciagabag.com]Christian louboutin boot[/url] writes:
16 Feb 11, 07:47:09
there must be some kind of life turmoil going on for these celebrities to gain that much weight within a few years.
<a href="http://www.zocdoc.com/obgyns/westlake-166pm">Westlake OBGYNs</a> writes:
17 Feb 11, 12:25:06
Finally I understand how to do it. I am sure that I can adapt it on my future projects.
<url="http://www.sebwill.com">Best Contractor</url> writes:
23 Feb 11, 17:09:41
Thanks for the key value sorting help!
<a href="http://www.mapleleafpromotions.com/Beer-Mugs.html">Beer Mug</a> writes:
31 Mar 11, 08:56:00
This theatre is really huge! It's looks so elegant and fantastic.
næsekorrektion writes:
16 Apr 11, 03:28:59
I completely agree with you. I really like this article. It contains a lot of useful information. I can set up my new idea from this post. It gives in depth information. Thanks for this valuable information for all. And of course nice review about the Veterans fight and their present prospect.
http://www.cosmedia.dk/naesekorrektion-naeseoperat...
itunes258 writes:
20 Apr 11, 06:48:51
I just can’t stop reading this. Its so fresh, so filled with updates that I just didn’t know. I am delighted to see that people are in fact writing about this subject in such a elegant way, presenting us all diverse parts to it. You’re a fine blogger. Please carry on with it. I can’t wait to read what’s after that. http://www.pcgamesupply.com/buygames/itunes-gift-c...
china wholesale clothing writes:
20 Apr 11, 11:39:40
The situation in US troops baracks are not as good as people may think off. There are some lacks of essentials here and there and this does not do any good to the troops there. http://www.tradesparq.com/search/all/china+wholesale+clothing-manufacturers.php
yamaha motorcycles writes:
11 May 11, 14:38:32
Thank you for such a fantastic blog. Where else could anyone get that kind of info written in such a perfect way? I have a presentation that I am presently working on, and I have been on the look out for such information.
http://www.dirtbikepages.com.au
Tactical Gear writes:
12 May 11, 09:43:13
Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I'll be subscribing to your feed and I hope you post again soon.
http://www.junnicairsoft.com
Airsoft gun writes:
13 May 11, 09:31:19
I completely agree with you. I really like this article. It contains a lot of useful information. I can set up my new idea from this post. It gives in depth information.
http://airsofgun.yolasite.com/
comments closed