Sorting in key-value data model

Friday, 13 March 09

The key-value data model is an emerging paradigm in many forms, from key-value stores to caching systems, where data is represented in form of a map between keys and values. Example of real world systems using this model are memcached, a distributed caching system, Tokyo Cabinet and Redis, two key-value databases.

This data model is at the same time old and new: as old as LISP's associative lists, and even older, but at the same time a new emerging paradigm for scalable applications.

In order to exploit the full potential of key-value stores, more advanced operations and design patterns on the key-value model are needed besides the Set-ket, Get-key, Delete-key basic operations of Dictionary data structures. For instance one additional and very useful operation is an atomic increment/decrement.

set x 10
incr x => 11
incr x => 12

The atomic increment operation is particularly useful since it allows to obtain an unique identifier for an object.

id = incr nextId => 100
set obj_<id> "My new object"

Multiple instances of a program can access simultaneously the key-value store and still be sure to obtain unique references (keys) where to store objects, in order to reference this new objects by key in other parts of the key-value data space using simply the identifier.

Some key-value stores like Redis support Lists as values. List support allows to model a large number of problems in a simpler way, expecially when atomic operations to append or consume elements from the list are provided.

This two features, atomic increments and lists as values can be combined in a very powerful way. For example imagine to have the following problem: the key-value store contains keys representing objects with an unique reference (identifiers), this objects can be news in a social news site like Reddit.

id = incr NextId => 1
set news_url_<id> "http://foobar.org"
set news_title_<id> "My foobar story"
push mylist 1

id = incr NextId => 2
set news_url_<id> "http://antirez.com"
set news_title_<id> "The blog you are reading right now"
push mylist 2

id = incr NextId => 3
set news_url_<id> "http://someothersite.com"
set news_title_<id> "Some Other Site"
push mylist 3

Later I'll be able to ask a specific range of the list (for example the latest 10 elements added) and populate a web page asking for the news_url_<id> and news_title_<id> keys.

While to retrieve things in the same or reverse chronological order is very useful (for example the 'new sumbissions' page of a Reddit alike site needs such a feature), there are times when we want to perform sorting of this items. To generate the front page of a social news site is such an example of application demanding sorting.

This article is a proposal for a sorting strategy in a key-value data model.

Sorting and weight keys

In order to perform sorting we need another player in our data schema, that is, weights. Our three news with ID 1, 2 and 3 can have different weights generated accordingly to the news age and score (the score may result from user votes). One of our social news system will be in charge of updating this scores performing set operations on score keys:

set score_1 124
set score_2 4461
set score_3 -50

We can now define a new SORT operation on our key-value data model that takes as input key holding a list, and a pattern to retrieve scores for every element of the list:

sort mylist by score_* => 3 1 2

Elements of the list are compared by the value of the score_<id> key. A real world implementation could provide a way to specify if the sorting is done ascending or descending, and the range of values we want to retrieve in order to implement pagination or to take only the first N top items.

Sorting into a distributed environment: key tags

One of the greatest benefits of the key-value data model is that the dataset can be partitioned into N different servers just hashing the key. This can be a problem when sorting is required since the scores needed to sort a given list may be distributed across different stores. This problem can be resolved making the sorting server able to access the other stores to lookup the score keys in other servers: while this can be a good solution the latency time may slow down the sorting operation too much.

An alternative is to make sure that all the scores needed to sort a given key are stored in the same servers of the key holding the list. In order to do so instead to define the hashing of the key as a simple operation that hashes the whole key, only a specific subset of the key is hashed if the key contains a special delimiter.

For example the key "foobar" can be hashed as a full 6 bytes key, but if the pattern [...] is found inside the key, only the part of the key between [ and ] is hashed. As long as the part of the string inside the [] delimiters is the same multiple keys will hash to the same value, and will be stored in the same server.

Hash(mylist[homenews]) == Hash(score_10_[homenews]) == Hash(score_20_[homenews])

Implementation

What was described in this article will soon appear on the Redis key-value store. If you have feedbacks about how to improve this model please leave a comment here. Thank you.

vote on reddit or hacker news

69813 views^*

Posted at 08:02:55 | permalink | 3 comments | print

Do you like this article?
Subscribe to the RSS feed of this blog or use the newsletter service in order to receive a notification every time there is something of new to read here.

Note: you'll not see this box again if you are a usual reader.

Comments

Matteo Merli writes:

13 Mar 09, 10:37:32

Hi, this approach for sorted items is really interesting. Reading this I would think of a system in which when you insert a key you can specify an index (or more than one) like:

put( Key, Value, [MyIdx1=X, MyIdx2=Y] )

this way the (Key,Value) will be inserted with usual hashing, and then the key will be added also to the specified indexes. Now every index will be routed (as a key) with hashing to the corresponding server.

then I could have something as :

get_range( MyIdx1, 1, 50 ) -> [K1, K2, ...]

that will involve the query only to the server that owns MyIdx1. Anyway, this doesn't solve the problem when selecting using more indices, as the could resides on different servers.

antirez writes:

14 Mar 09, 09:45:30

@Matteo: thanks for your comment. For now we have a single ordering weight vector, the idea is that if you have more than one, there should be a client that will take care to sync into an unique weight all the other indexes.

About the problem of N servers, your method is workable of course, but I think that key tags are a more general way of making sure a set of N keys will reach the same server.

800 numbers writes:

22 Nov 10, 09:43:23

I must say that the key value data model is catching attention in a pretty large scale recently and I think that their ability to represent data in the mode of a map between keys and values very efficiently is what that makes them more applicable and practicable. What is interesting about the key value data model is that it is new practically but old theoretically, which thereby makes this the end choice for many users!

comments closed

antirez weblog