Zulfiqar's weblog

Architecture, security & random .Net

Archive for October, 2014

Redis vs Couchbase

Posted by zamd on October 10, 2014

Recently in my project, we experienced latency issues which required us to introduce a caching layer in our architecture. I evaluated Couchbase & Redis as potential technology choices and have decided to go with Redis as it nicely fits our data & computation model. In this post, I’ll briefly share our requirements, data model and the factors which lead us to choose Redis over Couchbase.

Our latency issues were twofold:

  • Large data returned from various corporate services like Stock levels, Product & Price etc. This data is mostly reference/semi-static and is cachable (for hours at least)
  • Running computation (matching, copying & intersection) on large lists of objects and copying them back & forth from cache

The data structures involved in our solutions were set of tuples (2-tuples) containing a TPNB(identifier) and a number, which nicely aligns with the Redis Sorted Set data structure, where each member of the set has a key and a score.

Canonical Data Model

Stock Count

TPNB

Count

018616111

3

018616112

4

Ledger Stock Level

TPNB

Level

018616111

4

018616112

7

Stock Count Variance

TPNB

Variance

018616111

-1

018616112

-3

Product Price

TPNB

Price

018616111

9.99

018616112

2.99

Category-H71

TPNB

 

018616111

 

018616112

 

018616113

 

018616114

 

The power of Redis comes from the fact that you can perform operations directly on the sets. Couchbase has no such facility and you are required to retrieve target documents into the application server and apply your logic/computation to store the resultant document back in the cache. The Redis architecture significantly reduces data copying in and out of cache by pushing computation to the data.

We started by storing Ledger stock level as a sorted set. Each [TPNB, Level] pair was stored as a distinct element within the set and we used standard ZADD command inside a transaction to populate the set.

multi

zadd sc.03211.ledger 4 018616111

zadd sc.03211.ledger 7 018616112

exec

When we start the stock count, we simply union the ledger set with a non-existent set to create our count variance set. This is a single instruction in Redis and no data is copied in or out of cache.

zunionstore sc.03211.variance 2 sc.03211.ledger non-existent-set

When we need to start the count, we need the same product boundary as the ‘ledger set’ but all the product counts needs to reset to ‘0’. For this, we again used a union operation with a weight of ‘0’ & cloned the ledger position as our ‘start count’ position.

zunionstore sc.03211.count 2 sc.03211.ledger non-existent-set WEIGHTS 0 0

Couchbase’s memcached type buckets have a max ‘value size’ limit of 1MB, which means we can’t store our data objects as a single key-value item, even the 20MB limit of Couchbase [type] buckets would be a stretch in some cases.

Redis has no real data size limit and can store up to 4 billion members in a sorted set.

Once the initial data structures were set up, we use INCR & DCR commands in a transaction to increment count & decrement variance in real time based on our inventory scan event stream.

We categorize our variance by simply intersecting the relevant ‘category set’ with ‘variance set’ and storing the resulting set a ‘categorized-variance’. Again this is a single Redis instruction requiring no data copying

zinterstore sc.032111.variance.h71 2 sc.032111.variance product. cat.h71 WEIGHTS 1 1

Redis also offers the ability to run custom logic inside the cache as Lua scripts. In the example below, we used a custom script to price the variance by multiplying each product count with unit price

local leftSetId = KEYS[1]

local rightSetId = KEYS[2]

local destSetId = KEYS[3]

 

local data = redis.call(‘zrange’, leftSetId, 0, -1, ‘WITHSCORES’)

for k, v in pairs(data) do

    if k % 2 == 1 then

        local val  = v

        local leftScore = tonumber(data[k + 1])

        local rightScore = tonumber(redis.call(‘zscore’, rightSetId, val))

       

        if rightScore then

            redis.call(‘zadd’, destSetId, leftScore * rightScore, val)

        end

    end

end

return ‘ok’

Redis has built-in support for master/slave replication through its Sentinel platform. A cluster of master/slave nodes can be deployed across data centres and Sentinel can provide auto-failover when the master goes down. Sentinel publishes key failover events using the Redis pub/sub & client can use these events to reconfigure themselves to the new master once a failover is completed.

Redis cluster is currently under development and would enable auto-sharding and transparent failover on top of the current replication model. The currently proposed model is very similar to Couchbase where dumb clients simply tries a random instance & gets redirected if the target instance doesn’t master the key. Smart Clients cache the cluster map and always go to the correct host based on the cached cluster map.

Currently proxy solutions like twimproxy can be used to provide auto-sharding on top of Sentinel replication.

In summary, Redis was more than a simple key/value store for us. The fact we can run computation inside the cache to filter, merge and group data was the key differentiator for Redis against Couchbase.

The clustering and high availability support in Redis is still bit lacking which makes it a risky choice as the master/primary data store of the application. In such scenarios, Couchbase should be the preferred technology choice.

 

Advertisements

Posted in Couchbase, Redis | Tagged: | 3 Comments »

Queues and Workers

Posted by zamd on October 10, 2014

In the last post, I briefly talked about the architecture of project I’m currently leading. We got a clear read/write separation in the architecture and for past few sprints we are pushing more & more work on the write path which made our write pipeline a bit heavy. Our challenge is to quickly process huge bulk of tags flowing through the write pipeline. Just to give you an idea of the numbers:

The business process involves doing an “all clothing” stock count every Monday between 06:00 – 09:00 AM. For our UK deployment, we are aiming 600 stores each containing an inventory of roughly 80k garments. To get good accuracy from RFID tags, all inventory must be counted at least twice.  The usual process is, a group of people count the backroom and shop floor in parallel – they then swap and count again.  So the math look like this:

80K * 600 = 48 million * 2 = 96 million tags

These needs to be processed within 3 hours which equates to roughly 10,000 commands/tags/second to our backend service.

We already spent quite a bit of time to optimize the inventory pipeline and 99th percentile latency is below 100ms which is reasonably good considering we are using NHibernate & Oracle and calling bunch of backend services. There is further juice we can extract out of inventory pipeline but realistically to process all this load, we need to scale out the system.  We kind of knew this from day one, so we designed the system in a way where commands are pretty much queuable after some simple invariants checking.

We came with a very simple scale model to run multiple workers behind a set of queues – we started by re-hosting our domain in a worker process (a simple console application, the plan is to use NSSM in production). This simple model works great – workers compete for the commands, a single worker read a command under peak-lock and run the unit of work(UOW). If UOW cannot be committed, message is simply retried on another worker and in most cases the transient failure (a small race condition :-)) gets resolved on subsequent retries.

With this simple model, we were able to get a throughput of over 2000 tags/second using 16 workers on a single beefy machine.

There is huge number of duplicates in our scenario, so our next attempt was to detect/remove duplicate before they hit our workers. Ideally the messaging system should do this for us – We use Tibco EMS but unfortunately EMS doesn’t have any built de-duping functionality.

We also use Redis as our read store in our architecture – so we decided to build de-duping (on publish) functionality in Redis using the simple Get/Set operation. The results were awesome as we can de-dup a batch of 50 tags in 0.3ms.

This one change has significantly reduced the problem size for us as there are at least 100% duplicates in a stock count and there is no way to avoid them on the client/sender side.  By efficiently de-duping them on server means our workers only has to process 1/2 of the load ~ 5000 tags/sec

Another interesting pattern we have seen is around large UOWs which becomes very in-efficient to be done as a single UOW synchronously. In these situations, a worker simple breaks the larger UOW into ‘N’ smaller UOWs which are queued and then processed in parallel. The downside here is that coding become bit tedious, as we are reading a message from the queue, breaking it down in smaller messages and writing them back in the queue.  It’s not perfect, but it gives us a nice way to break & parallelize large UOWs (and we got plenty of these).

Posted in .net, Architecture, Redis | Leave a Comment »