This doesn't make any sense to me. Why store the values in Redis and use Ruby to...

meritt · on Oct 16, 2012

Because they use MongoDB and as the usage of it increases, we'll see more and more people continue to "discover" ways to do things RDBMS' solved in the 80s.

Just like this also has nothing intrinsically to do with Redis and could be any regular ole key-value store.

samstokes · on Oct 16, 2012

It looks like one of the things they're counting is clicks, so they could potentially have some pretty large datasets.

I don't know how well Mongo's map-reduce works, but in Postgres, COUNT(star) [1] does not perform well for very large tables (e.g. 100 million rows). You wouldn't want to be doing a COUNT(star) once per minute for each customer that had their dashboard open on a plasma screen.

Of course, there are other solutions to that problem: generate the counts on some more feasible schedule and cache them; have a read replica used for analytics queries; shard by customer and have no large customers.

I don't know whether their scale strictly requires the Redis solution, but in any case there are situations where it's not as simple as "throw it in Postgres and use an aggregate function".

[1] "star" instead of an asterisk to avoid HN thinking I'm trying to write in italics.

pestaa · on Oct 16, 2012

Note that recent versions of Postgres can now COUNT against indices so no need to do a full table scan.

samstokes · on Oct 17, 2012

This certainly helps, but if your indices are several GB in size, even an index scan is a nontrivial expense.

avand · on Oct 16, 2012

To be fair, we may have over-engineered this solution. I've been doing some work with compound indexes with Mongo and they seem to be performing really well. Maybe I'll have to write a guest post for MongoHQ too ;)

Appreciate the time and thoughtful response.

mbell · on Oct 16, 2012

The article said they could rebuild the count if they needed to. So something in regards to each click is being stored. If you were using postgres you'd just setup a trigger on that table to increment the click values (stored in another table) as appropriate. No aggregate function needed.

samstokes · on Oct 16, 2012

Certainly you can implement this same pattern without Redis. Triggers in Postgres would be a reasonable way to do it. I didn't say you can't do this in Postgres, I said you can't do it with COUNT().

It does indeed sound like they're storing every click: that's precisely why using aggregate functions would be expensive.

guywithabike · on Oct 16, 2012

You absolutely can do it without COUNT. Just increment a counter value, the same way they're doing it in Redis.

pestaa · on Oct 16, 2012

Why was this downvoted? Adding 1 to any number field is just as an atomic operation as incrementation is in Redis.

samstokes · on Oct 17, 2012

Presumably it was downvoted (not by me) because it's refuting a claim I didn't make:

samstokes: I said you can't do it with COUNT()

guywithabike: You absolutely can do it without COUNT

Then we agree.

mbell · on Oct 16, 2012

Agreed, this seems to fall into the bucket of "rediscovering" something that was solved decades ago.

stephen_mcd · on Oct 16, 2012

It's the blog of a hosted Redis service.

Make sense?

mikkelewis · on Oct 16, 2012

I completely agree, I worry how everyone is introducing all these new technologies into their stack when their existing tools work fine.

Introducing new services/technologies just complicates things on both the application and ops layer (even though the author did say he was using RedisToGo).

avand · on Oct 16, 2012

We definitely could have done this in Postgres.

Our API is a Rails project and though we do a lot of work that's "off the rails," so to speak, we do try to follow a convention where possible. A convention that's worked well for us is to use Postgres only for database-backed models in the app. So when it came time to solve this problem, Redis seemed like a good choice.

Map/reduce isn't an option here. First, it's too slow. Second, it's asynchronous by nature, which doesn't work well when trying to load a webpage. We did consider using it in the background, however, to generate the counter-caches.

Thanks for reading!

gizzlon · on Oct 16, 2012

This makes a lot of sense to me. Why bother your main DB with a lot of simple "queries" like these when:

1) They're so easy to move out to something else

2) You main DB probably runs on very expensive hardware, unneccessary expensive for things like this. This probably does not need that expensive san backing it. Not all data is created equal.

3) Redis does this faster

4) Redis is simpler

Of course there are some advantages to storing everything in the same DB, but to me, this seems like a good example of "use the right tool for the job". But, I haven't actually tried this in Redis, so what do I know..

jeromeparadis · on Oct 16, 2012

I suppose it really depends on the use case. For example, there aren't probably any tables corresponding to an API call. If I wanted to collect stats on different API calls and other stuff not related to my models, I would prefer avoiding a database count update and would rather make a single incr call to a Redis key.

Groxx · on Oct 16, 2012

In general, absolutely. Redis tends to have very high write IO compared to your average DB though, so that's a definite advantage when gathering very-frequent actions.