Hacker News new | past | comments | ask | show | jobs | submit login
Snappy Dashboards with Redis (togo.io)
66 points by avand on Oct 15, 2012 | hide | past | favorite | 44 comments



This doesn't make any sense to me. Why store the values in Redis and use Ruby to do simple math? If you're using Postgres already, just store the stats in Postgres and use an aggregate or window function to grab the stats. (And collect those stats with triggers, in the first place.) If you're using Mongo already, just grab your stats with a map reduce query.


Because they use MongoDB and as the usage of it increases, we'll see more and more people continue to "discover" ways to do things RDBMS' solved in the 80s.

Just like this also has nothing intrinsically to do with Redis and could be any regular ole key-value store.


It looks like one of the things they're counting is clicks, so they could potentially have some pretty large datasets.

I don't know how well Mongo's map-reduce works, but in Postgres, COUNT(star) [1] does not perform well for very large tables (e.g. 100 million rows). You wouldn't want to be doing a COUNT(star) once per minute for each customer that had their dashboard open on a plasma screen.

Of course, there are other solutions to that problem: generate the counts on some more feasible schedule and cache them; have a read replica used for analytics queries; shard by customer and have no large customers.

I don't know whether their scale strictly requires the Redis solution, but in any case there are situations where it's not as simple as "throw it in Postgres and use an aggregate function".

[1] "star" instead of an asterisk to avoid HN thinking I'm trying to write in italics.


Note that recent versions of Postgres can now COUNT against indices so no need to do a full table scan.


This certainly helps, but if your indices are several GB in size, even an index scan is a nontrivial expense.


To be fair, we may have over-engineered this solution. I've been doing some work with compound indexes with Mongo and they seem to be performing really well. Maybe I'll have to write a guest post for MongoHQ too ;)

Appreciate the time and thoughtful response.


The article said they could rebuild the count if they needed to. So something in regards to each click is being stored. If you were using postgres you'd just setup a trigger on that table to increment the click values (stored in another table) as appropriate. No aggregate function needed.


Certainly you can implement this same pattern without Redis. Triggers in Postgres would be a reasonable way to do it. I didn't say you can't do this in Postgres, I said you can't do it with COUNT().

It does indeed sound like they're storing every click: that's precisely why using aggregate functions would be expensive.


You absolutely can do it without COUNT. Just increment a counter value, the same way they're doing it in Redis.


Why was this downvoted? Adding 1 to any number field is just as an atomic operation as incrementation is in Redis.


Presumably it was downvoted (not by me) because it's refuting a claim I didn't make:

samstokes: I said you can't do it with COUNT()

guywithabike: You absolutely can do it without COUNT

Then we agree.


Agreed, this seems to fall into the bucket of "rediscovering" something that was solved decades ago.


It's the blog of a hosted Redis service.

Make sense?


I completely agree, I worry how everyone is introducing all these new technologies into their stack when their existing tools work fine.

Introducing new services/technologies just complicates things on both the application and ops layer (even though the author did say he was using RedisToGo).


We definitely could have done this in Postgres.

Our API is a Rails project and though we do a lot of work that's "off the rails," so to speak, we do try to follow a convention where possible. A convention that's worked well for us is to use Postgres only for database-backed models in the app. So when it came time to solve this problem, Redis seemed like a good choice.

Map/reduce isn't an option here. First, it's too slow. Second, it's asynchronous by nature, which doesn't work well when trying to load a webpage. We did consider using it in the background, however, to generate the counter-caches.

Thanks for reading!


This makes a lot of sense to me. Why bother your main DB with a lot of simple "queries" like these when:

1) They're so easy to move out to something else

2) You main DB probably runs on very expensive hardware, unneccessary expensive for things like this. This probably does not need that expensive san backing it. Not all data is created equal.

3) Redis does this faster

4) Redis is simpler

Of course there are some advantages to storing everything in the same DB, but to me, this seems like a good example of "use the right tool for the job". But, I haven't actually tried this in Redis, so what do I know..


I suppose it really depends on the use case. For example, there aren't probably any tables corresponding to an API call. If I wanted to collect stats on different API calls and other stuff not related to my models, I would prefer avoiding a database count update and would rather make a single incr call to a Redis key.


In general, absolutely. Redis tends to have very high write IO compared to your average DB though, so that's a definite advantage when gathering very-frequent actions.


This is cool and, I believe, a good strategy for many things.

> If Redis isn’t available, for whatever reason, we could rebuild the gaps from the canonical data in Mongo. We’ve never had to do this.

I'm not sure this is so straight-forward.

So, this is a guest post by Sqoot hosted by togo.io, who owns the redistogo.com. Can anyone explain redistogo.com's pricing? I understand the value in not running your own service dependencies, but the redistogo.com prices seem really high to me. I was under the impression redis is fairly easy to manage. What kinds of operational tasks does redistogo.com perform for a redis instance that would warrant such high prices?


I pinged the guys over at RedisToGo to get you some more clarity on the pricing.

Speaking for ourselves, we've historically preferred to have someone else manage our infrastructure. Setting up Redis is fairly trivial, I agree. However, setting up a secure machine out on the internet and making sure it's always there is less trivial. Rather than juggle an ops/engineer role, we just double down on engineering. I'm sure at some point this may need to change. Hopefully, when that day comes we can afford a sysadmin!


This is a pretty standard way to handle counts and doesn't seem to have much to do with redis. In an SQL database you'd just do it with a trigger and your application wouldn't need to know a thing about it.


I've never used SQL triggers. We host our Postgres database with Heroku. As a result, I think I've ruled out database level solutions.

On a related note, it feels good to know that everything our app needs to run is in the code base. Back in my C# days, I remember relying on stored procedures that were configured manually at the database level. Rails fights that with migrations and the callback chain too, so I guess that thinking has sunk in for me.

Thanks for the comment.


> We host our Postgres database with Heroku. As a result, I think I've ruled out database level solutions.

What does hosting with Heroku have to do with using a trigger?

> On a related note, it feels good to know that everything our app needs to run is in the code base.

Except the database schema and any additional indexes you need to make it not perform terribly. All basic setup, just like creating triggers.

> Rails fights that with migrations and the callback chain too, so I guess that thinking has sunk in for me.

The problem with doing stuff like this in a callback is that an additional query is sent to the database which can be a big performance problem if the insert load is high. I generally agree that complex logic should be avoided in triggers and is better left in the application code in most cases but incrementing a counter is about as simplistic as you can get.


Well thanks for the insights. I'll keep triggers in mind when I come across a problem like this in the future.


For the record, Heroku doesn't affect your ability to use triggers. It's possible our ancient "shared database" infrastructure simply didn't support it, but the new starter tier plans certainly do.


Sounds like the typical caching approach to me, "avoid making big queries by caching your data in a ram-based key-value store" .. so this is pretty much common knowledge amongst webdevs since LiveJournal introduced memcached back in 2007?, or am i missing something ?


This works great, one thing I've started doing is caching those mget's in memcached. Redis is fast, but it's also single threaded and depending on how you're using it can become cpu bound, causing timeout errors while busy redis instance handle lots of write/reads... so similar to mysql - i've started guarding multiple redis reads with single memcache get... feels crazy, but maybe correct?


I'm surprised that Redis cannot handle your traffic. What are you throwing at it? Or are you on EC2, with frequent BGSAVEs or AOF enabled on an EBS volume?


rackspacecloud - no AOF only BGSAVEs currently my save frequency is:

  save 1900 1
  save 1300 10
  save 160 10000
Perhaps there's a better way for me to tune this? Most of the data I store in redis is temporal so I don't mind losing it, but I do store stats for usage of features for reporting in my admin dashboard similar to how this article describes and that stuff i would like to keep around but if i lost a few hours or even a day i'm not going to lose sleep.

I should add, I also use redis to handle some pretty large calculations... zrange's for distance calculations intersections and so on... I did some benchmarking and found that at least in ruby (1.9.3), this was more efficient to load up redis for the sets and sorting, fewer GC hits and faster sorting intersecting... I'm thinking it might be good if for these one off frequent compute tasks, I run on a redis instance that has no save, especially considering I expire/delete the keys immediately after running my calculations.


Tried running the redis benchmark app to see how your figures compare to other published stats?


I've been working on a dashboard that uses Redis as well. Just wondering, why not take advantage of sets/lists/zsets for your date-related keys. With lists you can do an easy LRANGE instead of that loop of GETs you're doing now.

Also, if you don't do this already, look into using bitsets to track users. As long as userID's are integers, it's real easy it saves a lot of space.


Nothing specific about redis or new here. Of course if you can easily cache a pre-calculated value, that'll save you time and CPU.


Well, to be fair, the cached counts are incremented by Redis through the incr command. So, no pre-calculation here. It can be interesting also to aggregate metrics of floats using the incrbyfloat command.


Why not store & update count in database. Rather Redis. Also I could never appreciate the redistogo. I can get much powerful redis instance on EC2 for much lesser price.


nitpick: "The values are almost inconsequential since THEIR just numbers" should be either THEY ARE or THEY'RE


Good god. I can't believe I missed that. Thanks. I'll make sure that gets fixed.


This is exactly what I'm building http://www.instahero.com to solve. You don't have to build your own infrastructure, just write the relevant bit of code (or select a template) and you have a dashboard.


Thanks for sharing. I agree with @bunkat here (though I didn't read the whole page). I find the language around analytics apps (including big boys like Mixpanel) to be generally vague. If you're catering to a primarily developer audience, you may consider just showing me how easy it is. StatHat does a good job of this.



I think your value prop could be honed a bit. I tend to read more on a site than most people will, but I still couldn't get through the wall of text without losing interest and leaving. An example of 'write this code and get this!' would have kept me around.


That sales letter converts better than the actual page, but you're right. Here's some sample code: http://www.instahero.com/blog/2012/10/11/using-instahero-gai...


I use Fnordmetric - Ruby/Redis dashboard. Works great.


Please always include a link: https://github.com/paulasmuth/fnordmetric


That's slick.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: