Hacker News new | past | comments | ask | show | jobs | submit login
Reddit, Stumbleupon, Del.icio.us and Hacker News Algorithms Exposed (seomoz.org)
67 points by marrone on July 3, 2008 | hide | past | favorite | 28 comments



These are not the algorithms but an overview of the formulas.

For example the HN algorithm is outlined in several places e.g. http://news.ycombinator.com/item?id=38704 and on the pg Arc site, and the reddit algorithms is

  def hot(ups, downs, date):
  s = score(ups, downs)
  order = log(max(abs(s), 1), 10)
  sign = 1 if s > 0 else -1 if s < 0 else 0
  seconds = epoch_seconds(date) - 1134028003
  return round(order + sign * seconds / 45000, 7)


the delicious one is wrong.


Thanks for reading the post. What do you think is wrong with the del.icio.us formula?


The actual algorithm is more complicated than that, is all.


eheh... any chance you can elaborate on that? Give us a few more details?


You just demonstrated one of the coolest things about posting at news.yc. You never know who you're gonna run into. Even when they're replying directly to you.

edit: Hint, click his name...


:)


This is why I love Hacker News.


Use of true clock time in these formulas penalizes posts that are submitted at low-traffic times.

Using a virtual clock, such as a tick count based on traffic (number of other submissions, votes, pageviews, etc.), could adjust for this.


Good observation. I believe a similar concept is used in high frequency trading (in markets such as foreign exchange). So derivatives are not dt, but du where u is a surrogate for time in ticks.


that's an interesting idea!


For reddit, this is the public algo released with their code. I think we should not assume that's the algo they use on reddit.com. Think about it: they could easily release a basic algo and keep the good one for their own internal use.

At the very least, even reddit's blog post announcement of the code mentioned they're not open sourcing the spam detection code. Surely that's part of the ranking algo in that it determines if a submission gets ranked or not.

For the SU one, I think there is more to the story, but that's for another post...


I don't think they'd bother trying to fake people out by releasing a different ranking algorithm than they used on the site. There wouldn't be any point anyway. Knowing how to get onto the frontpage of a venetian-blind site like Digg might be useful to spammers, but knowing the ranking algorithm of a bubble-up site like Reddit or News.YC wouldn't help them much.


It really doesn't get much better than (p - 1)/t^n, where n is larger if you want age to be more of a handicap.


Not to be an ass but how hard can they be anyway? Its not a formula like a^2 + b^2 = c^2

it is just some variables that they tweaked until they fit.


I thought karma had an influence on how something appears. If not what is the point of karma?


To keep you in line unfortuneately.


Yippee! Thats great. But how did u findout delicious's and stumbleupon's? By reverse engineering technique... like... Were these found out(or guess-worked) by constantly keeping track of the ranking of content?

Joshu, i think the del.icio.us formula is right, or they might just be using a different time constraint than 1 hour(maybe 2hrs or anything else). Because to findout what's popular they dont require any complex stuff. Simple math as specified in Danny's post is enough for their task.

Go Danny! Go!!!


Umm, yeah. To make it more clear, when you refer to 'they' in your comment, you are actually talking about Joshu!


I would trust user 'joshu' on matters related to del.icio.us.


God!

Gojomo, is that really Joshua, the founder of Delicious and Memepool? I just guessed it when u mentioned the username in single quotes. ;) I wouldn't talk a word abt the algo then. :D


click on 'joshu' 's user name.


Yeah, Jax. You know, he might just be the founder of it.


It would be nice to have a less misleading title. I clicked to find out what Digg's algorithm is.


Wait - the title doesn't include Digg, so how exactly is it misleading? Maybe you're being sarcastic and my sarcasm detector just isn't well-tuned.


The title was edited, apparently. When I clicked, it said Digg, and the article still has a section on Digg.


can't be to hard to finds .yc's - cause its open. i hope this doesn't send a flood of seo's ;)


Mathematically speaking, would it make a difference to use p^2 / t instead of p / t^2?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: