Hacker News new | past | comments | ask | show | jobs | submit login
Top domains by aggregate score on HN from the past year (hntrending.com)
193 points by vinnyglennon on Dec 7, 2018 | hide | past | favorite | 46 comments



Very cool. Here's a similar ranking that takes average scores into account. It surfaces a lot of the smaller domains that get a lot of HN love: https://hnleaderboard.com


This sort of thing is an excellent seed for host authority if you're making a search engine. Often the trick is to find places where humans have curated 'good' web sites out of the sea of all possible web sites. HN is a good source of data because you get several human signals, one is up votes, the other is comment points. Between the two you can create a rank for both how non-spammy it is and how controversial it is.


Until the spammers find out that google values those results and supply you with an endless stream of drivel and trash to wade through.

(cough cough reddit)


@shawn, it seems you have been shadowbanned.


He already knows but continues to harrass HN. I know the dude personally, he is deranged and obsessed about how HN is being "manipulated by the mods."

Shawn: we know HN is curated. The mods do a good job, that is why it is not a shithole like reddit.

Find a new hobby instead of being obssessed with someone else's platform like an angsty fixated five year old.


This is actually much more useful than the OP's list, at least for me. OP's list has a lot of high volume, low score stuff taken into account.


Thanks for that reminder (I was going to say: “I wish I could sort this by average or something instead”).

Neither are perfect though. A lot of the best articles come from the BBC. Yet because so many people submit (and resubmit) BBC articles, they have an awful per-posting score.

I think you’d want to pre-filter by submissions that actually got some traction (made it to the front page?) and then look at the score distribution.


And here's yet another searchable/filterable HN archive https://hn.algolia.com. You can find more info on the (free) HN API here https://github.com/HackerNews/API


https://hnify.com, for an offline 1 pager of all the top stories/comments over last 2 days, is mainly based off this Algolia api,an excellent tool.


Here's the top 10 all-time sorted by avg (from the top 100):

    blog.samaltman.com 216.35
    paulgraham.com 114.9
    blog.ycombinator.com 108.09
    stripe.com 105.63
    jacquesmattheij.com 91.98
    blog.mozilla.org 57.8
    hacks.mozilla.org 48.78
    www.apple.com 46.98
    www.marco.org 46.54
    googleblog.blogspot.com 37.41
I guess that's a bit biased toward ones submitted infrequently


top general news site: nytimes.com

top multi blog site: medium.com

top tech news site: techcrunch.com

top social media site:twitter.com

top journal site: arxiv.org

top cloud provider site: aws.amazon.com

top for-profit company tech blog: blog.cloudfare.com

top personal blog: drewdevault.com

top disregard of users' data site: www.facebook.com


>top disregard of users' data site

LOL


I feel like bbc.com and bbc.co.uk should be combined and rank 6 overall on this list.


Is this just cumulative? It seems to me that ordering by average score would be better, is Github simply pushed to the top for being low rank high volume?


It is cumulative. I can probably add an option to view by average - but lots of domains pop up with only 1-2 stories that had lots of upvotes.


This seems a frequent problem. What is the most robust formula for sorting by average while boosting by frequency? Something like avg(ratings) * log(len(ratings)) ? Maybe that curve needs to be tweaked based on the use case ?

I wish sites like amazon had something like this, since sorting by average rating is completely useless if you have a long tail.


To me, the interesting number is average points per submission. It's surprising to see how badly medium.com, forbes.com, and theregister.com do by that metric, and how well stripe.com, ifixit.com, and blog.rust-lang.org do...


Shout out to danluu.com — literally the top average score for the last year and in the top few for last three years (per https://hnleaderboard.com/ ). Probably my favorite blogger and it seems popular with HN as well.


Hello - I created the site. I would encourage you to take a look at the 'links' section which is something you probably have not seen before. It aggregates and comment urls and ranks by count for a variety of sites - including XKCD!:

https://hntrending.com/links/all/xkcd/index.html

You can view summaries of Wikipedia articles or abstracts of arXiv papers on the site as well.

I hope you find it useful!


Very interesting and well done. Something seems off for the 'ask' page though. "Top Ask HN stories from the past year" but most only have a few points.


I’ll look into it thanks


Well done! Did you consider using other metrics (like median instead of mean) or other techniques (e.g. removing outliers, calculating confidence intervals, considering the standard deviation, etc.)? I'm aware that things can become complex (to visualize and interpret) and maybe not many people would be interested in something more complex than the mean. :)


Mozilla and AWS being the top listed that aren't news/info sites, while Mozilla somtimes also posts general tech stuf.


Does this have any meaning to anyone? As someone that enjoys submitting articles and trying to find patterns in their success or failure, I do not see much significance in where the story came from as opposed to a catchy title or subject. I'm often surprised by what catches and what flops for what that's worth.


Hello. I created the site. I think it can be useful to browse domains for certain sites. For example, Github.com can be used to browse popular projects. aws.amazon.com can be browsed to keep up with large AWS announcements, and www.reddit.com can be used to see top posts on reddit.

example: https://hntrending.com/domains/year/github.com/

Why did I create this site? Because it can be very difficult to keep up with tech news. There are great new stories on HN every single day. I simply can't keep up. The solution? A way to browse top stories from the past week/month/year or all time. I am letting the wisdom of the crowds decide for me what is most important, but it is nice to be able to take a vacation without worrying I will miss something big. I still read HN and other tech sites almost daily, but this helps me review the more popular stories which can be both interesting and fun.


Check skimfeed.com but only has top stories of last +48hrs


I find it surprising that nytimes takes 2nd overall, top in news category. It means nytimes is doing things better than others, and the news industry is very motivated to figure out how to make digital work.


Glad to see The Economist in top 20


Quite interesting.

I am surprised by wikipedia’s performance. How does an encyclopedia (even and excellent and ever evolving one) beat out the Washington Post for topical news?


The wikipedia links submitted that make the front pages are almost automatic reads for me. They aren't news so much as obscurities, and are usually pretty fascinating.


I agree that they can be good. It’s just that whatever someone will surface in 2020 was probably sitting there in 2012. Why didn’t we look at it in 2012? Surely there is a listing of obscure topics. But somehow it becomes a must read in the HN context. (I do it too)


In terms of product organisations, the top ones on the list is Mozilla, AWS, Cloudflare then Apple ... wouldn’t have guessed that (part from maybe apple)


Thanks hacker news for using up all my limited article passes on the first day of the month.

Has anybody else found the web button largely useless these days?


There is github, newspages, blogs, social media and then apple.com,...


The Atlantic is such a quality publication, glad to see it is read.


Much more interesting would be domains of interest stats.


Haha, That was more depressing than imagined.


Surprised arxix is only 22.


That the guardian is so high is extremely worrying.


More than the BBC? Some of their article are barely above click bait, and their political pieces can completely biased.


I await examples of the biased BBC articles.


It oughtn't be. You might not like their opinion pages -- yes, they're typically left-wing -- but The Guardian's reporting is absolutely top notch; amongst the best in the world.


No. They constantly blur editorial features and news. The most shared posts are ludicrous hysterical nutters that they give a platform to because it generates clicks.


How do you differentiate between opinionated/politically biased pieces and general reporting ? What's the criteria for evaluating good (ie trustworthy) reporting ? Curious to know.


Opinion pages are clearly set apart from news reporting. In The Guardian, opinion pieces appear in the "Comment is Free" section.


They usually say that the piece is an opinion piece.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: