Hacker News new | past | comments | ask | show | jobs | submit login
HN will be down Saturday morning while we switch servers
225 points by pg on Feb 15, 2013 | hide | past | favorite | 118 comments
Rtm has lined up a new, faster server for us. He currently plans to switch over to it at 6 am EST on Saturday. He says the site will be down for around 10 minutes, but you know how these things go...



If even rtm, tenured professor of computer science at MIT, can't escape from sysadmin duty, I really have no hope.


    Compared to system administration, 
    being cursed forever is a step up.
-- Paul Tomko


Do you think you could go down for a week?

That's generally how long I need to break an addiction


I recommend using StayFocusd: https://chrome.google.com/webstore/detail/stayfocusd/laankej...

It lets you limit the number of minutes you can read sites (and importantly links from thoses sites). I give myself 10 minutes per day until 9pm at night.


I actually use stayfocused, but always have my phone and tablet which don't work with plugins!


Add me to the list - I'm in for $100 if we can make it a one week outage. $250 if we can stretch it out to two weeks.

I'm sorry, but 10 minutes on a saturday morning does nothing for me.


My router at home blocks Reddit and HN except from 5-7pm and 11:45-midnight (and all day Saturday). Worked amazingly well to cut back the addiction. Now if I could just get HTTPS blocking to work on FB and Twitter.


I do the same. It works pretty well, it's crazy when you start catching yourself trying to go to one of them by habit.


I don't know why but I find myself opening new tabs without thinking about it. On Facebook? Ctrl+T, fa, down arrow, enter. Why? Who knows, I was already there... That's when I noticed that I have some major problems.


You can lock yourself out easily with noprocrast and minaway set to a week or two (say 60 x 24 x 10). At least one prominent HN'er claims to have done this when needing a break.

Happy to have been of assistance; please send the $250 to GiveWell. ;-)


Without meaning to sound snarky or anything, isn’t there something else at work if you have to externally block sites to stop you procrastinating?

I would have thought getting to the root cause of your “HN addiction” would probably be a better alternative.


OK, I'll bite: how do I go about finding the root cause?


Editing hosts files works wonders.


2,000 startups will be launched on Monday morning as a result of this downtime.


... broken window fallacy ?


Not sure what you mean by that but I think it'll be closer to: >"HN's Down!" >"fuck it, lets get some work done"


I misread it as, "2000 startups are created to replace functionality." :P


For all the flak HN gets about fnid problems, etc, it's still awe-inspiring to me that such a popular site runs on a single server, with so little admintime ;)


Just how popular is HN?

Anyone got info on uniques or pageviews a month?


We currently get just over 200k uniques and just under 2m page views on weekdays (less on weekends).


Thank you for creating a wonderful source of information!


Does a single server support this level of traffic? That is pretty impressive - it would be great to learn about the server's hardware and software configuration.


Old server: two Xeon E5450 chips, 3.0 GHz, 8 cores total, 24 GB RAM.

New server: one Xeon E5-2690 chip, 2.9 GHz, 8 cores total, 32 GB RAM.


Wow, nice, that was exactly the HW I was predicting; the fastest single-core-performance xeon under turboboost.


I don't know whether things have changed since, but some time ago (a year or more?), IIRC pg mentioned that the site was running on a single core. (I don't recall his saying what type of core.)

P.S. I see that abstractbill beat me to it, both in stating here and in originally acquiring this information:

https://news.ycombinator.com/item?id=5229548


Around 40-60 requests per second, of which almost anything an be cached. Any single server should suffice.


But you would need at least 30 Heroku dynos to handle that reliably ;)


As of October 5th 2011:

"Also, a traffic update: HN now gets over 120k unique ips on a weekday, and serves over 1.3 million page views."

http://ycombinator.com/newsnews.html

Additionally:

http://www.alexa.com/siteinfo/ycombinator.com


don't you dare mention alexa stats here


Why? (genuine question)


They are known to be very wrong.


Just out of curiosity, what are the specs of the old vs the new server?


Can you please post a picture of the server so we can see all the sweet LEDs on network cards?



I call foul. This doesn't even look like it's powered up.

Edit: Wow. 3.5 years on HN and my first down-voted post. I guess I should have added a sarcasm tag or stick with keeping comments "safe".


> 3.5 years on HN and my first down-voted post

Honest? Then, if I were you, I'd try to be a bit less consensual from time to time. I got plenty of downvotes. I would hope even pg get his share, that would show sane critical thinking.





What about http://imgur.com/BOaFPtn

So compact yet so powerful.


Ah the good old days, when computers were so simple and yet so mysterious and exciting at the same time.


I don't know, I still find them exciting -- although PC hardware isn't as fun as it used to be.


Not one of these?

http://ed-thelen.org/comp-hist/vs-bendix-g15.jpg

[one of the first computer models I ever wrote for - a three address machine: first a source operand, second destination, third address of next instruction. The memory was a rotating drum]


Mel?



That would have been really cool, and a true hackers machine, but I imagine that the days when LISP needs (or benefits from) custom hardware are long gone.


Love the symmetry of the photo...triangle. (I suspect with the orange background that it was taken at the YC office? http://m.inc.com/?incid=96)


Haha, epic response.


// 3 hours ago · 1,439 views

Impressive.


In case anyone wondering colo place appears to be The Planet.

whois -h whois.arin.net 174.132.225.106

NetRange: 174.132.0.0 - 174.133.255.255 CIDR: 174.132.0.0/15

Name: 6a.e1.84ae.static.theplanet.com

Address: 174.132.225.106

OriginAS: AS36420, AS30315, AS13749, AS21844

NetName: NETBLK-THEPLANET-BLK-15

NetHandle: NET-174-132-0-0-1

Parent: NET-174-0-0-0-0

NetType: Direct Allocation

RegDate: 2008-06-17

Updated: 2012-02-24

Ref: http://whois.arin.net/rest/net/NET-174-132-0-0-1


*SoftLayer. ThePlanet is no longer, but a lot of their reverse DNS entries still point to .theplanet.com.


I had forgotten that but that was actually a merger:

http://www.datacenterknowledge.com/archives/2010/11/10/softl...

In any case they never got the block changed at arin. We switched providers at one point within the last 6 mos. and the old colo already changed the arin block.


Does all of HN run on one single server?


I think a lot of people underestimate how powerful a single well configured server can be. You don't need Heroku/AWS for everything.


Also, people underestimate the power of serving out of RAM. It's not unreasonable to serve 20-30K QPS off a single server if the work it needs to do is limited to minimal request parsing and fetching some data from main memory. That's about 2.5 billion requests/day, fully loaded. Granted, I'm thinking something more like memcached than a fully-formed webserver, but an in-memory webserver that stores its data in hashtables (like news.yc) and has a really fast templating languages, or just writes output directly to the socket, could probably come close.


I use redis for this exact reason -- I prerender over 2,000 page templates twice a day, and store them in RAM. The app server has to do a little processing before sending the pages to users -- it picks a different template depending on whether the user's logged in or not, and then substitutes the user's info into the template (for logout/profile links). The session info is also stored in redis. This lets me reboot the server and be ready to serve pages again almost as soon as it's back up. With all the data, redis uses about 300-400MB RAM on a 64bit Debian VM.

I use a VPS for my site, and on a VPS, the only thing you're allocated that you can depend on always being available is RAM. The processor cores might be shared with a busy user, and you can't always depend on high disk I/O speeds.


Not just a single server, a single process on a single core (last time I asked pg at least).


And I recall the last time I looked at the source (when it was released with Arc), it didn't use a database either. All the data is stored as files.


A link to (possibly an old version of) the source for those interested: https://github.com/nex3/arc/blob/master/lib/news.arc


Interesting read.

I'd love to see my median karma on my profile. Much more robust to outliers than average.

https://github.com/nex3/arc/blob/master/lib/news.arc#L2613


Just like Viaweb.


Files???? That's a joke right???


Why should it be? You can get a long long way by treating the filesystem as a database. The first engineers at Amazon used the same technique a lot, as do I.


Because you always end up building your own database out of flat files and that is always worse than using an existing one.


If it was always worse then every developer doing this must be stupid. Here are some ways in which a filesystem is "better":

- Zero administration

- Only configuration setting is the directory

- Trivial to test

- Trivial to examine with existing tools, backup, modify etc

- Works with any operating system, language, platform, libraries etc

- Good performance characteristics and well tuned by the operating system

- Easy for any developer to understand

- No dependencies

- Security model is trivial to understand and is a base part of operating system

- Data is not externally accessible

Many existing databases have attributes that aren't desirable. For example they tend to care about data integrity and durability, at the expense of other things (eg increased administration, performance). For a use case like HN, losing 1 out of every 1,000 comments wouldn't be that big a deal - it isn't a bank.

Consider the development, deployment and administrative differences between doing "hello world" with a filesystem versus an existing database. Of course this doesn't always mean filesystems should be used. Developers should be practical and prudent.

TLDR: YAGNI, KISS, DTSTTCPW


You also get an automated buffer system for the data, due to the OS.


ACID transactionality.

Thank you, drive through.


TIL: DTSTTCPW


And yet, here you are, on a site run off a flat file database.


Just a nitpick. A "flat file" database suggests encoding all the data to a single file.

Using a files system as a database is a little different as file systems are databases in their own right.

The question to ask is "is the data I want to store in my 'database' enough like the data stored in a filesystem that I can just use the filesystem as my database?"


A commenting site that:

- Has average latency over 500ms when not under load - Performs quite poorly under load (I hate to bring it up, but the most recent example was Aaron Swartz's passing. Anyone who used HN then to get news knows how poorly HN performs under load) - Is restarted every week or two because it leaks memory - Keeps XSRF tokens in memory and loses them across restarts - Doesn't have a full markup language

HN is quite poorly-featured compared to typical commenting sites. People use HN because pg is here. He could remove half the features on the site (bold & italics... what features are there even to remove beside nested commenting?) and retain 90% of the audience.


>People use HN because pg is here.

Nothing personal against pg - but I'm here more because of everyone else - the caliber of the discussion, for a news/tech site is quite high, IMHO - and that's due to everyone, not just pg.


Well, I guess we can agree to disagree. HN is is popular for me because of the participants. pg, as epic and central as he is to ycombinator, doesn't play that much of a role on HN in terms of moderating and directing conversations, or even, in recent years, participating that much.

With regards to the commenting site itself, I can think of no more viscerally enjoyable a forum I've ever participated in, with the possible exception of *Forum on MTS. There is nothing whatsoever that I would change about it, with the one possible exception of tweaking the markup so you could add fixed-width text/lists that wrapped over multiple lines. It's the only additional feature I've ever wanted out of HN. There is beauty in it's simplicity. [Edit: Okay, I would also move the upvote/downvote arrows a bit for mobile usage. It's almost impossible to hit the right one without a lot of zooming]

And, with rare exception of a MSM hit, the performance is more than adequate for an environ that should be encouraging reading, digesting, and composing.


You're probably seeing the artificial delay introduced for commenters that can't maintain at least a 4.0 comment point average or people who aren't signed in. Site runs like butter for me.


Site runs faster for people not logged in, too.


Given that your average is below 4.0, this hypothesis seems questionable. Can you give a cite for this 4.0 rule?


I didn't even know who pg was until I'd been using this site for years. So no.


Actually I'd never heard of pg or YC until they bubbled up into my consciousness from reading HN.


you never saw the domain name?


It was just a domain name.


A worthwhile criticism, except...

How many ads do you see on HN? What's the ad revenue? What are the operating costs?


So, to reiterate your argument:

1. We don't matter to pg because we don't generate profit and are instead a slow drain on resources

2. Therefore this site must be a relic of the 90s written in an ad-hoc collection of mzscheme macros


That's a completely accurate mischaracterization of what I've said, yes.


Just as one example, I think you could make a fairly convincing case that the official version of git uses a "filesystem as a database" system with great success.

I doubt it would be improved by using something more "proper".


Well, no. There are some exceptions, but most databases add a whole lot of bloat you don't necessarily need. Simple files can be just as fast or even faster than using a big database - which is the most important metric to me.


Even if that was true, I'd still tell people to start with flat files for a new project. It's like the advice to do a job yourself before hiring for it: You'll be better equipped to judge how well a database is managing your data if you've already done it yourself.


I find it interesting how easy it is to criticize a functioning system based on some aspect of non-conformance with some hypothetical ideal. The idea that HN wouldn't run on a database is no less astounding than the idea that much of facebook runs on PHP. Design of real systems is often messy and imperfect and deviates from the ideal due to necessity of optimizing one or another factor that may not be obvious.


Facebook doesn't run on PHP and hasn't for a while... Its compiled C.


Keep in mind that a single modern physical server that is decently configured (12-16 cores, 128GB of ram) is the processing equivalent of about 16 EC2 m1.large instances.


It would be cool to see some performance "brag numbers" posted after the cutover!


Given that you've got performance issues and a fairly limiting deployment model, I never understood why you didn't get the most absurdly overpowered machine possible. (I assume you're not, because if you were, you'd be upgrading every ~6mo or so as faster single-core machines come out)


Would the CPU really be the bottleneck for HN?


I assume it's CPU and cache/memory bandwidth.

If I were specing a machine for HN, naively, it would be a competition between a Xeon "enterprise" CPU with huge cache and memory bandwidth (interleaved up) and a gaming/desktop CPU with maxed-out single-core performance. Xeons can do single-core turboboost now, so E5-2690 which goes up to 3.8GHz is probably the best bet, but a desktop i7-3970X 4.0GHz might be an option if you don't need ECC (which also gets you slight speed improvement on memory).


What are we supposed to do on Saturday morning?



Let's be honest here... 6 AM? It's a good time to be asleep.


That's 11am in the UK, noon in Western Europe, 4:30pm in India, and 8pm in Japan.

Also, it'll be 3am on the West Coast, so some people will still be enjoying their Friday night.

http://everytimezone.com/#2013-2-16,-60,6be


This is just idle speculation but are there any stats on HN traffic? I have always wondered just how many folks read it, how often etc? I heard a million accounts being bandied around at one point and I cannot tell if that is excessive or not?


It was said up in the comments.

120k unique IPs per day. 1.3 million page views.

source: http://ycombinator.com/newsnews.html


Newest data straight from pg in this thread: 200k / 2M.

http://news.ycombinator.com/item?id=5230201


Well, it looks like I'll be getting 10 minutes of work done I didn't plan on doing.


Oh maybe on the maintenance landing page you could post up a big bunch of static links to say the top 50 most voted articles ever! (Or such like). Something like that would keep us busy for a while.


I have a bad habit of checking hackers news to know when my wifi is up or down.. I'm more productive when wifi is down because I'm not tempted to read articles. Saturday will be productivity day!


profile, procrast settings use them! they are amazing!


I usually just block it in my hosts file, but that's a good idea! I'll try it


Will the beefier machine mean the lifetime of the fnids can be increased?

(Funny surprise: the one for this comment box expired before I submitted this comment.)


For a moment there I read "will be shut down". Phew.


I guess tonight would be the wrong time to post my Show HN project that I've been working on then.


Geez, could we stop posting these 'site X is down!' threads? Its gotten so bad that they're being posted preemptively. ;)


Hi. Could someone please explain to me why it is necessary for a service (e.g. HN) to go down while people play with the (increasingly amorphous and abstract) back end? Is it 1990?

Sorry, just hit a nerve. Like doing some OS updates (Windows) and then needing to reboot to "complete the installation". I'm sorry. That totally sucks.


Because the single server that runs it is being turned off and another one is being turned on? Its probably not worth the time to write something to sync processor state from one to the other or clone some sort of vmotion type thing.


HNs backend isn't very abstract. The site doesn't really use Javascript and resides on a single server.

And I agree with you on the windows issue -- except that ten plus years ago I did in fact reboot and/or shut down my computer.


If you're using a mac http://selfcontrolapp.com/ works pretty well.


10 minutes!? 10 minutes!?

oh gosh its okay, I'm just kidding :P


Where will we ask "Is HN down?"


you know this will tickle funny bones more than calm panicked ones right?


Wheee - currently much faster, although the true test will be on Monday, around 5pm GMT.


It's Friday night. My favorite band (http://www.theanatomyoffrank.com/music) is playing a house show in my town, which is the best kind of show because it's BYOB. Then, my best friend from Texas is in town for the night, and we're meeting up. I'm definitely going to go tear it up and create some memories.

And yet..... all I can think of is to stay up all night creating a "replacement HN" just for Saturday morning. I just started using Django and it would be perfect for this. Must. Resist. Must. Live. Real. Life.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: