Hacker News new | past | comments | ask | show | jobs | submit login
The right way to deal with frozen processes on Unix (phusion.nl)
69 points by dRiek on Sept 21, 2012 | hide | past | favorite | 18 comments



I'm not sure if that's darwin terminology, but calling a process that's blocking on io or in a tight loop "frozen" certainly isn't tradition in UNIX - you're much more likely to use hung or stuck. In linux specifically the term has a completely different meaning - "freezing" a process is intentional and involves putting the process into a cgroup and then removing all cpu shares to prevent it from executing - suspending the process and allowing for sleep/hibernation/etc.


The "frozen" terminology is amusing to me since I would normally say that processes in tight loops are "burning CPU".


I have always described such processes as "wedged".


I don't mean to disparage other Ruby application servers, but this is why I use Passenger. It's not that I believe other Ruby application servers aren't good, or that their authors don't understand Unix as well as the guys at Phusion, but there's a real dedication to stability and predictability inside Phusion that I don't see elsewhere.

You can acheive the same results with other app servers, but you're going to have to do a lot of the heavy lifting yourself. I'm not ashamed to admit that I have a lot more confidence in Passenger's solution than I do my own.


OK, total noob question here. Could we achieve the same thing with something like pgrphack from daemontools?

    pgrphack sh -c "processes" 
Kill the pid for the sh ("agent") and you thereby kill all the processes?

Again, sorry for the noob question. I'm still learning and making mistakes.


No. To instruct kill() to kill a process group, you have to specified the PID of the process group leader as a negative number. Otherwise kill() will kill only a single process.


But won't all the processes in my example have the PGID of sh?


Yes they do, but that is irrelevant. kill(pid) kills the process specified by 'pid'. kill(-pid) kills the process group specified by 'pid'.


What if I just use userland kill(1) utility? Is it possible to kill all processes under a PGID using kill(1)?

Say the PGID I get for sh is 321. If I do

    kill [signal] 321
that will not kill all the processes having PGID 321?

If it would not kill them, then couldn't we modify kill(1) to be able to call kill() with a negative integer as you describe?

Sorry for the noob questions. I am still learning and making mistakes.


isn't frozen in usual unix terminology any SIGSTOPped process?


That's suspended to me.


Its nice to see traditional debugging made easy, this is stuff that you try to teach people as a sysadmin/opsguy and they never pay attention. Yay!


The page is unfortunately not loading, and there doesn't seem to be a Google cache for the page.


There was a slight interruption of service, but the blog has been restored now. Our apologies for the inconvenience.


The server is not responding right now and Google Cache is empty. Any one got a copy of this?


In related news, Hongli Lai has solved the halting problem.


You do realize that it's quite easy to solve the halting problem in most cases, right?

The halting problem is unsolvable because _very particular_ (and pathological) cases are unsolvable. If a process is in a loop between the same instructions at the same states, it's very easy to tell that it's not going to halt -- the only challenge is that you can't make this determination for all processes all the time.

Mathematically, note that the concept of an oracle for the halting problem is well-defined (and a useful concept).


I was mostly making a jibe at the notion that there is a single, correct, and reliable method for identifying stuck/hung processes.

There are in fact fairly reliable heuristics for noting when things are going pear-shaped. The edge cases get sticky though.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: