Hacker News new | past | comments | ask | show | jobs | submit login

> This stuff was all well understood four decades ago. Much of it was forgotten outside the mainframe world, because threads and multiprocessors didn't make it to microprocessors for several more decades.

Here's an important distinction to make: this stuff was well understood in theory, but the practice is a bit different.

Semaphores are a neat theoretical concept but not a very good practical parallel programming paradigm. Semaphores are easy to reason about when writing proofs by induction that a parallel programming algorithm is working correctly, which is why they are still at the core of parallel programming education.

But when writing practical multithreaded programs, mutexes and condition variables are a lot more practical. Typically each mutex is coupled with one or more condition variables to wait/signal for conditions such as "queue not full" and "queue not empty". Incrementing and decrementing numeric counters is a very clumsy way to maintain a state of any kind. Implementing a semaphore requires grabbing a spinlock and doing this several times is unnecessary when you could just grab one mutex, check for the appropriate condition(s) and then wait/signal on the correct condition.

It is rather unfortunate that parallel programming is still primarily being taught in the theoretical manner (at least I was) using primarily semaphores, leaving too little emphasis on the practical implementation. It is important to understand the theory and know how to do the proofs but it is equally important to apply this into practice.

Every time I see someone "implement" a semaphore using a pthread mutex and condition variable, I cry a little. I've seen this several times in production code.

> It's very simple. That's the real use case for P and V. Linus' note indicates that in 1999 he didn't know this.

I'm pretty sure Linus was joking and he did understand what semaphores are used for. Every computer science curriculum has a course on parallel programming with bounded buffer producer consumer/problems, readers/writers problems and other "toy problems" solved using semaphores, followed by proof by induction that the solution is correct. And Linus did, eventually, finish a degree on computer science.




Semaphores are useful for rate limiting - for example say you have a connections pool with an upper bound, so naturally the number of threads that can acquire a connection and do things with it is limited by the size of that connections pool.

And you don't necessarily need a mutex or to actually put the thread to sleep. Semaphores are also relevant when speaking of asynchronous stuff (e.g. Futures), in which case you can easily do CAS operations on an atomic reference holding an immutable queue of promises. Slightly inefficient under high contention, but non-blocking and gets the job done.


While I do not disagree with you, I still do think that even these cases mutexes and conditions are more practical.

Take the "rate limiting" example you mention (also one of Linus' examples in the OP). You initialize a semaphore to `max_concurrent_connections` and call `semaphore_down()` when you enter the connection handling sequence and `semaphore_up()` when you're done. Now this works fine and is an idiomatic example of using semaphores.

However, in the real world, this kind of situation rarely happens in isolation. What happens if you need to terminate the application for whatever reason, and do so cleanly? If you're under contention, you might have a dozen threads waiting for the semaphore go up and your only option is to kill them. Or implement some logic for this case (using another semaphore) to make sure the application hasn't been terminated while we were waiting for the rate limiting semaphore.

You can implement this cleanly using mutexes and conditions, by creating a "killable semaphore" synchronization primitive. While this is similar to a semaphore as an idea, it's very difficult to implement it if semaphores are the only primitive you have. Additionally, you probably want some kind of timeout if the queue is full.

So in practice you need something like:

    while(true) {
        socket = accept();
        int status = rate_limiting_enter(my_rate_limiter);
        // NOTE: someone else may call rate_limiting_terminate()
        if(status == OK) {
            service(socket);
            rate_limiting_leave(my_rate_limiter);
        } else if(status == TIMEOUT) {
            send_busy(socket);
        } else if(status == KILLED) {
            send_terminate(socket);
            break;
        }

        close(socket); // this must be called or resources leak
    }
Now, while this is theoretically very similar to a semaphore, it has other real world priorities (like timeout and termination) which are very difficult to implement using Dijkstra -style semaphores with only P() and V() operations (and you definitely need more than one semaphore).

This has been the case in almost every practical multithreaded programming scenario I've had. The solution could be thought of using semaphores (and I frequently do) but in practice, there's always some real world conditions (timing, contention, errors) that must be met.

It is very trivial to implement semaphore-like synchronization primitives using mutexes and conditions but not vice versa.

Semaphores are very good for textbook examples and a mental model but not so much in practical software.


Termination in queued systems is moderately hard. You have to drain out the queues. In Go, you can close a channel at the write end and wait for the reader to reach EOF. But if the reader is stuck waiting for something, there's a problem. Especially if it's waiting to write another channel. If you close a channel written by another task, that task will panic when it writes to the closed channel. You can't close channels to force shutdown in Go unless a panic is acceptable.

This is a classic problem with bounded buffers, re-invented four decades later.


Using two semaphores to implement a bounded buffer has the advantage that writers and readers don't interfere with eachother until the queue is either empty or full. Of course, that might be the only good use of semaphores, and a direct implementation can be better then one using standard semaphores.

Semaphores can be implemented in a way that doesn't require a spinlock in the fast path. Still not as fast as you can make a mutex, but a lot better than the pseudocode given in many classes (which is almost always wrong, too.)


In practice, the semaphores available for many systems are considerably slower than using architecture provided atomic increment operations, which means implementing a bounded buffer with two counters and two mutexes ends up being faster.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: