Mutexes and Semaphores Demystified

Groxx · on Oct 24, 2010

>I hope this article helps you understand, use, and explain mutexes and semaphores as distinct tools.

Uh... it didn't. In fact, the very example they give of using a bathroom very much allows for both semaphores and mutexes.

  initial state:  mutex: released; semaphore: 1 (signalled);
  use bathroom:   mutex.lock;      semaphore.decrement;
  leave bathroom: mutex.release;   semaphore.increment;

In this way, a semaphore can be viewed as "consuming" the "key", and creating a new one when finished, while a mutex is descriptively grabbing and returning.

They go on to state that a 2-counting semaphore cannot accurately control 2 bathrooms - true, it takes two binary semaphores or two mutexes, or other data. But at this point we're still at the two being identical; all that's been shown is that the claim that a single n-semaphore cannot be used in this manner. And neither can a single mutex. So the claim is they're not identical, but the example shows them behaving identically (the pseudo-code doesn't align with the example, and is meaningless anyway, as you can (and frequently do) .wait on a semaphore until it is signalled, just like a mutex release).

A significantly better example would be to show a single n-semaphore in a producer/consumer setup vs mutexes, where as many as n units can wait for consumption. With an n-semaphore, you just increment when you .push, and decrement when you .pop; you'll never pop from an empty stack, and will wait until a .push + .increment before continuing, and will never over-fill your stack. Simple, safe, and efficient in both code and computing power. Optimal, even - all producers produce until all are unable, and all consumers consume until all are unable.

With mutexes, this is not as easily possible; you'll have to test multiple mutexes, probably associated with an array location, to find one to safely push into / pull from. This means multiple checks to push / pull, and complex (if possible, haven't attempted to actually do it) mutex setups to allow consumers to both act immediately on new data and to not get in each other's way, in a similar manner to an n-semaphore.

edit: a quick example of the mutex's problem in this example:

assume n producers and n consumers, with mutex setup which prevents a producer from overloading an entry, or a consumer from over-emptying (takes 2 mutexes). If producer 1 makes two quickly, the second unit will languish until consumer 1 finishes the first unit and can claim it - no other consumer can work on that slot without many more mutexes controlling access to others' slots, if it's even possible to construct something which does this without spin-waits (continually checking all mutexes), which are very wasteful. I'd imagine it is, but I don't care to work out how to do it, as semaphores make it easy.

forensic · on Oct 25, 2010

Is it fair to say a mutex is just a special case of a semaphore that only counts to 1?

panic · on Oct 25, 2010

A mutex is more than that. That's the entire point of the article. A mutex implementation needs to deal with issues like priority inversion, where higher-priority threads continually interrupt a lower-priority thread holding a naively-implemented mutex, starving any other thread (even a high-priority thread) that needs the resource it protects.

Groxx · on Oct 25, 2010

Which is identical to a binary semaphore. If the low priority thread which decremented the semaphore does not allow others to work while it itself is interrupted - ie, it does not increment when blocked - it has the same problem.

The difference, as I've seen it, is that mutexes more often have code to account for lock escalations and shared locks, and sometimes other potential problems, which are deemed unimportant for semaphores because they're used where such things are not needed. It's entirely possible to create a semaphore which does account for these - trivial, even, to make a binary semaphore have everything a mutex has. Similarly, a mutex does not need to have such capabilities to be considered a mutex.

All of which is simply meta-behaviors, not things which are part of their definitions. Debating about current implementations is useful only in terms of current implementations, not in defining something's inherent capabilities.

forensic · on Oct 25, 2010

Starvation is equally an issue with semaphores.. how does that make a mutex different?

panic · on Oct 25, 2010

First of all, it's not clear to me that priority inversion is an issue with semaphores as normally used. If a high-priority consumer thread is waiting on a low-priority producer thread, there may just be no data ready to be consumed. There's no reason to raise the priority of the producer thread if, for example, it's just waiting for some external input.

Second, how would you solve priority inversion for general semaphores? With mutexes, you know what thread is holding the mutex, and can bump its priority correspondingly. With semaphores, there's no guarantee that the same thread that incremented the semaphore will decrement it later.

Groxx · on Oct 25, 2010

"As normally used" is part of the problem. How something is normally used has nothing to do with what it is.

General semaphores? I'll easily grant that would be difficult. Not impossible, but likely wasteful and a PITA.

Binary semaphores, which is where the debate lies? Store a whoDecremented thread pointer, cleared on increment. Which is functionally identical to Dekker's algorithm[1] and Eisenberg and McGuire's algorithm[2], except they require storage for all possible waiting threads, instead of the single storage of a semaphore.

edit: links: [1]: http://en.wikipedia.org/wiki/Dekker%27s_algorithm [2]: http://en.wikipedia.org/wiki/Eisenberg_McGuire_algorithm

inscitekjeff · on Oct 25, 2010

There is a relevant question and answer set for this topic on Stack overflow at http://stackoverflow.com/questions/187761/recursive-lock-mut...