The more play around about Erlang, the more it makes absolute sense. This is a concurrency done right. And I say this having done concurrent programming in Java and C (both pthreads and LAM/MPI).
The share-nothing, message passing paradigm beats the pants off Java concurrency. Threadpools and monitor based locking is fine if only 10% of your codebase is going to be concurrent, but it gets old fast.
MPI is pretty cool. LAM is a decent implementation. But as with any C API, there is a lot of low-level cruft to doing it right. pthreads is both cumbersome and fragile. Also kernel level threads, while they have scheduling benefits, tend to be more expensive to spawn/kill.
Erlang really abstracts away all this messiness and provides non-blocking send and selective receive with message buffers. If only it had better string literals. (Implementing infinite loops with infinite recursion also gives me chills. But then I guess you couldn't implement a while loop with single value variables.)
Breaking a program into lots of autonomous pieces that communicate via message passing is a great way to help isolate complexity. Even if it didn't have any benefit for performance on a multi-core system or cluster, it would probably still be worth doing to keep everything from turning into a big ball of mud.
Of course, that's Unix pipes all over again. Unix pipes and pattern matching from Prolog. :) A lot of Erlang stuff makes more sense if you've used Prolog, by the way. (Though Erlang is almost certainly a more practical language...)
Selective receive seemed unimportant, but I read the PDF presentation at the bottom to find out why he thought it was important. And I think he's right.
Basically it allows your process to simulate synchronous calls (by sending, then entering a state where you won't receive any other messages besides the one you're waiting for). And in certain cases, you _really_ want those synchronous calls (sometimes, it just doesn't make sense to handle any other events until the synchronous call returns).
He uses a hardware example to make it clear -- say you have to issue commands to hardware in a certain order, but with delays between them. In a non-blocking system, you have to write a complicated state machine to handle all the events and buffer the ones which are not "hardware is ready for next command". With selective receive, you can use selective receive between hardware calls: send to the hardware, receive only the "hardware ready", send again.
I'm curious fast message sends need to be to be considered "fast". IIRC, until recently Erlang used a separate heap and mailbox for each process, so each message send required a full copy of the message. That fast enough? Otherwise Erlang didn't really have Erlang-style concurrency.
I think that Erlang now uses a shared heap for messages with message passing just being a pointer copy into the receiving process's mailbox. But concurrent shared-heap GC is much harder to get right than per-process heaps, so I'm asking mostly to simplify implementation of my own concurrent programming language. :-)
Nope, as of now each process has a separate copy, even when they are both on the same node. "The exception is large binary objects, which are handled specially by placing them outside the normal data areas and using reference counting to keep track of them. They are not copied when sent as messages between processes on the same node."
http://erlang.org/pipermail/erlang-questions/2006-September/...
The closest one I know of is Haskell, but where Haskell can be superior in some ways in a single OS process, it gets totally spanked by Erlang when it comes time to leave one OS process, including going out over a network to other machines. And since you really need that if you want high availability, Haskell's out of the running for the systems I use Erlang for.
(Erlang is also much easier for other programmers to read.)
Haskell is also missing OTP. However, IMHO the more fundamental problem is that Haskell doesn't have anything like Erlang's fundamental, simple message passing, including process monitoring. Fix that and an OTP library just as good as Erlang's (or better) will fall right out; fail to fix that and it'll never catch up on that front.
There's Termite (http://code.google.com/p/termite/), based on Gambit Scheme, which looks interesting. I haven't been able to get Gambit to build on OpenBSD/amd64, though.
There's also ConcurrentLua for Lua (http://concurrentlua.luaforge.net/), which looks like a deliberate attempt to port Erlang's concurrency model to Lua. I hadn't worked much with Erlang when I checked that out, but it has potential. IIRC the only thing missing from Ulf Wiger's checklist is selective receive, and adding a second receive function that takes a (message->bool) function probably wouldn't be difficult. It already has asynchronous message passing, distributed computing, very cheap processes, and process monitoring.
There's also ConcurrentLua (http://concurrentlua.luaforge.net/), which is a more deliberate attempt to port Erlang-style concurrency to Lua. It doesn't have pattern matching for selective receive, IIRC, and doesn't enforce lack of local state, but is otherwise mostly there. It implements lightweight processes via Lua coroutines, which are cheap co-operative mini-threads, the other concurrency model that (IMHO) makes sense.
I've played with it a bit and it seems well-designed, but at the time I hadn't done much serious with Erlang, and felt it would make more sense to learn the Erlang idioms (supervisor trees, etc.) from Erlang itself and come back to ConcurrentLua later. I'm very interested in it, though - Lua is one of my favorite languages.
It also seems to be the work of one grad student, who since graduated (and may have moved on). OTOH, it's a rather small project, since Lua (as with Scheme and Termite) already provides most of the necessary infrastructure in the core language (it's mostly a scheduling wrapper for coroutines and some distribution primitives), so forking it wouldn't be terribly difficult.
I've checked out both concurrentlua and luaproc in the past and they seem quite similar (both use cooperative scheduling via co-routines). I initially liked luaproc more because the channel-based message passing approach in luaproc seems to me to be inherently more flexible than message passing based on process id's. luaproc leads to a particularly elegant implementation for the chameneos-redux benchmark, for example.
concurrentlua seems more maintained, though; it showed up on luaforge more than a year ago and its latest update was in May, which to me means that the creator is interested in continuing the project. I believe this is the paper that introduced concurrentlua:
Thanks! I was looking for the ConcurrentLua PDF - it isn't linked on the project site anymore.
IIRC, luaproc was put up on luaforge because people on the mailing list read the paper and started asking if the source was available.
There are also about a half dozen other concurrency packages for Lua, which at first seems rather odd. As a language, it has quite a few semi-official extensions, but unlike Scheme (with its standard and several implementations), it's because the language designers are very serious about keeping the core language small and portable for embedding. You can import the ~800k of source into your project tree and fork it as necessary, compile it down to a ~200k library, etc. In practice, this means that people contribute Lua extensions that help it can work nicely with whatever funky C++ game engine they're using, etc., as well as more general libraries.
" It doesn't have pattern matching for selective receive, IIRC"
So http://metalua.luaforge.net/ has an implementation of pattern matching as one of it's shipped examples. Do you know if ConcurrentLua could be used with pattern matching. or is it designed such that, as it stands, pm would be useless.
I don't think it does -- the receive function only takes a timeout[1]. If it did pattern matching, it'd probably take a table of potential patterns, or a table array of pattern->functions, or somesuch.
After wondering before, I just checked the source. In message.lua, receive just takes an optional timeout. It'd be easy to have it take an optional function argument that tests each message in turn (since it's just a list of messages, modified in place), and returns a true if that message should currently be used. I'll test and submit a patch for it later. Not as flexible as pattern matching, but PM would be better off in the base language. As for integrating Metalua's PM with selective receive, I'm not sure.
The share-nothing, message passing paradigm beats the pants off Java concurrency. Threadpools and monitor based locking is fine if only 10% of your codebase is going to be concurrent, but it gets old fast.
MPI is pretty cool. LAM is a decent implementation. But as with any C API, there is a lot of low-level cruft to doing it right. pthreads is both cumbersome and fragile. Also kernel level threads, while they have scheduling benefits, tend to be more expensive to spawn/kill.
Erlang really abstracts away all this messiness and provides non-blocking send and selective receive with message buffers. If only it had better string literals. (Implementing infinite loops with infinite recursion also gives me chills. But then I guess you couldn't implement a while loop with single value variables.)