Just like with Python, why would you even care about the GIL?
Writing single multithreaded apps with low-level locking hardcoded everywhere is now quite clearly NOT the right way to build software. If you don't use locks, i.e. only use lock-free data structures and immutable state, then you won't care about the GIL. And you can use multiple processes and interproccess communications in place of threading. On Linux, the difference in performance between threads and processes is very small. Most people who complain about the GIL have not even profiled multithreading versus multiprocessing. They are just bound and determined to reinvent the wheel in their own code base.
There is no reason why you can't leverage C++ (ZeroMQ) or Erlang (RabbitMQ) to do the hard bits and write the rest of your app in nice simple Ruby (or Python) scripts that are designed according to the Actor Model.
> Writing single multithreaded apps with low-level locking hardcoded everywhere is now quite clearly NOT the right way to build software.
It is a very opinionated view that writing multi threaded code with locking is not "the right way". Of course, using locks requires disciplined engineering but there are applications where you might want to use threads in a single process rather than multiple processes. Although it might be less common for domains where Ruby is popular.
Multiple processes and IPC is not a whole lot easier than writing multi threaded code, especially if you have nice a nice concurrency framework with channels, etc available.
> If you don't use locks, i.e. only use lock-free data structures and immutable state, then you won't care about the GIL.
Unfortunately, lock free data structures or immutability will not avoid the GIL. The GIL is used to guard internal data structures in the interpreter and it's practically held always when code is being interpreted and released only to when blocking I/O is happening (at least that's the way it works in Python).
Not being able to write multi threaded code is a weakness in CPython and CRuby and finding ways to avoid locking the GIL would make them better.
> Most people who complain about the GIL have not even profiled multithreading versus multiprocessing.
This is a fair point, but the issue might be about memory usage, not speed. A unicorn setup might have two to eight worker processes to service HTTP requests. Even with copy-on-write-friendly garbage collection, the memory usage of each additional process is significant. On the other hand, a thread-based solution (using JRuby, for example) can maintain a threadpool with hundreds of worker threads because the cost of an additional thread is nearly negligible.
Why would you need hundreds of worker threads? If you're using an event-loop in each process (which you probably want if only to minimize context-switching overhead, and is how Unicorn does it), then you need only one process per physical core in the machine. Anything else will just sit in the runqueue and cause context switches.
There is definitely annoying memory overhead with multiprocess (vs. multithreaded) architectures, but it's on the order of 2x-8x, not 100x. And that's 2x-8x the code size of the application, not data size - you only need duplicate interpreter objects, anything at the app or framework level (like templates or data files) can be stored in read-only shared memory or just COW'd with no writes. (It's technically not even every interpreter object - a number of function objects are completely static data that will never have additional references made, and so COW means they'll be shared perpetually between processes.)
>> If you don't use locks, i.e. only use lock-free data structures and immutable state, then you won't care about the GIL.
Unless you're trying to do something that requires parallel processing. Crypto cracking was one example recently.
>> And you can use multiple processes and interproccess communications in place of threading.
This adds complexity
>> There is no reason why you can't leverage C++ (ZeroMQ) or Erlang (RabbitMQ) to do the hard bits and write the rest of your app in nice simple Ruby (or Python) scripts that are designed according to the Actor Model.
This also adds complexity.
Basically it would be nice if threads in python acted like they do in other languages, rather than just pretending to.
There are tiers for performance. A few hundred (or thousand) ops/sec -- a GIL isn't going to hurt too much. A few thousand -- well-designed IPC is fine. A few tens of thousands -- lock-free gets important. Hundreds of thousands (or millions) -- userspace/kernel transitions and interrupt servicing can dominate, and an application's interactions with the OS need to be very carefully managed...
It all depends on how much you want to get out of your hardware.
The reason you care is because if you remove the GIL, you can write a 1:1 or green-threaded M:N library (actor model like Erlang, or dataflow like Oz, or whatever other models floats your boat) once in Ruby, and then a rite all the application code you want using just Ruby.
Using scripts running in separate OS level processes -- no matter what infrastructure you use to connect them -- isn't necessarily the ideal solution.
Based on our experience running large telephony applications, I would say that the threading approach, using JRuby as it provides a stable GIL-free interpreter, is vastly superior both in resource handling and developer/sysadmin productivity, also known as "less headaches".
Our experience at Google (using first CPython and then Java) was that the single-threaded multi-process approach lead to better developer productivity but worse sysadmin productivity. It made reasoning about the behavior of the code significantly easier and wasted less time searching down race conditions, livelocks, and deadlocks, but it meant that SREs and infrastructure teams had to spend more time managing memory consumption and dealing with deployment and monitoring hassles.
So it's a trade-off where the downsides often get pushed off into another group. As a developer, I really miss the CPython solution, which was a lot simpler and seemed more robust. But then, I wasn't the one responsible for pushing out new code or monitoring. I do think there were various optimizations we could've made to our other tools that might've compensated for the need to run more server processes, and wish we'd tried that before jumping to "Let's use a GIL-free language."
We are a pretty small team, and aside from one person most of us double as ops at some point. Not trying to generalize at all, but at our scale, it is far more productive to go to something GIL-less and possibly as self-managing as possible, such as BEAM VM languages.
Incidentally, Erlang/Elixir describe applications as a collection of "in process processes", essentially allowing you to reason much like you were doing single threaded processes but without the real system implications.
What I see as the most important thing here is that developers have a better pulse on resource needs of their own software, since we do a lot of load testing before pushing anything to production, and the ops team can take our baseline data and handle applications almost as a black box.
On the other hand, Docker essentially makes all my points nul by itself, so there is room for all approaches.
Writing single multithreaded apps with low-level locking hardcoded everywhere is now quite clearly NOT the right way to build software. If you don't use locks, i.e. only use lock-free data structures and immutable state, then you won't care about the GIL. And you can use multiple processes and interproccess communications in place of threading. On Linux, the difference in performance between threads and processes is very small. Most people who complain about the GIL have not even profiled multithreading versus multiprocessing. They are just bound and determined to reinvent the wheel in their own code base.
There is no reason why you can't leverage C++ (ZeroMQ) or Erlang (RabbitMQ) to do the hard bits and write the rest of your app in nice simple Ruby (or Python) scripts that are designed according to the Actor Model.