I love how much thought and polish went into this, just like so many other Erlang features in recent years: counters, persistent_term, thread progress, ETS scaling, process groups...
There's been talks of JIT for quite a few years, but they didn't want to merge it in until they had something that's easy to maintain, and doesn't introduce significant performance regressions. The end result looks really solid.
So this JIT supports everything but HiPE and will be included in OTP-24 and is production ready(???) and dramatically improves performance across almost everything? Seems too good to be true. What has been the bug finding process so far in this PR?
This is far from the first JIT effort for BEAM. I'm sure the authors have rolled many years of learnings and experience into this one. Still, you make a good point, it will need a lot of testing by the community before I'd trust it in production.
Probably a case of the low hanging fruit. With most projects you start off with a bunch of improvements with very few side effects - at present - but quickly start to encounter trade offs. Both in the form of new improvements introducing more development overhead, and the fact that open ended things are generally hard to make fast.
To make something faster may mean calcifying some part of the architecture; this code handles every scenario that the design defines now, but it can’t handle other scenarios we might have wanted to consider later. Those are now harder to implement or counter to our assumptions. People don’t like it when you take back things you previously gave them, meaning no new feature or a huge opportunity cost for that feature (because it’s reimplementing other features as well as adding a new one).
I don’t think this is at all true... I mean the PR changes 350-odd files and adds some crazy JIT engine. It’s hardly low hanging fruit... I don’t know enough about how this is implemented to say it will make future features harder, they even state in the PR maintenance is very important to the design here so I’ll assume it won’t be a weight around their necks.
I may be missing some context but can't it be done similar to how we handle data from outside world in TS ?
A runtime type guard that asserts that the shape of data conforms to a particular type and from that point forward compiler will trust the guard and assume that the message is of particular type.
In a dynamically typed world, I doubt one could do much better.
The thing is; messages for Erlang are far more part of your programming/architecture than in other languages. You are basically writing 'nanoservices' all the time. I saw quite a lot of people mentioning it in discussions, including here, for instance [0].
Can someone break this down and explain what this means in practical runtime performance gains?
I know it's probably "it depends", but in Elixir let's say you were doing a lot of string parsing or mucking around with a bunch of Enum functions or doing a bit of math. Will this put Elixir's performance on par with at least Python and Ruby in those cases?
I know this feature is meant for Erlang but I'm assuming it applies back to Elixir too?
It does apply to Elixir. It will help all around performance but more so it does this on a per-instruction level by reducing dispatch cost in that the interpreter pays (as well as managing to specialize things a little better than what the fixed instruction tables can express).
To get an idea of the instruction stream of the BEAM (not the same as .beam asm), you can use the erts_debug module:
iex> :erts_debug.df(String)
This will dump a BEAM machine instruction stream to a file named Elixir.String.dis in your current working directory. You'll see things like:
It's been a C codebase so far. Introducing C++ makes this no longer the case. This has strong implications for both development and deployment. I'm perplexed since there exist battle-tested code generation engines written in C (e.g. dynasm). Correct me if I'm wrong but it doesn't look like asmjit is anything special in that regard.
I can't imagine Joe Armstrong being happy about this.
Dynasm depends on Lua for pre-processing into pure C. I guess you could write an Erlang version? There's a Ruby meta-assembler in Safari JavaScriptCore!
AsmJit was designed to be able to integrate well with C code bases. It uses C-style error handling (no exceptions) and provides easy to use API. I don't think it's a big deal to use it in a C project - there are other C projects that use AsmJit without any issues.
Given that all major C compilers are now written in C++, this is yet another prof where the world is going.
Using C++ doesn't mean one needs to use everything from it, and improving C while keeping its semantics will hardly lead to anything much different from what C++ already offers.
C++ generally compiles orders of magnitude slower, has far worse interop and platform support and makes it really easy to unwittingly add all sorts of horrendous bloat. I'd personally still rather write code in C++ than C, but I agree that Joe Armstrong would probably be horrified, and for good reasons.
Some background on the journey to BeamJIT: https://drive.google.com/file/d/1hHCs90kDX_wJ9AbLzGNu5bZKdIV...