Rhizome – A pedagogical example of a JIT for Ruby, implemented in Ruby

mike1o1 · on June 21, 2021

I really appreciate the recommended reading order mentioned in the repository. Many times when looking at a repository as a learning resource it can be pretty daunting to know where to start, so I'm glad to see that in the readme.

adenozine · on June 21, 2021

Out of curiosity and for possible discussion, do you have any hard and fast methods for approaching a medium-to-large unfamiliar codebase?

In the past, I've tried looking in the past of the repo and trying to make maps of the dependencies between different files over time, to better understand which classes or types are the most widespread. In dynamic languages, I really don't know how I'd start, I'd probably just see how it's invoked and start depth-first from there.

inopinatus · on June 21, 2021

To understand code, start with data.

For database-driven applications, that means, read the DB schema first. For everything else, look at their in-memory equivalents.

"Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." -- Fred Brooks, The Mythical Man Month (1975)

"Bad programmers worry about the code. Good programmers worry about data structures and their relationships." -- Linus Torvalds (2006)

mike1o1 · on June 21, 2021

It depends on the type of app. If it's a rails app, I usually start with user.rb or whatever the equivalent is (account.rb or something) as those usually have most of the functionality. From there, I'll either look start looking at routes config and going from there or some of the base controllers to get a sense of things (i.e. ApplicationController or maybe AuthenticatedController).

For non-rails web apps (and rails apps), I'll usually find a portion of the UI and just start tracking from the front-end to the back-end. Something like finding some text on the page, and trying to reverse back to where that particular piece of text was defined and what steps it took to get there (which view, helper, controller, etc.)

For non web-apps, I don't have any good techniques, unfortunately.

pizza · on June 23, 2021

- file size

- earliest committed files

- most recently committed files

- tests / examples folders

If it's a popular library, I also go to github and search for "import <library name>" and go to the code results, and look at a few examples of files from different projects that use that library to see how it can actually be used

kgilpin · on June 22, 2021

I recently gave a talk at RailsConf that might interest you. I stepped through some ways to make a code base "auto-document" itself - including Swagger/OpenAPI, end-to-end code and data flows, and database schema. The talk is available as a 4-part blog series, starting here - https://dev.to/appland/we-need-a-better-way-to-communicate-a...

adenozine · on June 21, 2021

Thunderbong with an excellent link!

Chris Seaton is one of the most influential programmers out there for me. I'm so interested in just about everything he's touched. What a guy. Ruby will never die so long as people have ideas like his, though not all can follow through and create such cool things!

I'm curious to see where performant ruby goes in the midst of Crystal. I quite like Ruby for exploring, Pry is an unrivalled repl experience, but then Crystal is very fast and quite efficient for big things. I like the idea of types guiding me amongst a big large codebase that I might not be familiar with.

We'll just have to keep hacking and see what happens!

vinceguidry · on June 21, 2021

It should be kept in mind that the Ruby ecosystem has quite a bit of depth to it and there are solutions that have been around for quite some time to make it useful in places you wouldn't ordinarily think to. The performance issue has many angles to it.

There are lots of different ways to extend Ruby with code in other languages such as C or Rust, there's DragonRuby if you wanna make games, you can run Ruby on the JVM, JRuby and nowadays TruffleRuby, there's even a slimmed-down Ruby suitable for use in embedded contexts, mruby, which is what I replaced all of my Crystal code with. (if you go this route, the best way I've found is to compile your own mruby with whatever you want in it, and put a #!/path/to/mruby shebang at the top of your scripts. Compile if you need even more perf, I found JITted mruby to be more than sufficient.)

Crystal isn't a bad language, but the only thing it shares with Ruby is a small subset of syntax. It's sorely lacking in maturity and in libraries and tools. It's unfortunate because Crystal is an idea worth exploring, but its proximity to Ruby means people will always position it against Ruby, and it will always fall short. And with types now in Ruby that's one less reason to pick Crystal over Ruby.

multiplegeorges · on June 22, 2021

> And with types now in Ruby that's one less reason to pick Crystal over Ruby.

The story around types in Ruby is very early. There are "competing" ways to do it and they are really not dev ergonomic.

I think Ruby has a loooong way to go with typing to take it off the list of things to use Crystal for.

vinceguidry · on June 22, 2021

It's not really. The earliest type system for Ruby as far as I can recall is sorbet, and it's been around since at least 2017.

chrisseaton · on June 22, 2021

Type systems for Ruby go back to at least 2007 https://rubybib.org/#madsen2007

colesantiago · on June 22, 2021

Wasn't Rust also lacking libraries as well?

Example: Last time I checked couldn't find a firebase library for Rust, so gave up on it.

Rust seems a bit immature in the library department no? But didn't stop it becoming popular. I bet Crystal will do the same.

vinceguidry · on June 22, 2021

Crystal's is about a year older than Rust. If it was going to get popular it would have by now.

ksec · on June 22, 2021

>Crystal's is about a year older than Rust. If it was going to get popular it would have by now.

You mean at least a year younger ?

Rust

>The language grew out of a personal project begun in 2006 by Mozilla employee Graydon Hoare,[18] who stated that the project was possibly named after the rust family of fungi.[29] Mozilla began sponsoring the project in 2009[18] and announced it in 2010.

Crystal

>Work on the language began in June 2011

vinceguidry · on June 22, 2021

I guess. It probably would have been better to say they're about the same age. I went by Rust's first pre-alpha release, maybe I should have used an earlier event.

pizza234 · on June 21, 2021

> I'm curious to see where performant ruby goes in the midst of Crystal

Crystal is a statically typed language, so the traditional performance considerations about static <> dynamic languages apply. AFAIK Ruby is also particularly hard to optimize, due to its very dynamic nature (pretty much every concept can be changed at runtime).

> Crystal is very fast and quite efficient for big things

It can'be currently said "very fast" about such a young language; it will depend on how much manpower will be put it in the long term. Parallelism is not even yet stable, which is a considerable factor.

I personally think it as a pleasant language to write small tools/scripts. If parallelism would have been implemented from day 1, I would have actually used it.

michaelcampbell · on June 21, 2021

I don't see the age of language at all relevant to how many ops/s or whatever other measure of speed you want to use it has.

It's either faster than what you consider "very fast", or it is not; how long the language has been around is, IMO, a complete non-sequitur.

That it can be made, possibly, much faster than it is I guess can probably weigh in here, but that's not what they were talking about.

LAC-Tech · on June 21, 2021

Ruby 3 is like 10x faster than it was in 1.8.7 when I started using it.

That still doesn't make it a speed demon. But it does make it 'fast enough' for me.

cutler · on June 22, 2021

With all this negativity about Ruby's speed it's easily forgotten that Ruby is as fast as Python for most tasks with the exception of some numeric libraries which are really just C extensions. They are both interpreted languages and that hasn't stopped Python from dominating the world of programming.

burlesona · on June 21, 2021

I was disappointed that this isn't a complete working project, but I have to say the documentation is well-written and informative. This seems like a great learning project.

chrisseaton · on June 21, 2021

Yeah, sorry, it was designed to be deep but not broad, and of course it's unfinished. It's a bit of a shame.

The reading list is the starting point https://github.com/chrisseaton/rhizome#how-to-read-this-repo..., as is the code in lib of course. You can also run the experiments to generate programs before and after passes.

Someone who knows about basic things like bytecode might like to start reading at https://github.com/chrisseaton/rhizome/blob/main/doc/constru... and may find that then starts to be new information.

stormbrew · on June 21, 2021

I did something very similar to this (though I never got to JITing to actual machine code, it was on my roadmap, and there were differences in goals[1]) back in the 1.8.x days and the reality is that writing any kind of ruby interpreter is an astoundingly complex task. It is not a simple language and it has a lot of awkward corners.

I spent months getting it to the point where it could just properly run rubyspec, and then months more just making it pass a decent number of its tests.

I can't imagine this has gotten any easier since then, it would be a hell of a project to make anything like this a complete working project.

[1] it's kind of horrifying to me now but it's still up on my github at github.com/stormbrew/channel9 -- the actual goal was a multilanguage vm where you can implement languages in themselves. It was originally written in ruby and eventually the bytecode interpreter core was rewritten in C++. The OP project is much nicer, more directed, and better documented by far than what I wrote there though.

alberth · on June 21, 2021

I wonder what the folks at Shopify think of this given that they are doing a huge amount of performance work with Ruby/Truffle-ruby.

Off topic: If Ruby could achieve the performance that Crystal provides, it'd wager the adoption rate would be huge.

Malp · on June 21, 2021

This is created by Chris Seaton, who works on TruffleRuby at Shopify

alberth · on June 21, 2021

Lol. I couldn’t remember his name from his HN comments.

Thaxll · on June 21, 2021

It's just impossible, ruby / python will always be fast'ish never really fast because the foundation was not built for it.

nirvdrum · on June 21, 2021

I don't believe it's impossible, although it's certainly a large undertaking. TruffleRuby already optimizes some "slow" features of Ruby quite well. E.g., it's able to inline blocks and JIT compile metaprogramming features. I haven't really kept up with all that Crystal is doing these days, but if you can optimize the hard parts of Ruby, you eventually just get into the traditional trade-offs between AOT and JIT compilation.

(Full disclosure: I work on TruffleRuby, in case that matters.)

jordanthoms · on June 21, 2021

JS used to be thought of as a slow language, then better VMs came along and now it's fast. I think it's much more a function of how much resources have been expended on developing a fast VM than the fundamentals of the language, though it will always be much harder to make a fast VM for Ruby than something like Lua.

vorticalbox · on June 21, 2021

What you lose in execution time you generally gain in development.

mioasndo · on June 21, 2021

> ruby

> doubt

goldenkey · on June 21, 2021

> "What you lose in apples you gain in oranges"

I didn't know different units could be compared as if they had an equivalence ratio...

michaelcampbell · on June 21, 2021

This feels like it is said in bad faith because I suspect you know exactly what is meant.

If you really want to be pedantic, 1_000_000 user operations are to be done. If you write something that does 1_000_000 user operations/second, and it takes 6 mos to write, it'll be done in 6 mos and 1 sec.

If it only does 500 ops/sec but takes 1 day to write, it'll be done in ~3.3 days.

Which one is faster?

sitkack · on June 22, 2021

Whole system optimization is the key. One can't optimize in isolation. Will it run once? What are the inputs? How much memory will it take? How important is the result?

Now we are in the regime where parallelism and memory bandwidth efficiency is more important than constant factors of speed increase. 1k, 10k or 100k cores will be commonplace, even with slow interpreted code, the code the scales to more cores faster will win, even against C++. We already have more cores than we know what to do with.

mioasndo · on June 22, 2021

You're ignoring the cost of running the code, latency, and the many overheads and potholes of horizontal scaling. There's a reason large codebases are being rewritten from python, ruby, php to java, go, rust.

sitkack · on June 23, 2021

Those large codebases are being rewritten in languages that support large dev teams and the tooling that static type systems can support.

Horizontal scaling isn't the reason and nor is it the bugaboo you make it out to be. Vertical scaling might get you a factor of 100 on modern hardware over dynamic languages, vertical with threading might get you to 1000x, massive parallelism afforded by the coming exponential jump in core counts is going to get folks to 10-50k increases in throughput.

mioasndo · on June 23, 2021

> Those large codebases are being rewritten in languages that support large dev teams and the tooling that static type systems can support.

Of course that's one of the reasons, and that goes directly against the notion that python and ruby are somehow cheaper or more efficient to develop in.

> Horizontal scaling isn't the reason and nor is it the bugaboo you make it out to be.

The point is horizontal and vertical scaling don't automatically resolve the drawbacks of slow code. Using 10x the resources to make up for slow code is generally not a solution if performance actually matters.

> Vertical scaling might get you a factor of 100 on modern hardware over dynamic languages, vertical with threading might get you to 1000x, massive parallelism afforded by the coming exponential jump in core counts is going to get folks to 10-50k increases in throughput.

Not sure what you're trying to argue here. Yes 'dynamic' languages are slower, yes you can vertically and horizontally scale both. I don't see how that's relevant for a comparison of languages.

vorticalbox · on June 21, 2021

languages like python usually have shorter time to market at the cost of it running slower.

So sure they can't be directly compared but that doesn't mean that one cannot effect the other.

mioasndo · on June 22, 2021

> languages like python usually have shorter time to market at the cost of it running slower.

Let's see the data on that. Then let's see the cost of maintenance vs Java.

rattray · on June 21, 2021

Related project for those interested in alternative/experimental ruby implementations: Artichoke, a ruby for Wasm built with Rust. https://www.artichokeruby.org/

dleslie · on June 21, 2021

The code is quite legible; I recommend having a peak.

jonnyone · on June 21, 2021

Wonder if the name is a Deleuze/Guattari nod.

hestefisk · on June 21, 2021

A thousand plateaus of Ruby compilers? On compilers and schizophrenia?

chrisseaton · on June 21, 2021

Sorry, it's just a reference to the ruby-red control flow edge flowing through the intermediate representation graph nodes, like a root. The same line in the logo.

And it starts with R, so the idea was I could write a new command-line option -rhizome to enable it, but that would actually run a library hizome.rb, with -r being the standard option to load libraries in Ruby, so letting me add a new option to an unmodified Ruby.