I really appreciate the recommended reading order mentioned in the repository. Many times when looking at a repository as a learning resource it can be pretty daunting to know where to start, so I'm glad to see that in the readme.
Out of curiosity and for possible discussion, do you have any hard and fast methods for approaching a medium-to-large unfamiliar codebase?
In the past, I've tried looking in the past of the repo and trying to make maps of the dependencies between different files over time, to better understand which classes or types are the most widespread. In dynamic languages, I really don't know how I'd start, I'd probably just see how it's invoked and start depth-first from there.
For database-driven applications, that means, read the DB schema first. For everything else, look at their in-memory equivalents.
"Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." -- Fred Brooks, The Mythical Man Month (1975)
"Bad programmers worry about the code. Good programmers worry about data structures and their relationships." -- Linus Torvalds (2006)
It depends on the type of app. If it's a rails app, I usually start with user.rb or whatever the equivalent is (account.rb or something) as those usually have most of the functionality. From there, I'll either look start looking at routes config and going from there or some of the base controllers to get a sense of things (i.e. ApplicationController or maybe AuthenticatedController).
For non-rails web apps (and rails apps), I'll usually find a portion of the UI and just start tracking from the front-end to the back-end. Something like finding some text on the page, and trying to reverse back to where that particular piece of text was defined and what steps it took to get there (which view, helper, controller, etc.)
For non web-apps, I don't have any good techniques, unfortunately.
If it's a popular library, I also go to github and search for "import <library name>" and go to the code results, and look at a few examples of files from different projects that use that library to see how it can actually be used
I recently gave a talk at RailsConf that might interest you. I stepped through some ways to make a code base "auto-document" itself - including Swagger/OpenAPI, end-to-end code and data flows, and database schema. The talk is available as a 4-part blog series, starting here - https://dev.to/appland/we-need-a-better-way-to-communicate-a...
Chris Seaton is one of the most influential programmers out there for me. I'm so interested in just about everything he's touched. What a guy. Ruby will never die so long as people have ideas like his, though not all can follow through and create such cool things!
I'm curious to see where performant ruby goes in the midst of Crystal. I quite like Ruby for exploring, Pry is an unrivalled repl experience, but then Crystal is very fast and quite efficient for big things. I like the idea of types guiding me amongst a big large codebase that I might not be familiar with.
We'll just have to keep hacking and see what happens!
It should be kept in mind that the Ruby ecosystem has quite a bit of depth to it and there are solutions that have been around for quite some time to make it useful in places you wouldn't ordinarily think to. The performance issue has many angles to it.
There are lots of different ways to extend Ruby with code in other languages such as C or Rust, there's DragonRuby if you wanna make games, you can run Ruby on the JVM, JRuby and nowadays TruffleRuby, there's even a slimmed-down Ruby suitable for use in embedded contexts, mruby, which is what I replaced all of my Crystal code with. (if you go this route, the best way I've found is to compile your own mruby with whatever you want in it, and put a #!/path/to/mruby shebang at the top of your scripts. Compile if you need even more perf, I found JITted mruby to be more than sufficient.)
Crystal isn't a bad language, but the only thing it shares with Ruby is a small subset of syntax. It's sorely lacking in maturity and in libraries and tools. It's unfortunate because Crystal is an idea worth exploring, but its proximity to Ruby means people will always position it against Ruby, and it will always fall short. And with types now in Ruby that's one less reason to pick Crystal over Ruby.
>Crystal's is about a year older than Rust. If it was going to get popular it would have by now.
You mean at least a year younger ?
Rust
>The language grew out of a personal project begun in 2006 by Mozilla employee Graydon Hoare,[18] who stated that the project was possibly named after the rust family of fungi.[29] Mozilla began sponsoring the project in 2009[18] and announced it in 2010.
I guess. It probably would have been better to say they're about the same age. I went by Rust's first pre-alpha release, maybe I should have used an earlier event.
> I'm curious to see where performant ruby goes in the midst of Crystal
Crystal is a statically typed language, so the traditional performance considerations about static <> dynamic languages apply. AFAIK Ruby is also particularly hard to optimize, due to its very dynamic nature (pretty much every concept can be changed at runtime).
> Crystal is very fast and quite efficient for big things
It can'be currently said "very fast" about such a young language; it will depend on how much manpower will be put it in the long term. Parallelism is not even yet stable, which is a considerable factor.
I personally think it as a pleasant language to write small tools/scripts. If parallelism would have been implemented from day 1, I would have actually used it.
With all this negativity about Ruby's speed it's easily forgotten that Ruby is as fast as Python for most tasks with the exception of some numeric libraries which are really just C extensions. They are both interpreted languages and that hasn't stopped Python from dominating the world of programming.
I was disappointed that this isn't a complete working project, but I have to say the documentation is well-written and informative. This seems like a great learning project.
I did something very similar to this (though I never got to JITing to actual machine code, it was on my roadmap, and there were differences in goals[1]) back in the 1.8.x days and the reality is that writing any kind of ruby interpreter is an astoundingly complex task. It is not a simple language and it has a lot of awkward corners.
I spent months getting it to the point where it could just properly run rubyspec, and then months more just making it pass a decent number of its tests.
I can't imagine this has gotten any easier since then, it would be a hell of a project to make anything like this a complete working project.
[1] it's kind of horrifying to me now but it's still up on my github at github.com/stormbrew/channel9 -- the actual goal was a multilanguage vm where you can implement languages in themselves. It was originally written in ruby and eventually the bytecode interpreter core was rewritten in C++. The OP project is much nicer, more directed, and better documented by far than what I wrote there though.
I don't believe it's impossible, although it's certainly a large undertaking. TruffleRuby already optimizes some "slow" features of Ruby quite well. E.g., it's able to inline blocks and JIT compile metaprogramming features. I haven't really kept up with all that Crystal is doing these days, but if you can optimize the hard parts of Ruby, you eventually just get into the traditional trade-offs between AOT and JIT compilation.
(Full disclosure: I work on TruffleRuby, in case that matters.)
JS used to be thought of as a slow language, then better VMs came along and now it's fast. I think it's much more a function of how much resources have been expended on developing a fast VM than the fundamentals of the language, though it will always be much harder to make a fast VM for Ruby than something like Lua.
This feels like it is said in bad faith because I suspect you know exactly what is meant.
If you really want to be pedantic, 1_000_000 user operations are to be done. If you write something that does 1_000_000 user operations/second, and it takes 6 mos to write, it'll be done in 6 mos and 1 sec.
If it only does 500 ops/sec but takes 1 day to write, it'll be done in ~3.3 days.
Whole system optimization is the key. One can't optimize in isolation. Will it run once? What are the inputs? How much memory will it take? How important is the result?
Now we are in the regime where parallelism and memory bandwidth efficiency is more important than constant factors of speed increase. 1k, 10k or 100k cores will be commonplace, even with slow interpreted code, the code the scales to more cores faster will win, even against C++. We already have more cores than we know what to do with.
You're ignoring the cost of running the code, latency, and the many overheads and potholes of horizontal scaling. There's a reason large codebases are being rewritten from python, ruby, php to java, go, rust.
Those large codebases are being rewritten in languages that support large dev teams and the tooling that static type systems can support.
Horizontal scaling isn't the reason and nor is it the bugaboo you make it out to be. Vertical scaling might get you a factor of 100 on modern hardware over dynamic languages, vertical with threading might get you to 1000x, massive parallelism afforded by the coming exponential jump in core counts is going to get folks to 10-50k increases in throughput.
> Those large codebases are being rewritten in languages that support large dev teams and the tooling that static type systems can support.
Of course that's one of the reasons, and that goes directly against the notion that python and ruby are somehow cheaper or more efficient to develop in.
> Horizontal scaling isn't the reason and nor is it the bugaboo you make it out to be.
The point is horizontal and vertical scaling don't automatically resolve the drawbacks of slow code. Using 10x the resources to make up for slow code is generally not a solution if performance actually matters.
> Vertical scaling might get you a factor of 100 on modern hardware over dynamic languages, vertical with threading might get you to 1000x, massive parallelism afforded by the coming exponential jump in core counts is going to get folks to 10-50k increases in throughput.
Not sure what you're trying to argue here. Yes 'dynamic' languages are slower, yes you can vertically and horizontally scale both. I don't see how that's relevant for a comparison of languages.
Related project for those interested in alternative/experimental ruby implementations: Artichoke, a ruby for Wasm built with Rust. https://www.artichokeruby.org/
Sorry, it's just a reference to the ruby-red control flow edge flowing through the intermediate representation graph nodes, like a root. The same line in the logo.
And it starts with R, so the idea was I could write a new command-line option -rhizome to enable it, but that would actually run a library hizome.rb, with -r being the standard option to load libraries in Ruby, so letting me add a new option to an unmodified Ruby.