General notes:
- Semicolons inconsistent usage makes Rust harder to process. Is that semicolon extra? Is it not? Will that be a compilation error? Saving single keystrokes will be paid in cognitive load. Programming is already a demanding task, this isn't helping.
- Chaining multiple statements on a single line is bad for various reasons.
Dependencies / headers / modules:
- C++ is better precisely because it's more grained on the initialization side. The paragraph starting with 'Rust wins, here', is just a general turnoff to read the rest of the article.
When I read the title 'Rust vs. C++: Fine-grained performance' I was honestly expecting either a table with performance times across various algorithms, or a savvy optimization guide.
I wasn't expecting a LoC duke-it-out.
Input file processing
- Both are conceptually boilerplate code. There is more for Rust apparently.
- There is an interesting &⊛env:: in there that makes the language looks like it needs a bit more work. Three operators just to do something with the first object?
- Are they intentionally trying to copy C++ with the :: operator but removing visually arbitrary semicolons? Just replace the :: with something else.
Data structure and input setup
- Remove the 'let' from the language, and it does look cleaner than the C++.
Input state machine
- Why isn't "(0..7).rev()" "(7..0)"
- On the bottom part, lamba-like usage will makes in less comprehensible (and probably undebuggable the way it's written)
General lack of performance numbers
- "I found that iterating over an array with (e.g.) “array.iter()” was much faster than with “&array”, although it should be the same..."
"Curiously, changing scores to an array of 16-bit values slowed down (earlier versions of) the C++ program by quite a large amount – almost 10% in some tests – as the compiler yields to temptation and forces scores into an XMM register. The Rust program was also affected, but less so."
I didn't write this blog post, though I did help the author optimize their Rust code when they dropped by #rust, so I'll respond to the generic Rust stuff rather than those details.
> Semicolons inconsistent usage
You've asserted this, but not demonstrated it: semicolon usage is consistent in Rust. Rust is not C++, so it doesn't necessarily follow the same rules, but it is consistent, even though it may be different.
> C++ is better precisely because it's more grained on the initialization side.
Can you elaborate on this? I'm not exactly sure what you're saying.
> There is an interesting &⊛env:: in there that makes the language
> looks like it needs a bit more work.
See above: this kind of thing is about how Rust views coercion and heap allocation: explicit, not implicit.
> Just replace the :: with something else.
We tried that. We still ended up with :: as a scope operator.
> Remove the 'let' from the language, and it does look cleaner than the C++.
`let` has a few advantages, in that it allows you to take advantage of patterns, and makes initialization explicit.
> Why isn't "(0..7).rev()" "(7..0)"
Because ranges always iterate forward, from start to end. rev() reverses the direction.
I get the feeling that Rust design is concerned more about write-ability of the code than readability. Code has to be maintained, read by random people, comprehended. It will grow up in hundreds of megs on large projects, and it will take time to compile.
With the above in mind:
> You've asserted this, but not demonstrated it: semicolon usage is consistent in Rust. Rust is not C++, so it doesn't necessarily follow the same rules, but it is consistent, even though it may be different.
Writing code is consistent and unaffected, code comprehension (debugging, integration efforts, etc) by third parties will be hampered.
>Can you elaborate on this? I'm not exactly sure what you're saying.
Not importing the entire library will result in faster compilation. Unless rust has a concept similar to precompiled headers, etc. I apologize at this point. I'm not sure what Rust does in that respect, but my initial thoughts were that as the project grows, the parser will slow on includes. The internal symbol resolver will slow on lookups because too many things are included.
All of the above is only if you are trying to match what C++ does for large teams. I guess it's up to the developers of the language what direction they want to take it in. More agile or more enterprise oriented.
(From personal experience: I was on a team looong time ago, where codebase took 12 hours to compile on a powerful cluster. Any gains, even by few hours, changed schedules drastically)
>See above: this kind of thing is about how Rust views coercion and heap allocation: explicit, not implicit.
I understand your viewpoint, however three operators on a single object still seems conceptually excessive to me.
>Because ranges always iterate forward, from start to end. rev() reverses the direction.
Since both are supported, maybe the parser can be made smarter?
> I'm not sure what you're referring to here.
Well this piece of code:
let scores = words.iter()
.filter(|&word| word & !seven == 0)
.fold([counts[count];7], |mut scores, &word| {
for place in 0..7
{ scores[place] += (word >> bits[place]) & 1; }
scores
});
Is not easily read or debugged ( on which line do you set the breakpoint, etc).
Outside of minor things like this, I like what Rust is trying to do. I think it's a great effort and I do look forward to where it will lead.
> Not importing the entire library will result in faster compilation. Unless rust has a concept similar to precompiled headers, etc. I apologize at this point. I'm not sure what Rust does in that respect, but my initial thoughts were that as the project grows, the parser will slow on includes. The internal symbol resolver will slow on lookups because too many things are included.
The compilation time problems associated with headers are problems with headers: avoiding headers makes it easy to avoid the problems. The crate system of Rust is somewhat similar to precompiled headers, and C++ is in fact in the process of putting something similar (modules) into the standard.
Interestingly, this makes the Rust compiler easier to work with: the basic things of lexing, parsing, name resolution etc. are not at all bottlenecks (any particular file is only parsed once, there's not megabytes and megabytes of headers to parse for every invocation). This means these parts don't have to be micro-optimised, and hence don't get hit by an increase in complexity due to those optimisations.
Concerns about internal symbol resolution aren't a problem: hashmaps give (expected) O(1) access no matter how many elements they hold, and even using a tree has O(log n) access (i.e. going from 1_000 symbols to 1_000_000 only doubles the time for a look-up, and the next doubling happens at 1_000_000_000_000 symbols).
> Since both are supported, maybe the parser can be made smarter?
Not sure what this has got to do with the parser (7..0 parses fine, it just creates an empty iterator), but implicitly reversing ranges are on of the most annoying features of R: it makes it annoyingly hard to do things like `x..y - 3`.
> Well this piece of code:
Two things: the author is effectively golfing here (they say they want the code to fit on a single page), and, the Rust version can be written just like the C++ version, with a loop:
let scores = [counts[count]; 7];
for &word in words {
if word & !seven == 0 {
for place in 0..7 {
scores[place] += (word >> bits[places]) & 1;
}
}
}
Yeah, it should be able to handle that easily, since both bits and scores have static length 7. Note, however, that the loop version does no more or less [] indexing than the iterator/fold version.
> I get the feeling that Rust design is concerned more about write-ability
> of the code than readability.
We're usually accused of the opposite: being explicit about things really helps in the long run. We've made several choices that we think, at least, helps "in the large" but hurts "in the small". We can always be wrong, of course :)
> code comprehension (debugging, integration efforts, etc) by third parties will be hampered.
In what way? I still don't understand what you're getting at here.
> Not importing the entire library will result in faster compilation.
Rust's unit of compilation is a "crate", which gets compiled at once. Metadata is then stored in the artifact (.rlib) so that you can know the interface, etc, and you don't need to recompile the rlib when you change what aspects of the library you use. So I'm a _little_ bit unsure of the _exact_ details, but I think it's the same as precompiled headers, to my knowledge.
> where codebase took 12 hours to compile on a powerful cluster.
Well, I'm sure we'll get to really huge projects someday, but we tend to do the "small packages" philosophy, so an initial compile might take a while, but subsequent ones aren't as bad. I just did a clean build of Servo, which is roughly 6MM LOC (575k Rust, 500k C, 1MM C++ (Spidermonkey, I'm guessing)), and it took 16 minutes on the first (debug) build, and 3 minutes on the second, after touch-ing the main file. Lots of the initial time is building the 170 (!) dependencies, which then don't need to be compiled again. (And your second build-times will be different based on which sub-package you're building. Some are faster, some are slower.)
We also have a lot of compiler performance improvements coming down the pipeline, including incremental recompilation within a single crate.
> Since both are supported, maybe the parser can be made smarter?
It's been debated, but it's another edge case to remember. It's not clear that adding another special rule to the language is worth saving four characters.
> Is not easily read or debugged ( on which line do you set the breakpoint, etc).
I think it depends. I come from a functional background, so reading it feels fairly straightforward to me (though that for in a fold is a bit odd), and debugging should work as usual, though I don't feel the need to use a debugger a whole lot in Rust.
Thanks for taking the time to reply. Skepticism is good! Constructive criticism is the only way things move forward.
Dependencies / headers / modules: - C++ is better precisely because it's more grained on the initialization side. The paragraph starting with 'Rust wins, here', is just a general turnoff to read the rest of the article. When I read the title 'Rust vs. C++: Fine-grained performance' I was honestly expecting either a table with performance times across various algorithms, or a savvy optimization guide. I wasn't expecting a LoC duke-it-out.
Input file processing - Both are conceptually boilerplate code. There is more for Rust apparently. - There is an interesting &⊛env:: in there that makes the language looks like it needs a bit more work. Three operators just to do something with the first object? - Are they intentionally trying to copy C++ with the :: operator but removing visually arbitrary semicolons? Just replace the :: with something else.
Data structure and input setup - Remove the 'let' from the language, and it does look cleaner than the C++.
Input state machine - Why isn't "(0..7).rev()" "(7..0)" - On the bottom part, lamba-like usage will makes in less comprehensible (and probably undebuggable the way it's written)
General lack of performance numbers - "I found that iterating over an array with (e.g.) “array.iter()” was much faster than with “&array”, although it should be the same..." "Curiously, changing scores to an array of 16-bit values slowed down (earlier versions of) the C++ program by quite a large amount – almost 10% in some tests – as the compiler yields to temptation and forces scores into an XMM register. The Rust program was also affected, but less so."
Edit: Unicode