I feel this blog post will be very helpful to people getting into rust for the first time. (or the second, etc time, if it turned them off the first time) I enjoy rust and I write as many of my personal projects in it as I can.
Here's a quote that really resonated with my early experience with the language:
"However many of these [common rust patterns] are by convention, rather than by language design. Their prevalence depends in part on the authors' stylistic choice. This meant I was more reliant on examples and documentation than I am perhaps accustomed to."
I find, even today, it's impossible to check documentation or read "the book", alone, and produce idiomatic rust code. Finally one breaks down and asks on Stack Overflow or one of chat or forum outlets and receives an answer which (almost always accidentally) comes across as, "Oh, didn't you know it?? We always express those ideas in this format here [example code]". I've come to accept and appreciate this way. But it can be baffling and off-putting to newbies who wonder how they should have known.
I think that says something about the book. I initially read the book online as well, and was put off when actually trying to get something done and not succeeding.
Later, I bought "Programming Rust" by Jim Blandy, and it changed everything.
I'd second this book recommendation. It is not only the best Rust book out there, it is also one of the best books on any programming language I have read so far.
This is one of the books which I will gladly buy once more when it is updated.
which edition, because honestly i found Blandy book very little value in providing any explanation about rust beyond demonstrating page over page rust's equivalences in c++. it's just really a translation of c++ patterns into rust equivalences. no explanation, history, context about language design or idiomatic patterns from the host language. I don't know the history of The Book's date of publish/development vs blandy's book enough make this any more than a conjecture: it seems like blandy's book is a re-take on the book.
also, i am not saying The book is any good beyond a tour of the language. come to think of it, it's actually less like TC++PL or D&EC++, and more like the tour of c++.
Maybe the 2020 edition is better than the one i got from 2017 but if things aren't entirely different, basically rust to you is just a reapplication of C++ idiom for a language that targets a compiler which was historically developed to emit machine code based on c++ on the demands of cpp semantics?
Can you give an example of this? I'm very curious. As an experienced Rustacean, I kind of realize that there is a lot to idiomatic style, but at the same time I've assimilated it enough that it's often no longer obvious to me.
That seems pretty natural? The only reason why you don't have this in e.g. python is that iterators are often implemented as generators, so the "own type of iterator" is implicit. But I remember pretty much the same thing from my Java days, if you need to iterate you have to create an iterator type. It's a bit less visible due to type erasure but it's there nonetheless.
Yeah it's not intuitive, but there isn't really another way to do it, right? Every concrete iterator is specific to what it returns and what it iterates over.
I think it's documented under something like "How to implement Iterator" but if that's not where people will go looking when they need it, then it should be somewhere else.
to be more "idiomatic" instead of doing the match on the Option. The methods like "unwrap_or...()", "map...()", "ok...()", "and_then()", etc. on Result/Option are very useful, if not a bit difficult to find the right one to use sometimes. Deeply nested match, "if let Some", etc. code becomes like a straight chain of method calls. In the end, I find that the Result/Option methods shorten code considerably and improve readability.
Also, instead of doing an unwrap() on optional values for assertion, sometimes I like:
let my_list = vec![1,2,3];
assert_eq!(Some(&my_list[0]), my_list.first());
Mostly just depends though since I don't care too much in tests.
I'm guessing they're referring the `?.` "safe calls" operator[0-] which essentially acts as a mix of `map` and `andThen` except only for property accesses and method calls, and can be combined with `let` for a more general map-like behaviour.
[0] where many conceive of the operation as `?` being a modifying operator to `.`
fun process_item(input: Item?) -> Item? {
input?.plus(3)
}
This would have to be modified in rust to apply to both error and option types, as well as working on stuff other than the . operator, but I think it could be made to work.
The main difference is that the ?. operator in kotlin works at the expression level, instead of the function level.
You can create and use Option without problem, your Option will be crate::Option, whereas Rust Option will be std::option::Option.
std::option::Option<Option> works, yay.
I think NULL in SQL is very different from NULL in most PLs, and is much better behaved. It is on fact much closer to an Optional than to a 0 pointer.
First of all, in SQL NULL is entirely your choice - no column is forced to accept NULL values.
Secondly, most SQL functions treat NULL just like mapping over an optional - a < b works just like a.map(|a| {b.map(|b| {a < b})}).
The only exception are aggregation functions, that also do the more sa e thing in my opinion, at least for most use cases - AVERAGE() over (a, NULL, b) is a+b/2.
From that list I think the only really surprising one is "3 NOT IT (1, 2, NULL)" returning NULL. The fact that ARRAY_AGG (C) and COUNT(C) treat NULL differently is also a bit surprising, though in terms of usefulness I think the current behavior is not bad.
I also think the definitions is relatively clear: NULL means "unknown"/"missing data".
The MIN, MAX, and AVERAGE of an empty list is N/A. SUM could have been 0, but then SUM(X) / COUNT(X) would be different from AVERAGE(X).
The count of tuples (COUNT(*)) in an empty relation is 0.
The count of pieces of data on column X (COUNT(X)) of a table ((1), (NULL)) is 1.
If we have 2 tables A(ID, C1), VALUES ((1, 2)), B(ID, C2), VALUES ((2, 3)), and try to ask what value B.C2 has for A.ID=1, the answer can only be NULL.
IS NULL and IS DISTINCT FROM NULL are explicitly treating NULL differently, so it's not that surprising that they can be both true and both false.
Records and Strings are not the same KIND of types, so I don't see anything surprising about the fact that one can contain NULLs while the other can't. It's like complaining that a record in Haskell can have a Maybe field, while a String can't have a Maybe character.
> The MIN, MAX, and AVERAGE of an empty list is N/A. SUM could have been 0, but then SUM(X) / COUNT(X) would be different from AVERAGE(X).
0/0 is mathematically undefined— it is a reasonable calculation to produce NULL on its own, and then SUM could do the sensible thing of returning the additive identity.
I still find it much more useful in practice for SUM to return a NULL and not 0. Especially when plotting data over a period of time, having the chart go down to 0 and then go back up to some value is extremely annoying, and the query that would produce the current behavior of SUM would be much more annoying if it were implemented the other way around, while it is extremely easy to transform NULL to 0 if you want.
But why is the SUM of no elements NULL, but the COUNT-star of no elements zero?
Sure, for any given case you can come up with some explanation. But there is no general explanation that covers many cases. And that leads to weirdness and surprises and bugs.
Option types are way better and always do what you expect.
> Option types are way better and always do what you expect.
No, I can implement the exact same weird functions with option types, with signatures
sum: Array<Int> -> Optional<Int>
count_elements: Array<Int> -> Int
FWIW, I personally prefer both functions to not return Optionals, but the point is that the mere presence of Option types in your type system (and the absence of null pointers) doesn't magically eliminate bad library design
> I think NULL in SQL is very different from NULL in most PLs, and is much better behaved. It is on fact much closer to an Optional than to a 0 pointer.
I beg to differ. Dereferencing NULL in C is generally well defined: SIGSEGV. Rust's unwrap() is a modest improvement on that, it reliably panics. Rust's main contribution it's type system forces you to acknowledge the possibly it might be null by making you to write unwrap().
I'd take that any day over what SQL does when you "dereference" a NULL, by say 1 < null. It just propagates the error so it poisons whatever expression contains it. I avoid nulls whenever possible to avoid that.
If SQL was in fact closer to Rusts Option then an attempt to use a possibly null value/columns with an operator that can't handle null would produce a syntax error. So writing could_be_null > 0 would be illegal, you would have to write ifnull(could_be_null, 0) > 0.
Dereferencing NULL in C is undefined behavior - it has no semantics in the C abstract machine. In practice, with a modern optimizing compiler, it can have very surprising behavior, including omitting further null checks after the dereference. While it's true that an explicit dereference is very likely to cause a SIGSEGV (these are Unix semantics, not C), dereferncing NULL + something is likely to lead to memory corruption (very ugly in v[length - 1] or v->field for v==NULL).
Also, the recommended way of working with Maybe/Optional values is map() or do-notation, not unwrapping. And map/do have exactly the same semantics as most NULL-based operations in SQL.
Dereferencing Null in C when running on a hosted OS is fairly predictable. Don't forget that a lot of C (and increasingly a lot of Rust) is written for embedded targets where that is usually not true (sometimes address 0 is the start of flash).
The only things you can rely on are what the C Spec says you can rely on, and that does not include the behaviour of address 0.
The only thing that I find weird is the description of the leaky abstractions for the pin interface in "traditional" OOP. I understand the general sentiment, but this particular case is a prime example where polymorphy expresses the exact same thing just as well (here in C++):
class Pin
{
public:
Pin(std::uint8_t pin) : _pin(pin) {}
virtual ~Pin() = default; // base class could also be made abstract here
private:
std::uint8_t _pin;
};
class ReadPin : public Pin
{
public:
ReadPin(std::uint8_t pin) : Pin(pin) {}
double read();
};
class WritePin : public Pin
{
public:
WritePin(std::uint8_t pin) : Pin(pin) {}
void write(double level);
};
You can then end up either with
ReadPin sensor_pin = gpio.getReadPin(SENSOR);
or, if you for some reason want to keep the chaining, nothing prevents you from having a
Maybe it's just not the right kind of example, but I don't really understand how OOP's failure is that you can call write on a ReadPin. The author says:
> This kind of compromise happens often in the "kitchen sink" approach of OOP, where a child class might have access to inappropriate data or behaviour. Perhaps that data is relevant to a sibling class, but this is a sign of the abstraction leaking.
...which boils down to "if you violate the Liskov Substitution Principle, you are not using OOP correctly". I concede that this can and occasionally does happen in practice, but it's not like other languages are incapable of following the SOLID. It might just be easier to not do so.
I'm improvising here, but if pins can be switched from read to write (but can only be in one of those at a time), then in Rust you should be able to use the borrow checker to provide additional guarantees. As long as a pin is borrowed by a ReadPin, you can't create a WritePin, and viceversa.
As you say, you can use the same approach in C++, but this is one reason why the approach is extra interesting in Rust.
//Do something with resource
println!("{}", item);
}
None => {}
---------
Wouldn't this be better with design by contract by having first() enforce length>0 then turning off assertions if desired for release code? Also OP should have mentioned not matching none is an error otherwise there's no point
Wouldn't this be better with design by contract...
I don't believe so. No plans survive contact with "the enemy", and the enemy here is input that your system may not have control over (such as processing streaming data from an external source). If we simply turn off the check for release builds, this case would halt the system rather than provide an option to log the offending input contextual information to an error channel and perhaps ignore or recover from the situation (as appropriate).
Maybe I misunderstand what you meant, but by having `first()` return an `Option`, you can verify at compile-time that the caller handles the empty case, whereas an assertion would not detect the issue until runtime.
If the caller wants to kill the program when its empty, they can just call `.unwrap()` on the option to explicitly opt-in to a runtime check.
which is a straight UB in "safe Rust" as soon as you invoke it on an empty slice (or vec), which is completely verboten.
What you could do is have a separate `NonEmpty` vector either by construction or by assertion, for which `first` wouldn't be failible (but other operations would be).
> Also OP should have mentioned not matching none is an error otherwise there's no point
That actually is irrelevant to the story OP is telling. They're talking about null-safety, match-completeness is an orthogonal concern (and one they don't really care for here, they just want to operate on a present value).
I'm getting it too, but it's not happening everywhere. It seems like your selection has to include code and the cursor has to sit over whitespace in a limited horizonal range. Nothing looks off about the spans used to color the code so I'm guessing it's something strange about the sidebar.
> FP's conceit is trading performance for correctness.
This isn't even remotely the case. The most popular functional languages are substantially faster than the most popular imperative languages.
Haskell or OCaml both smoke the top 3 imperative languages: Python, Java, JavaScript.
The "conceit" is trading the conceptual model most people learn in high school/trade programming classes for correctness, with little to no bearing on performance.
> Rust is, as far as I'm aware, the first language that allows you to deploy functional concepts as zero-cost abstractions.
This is definitely not the case. You can deploy zero-overhead functional concepts in C++ using template programming, constexprs, etc. Also, "zero-overhead" compared to what? Plenty of functional language have abstractions that can reasonably be described as zero-overhead, like newtypes in Haskell or functorial modules in OCaml. You can even translate the non-monomorphically-recursive subset of Haskell to pure hardware, which is pretty much the definition of "zero overhead".
This article reads more like a regurgitation of taglines than original thinking.
I am not too familiar with FP languages, but I must say I have never seen a FP language in low level embedded or GPU code.
Of course there are garbage collected imperative languages which are slower than say Haskell, but can Haskell's speed compare to the most efficient C implementation of a code without making it unreadable?
This is what Rust's promise of "zero cost abstractions" means: if you went all in and handcrafted every pointer it wouldn't become more efficient then it is in it's abstracted form.
Again, I don't know enough about FP languages to make qualified statements, but I thought this was a Rust novelty.
https://clash-lang.org/ is a functional HDL. Compiles to VHDL, Verilog, or SystemVerilog. That's lower level than just about anything else you can get code for (I suppose SPICE is even lower level in this sense...)
I'm responding to the article, which made the claim that the tradeoff when using a functional language is performance vs correctness. This is obviously not true in any meaningful capacity, when the vast majority of imperative language deployments are using languages slower than the vast majority of functional language deployments. The fact that the most popular imperative languages are usually JIT compiled (not interpreted, except python) instead of statically compiled is not relevant to the argument.
The claim wasn't "most popular vs. most popular". Why don't you choose "highest performance `FP language` vs highest performance `imperative language`"?
Also, `FP language` and `imperative language` aren't well-defined concepts (nor can they be, frankly). These are paradigms that languages have varying degrees of support for. So the entire foundation of the argument is meaningless.
It's actually even worse than that. Functional programming is on pretty much all accounts FASTER than imperative programming by design. Imperative programming heavily relies on state mutation and branching. FP relies on immutability, purity and streaming. Essentially all those are properties that make code faster/parallelizable. Yes, you can usually write SOME imperative form of an algorithm that is faster than its FP counterpart, but generally it will be much less readable and maintainable. The beauty about FP is that you can get readable, maintainable code that is often on par or faster than "good looking" imperative code and also parallelizable. Writing imperative, parallel code is often an absolute nightmare, that any entry level programmer and even experienced programmers will get wrong.
The best performing languages are imperative (C, C++, Rust), with functional languages being at least an order of magnitude slower both in benchmarks and real-life workloads. Whatever properties were supposed to make it faster/parallelizable have clearly not manifested themselves very well after decades.
Imperative and parallel code also does not have to be a nightmare at all, as evidenced by Rust and even Go.
> functional languages being at least an order of magnitude slower both in benchmarks and real-life workloads
Not even the most contrived benchmark games place functional languages an order of magnitude slower than C. Maybe 2-5 times slower on average. Of course, this doesn’t actually matter, because 95% of apps are performance constrained because of architectural reasons, and functional languages tend to scale better in terms of architectural asymptotics.
> Imperative and parallel code also does not have to be a nightmare at all, as evidenced by Rust and even Go.
Go is, by all accounts, a massive nightmare. I think this claim calls your judgement into question.
Both of your comments would have been upvoted if they didn't have offensive judgements in the last sentence.
You speak the truth, functional programming languages have excellent performance characteristics when compared to other GC'd or JIT'ed languages.
I'm not sure about FP leading to better architecture, it might just be the type system there, which Rust has adapted (in a slightly less powerful form) as well.
Maybe if you had used Go to solve the sort of problem that it's meant to solve you wouldn't have judged it such a nightmare.
Go was designed as a language with fairly good performance, but with limited capability so that programmers with limited skills would find it hard to get too far into the weeds. It is well suited to undemanding tasks.
Most programming tasks are undemanding, and most programmers have limited skills, so it is a good match for a great deal of routine work. It fits in the same niche as Java, where Java's OO apparatus imposes complications that are superfluous for simple applications, and an actual obstacle in most. So, a language that abandons all that is better in those applications.
Simple tools are not bad except when you need something more. So, my agreement was with the idea in mind of writing Pandoc in Go, which would be madness.
Thanks for the details. I hear what you've said. I agree that Go was built with limited capability in order to make it easy for junior developer. And I agree it can slow down more senior developers at times. Although not nearly as much as I expected when I first started using it.
I don't know Pandoc so I'll take your word for it. I'll also agree that there are entire of classes that Go isn't sell suited for, but we'll have to disagree as to whether it's limiting for the types of applications it was created for. I must not be building the same class of application as you are.
By definition, it is not limiting for the types of applications it was created for. We might disagree on where to put the boundary around those applications. Historically, tools have always found uses beyond their intended purpose, and Go is unlikely to buck that.
It really comes down to whether, while working on a big system, you will discover a necessary task that a language cannot do well. This might be because it lacks a key feature, like bit twiddling operations, first-class functions, or operator overloading. It might be inherently just not fast enough to meet a deadline, or not consistently so. It might not know enough about types to prevent common mistakes, or might lack operations on types needed to direct compilation. It might lack the organizational features needed to make a large system manageable. That doesn't make it a bad language, but would make it an unwise choice for a project that might, as time unfolds, be found to need one or other such quality. Generally, the bigger a project is, the more unexpected requirements surface.
This is where we get to the idea of "dynamic range". A language with a wide dynamic range takes correspondingly long to learn. People skilled in it are harder to find and more expensive to hire, you have fewer of them ready to hand, and those you have are likely to be already busy. Yet, you might need more, and any you have more than earn their keep.
In theory purity unlocks optimizing potential, but in practice I don't think what you're saying holds true.
Mutating a value is almost always cheaper than allocating a new version of that value and collecting the old one. I think you're vastly understating the "you can usually" and overstating the "much less readable and maintainable".
Haskell is pure functional language and uses linked lists as their main list data structure. This is because it is difficult to modify individual elements in an array in a purely functional manner without copying the entire array. Meanwhile using arrays in C or Rust and changing individual elements is not only very easy, it's very fast as well.
The default string implementation in Haskell is built on a linked list of characters and it performs horribly, so Haskell has multiple string implementations. The default string implementation is still often used because it is more convenient. This is an example where writing more readable code also involves making it perform worse. Rephrased: Faster code is less readable in Haskell.
Immutable data means lack of aliasing problems (two pointers to a mutable cell). There are other languages that have this property, notably Fortran. It does allow many speed optimizations, in both parallel and single-thread cases. Yes, this is one of the reasons why Fortran is routinely faster than C on numeric code, and even Haskell sometimes is, too.
It is theoretically proven that any algorithm which uses mutable data can be converted to use immutable data with a penalty on memory and time not exceeding O(log n). In absolute terms, such a penalty can be low, or pretty high (consider working with large bitmaps). There's no silver bullet.
Functional code is often preferred due to its clarity, both humans and machines have easier time reasoning about it, and even formally proving its properties. Sometimes it is just more important than raw speed, but sometimes allows for more aggressive optimizations (rewriting by the compiler) and higher speed.
>Functional code is often preferred due to its clarity, both humans and machines have easier time reasoning about it, and even formally proving its properties. Sometimes it is just more important than raw speed, but sometimes allows for more aggressive optimizations (rewriting by the compiler) and higher speed.
I'm curious if that's really the case in practice. Functional code allows for clever concise implementations and clever concise implementations are notoriously hard to read. Go, for example, is often lauded for it's readability due to it's more verbose and non-clever nature.
edit: To clarify, I mean readability by the above-average programmer not by a world expert who has spent 15 years becoming one with the language.
Here's a quote that really resonated with my early experience with the language:
I find, even today, it's impossible to check documentation or read "the book", alone, and produce idiomatic rust code. Finally one breaks down and asks on Stack Overflow or one of chat or forum outlets and receives an answer which (almost always accidentally) comes across as, "Oh, didn't you know it?? We always express those ideas in this format here [example code]". I've come to accept and appreciate this way. But it can be baffling and off-putting to newbies who wonder how they should have known.