I strongly disagree with this notion of "simplicity" as being attributable to scarcity of language features. Some of the languages that I felt were the easiest to use had quite a number of language features, but had simple semantics. I think Rich Hickey nailed this in his "Simple Made Easy"[1] talk. Complexity is not about additivity, it's about entanglement.
> Complexity is not about additivity, it's about entanglement.
This. And nothing reflects entanglement better than a formal semantics. English (or any other natural language) always lets you sweep it under the rug. The only objective measure of simplicity is the size of a formal semantics.
> nothing reflects entanglement better than a formal semantics
A formal semantics is just a way to translate from one formalism to another.
It's rather obvious that choosing the target formalism determines how simple the language will appear, when you talk about "formal semantics" you should specify "which one": operational? denotational? axiomatic?
Stricly speaking a compiler or an interpreter represents a formal semantics for a language: operational semanthics rules are often very very similar to the code of an AST interpreter, for example.
One could interpreter your statement to mean that the smaller the compiler the simpler the language, which means that assembly language was the simplest language all along!
For example, in your reddit post you claim that := is problematic, and indeed its semantics is tricky and often trips beginners (and even experienced!) programmers. However := semantics is not actually that complicated "define every variable that isn't defined inside the current scope, otherwise assign them" and the errors stem from the fact that people assume that the scope lookup for := is recursive, which would arguably result in a more complicated formal semantics.
> For example, in your reddit post you claim that := is problematic, and indeed its semantics is tricky and often trips beginners (and even experienced!) programmers. However := semantics is not actually that complicated "define every variable that isn't defined inside the current scope, otherwise assign them" and the errors stem from the fact that people assume that the scope lookup for := is recursive, which would arguably result in a more complicated formal semantics.
Clearer examples of unnecessary complexity in Go would be the function-scoped nature of "defer" (implicit mutable state is much more complicated than block scoping) and the inconsistent behavior of "nil" with the built-in collections (reading from a nil map returns zero values, but reading from a nil slice panics).
> A formal semantics is just a way to translate from one formalism to another.
Of course, we need to reach a gentleman's agreement regarding which formalism is a good “foundation” for defining everything else. My personal preference would be to define all other formal systems in terms of rules of inference.
> It's rather obvious that choosing the target formalism determines how simple the language will appear, when you talk about "formal semantics" you should specify "which one": operational? denotational? axiomatic?
I am fine with any, as long as the same choice is made for all languages being compared. What ultimately interests me is proving a type safety theorem, that is, a precise sense in which “well typed programs don't go wrong”, so perhaps this makes a structural operational semantics more appropriate than the other choices.
> Stricly speaking a compiler or an interpreter represents a formal semantics for a language: operational semanthics rules are often very very similar to the code of an AST interpreter, for example.
> One could interpreter your statement to mean that the smaller the compiler the simpler the language, which means that assembly language was the simplest language all along!
Sure, but the target languages used by most compilers are often themselves very complex. Which means a realistic compiler or interpreter most likely won't be a good benchmark for semantic simplicity.
>Of course, we need to reach a gentleman's agreement regarding which formalism is a good “foundation” for defining everything else. My personal preference would be to define all other formal systems in terms of rules of inference.
If you are interested in defining "low cognitive load" that's a poor choice, in my opinion.
>I am fine with any, as long as the same choice is made for all languages being compared. What ultimately interests me is proving a type safety theorem, that is, a precise sense in which “well typed programs don't go wrong”, so perhaps this makes a structural operational semantics more appropriate than the other choices.
I'm not aware of any such thing, the kinds of formal semantics that academics prefer deal very poorly with the realities of finite execution speed and memory, the kinds that pratictioners use (which usually isn't referred to as "formal semantics" but rather "what does this compile to") deal very poorly output correctness.
However this has little to do with cognitive load, even if such formal semantics existed it doesn't necessarily mean it would be easy for a human mind.
> Sure, but the target languages used by most compilers are often themselves very complex. Which means a realistic compiler or interpreter most likely won't be a good benchmark for semantic simplicity.
If you agree that formal semantics is just a translation from one formalism to another, you can't claim that a formalism A is semantically more complex than formalism B without picking a formalism C as a reference point.
> If you are interested in defining "low cognitive load" that's a poor choice, in my opinion.
I'm interested in “low cognitive load without sacrificing technical precision.” It's a much harder goal to achieve than “low cognitive load if we hand-wave the tricky details.”
> However this has little to do with cognitive load, even if such formal semantics existed it doesn't necessarily mean it would be easy for a human mind.
Which is exactly my point. I only consider a language simple if its formal description is simple.
> If you agree that formal semantics is just a translation from one formalism to another, you can't claim that a formalism A is semantically more complex than formalism B without picking a formalism C as a reference point.
No disagreement here. I even stated my personal choice of C.
> I'm interested in “low cognitive load without sacrificing technical precision.”
You don't seem to be interested in low cognitive load at all, otherwise:
> No disagreement here. I even stated my personal choice of C.
you would have attempted to motivated your choice of reference point in terms of cognitive load. Even if induction mathematics was the way the human mind worked (which it isn't) it's very different from CPUs and there is a cognitive load (and semantical distance) in going from mathematics to CPUs.
> Even if induction mathematics was the way the human mind worked (which it isn't)
Even if it isn't how the human mind works, it's how computing itself works. Would you take seriously a physicist who denies gravity? I wouldn't take seriously a computer scientist who denies structural induction.
> The only objective measure of simplicity is the size of a formal semantics.
If we accept that, then simplicity alone is not a desirable goal. Something may well be formally simple but at the same time incompatible with human cognition. Indeed, that may not be objective, but since when do we value things only by objective measures? That the only objective measure of simplicity may be the size of formal semantics does not mean that it is the most useful measure of simplicity (if we wish to view simplicity as possessing a positive value that implies ease of understanding).
But for writing actual programs, the complexity of using matters as much as the complexity of understanding.
(I recognize that this doesn't invalidate the point you are trying to make in the parent post. They aren't incompatible with human understanding. They're incompatible with writing programs in a reasonable amount of time, though.)
In general, the simpler the language, the more complex the code to implement the solution in that language, and so the harder it is to understand the code. But the more complex the language, the simpler (and easier to understand) the code, but the language itself is harder to understand.
It's almost like you want the language to have the square root of the complexity of the problem.
(This is in general. The big way around this is to pick a language that is well-suited for your particular problem.)
If you want an alternative explanation for simplicity, I'd say simplicity implies flexibility.
Designing a simple implementation of something means that it is as close to the essence of what you've designed it for, and by doing so you've made it more universal, and therefore more flexible/adaptable.
So I sort of agree with you here, but only as a partial converse:
> If all the formal semantic models for a language are unwieldy then you've probably got a non-simple language.
Now, "simplicity" is a mental construct, a language UX construct. To handle this, I think of "unwieldy" as a bit of a technical term. What does it mean to be unwieldy? It means that there is significant non-ignorable complexity.
Significant here must be defined almost probabilistically, too. If there is significant complexity which is ignorable across 99/100 real-world uses of a language then it really should win some significant points.
Ignorable complexity is also an important concept. It asks you to take empirical complexity measures (you mention Kolmogorov complexity; sure why not?) and temper them against the risk of using a significantly simpler "stand-in" semantic model. I accept that the stand-in model will fail to capture what we care about sometimes, but if it does so with an acceptable risk profile then I, pretty much definitionally, don't care.
Now that I've weakened your idea so much, it's clear how to slip in justifications for really terrible languages. Imagine one with a heinous semantics but a "tolerable" companion model which works "most of the time".
From this the obvious counterpoint is that "most of the time" isn't good enough for (a) large projects (b) tricky problems and (c) long support timelines. Small probabilities grow intolerable with increased exposure.
---
But after all this, we're at an interesting place because we can now talk about real languages as being things with potentially many formally or informally compatible formal or informal semantic models. We can talk about how complexity arises when too few of these models are sufficiently simple. We can also talk about whether or not any of these models are human-inelligible and measure their complexity against that metric instead of something more alien like raw Kolgomorov complexity.
So here's what I'd like to say:
> Languages which hide intolerable complexity in their semantics behind surface simplicity are probably bad long-term investments.
and
> Languages which have many "workably compatible" semantic models, each of which being human-intelligible, are vastly easier to use since you can pick and choose your mode of analysis with confidence.
and
> Value-centric semantic models (those ones with that nasty idea of "purity" or whatever) are really great for reasoning and scale very well.
In particular, I'm personally quite happy to reject the assertion made elsewhere that value-centric semantics are not very human intelligible. On the other hand
> Simple operational semantic models are also pretty easy to understand
> Now, "simplicity" is a mental construct, a language UX construct.
My take on “simplicity” is very computational. To me, a programming language is a system of rules of inference, whose judgments are of the form “program is well-formed” (which covers syntax and type checking) and “program does this at runtime” (a reduction relation, a predicate transformer semantics, or whatever fits your language's dynamics best). Then, simplicity is just some measure of the language's size as a collection of rules of inference. Also:
0. Undecidable rules of inference (e.g., type reconstruction for a Curry-style System F-omega) are considered cheating. Undefined behavior (e.g., C and C++) is also considered cheating. Cheating is penalized by considering the entire language infinitely complex.
1. Languages (e.g., ML's module system) are allowed to be defined by elaboration into other languages (e.g., System F-omega). Elaboration into a language that cheats is considered cheating, though.
> To handle this, I think of "unwieldy" as a bit of a technical term. What does it mean to be unwieldy? It means that there is significant non-ignorable complexity.
I don't see any complexity as ignorable at all. I just see some complexity as worth the price - but you, the programmer, need to be aware that you're paying a price. For instance, the ease with which one can reason about Haskell programs (without the totally crazy GHC extensions) justifies the increased complexity w.r.t., say, Scheme.
> Significant here must be defined almost probabilistically, too. If there is significant complexity which is ignorable across 99/100 real-world uses of a language then it really should win some significant points.
This is ease of use, which is subject to statistical analysis; not simplicity, which is not.
I don't want to deny that those "quantitative" measures exist. I want to cast doubt that they're the dominant mechanism for modeling how real people think when they're accomplishing a task in a formal system.
Generics solve an occurrence of too much entanglement. That is, it solves entanglement of an abstract "shape" of computation with a specific set of type definitions. Generics actually allow you to not think about an additional dimension of your program (i.e. the exact types a computation or data type can be used with).
Haskell programmers famously point this out with the observation that a generic fmap is safer than one that has knowledge of the concrete types it uses. The type signature of fmap is this:
fmap :: Functor f => (a -> b) -> f a -> f b
In practice, what this means is that you can be assured that your fmap implementation can only apply the passed function over the value(s) wrapped in the functor, because of the fact that it cannot have visibility into what types it will operate on.
In golang, because of a lack of generics, you can write a well-typed fmap function, but it will inherently be coupled with the type of the slice it maps over. It also means the author of such a function has knowledge of all the properties involved in the argument and return type of the function passed, which means the writer of an fmap can do all kinds of things with that data that you have no assurances over.
Exactly. Parametricity is the killer feature of statically typed functional languages. This why it saddens me when Haskell and OCaml add features that weaken parametricity, like GADTs and type families.
Without either GADTs or type families, two types `Foo` and `Bar` with mappings `fw :: Foo -> Bar` and `bw :: Bar -> Foo` that compose in both directions to the identity, are “effectively indistinguishable” from one another in a precise sense. If you have a definition `qux :: T Foo`, for any type function `T` not containing abstract type constructors, you can construct `justAsQuxxy :: T Bar` by applying `fw` and `bw` in the right places.
With either GADTs or type families, this nice property is lost.
However, GADTs and TFs are completely opt-in, so it seems a bit of a stretch to construe this as a generally bad thing. IME it's not as if library authors are arbitrarily (i.e. for no good reason) using GADTs or TFs instead of plain old type parameters in their APIs.
Reflection, downcasts and assigning `null` to pointers are completely opt-in in Java too.
With respect to type families, I'm probably being a little bit unfair. Personally, I don't have much against associated type families. (Although I think Rust handles them much more gracefully than GHC.) But very important libraries in the GHC ecosystem like vector and lens make extensive use of free-floating type families, which I find... ugh... I don't want to get angry.
> Reflection, downcasts and assigning `null` to pointers are completely opt-in in Java too.
No, they're not -- not in the same sense, at least. A GADT/TypeFamily is going to be visible in the API. None of the things you mentioned are visible in the API.
> A GADT/TypeFamily is going to be visible in the API.
Only works if you're never going to make abstract types. Which I guess is technically true in Haskell - the most you can do is hide the constructors of a concrete type. But the ability to make abstract types is very useful.
Don't get me wrong, I love Haskell. It's precisely because I love Haskell that I hate it when they add features that make it as hard to reason about as C++. (Yes, there I said it - type families are morally C++ template specialization.)
If a type is abstract then the rest is up to the implementation of functions that operate on the data type -- and that could be hiding all kinds of nastiness like unsafePerformIO and the like. Yet, we usually don't care about that because it's an implementation detail.
Am I missing some way to "abuse" GADTs/TFs to violate the abstraction boundary or something like that? (I seriously can't see what you think the problem is here. I mean, you can equally well abuse unsafeCoerce/unsafePerformIO to do all kinds of weird things to violate parametricity, so I don't see why GADTs/TFs should be singled out.)
It's something weaker. Consider the groupoid of Haskell types and isomorphisms. Without GADTs and type families, all type constructors of kind `* -> *` are endofunctors on this groupoid.
Note 1: And there are higher-kinded analogues, but I hope you get the idea from this.
Note 2: There are also exceptions, like `IORef` and friends.
Sorry, for some reason the “reply” link didn't appear below your post until after I had written my reply to Peaker. My reply to you is exactly the same:
> How do you have a large set of language features with them not interacting?
The ability to write interesting programs in a language comes from the interaction between its features. The real problem is features that interact in unpleasant ways, which almost always results from a lack of foresight on the language designer's part.
> In C++, RAII interacts with exceptions, which is the point but isn't exactly pleasant.
The interaction between control effects (of which exceptions are a particular case) and substructural types (of which C++'s RAII is a very broken particular case) is certainly nontrivial [0], but this doesn't mean we should give up on either feature. Control effects make it easier to design and implement extensible programs. Substructural types allow you to safely manipulate ephemeral resources, such as file handles, database connections or GUI objects.
> The interaction between control effects (of which exceptions are a particular case) and substructural types (of which C++'s RAII is a very broken particular case) is certainly nontrivial
A nitpick, but what constitutes an effect is rather arbitrary. An effect in the PFP sense is not an operational definition (other than IO) but a linguistic one. This is why I think that handling errors well, handling mutation well and handling IO well are three completely different problems that are only accidentally bundled into one by PFP for no cognitive/empirical reason other than that the lambda calculus happens to be equally challenged by all three.
There is a fourth effect, which is just as operational as IO (and thus a "truer" effect than errors or mutation) and is often the most interesting, yet it happens to be the one that baffles PFP/LC most: the passage of time. This is why there are usually two ways to sleep in PFP languages, one considered an effect, and the other is not (but happens to be much more operationally disruptive, and thus a stronger "effect").
I was talking only about control effects, not I/O or mutation. Control effects are basically stylized uses of continuations, with less insanity involved.
I understand. I just said that classifying non-linear transfer of control (whether exceptions or proper continuation) as an effect at all is quite arbitrary, and is just a common usage in the PFP world.
Of course, substructural types are also a language concept (that does indeed interact badly with non-local jumps), which is why I said it was a nitpick about the use of the word "effect".
> I just said that classifying non-linear transfer of control (whether exceptions or proper continuation) as an effect at all is quite arbitrary, and is just a common usage in the PFP world.
What exactly makes it arbitrary? It's pretty sensible, even if you don't have substructural types.
> Of course, substructural types are also a language concept (that does indeed interact badly with non-local jumps)
Control effects and substructural types don't interact “badly”. They just require care if you want them together. If you desugar control effects into delimited continuations (that is, normal higher-order functions), it becomes clear as daylight how to correctly handle their interaction with substructural types.
The word effect in the PFP world denotes anything that a language-level function does which may affect other functions and is not an argument or a return parameter. That definition is not valid outside of PFP/LC, because it defines as effects as things that are indistinguishable from non-effects in other models of computation. E.g. it calls assignments to certain memory cells "effects" while assignments to other memory cells non-effects.
Again, my (very minor) point is that the word "effect" as you use it simply denotes a PFP linguistic concept rather than an essential computational thing. The only reason I mention it is that the word "effect" has a connotation of something that's real and measurable beyond the language. That's true for IO and time (computational complexity, which, interestingly, is not generally considered an effect in PFP), but not true for jumps (or continuations) and mutation.
> delimited continuations (that is, normal higher-order functions)
Again, you are assuming PFP nomenclature. Delimited continuations do not require language-level functions at all, and higher-order functions can be defined in terms of delimited continuations just as the opposite is true. Delimited continuations are no more higher-order functions than higher-order functions (or monads, rather) are delimited continuations. PFP is not the only way to look at abstractions and not the only fundamental nomenclature.
Purity can be defined very nicely against the arrows in a compositional semantics of a language and then effects follow as reasons for impurity.
This is absolutely just a choice. It all ends up depending upon how you define equality of arrows. You could probably even get weirder notions of purity if you relax equality to a higher-dimensional one.
So, it's of course arbitrary in the sense that you can just pick whatever semantics you like and then ask whether or not purity makes much sense there. You point out that "passage of time" is an impurity often ignored and this is, of course, true since we're talking (implicitly) about "Haskell purity" which is built off something like an arm-wavey Bi-CCC value semantics.
A much more foundational difference of opinion about purity arises from whether or not you allow termination.
I'd be interested to see a semantics where setting mutable stores is sufficiently ignored by the choice of equality as to be considered a non-effect. I'm not sure what it would look like, though.
> A much more foundational difference of opinion about purity arises from whether or not you allow termination.
Termination or non-termination? One of the (many) things that annoy me about PFP is the special treatment of non-termination, which is nothing more than unbounded complexity. In particular, I once read a paper by D.A. Turner about Total Functional Programming that neglected to mention that every program ever created in the universe could be turned into a total function by adding 2^64 (or a high enough counter) to every recursive loop without changing an iota of its semantics, therefore termination cannot offer a shred of added valuable information about program behavior. Defining non-termination as an effect -- as in F* or Koka (is that a Microsoft thing?) -- but an hour's-computation as pure is just baffling to me.
> I'd be interested to see a semantics where setting mutable stores is sufficiently ignored by the choice of equality as to be considered a non-effect. I'm not sure what it would look like, though.
I think both transactions and monotonic data (CRDTs), where mutations are idempotent, are a step in that direction.
And of course that's true! Trivially so, though, in that we could do the same by picking the counter to be 10 instead of 2^1000, since we don't appear to care about changing the meaning of the program.
If we do, then we have to consider whether we want our equality to distinguish terminating and non-terminating programs. If it does distinguish, then non-terminating ones are impure.
Now, what I think you're really asking for is a blurry edge where we consider equality module "reasonable finite observation" in which something different might arise.
But in this case you need partial information so we're headed right at CRDTs, propagators, LVars, and all that jazz. I'm not for a single second going to state that there aren't interesting semanticses out there.
Although I will say that CRDTs have really nice value semantics with partial information. I think it's a lot nicer than the operational/combining model.
> If we do, then we have to consider whether we want our equality to distinguish terminating and non-terminating programs.
But this is what bugs me. As someone working on algorithms (and does not care as much about semantics and abstractions), the algorithm's correctness is only slightly more important than its complexity. While there are (pragmatic) reasons to care about proving partial correctness more than total correctness (or prioritizing safety over liveness in algorithmists' terms), it seems funny to me to almost completely sweep complexity -- the mother of all effects, and the one at the very core of computation -- under the rug. Speaking about total functions does us no favors: there is zero difference between a program that never terminates, and one that terminates one nanosecond "after" the end of the physical universe. Semantic proof of termination, then, cannot give us any more useful information than no such proof. Just restricting our computational model from TM to total-FP doesn't restrict it in any useful way at all! Moreover, in practical terms, there is also almost no difference (for nearly all programs) between a program that never terminates and one that terminates after a year.
Again, I fully understand that there are pragmatic reasons to do that (concentrate on safety rather than liveness), but pretending that there is a theoretical justification to ignore complexity -- possibly the most important concept in computation -- in the name of "mathematics" (rather than pragmatism) just boggles my mind. The entire notion of purity is the leakiest of all abstractions (hyperbole; there are other abstractions just as leaky or possibly leakier). But we've swayed waaaay off course of this discussion (entirely my fault), and I'm just venting :)
I don't think at all that "value semantics" without any mention of complexity is an end in and of itself. Any sensible programmer will either (a) intentionally decide that performance is minimally important at the moment (and hopefully later benchmark) or (b) concern themselves also with a semantic model which admits a cost model.
Or, to unpack that last statement, simulate the machine instructions.
I'm never one to argue that a single semantic model should rule them all. Things are wonderful then multiple semantic models can be used in tandem.
But while I'd like to argue for the value of cost models, at this point I'd like to also fight for the value-based ones.
Totality is important not because it has a practical effect. I vehemently agree with how you are arguing here to that end.
It's instead important because in formal systems which ignore it you completely lose the notion of time. Inclusion of non-termination and handling for it admits that there is at least one way which we are absolutely unjustified in ignoring the passage of time: if we accidentally write something that literally will never finish.
It is absolutely a shallow way of viewing things. You're absolutely right to say that practical termination is more important that black-and-white non-termination.
But that's why it's brought up. It's a criticism of certain value-based models: you guys can't even talk about termination!
And then it's also brought up because the naive way of adding it to a theorem prover makes your logic degenerate.
> And then it's also brought up because the naive way of adding it to a theorem prover makes your logic degenerate.
Well, I'd argue that disallowing non-termination in your logic doesn't help in the least[1], so you may as well allow it. :) But we already discussed in the past (I think) the equivalence classes of value-based models, and I think we're in general agreement (more or less).
[1]: There are still infinitely many different ways to satisfy the type a -> a (loop once and return x, loop twice, etc. all of them total functions), and allowing (and equating) all of them loses the notion of time just as completely as disallowing just one of them, their limit (I see no justification for assuming a "discontinuity" at the limit).
It's not the type (a -> a) which is troubling, it's the type (forall a . (a -> a) -> a) which requires infinite looping. It's troubling precisely because the first type isn't.
Oh, I see. It's an element in the empty set, which is indeed very troubling for constructive logic. Well, they're both troubling in different ways. Your example is troubling from a pure mathematical soundness perspective, and mine is from the "physical"[1] applicability of the model.
[1]: The relationship between classical math and computation is in some ways like that of math and physics, except that physics requires empirical corroboration, while computation is a kind of a new "physical" math that incorporates time. In either case the result can be the same: the math could be sound but useless. In physics it may contradict observation; in computation it can allow unbounded (even if not infinite) complexity.
It causes trouble for non-constructive logics, too. Any logic with an identity principle will be made inconsistent with the inclusion of `fix : forall a . (a -> a) -> a`.
By yours are you referring to `forall a . a -> a`? I don't see how that principle is troubling at all.
It is troubling in the same way, but more subtly, and it has to do with the interpretation of the logic rather than the logic itself. The problem with (a -> a) -> a is that you can prove any a. Now, this is indeed a problem if you're trying to use types to prove mathematical theorems (one interpretation). But what if you're using types to prove program correctness (second interpretation, this one computational)? Why is it troubling? Well, it's troubling because you may believe you've constructed a program that produces some result of type x, but really you haven't, because somewhere along the way, you've used a (a->a)->a function (or forall a b. a->b). But the thing is that from one interpretation you really have succeeded. Your type is populated, but it is populated with a nonterminating function. Why is that a problem? It's a problem because it may cause me to believe that I have a program that does something, while in reality that program is useless.
Now back to my issue. Suppose that somewhere along the way you rely not on a non-terminating function but on a high-complexity function (e.g. a function that factors integers). You may then believe you've constructed a program, but your program is not only just as useless as the non-terminating one, but useless in the same way. A program that takes 10000 years is much more equivalent to a non-terminating program than to one that completes in one second. Your types are still populated with "false" elements, and so your logic, while now useful for proving mathematical theorems, may still prove "false" programs, in the sense of useless programs.
HOWEVER, what I said has a practical flaw, which still keeps excluding non-termination but allowing high-complexity useful. And that is that it's much easier for human beings to accidentally create programs with infinite complexity, rather than accidentally create programs with a finite, but large complexity. I don't know if we have an answer as to why exactly that is so. It seems that there are many cases of "favored" complexity classes, and why that is so is an open problem. Scott Aaronson lists the following as an open question[1]:
The polynomial/exponential distinction is open to obvious objections: an algorithm that took 1.00000001^n steps would be much faster in practice than an algorithm that took n^10000 steps! But empirically, polynomial-time turned out to correspond to “efficient in practice,” and exponential-time to “inefficient in practice,” so often that complexity theorists became comfortable making the identification... How can we explain the empirical facts on which complexity theory relies: for example, that we rarely see n^10000 or 1.0000001^n algorithms, or that the computational problems humans care about tend to organize themselves into a relatively-small number of equivalence classes?
Nevertheless, it is important to notice that what makes non-termination-exclusion useful in practice is an empirical rather than a mathematical property (at least as far as we know). Which is my main (and constant) point that computation and software are not quite mathematical, but in many ways resemble physics, and so relying on empirical (even cognitive) evidence can be just as useful than relying on math. The two should work in tandem. It is impossible to reason about computation (more precisely, software), with math alone; there are just too many empirical phenomena in computation (and software in particular) for that to make sense. I feel (and that may be a very biased, wrong observation) that the software verification people do just that, while the PLT people (and by that I don't mean someone like Mattias Felleisen, but mostly PFP and type theory people) do not.
How can that look in practice? Well, observing (empirically) that the complexity spectrum is only sparsely populated with programs humans write (and that's true not only for instruction counts but also of IO operations, cache-misses etc.), perhaps we can create an inferrable type system that keeps track of complexity? I know that integer systems with addition only are inferrable, but I'm not sure about multiplication (I don't think so, and I know division certainly isn't). Perhaps we can have a "complexity arithmetics" that is inferrable, and allows "useful rough multiplication" even if not exact multiplication? A Google search came up with some work in that direction: http://cristal.inria.fr/~fpottier/slides/fpottier-2010-05-en... (I only skimmed it).
I don't agree with pron overall, but he does have a point. Termination and algorithmic complexity do matter, and the techniques Haskell programmers advocate for reasoning about programs have a tendency to sweep theese concerns under the rug. This is in part why I've switched to Standard ML, in spite of its annoyances: No purity, higher kinds, first-class existentials or polymorphic recursion. And no mature library ecosystem. But I get a sane cost model for calculating the time complexity of algorithms. And, when I need laziness, I can carefully control how much laziness I want. Doing the converse in Haskell is much harder, and you get no help whatsoever from the type system.
As an example, consider the humble cons list type constructor. Looks like the free monoid, right? Well, wrong. The free monoid is a type constructor of finite sequences, and Haskell lists are potentially infinite. But even if we consider only finite lists, as in Standard ML or Scheme, the problem remains that, while list concatenation is associative, it's much less efficient when used left-associatively than when used right-associatively. The entire point to identifying a monoid structure is that it gives you the freedom to reassociate the binary operation however you want. If using this “freedom” will utterly destroy your program's performance, then you probably won't want to use this freedom much - or at least I know I wouldn't. So, personally, I wouldn't provide a Monoid instance for cons lists. Instead, I would provide a Monoid instance for catenable lists. [0]
By the way, this observation was made by Stepanov long ago: “That is the fundamental point: algorithms are defined on algebraic structures.” [1] This is the part Haskellers acknowledge. Stepanov then continues: “It took me another couple of years to realize that you have to extend the notion of structure by adding complexity requirements to regular axioms.” [1]
Of course, none of this justifies pron's suspicion of linguistic models of computation.
> Of course, none of this justifies pron's suspicion of linguistic models of computation.
Of course. :)
But my view stems from the following belief that finally brings us back to your original point and my original response: there can be no (classical) mathematical justification to what you call linguistic models of computation because computation is not (classical) math, as it does not preserve equality under substitution. The implication I draw from this is not quite the one you may attribute to me such as an overall suspicion, complete rejection or dismissal of those models, but the recognition that their entire justification is not mathematical but pragmatic, and that means that the very same (practical) reasons that might make us adopt the (leaky) abstraction of those models, might lead us to adopt (or even prefer) other models that are justified by pragmatism alone -- such as empirical results showing a certain "affinity" to human cognition -- even if they don't try to abstract computation as classical math.
Of course, computation is more foundational. It's mathematics that's just applied computation.
> as it does not preserve equality under substitution
You just need to stop using broken models.
> but the recognition that their entire justification is not mathematical but pragmatic
I don't see a distinction. To me, nothing is more pragmatic to use than a reliable mathematical model.
> the (leaky) abstraction of those models
Other than the finiteness of real computers, what else is leaky? Mind you, abstracting over the finiteness of the computer is an idea that even... uh... “less mathematically gifted” languages (such as Java) acknowledge as good.
> such as empirical results showing a certain "affinity" to human cognition
Experience shows that humans are incapable of understanding computation at all. But computation is here to stay, so the best we can do is rise to the challenge. Denying the nature of computation is denying reality itself.
No computation preserves equality under substitution. If your model assumes that equality, it is a useful, but leaky abstraction.
> Other than the finiteness of real computers, what else is leaky?
The assumption of equality between 2 + 2 and 4, which is true in classical math but false in computation (if 2+2 were equal to 4, then there would be no such thing as computation, whose entire work is to get from 2 + 2 to 4; also, getting from 2+2 to 4 does not imply the ability to get from 4 to 2+2).
> Experience shows that humans are incapable of understanding computation at all.
Experience shows that humans are capable of creating very impressive software (the most impressive exemplars are almost all in C, Java etc., BTW).
> The assumption of equality between 2 + 2 and 4, which is true in classical math but false in computation
Using Lisp syntax, you are wrongly conflating `(+ 2 2)`, which is equal to `4`, with `(quote (+ 2 2))`, which is obviously different from `(quote 4)`. Obviously, a term rewriting approach to computation involves replacing syntax objects with syntactically different ones, but in a pure language, they will semantically denote the same value.
Incidentally:
0. This conflation between object and meta language rôles is an eternal source of confusion and pain in Lisp.
1. Types help clarify the distinction. `(+ 2 2)` has type `integer`, but `(quote (+ 2 2))` has type `abstract-syntax-tree`.
> very impressive software
For its lack of conceptual clarity. And for its bugs. I'm reduced to being a very conservative user of software. I wouldn't dare try any program's most advanced options, for fear of having to deal with complex functionality implemented wrong.
> Using Lisp syntax, you are wrongly conflating `(+ 2 2)`, which is equal to `4`
It is not equal to 4; it computes to 4. Substituting (+ 2 2) for 4 everywhere yields a different computation with a different complexity.
> but in a pure language, they will semantically denote the same value.
The same value means equal in classical math; not in computation. Otherwise (sort '(4 2 3 1)) would be the same as '(1 2 3 4), and if so, what does computation do? We wouldn't need a computer if that were so, and we certainly wouldn't need to power it with so much energy or need to wait long for it to solve the traveling salesman problem.
> For its lack of conceptual clarity. And for its bugs.
That's a very glass-half-empty view. I for one think that IBM's Watson and self-driving cars are quite the achievements. But even beyond algorithmic achievements and looking at systems, software systems that are successfully (and continuously) maintained for at least a decade or two are quite common. I spent about a decade of my career working on defense software, and that just was what we did.
If you can't distinguish object from meta language, I'm afraid we can't have a reasonable discussion about computing. This distinction is crucial. Go get an education.
If you don't understand what I'm saying -- and that could be entirely my fault -- you can just ask. If you (mistakenly) assume that by 2 + 2 I mean the expression "2 + 2" rather than the computation 2 + 2, why not assume that you may have missed something (which is the actual case) rather than assume that I don't understand the basics (which is not)?
Since I don't wish to discuss this topic further with rude people, but I do wish to explain my point to other readers, I'll note that the entire concept of computational complexity, which is probably the most important concept in all of computer science (and is at the very core of computation itself -- there can be no computation without computational complexity), is predicated on the axiom that in computation 2+2 does not equal 4 (in the sense that they are "the same"), but is computed to be 4. If 2+2 were actually 4, there would be no computational complexity (and so no computation).
As a matter of fact, an entire model, or definition of computation (another is the Turing Machine) called lambda calculus is entirely based on the concept that substitution is not equality in the theory of computation, by defining computation to be the process of substitution (which is what lambda calculus calls reductions). If 4 and 2+2 were the same (as they are in classical math), there would be no process, and the lambda calculus would not have been a model of computation but simply a bunch of trivial (classical) mathematical formulas.
Indeed, some people confuse the LC notation with classical mathematical notation (which it resembles), and mistakenly believe that 2+2 equals 4 in LC in the same sense that it does in math (I assume because the same reductions preserve equality in math). This is wrong (in LC reductions do not preserve "sameness" but induce -- or rather, are -- computation). To their defense, LC does make this fundamental distinction easy to miss in hiding 100% of what it is meant to define -- namely, computation -- in operations that classical mathematicians associate with equality[1], and in itself does not have a useful formulation of complexity[2]. Nevertheless, those people might ignore computational complexity, which is the same as ignoring computation itself, and while they may turn out to be great mathematicians, you would not want them specifying or writing your traffic signal or air-traffic control software.
[1]: Although I believe most notations take care to not separate consecutive reductions with the equal sign but with an arrow or a new line, precisely to signify that reduction is not equality. Also, unlike in math, LC reductions are directional, and some substitutions can't be reversed. In this way, LC does directly represent one property of time: it's directionality.
[2]: The challenge complexity poses to LC is great, and only in 2014 was it proven that it is not just a model of computation but one of a "reasonable machine": http://arxiv.org/abs/1405.3311
Computation is something different. Models like call by push value make this very clear. LC does as well, though, but LC tends to be joined up with an equality semantics which intentionally sweeps computation under the rug for simplicity.
This is a big hairy problem in untyped LC, though, since untyped LC has non-termination and therefore is not confluent. This is what I mean by taking non-termination seriously is one way to force "time" and "computation" back into models. It means that LC has no beta-equivalence the same way that, say, simply typed LC does.
So anyway, you're wrong to say that LC has no notion of complexity—people count reduction steps all the time—but right to say that often this is intentionally ignored to provide simpler value semantics. It's foolish to think of this as equivalent to LC, though.
This paper is interesting. I think what they prove was at least folk belief for a long time, but I've never seen a proof.
> you're wrong to say that LC has no notion of complexity
I didn't say that it has no notion of complexity; I said it "does not have a useful formulation of complexity", as reduction step count are not very useful in measuring algorithmic complexity, at least not the measures of complexity most algorithms are concerned with.
> It's foolish to think of this as equivalent to LC, though.
Oh, I don't think that at all, which is why I specifically said that some people make the mistake of confusing LC reductions with classical substitutions (equality). They may then think that computation can be equational (false), rather than say it may sometimes be useful to think of computation in equational terms, but that's an abstraction -- namely, a useful lie -- that has a cost, i.e. it is "leaky" (true).
Most people consider garbage collection to be a net win in terms of simplicity. Have you thought about why? Not every feature interacts with other features in complicated and error prone ways.
I believe that garbage collection is a net win because it allows software to be composed in simple ways when it would otherwise be difficult to compose.
I can pass data from one part of the program to another without coordinating both parts to respect the same memory management convention, and without having to pass that information from one place to another. This makes it easier to compose software, and in particular to reuse software like libraries (that frequently end up as layers between one component and another). For a concrete example, in a Java program I can simply publish an event into a Guava EventBus [1] without worrying where it will end up at the time I write that code. There's no real risk that I'll end up with a memory leak. I can connect two things together that weren't designed to be used together, and I can do it while inserting intermediate layers that transform, copy, record, measure, that data.
Garbage collection significantly reduces the amount of coordination necessary between unrelated parts of the code base, thereby improving code reuse. This is what I would claim is less commonly recognized win, beyond the more commonly recognized wins from eliminating classes of obvious mistakes. EventBus is just one random example that involves plugging things together - the same effect is present all over Java libraries, from logging frameworks to collections to concurrent data structures.
I think the politest description I can provide of the experience of tracking down GC bugs is that they interacted with other features in complicated and error prone ways.
But was that code in the GC implementation, or your program? Because if its in the implementation, then that is a different matter. We have to distinguish between simplicity of implementation vs simplicity provided the user. I agree that if it is not implemented correctly, it can be a net loss in simplicity.
I think that's what he's appealing to, but I have a hard time reconciling that sentiment with many design characteristics of Go. Go's type system for instance... I don't think he fully grasps "what he's trying to solve" by having a static type system in golang, when the language has things like unsafe casting, null pointers, a lack of parametric polymorphism, etc. As a programmer tool, its hugely weakened by these design decisions... there are large classes of properties about code that are simply impossible (or are much more complicated) to encode using types in golang. And yet in their literature on some of these subjects, they make an appeal to simplicity [1]. I think there's a disconnect here between theory and practice.
From reading the abstract, it sounds like this addresses precisely the topic of "Worse is Better" [1].
There are frequent misunderstandings of this essay -- the argument isn't as coarse as "crappy software wins", or "release early and often".
The tradeoff is: do you want a simple interface (MIT style) or a simple implementation (NJ style)? If you want a simple interface, you have to hide a bunch of complexity underneath. If you want a simple implementation, you punt on some hard things and expose it to the user.
Despite its authorship, and marketing as a better C (which is the epitome of NJ style), Go is MIT style! The concurrency model hides a lot of stuff under the covers: sync vs async syscalls, segmented stacks, etc. And GC is probably the ultimate MIT-style feature, in that the interface is very simple (just pretend you have infinite memory), but implementations are tremendously complex -- basically each GC is unique and its own research project.
EDIT: From paging through the slides, this seems to be basically the gist of the last half, but I don't see that he mentioned "Worse is Better" or MIT vs NJ style...
It is very easy to implement, and leaves a lot of the complexity (null pointers, generics, etc) to the user instead of solving it in the language.
This is a classic example of worse-is-better.
An MIT-style language would be Rust, which indeed struggled to get to v1.0 and Go vs. Rust plays out quite similarly to what worse-is-better would predict.
C won over Lisp, not because “worse is better”, but because the C's advantages over Lisp (performance on cheap machines) were more pronounced than the other way around (“safety” achieved by means of lots of runtime checking - by no means was it possible to statically rule out errors). Lispers fancy their language of choice the pinnacle of computer science, but the falsity of this claim becomes evident under close scrutiny.
> Go vs. Rust plays out quite similarly to what worse-is-better would predict.
I see a brighter future for Rust in the long run, for the following reasons:
First, Go doesn't do enough for the programmer: Haskell and Erlang are safer and easier to use, and have much better parallelism and concurrency stories. What kind of silly value proposition is “safer than C” or “less verbose than Java”? Rust offers something genuinely new: half-assed attempts at “better C++s” have existed for a while, but “safer C++s” haven't.
Second, Rust plays much more nicely with other languages. Rust is no more difficult than C to call from, say, Python. On the other hand, Go seems to be more of an all-or-nothing proposition: “If you want to use Go libraries, use them from applications written in Go.”
Seems like a "boo Go" echo chamber in here, but as someone that moved their entire backend infrastructure to Go, the offering was so compelling it was worth going against common wisdom and actually rewriting huge pieces of infrastructure code. And I'm hardly alone - many huge companies felt the same value proposition and made the same decisions.
Go isn't perfect for many things, but where it's placed itself - among often headless networking infrastructure projects and backends, it's exceedingly good at what it offers.
When I tried Go, things went like this: First, I deliberately wrote a program with a data race, which the data race detector didn't detect. Then, I honestly tried to write a race-free program, and again the data race detector didn't complain. A couple of hours later, I realized that the second program also contained a data race.
And this 'win' is ill defined. Both won on their market. C was great to do what it did, Lisp too. All the research and crazy idea spawned thanks to high level symbolic thinking is still bearing fruit today. It's not a win as in market domination but gene spreading and still being alive too.
I also think Lisp is not truly "better" in the sense "worse-is-better" is ascribing. But if we suspend disbelief - then the Go vs. Rust story matches the one described in "Worse is better".
Perhaps Rust will win out in the end and "worse is better" won't prevail. Despite liking the idea of Rust very much, I don't think it will.
This worldview is overly simplistic. There is no indication that languages ever "win out in the end" and there is no reason to suspect that the industry is not large enough to support healthy ecosystems for both Go and Rust, who, again, are not competitors.
> An MIT-style language would be Rust, which indeed struggled to get to v1.0 and Go vs. Rust plays out quite similarly to what worse-is-better would predict.
Not having to implement a userland OS scheduler, a production-quality garbage collector, and an industrial-strength compiler backend from scratch strikes me as a fairly New Jersey-style approach, don't you think?
(I think this whole "Rust vs. Go" as "MIT vs. New Jersey" argument is pretty silly. There are many factors going into why Go released 1.0 before Rust did, and both communities have consensus that you shouldn't treat Rust vs. Go as a horse race anyway.)
There isn't one language to rule them all. An advantage of Go over Rust is easier tooling due to a simpler language, an advantage of Rust is a richer type system. These are in opposition.
Rust can reuse nearly all C tools like kcov, gdb, perf, valgrind. On top of it, it has a solid set of its own tools. For example it has rustfmt, cargo, rustdoc (having the ability to test your Rust documentation is great).
Only thing it truly misses is a IntelliJ IDE for Rust and possibly a Rust oriented coverage system.
> An advantage of Go over Rust is easier tooling due to a simpler language
I'm curious how "tooling" is easier in Go than in Rust. To my mind, dynamic tooling (instrumentation, profiling, debugging, etc.) is much easier in Rust than in Go, because Rust uses the OS facilities and looks much like native code, while Go's runtime reinvents the world.
This is in fact what we've seen—Rust didn't have to write custom profilers, custom debuggers, language-specific race detectors, and so on, because the native system tools work just fine.
(NB: I'm not saying Go made the wrong decision with a userland scheduler; I'm objecting to the idea that it's somehow simpler to do that. It's much more complex.)
I should have been more specific by tooling. I meant source code manipulation and analysis, go-oracle and goimports are a couple. There's many more in the go community.
Are they useful, I'm not sure. They feel useful but I have little trust in my judgment. When I switched back in the day from VS to emacs I didn't seem to miss all of the refactoring tools VS provided.
A richer type system is not in opposition to good tooling, quite the opposite. Rust's strong type system enables fantastic static analysis, which, for example, obviates the need for anything like Go's data race detector.
Writing tooling for C++ is difficult because of how entangled all the language's features are (and its wonky grammar certainly doesn't help). In addition C++'s type system is probably even weaker than Go's, having inherited a lot of slop from C.
That said, if you're talking about source code formatters, I'm pretty sure that clang-format predates gofmt (and surely clang-format was not the first automatic source code formatter for C++). The coup for Go was that they got the community at large to standardize upon and champion gofmt, which is not something that C++ has ever matched.
I'm thinking more of difficulty in writing rustfmt compared to gofmt. There's other examples like go-oracle or goimports which as far as I know don't have Rust equivalents.
How useful the extra tooling is, I don't really know.
>
I'm thinking more of difficulty in writing rustfmt compared to gofmt.
The extra stuff that rustfmt does stems from the fact that rustfmt wants to aesthetically format code well, matching the style that the community had settled on prior to its introduction (e.g. avoiding long lines, lining up parameters), while gofmt is fine with less aesthetically pleasing code (long lines, all parameters on one line) as long as the implementation of gofmt remains simple. There's nothing stopping anyone from implementing a bare-bones rustfmt in the style of gofmt and trying to push its style onto the community, but Nick had different goals.
You're right, the comparison isn't as valid as I thought, rustfmt has improved impressively since I looked at it last. 'go fix' might be a better comparison, what's the Rust equivalent?
"Go vs. Rust" is not a valuable comparison because Go and Rust have wildly divergent goals and intended use cases. By the same token you could argue that Rust is a worse-is-better Haskell.
I think that code readability is different from system comprehensibility, and that readability is merely a subfactor to comprehensibility. Readability often means the time it takes for me to grok a function, module, or unit of code, and often what fits on my eyeball.
System comprehensibility is how long it takes for me to learn the minimal set of atoms in order for me to comprehend an entire system.
Rob Pike criticizes map and reduce and other functional idioms as not machine simple. And although Rob Pike doesn't say this, I might argue on his behalf that functional code has a mild reputation for less readability.
But I would argue that functional idioms permit tasteful tradeoffs between machine simplicity and system comprehensibility. I would also argue that when a system gets to a certain largeness, I would be sometimes okay with sacrificing some readability and machine simplicity for system comprehensibility. I would also say that although I might lose some performance advantages with a functional approach, the improvement to system comprehension might mean an improved cognitive capacity for the human to optimize for distributed multi-core contexts, which might offset the losses from machine simplicity.
> I would also say that although I might lose some performance advantages with a functional approach, the improvement to system comprehension means better human optimization of distributed multi-core contexts.
You don't even need to bring in parallelism for functional approaches to gain in performance. Here's an example (in Rust) of how you'd implement collecting an iterator into a vector using the "machine simple" approach:
let mut array = vec![];
for element in some_iterator {
array.push(element);
}
Now for the functional, less "machine simple" approach:
let array = some_iterator.collect();
Which is faster? Without knowing the details of the compiler, you might think the former is faster, because there are fewer function calls and less indirection. But this is wrong, as the compiler will trivially inline "collect" for you and eliminate any indirection.
In fact, it turns out that the second approach is often significantly faster, because the standard library contains the equivalent of:
let mut array = Vec::with_capacity(some_iterator.size_hint());
for element in some_iterator {
array.push(element);
}
That is, it uses the size_hint() method to avoid reallocating during the collection operation, reducing the number of memory allocation and array copy operations to O(1) instead of O(log n).
What are the chances that you'd do this important optimization manually? Not high: it adds noise, it's esoteric, and it's easy to forget. But if you use the functional idiom, you benefit from the optimized approach every time. That is the true benefit of abstraction, which is really the key point here. "Machine simple" and "fast" are frequently in opposition, and too often people think that the former implies the latter.
An excellent example. This is also one of those things that is hard to profile away, because it essentially leads to a very flat profile because you're probably forgetting to use with_capacity()/reserve() all over the place not just in single very-costly cases[1]. So you end up wasting, say, 1% CPU time all over the place... add up enough of these time-wasters and soon you're going to be wasting significant amounts of CPU time for no good reason.
Obviously, this kind of thing isn't a magic bullet, but in general I'd say the key for any kind of program is to express your intent at the highest level possible. This benefits both optimization and human readers! (That obviously means that the programming language must support such abstraction in a non-leaky way.)
Not really... the wasted time is still all malloc() / realloc() / whatever under array.push(), and a half-decent callgraph profiler will catch that. array.push() will score far higher than it ought to, and the bulk of its time will be the reallocation.
Source: I've used oprofile, and later perf, to hunt exactly this kind of performance bug.
.
Of course, fixing a pervasive issue like that is much more tedious than just finding it. :-(
This talk has a lot of very weak points. He made the claim that more features hurt readability (~6:10), because when you are reading you have to waste time thinking about why the programmer chose the set of features he did to write the code. To make that kind of claim without qualifications is just ridiculous - if it were true, why add any feature to a programming language at all? Especially if one believes readability is the most important thing in a programming language, which Rob Pike says he does. It may be true that some features hurt readability, but it is obviously not the case that all features hurt readability for all people all of the time.
It is also strange that he makes so many claims about readability, considering it is the most subjective attribute of a programming language there is. For example, some code that would have been completely unapproachable to me a couple of years ago is perfectly readable to me today, because I've learned new concepts.
Some history might help. Go was created largely as a reaction to writing server side C++. I have sympathy after writing but mostly reading C++ code bases. I swore never again.
I react in horror to the features being added to CUDA and SyCL to make them closer to C++.
I still hate C++ with a passion, grepping around in gecko is much harder than servo.
> because when you are reading you have to waste time thinking about why the programmer chose the set of features he did to write the code.
It seems to me that what Pike is getting at here is how easy it is to build mental models of the code, the language, the underlying system, down through all its layers, and, yes, the code's original author and any subsequent authors.
I share your frustration at the subjectivity of readability claims. There's actually starting to be good academic research on code readability, for instance this paper:
Oh, or this investigation into programming language syntax for learnability!
> Recent studies in the literature have shown that syntax remains a significant barrier to novice computer science students in the field. While this syntax barrier is known to exist, whether and how it varies across programming languages has not been carefully investigated. For this article, we conducted four empirical studies on programming language syntax as part of a larger analysis into the, so called, programming language wars.
As someone maintaining an 18 year old code base, I've long held the same opinions as Rob. I often describe myself as the dumbest person on my team and then go on to say how fortunate it was that I was here first. All my smart people need to dumb stuff down enough that I can understand it. Which means we have a prayer of supporting it.
Obviously I can't force everything through that filter, some problems require complex solutions.
Hacker News is full of really bright people who like the expressiveness of Haskell et al. So it's not surprising that Rob's not going over well here. Just saying that us dumb people really appreciate his efforts.
> Hacker News is full of really bright people who like the expressiveness of Haskell et al. So it's not surprising that Rob's not going over well here. Just saying that us dumb people really appreciate his efforts.
Some of us use Haskell because we're also dumb people.
There is very little, to no, emetic all evidence either way. Take all of the demands for generics and see if you can find you can find any emperical data to back up the demands.
> He made the claim that more features hurt readability (~6:10), because when you are reading you have to waste time thinking about why the programmer chose the set of features he did to write the code. To make that kind of claim without qualifications is just ridiculous - if it were true, why add any feature to a programming language at all?
Because if you didn't the programming language would never do anything at all.
That one part was a weak part. If he claimed that having less features helps writeability I would have completely agreed though. It might take more writing to accomplish what you want, but frequently the choices you are making are a lot simpler, so you spend less time on them and more time just writing the code that you need to write.
"If he claimed that having less features helps writeability I would have completely agreed though."
I suspect most people disagree with you and think, for example, that the features in high level languages make it easier to program in those languages than in low level ones.
I'm trying to think how the simple/complex from this talk compare to the simple/complex from "Simple Made Easy".
The part about how features should be orthogonal does seem similar to the "not entangled" (and to DRY / SRP).
The part about visible surface area -- garbage collection is "simple" because you can't explicitly talk to it -- seems to be a closer match for "easy" tho, as do the examples (smell test: anything with "and" in the name probably means there are entangled concepts).
...huh. Which means that his concept of "simple" does in fact contain multiple things entangled together, or in other words is complicated. ;-)
I can't even begin to measure the amount of time I've wasted trying to make go code look as concise as what I can have trivially in C where things like macros and do { } while() are available.
Instead of "simplicity" saving me time as Mr. Pike suggests, it leaves me dissatisfied with the results after wasting my time in attempts to prevent my dissatisfaction.
This probably isn't an issue for someone new to programming without the expectations experience with other languages may bring, but for a seasoned C programmer Go often feels like a regression.
IMHO, what largely makes Go useful over C is the language-level support of coroutines, and not because the simplicity of the "go" keyword or channels, but because it eliminates the need to design and integrate with snowflake event loops or the specifics of blocking/non-blocking file descriptors.
As a result, packages are written without concern for how the caller's event loop will be integrated, you just write blocking code everywhere and it can be used by every Go program with significantly less potential for impedance mismatch. The garbage collection and overall elimination of the need for allocating and freeing memory also contributes to this; no longer do you have to understand the resource life-cycle strategy of some C library you've decided to use, or if it's using the C library's allocator or its own, or if it's capable of using your allocator via some elaborate and unique initialization scheme.
Unfortunately Golang's concurrent execution model is also what makes it a pretty terrible language for system programming. Golang's inescapable and largely opaque underlying use of threads is a constant source of pain for anyone programming operating system facilities which influence only the calling thread's state, particularly state inherited by child threads.
I often encounter misuse of runtime.LockOSThread() in attempts to overcome this problem but as of now this is completely inadequate because the Go scheduler creates new OS threads on an as-needed basis from whichever currently-executing OS thread happens to be executing at the time. This leaks whatever unique state the thread may have into a newly created, unlocked OS thread, which may then go execute any goroutine - likely the very thing you were attempting to prevent in using runtime.LockOSThread().
There's plenty more to whine about, but I'll stop here.
As someone who reads far more C code than I write. Those macros you seem to be missing infuriate me. They hide far too much from me when I'm trying to understand the code I'm reading.
And as far as do and while go, why would you hide part of the state determining when you're loop exits at the end of the loop just to save a few lines at the top. I am going to naturally try to read your code from top to bottom not top to bottom to top. The more you make me skip around the longer it will take me to understand what you are doing. 90% of my job is maintaining code not writing new code. Programmers who write as if that isn't the case just make the long term cost of maintaining the software they write larger.
I absolutely agree with all Rob Pike says. But in practice the problem is still the trade off. How limited a language should be ? what are the features a language should absolutely have to make the life of the developer easy yet still give him some room to express himself the way he wants? I think that Go failed if one considers that trade off. Often with Go, a developer ends up doing the job of the compiler, because its flat type system doesn't allow a certain degree of expressiveness ( or worse, forces the developer to use reflection , i.e. putting aside the type system all together and using a "generic" 'interface {}' type).
While C++ is something extreme in term of complexity , the other extreme, Go, is problematic as well. I've yet to find a language that has a good balance between the 2. But Go certainly showed that the tooling is as important as the language itself, and it's a great part of its success.
"Philosophically" , no question, I agree with Rob Pike. On practice, things are a bit more nuanced.
I have been thinking about the same thing: that the simplest languages have a minimal set of "orthogonal" features; though it never occurred to me to describe them in those terms.
But hey, it's Rob Pike. I was bound to walk away with something after listening to him talk :)
The mere presence of choice incurs a tangible mental cost, even if that choice is not exercised.
In the case of programming languages, this is not only a development cost, it's a maintenance cost as well. But this concept seems to apply to many more things than just programming.
Choices are made at the time of writing the code.
And complex code has proven to be very easy to write.
So the mental cost of all the existing choices must be minimal.
Some quibbling here in the comments about whether Pike uses the right definition of "simple". There is of course a context to Pike's notion of the word:
> The key point here is our programmers are Googlers [...] They’re not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt.
That quote is a hilariously denigrating claim to make of your coworkers.
Even if it were true (which I don't believe for a second), you've created a language released to the community at large. Your users are not just google engineers. If you want to encourage broader acceptance of your language, you're going to need to make a good faith attempt to listen to their requests.
If this is all tongue-in-cheek then what does it really amount to though? A good-natured ribbing of Google employees? I'm in total agreement that programming is hard, but I don't think it's hard because of languages that pack in features, its hard because of languages that don't ensure reasonable semantics or provide you the tools to enforce them. Programming is hard because its inherently powerful... languages that allow you to circumscribe that power are what keeps me sane though, not particularly the ones you can fit on an index card.
Which is hilarious given Google´s hiring processes.
Any genius that is able to invert binary trees on whiteboards, count the balls on a vending machine, do a C++ compiler course, describe the random code in a magazine, ....
Can surely cope with a modern programming language.
I think it was meant as jab on the audience and the general public, more like "Not even google engineers can appreciate a brilliant language, so who the hell you think you are?"
Actually I think they do outsource a lot and those are the actual target audience, when reading between the lines of statements like "having junior devs straight out of school" or the quality of the Android tooling.
Or at least wrong when we talk about general programming.
Go isn't a 'bad' language. Go is simple another tool for one job and it's doing his job good, cause of his simplicity.
However not every problem is so easy to solve with this simplicity. Some problems doesn't fit well with the Go approach. And I always hate it when people talk about "why x is superior because less/more features", mostly things emerge when the problem emerges.
I agree with him in principle although I agree with a lot of the other counterpoints made here too.
The biggest reason that other features get added to languages to make them ubiquitous is because in MANY cases companies marry their infrastructure to a particular language. With that infrastructure commitment you end up with an investment into doing everything with that particular language so if you have a feature or methodology in another language that would be good for a particular use case, your existing investments prevent you from being able to EASILY introduce a new language just for that feature.
Microservice architecture goes a lot way toward countering those investments. Heroku actually does a great job of removing language specific infrastructure investments from your process that make it easier to introduce language-per-purpose approaches.
The first mention sounded like C++. For verbosity (second mention), I'd guess Java. The third mention sounded like JavaScript where he said something to the effect of unexpected tangling, but those are guesses based on what Google uses and Rob's brief description.
[1] http://www.infoq.com/presentations/Simple-Made-Easy