This whole article is a reflection of a belief that I really don't like but is all too common these days. It's the idea that nothing is better than anything else, everything is a trade-off and strong positive are inherently wrong.
I don't buy it.
It's a tacit assumption that, somehow, everything is on the Pareto frontier. Profoundly implausible. In the background, it's basically a hybrid appeal to moderation (all extreme positions are wrong) and popularity (if so many people are doing something, it can't be wrong). Never mind the fact that popularity is almost meaningless, driven by fashion more than anything, and that what's "extreme" is arbitrary and largely driven by popularity itself...
This is how we get "worse is better"; this is how we get standards that are compromises ideal for nobody but still widely adopted because they're standard or because that seems like the thing to do or because that's what everyone else is doing or because " engineering".
The worst part is that this all goes unsaid, assumed correct by default. Just the fact that we're having the discussion the article wants, and not this meta-discussion, implicitly elevates its premise!
Some things are better than others. A strong debate between strong positions is healthier than an artificial social force reverting everyone back to mean.
Now, all this is not to say the usual message about types is perfect. It could definitely be improved in many different ways. But not by muzzling, weakening or qualifying: not by most of the argument's suggestions.
Especially ironic is the rhetorical techniques the author uses to argue against rhetoric. The one I personally find the most grating—beyond the ambient implication that everything is equal that I've just written about—is establishing cheap credibility by aligning himself with the camp he's arguing against. "I like X, but..." Or, in this case, "I use OCaml, but..."
Then again, my pointing this out is probably ironic in the same way. Or is it?
I agree with your general sentiment but that's not what I got out of this article. The article doesn't say much about dynamic vs static typing, it merely says something about the arguments around dynamic vs static typing, and it's spot on in that respect. I personally believe static typing to be better than dynamic typing, but at the same time most of the arguments for static typing that I read on the internet are terrible or incredibly exaggerated.
This whole article is a reflection of a belief that I really don't like but is all too common these days. It's the idea that nothing is better than anything else, everything is a trade-off and strong positive are inherently wrong.
I feel you've missed the point of the article.
The author isn't making a judgement on whether static type systems are better than dynamic type systems, instead he's suggesting we increase the quality of the debate by discarding poor arguments.
It seems more unrealistic to assume everyone agrees on the same cost-benefit function. We are different people and value different things. So in any attempt at a proof involving what should be valued more, you have to state your assumptions and they're generally debatable.
There's also the bit of refusing to agree to the same cost benefit function because it commits you to a standard that you can't back away from. I mean look at the global warming debate, for example.
> Some things are better than others. A strong debate between strong positions is healthier than an artificial social force reverting everyone back to mean.
I agree in principle, but the fact that there are good arguments on either side of the debate, and neither of them has been able to conclusively prove that one approach is decidedly better than the other, turns the debate into an argument over preference rather than absolutes, which is a very different kind of debate.
But I would even say something stronger: when it comes to programming languages, much (if not most or even practically all) of the arguments must eventually come down to psychology. If a programming language that can produce no bugs and still express useful programs were ever to be created, if that language is hard for people to use (because of human psychology), or even if people don't like using it -- I would say that that language is nearly useless, except for research purposes or maybe some domains where this property is important over all others (say, safety-critical embedded systems). So even if there is a "right answer" like you imply, the debate is in the realm of psychology -- not math or computer science -- and psychology, as we all know, does not usually yield hard answers even when it has answers at all. So while studying the properties of programming languages is computer science, their overall utility is (almost) purely within the realm of the social sciences.
To repeat an example I've made a few times, if your job is to catch a ball you can't argue that teaching you to compute the ball's trajectory with a Newtonian formula is better than practicing a few times simply because the former is more mathematically rigorous. Which is best depends on which approach better fits your psychology.
So here's my strong position: if someone debating programming languages denies that their general utility is mostly a matter of psychology rather than math, then that person is wrong!
"Better" is often a relative term based on somebody's perspective. To rate something as being "better", you have to clearly define what "better" is and have some sort of agreement that "better" is better for everybody.
If you ignore the "dissenters" then you effectively dis-empower those people & silence their perspective. Is that "better"?
There's a cost of reducing potential novelty. There's a benefit of increasing focus (perhaps increasing implemented novelty of downstream scopes).
Certainly very sensible points. But I take some issue with the bits towards the end:
> Fetishising Curry-Howard
Hey, I happen to find Curry-Howard sexy, you have no right to shame me!
> To specify and check liveness or reachability, we need to represent programs as some kind of flow graph, where flows are very likely to span multiple functions and multiple modules.
But no, this is exactly what basing systems on Curry-Howard (with terminating functions) buys you. If I give a HTTP response handler in Coq the type
HttpRequest -> HttpResponse
I know it is live, because the function is total. And similarly, dead code paths can be gotten rid of by eliminating a proof of False.
So this lets you state liveness and unreachability. Of course, the proofs of those statements is very different from program-analysis-style systems. In a whole-program analysis, the analyser goes over the entire program, and at the end it says "well, I could not find anything calling this function, so I guess it's dead". In a Curry-Howard-style system, you need to look at the function and figure out when it should be called and what preconditions need to be true for it to work properly. But this is exactly why people like type systems, you can think about programs modularly rather than all at once.
I suppose, but outside certain mathematical systems that are designed that way, assuming that all functions are total is unrealistic. The languages we use every day are Turing-complete languages and infinite loops can happen.
You don't have to assume everything is total, you just need to track in the type system which things are and which aren't. E.g. you could also give it the type
Having a type for total functions gives you the option to state and prove totality properties, even if you maybe don't want to go through the effort of doing that for every function you write.
You don't need to go all the way to IO, you can make every function total using Partial:
data Partial a = Now a | Later (Partial a)
It's straightforward to do things in Partial, since it's a Functor, Applicative and Monad:
instance Functor Partial where
fmap f (Now x) = Now (f x)
fmap f (Later x) = Later (fmap f x)
instance Applicative Partial where
pure x = Now x
(Now f) <*> (Now x) = Now (f x)
(Later f) <*> x = Later (f <*> x)
f <*> (Later x) = Later (f <*> x)
instance Monad Partial where
return = pure
f >>= (Now x) = f x
f >>= (Later x) = Later (f x)
Partial is so trivial that everything can be made co-recursive by wrapping all function bodies in "Later". All of the "Later" wrappers will get nested together by the Applicative/Monad; everything will "just work". The one missing piece is the following function, which isn't total:
runPartial :: Partial a -> a
runPartial (Now x) = x
runPartial (Later x) = runPartial x
We could name it `unsafeForcePartial` and treat it with care, like we currently do with `unsafePerformIO`.
Yep, although it doesn't actually diverge; that's the point ;)
The classic "loop" function, which is completely polymorphic, does diverge:
loop :: a
loop = loop
"loop" will never return a value, since to perform one 'step', it must perform an 'infinite' amount of computation.
The function you've written doesn't diverge, since it will immediately return a value "Later x" after one step. As long as we're using laziness, it doesn't matter that "x" itself is 'infinite':
loop' :: Partial a
loop' = Later loop'
Note that we can do an equivalent thing in strict languages, by eta-expanding our definitions into thunks:
data Partial' a = Now a | Later (() -> Partial a)
loop'' :: () -> Partial a
loop'' _ = Later loop''
We can use "loop :: a" in place of "undefined :: a", since they have the same type. We can't use "loop' :: Partial a" in the same way, since its type is different.
"loop' :: Partial a" is a bit like a 'phantom type', where the "a" in its type signature never actually occurs in the value. In fact, we can define a type which doesn't contain "Now", and is thus guaranteed to never terminate:
data Inf a = Inf (Inf a)
Of course, the "a" in "Inf a" is a true phantom: it doesn't affect the values at all. We can get rid of it to obtain:
data Loop = Loop Loop
We can use this type to "drive" main loops for servers, operating systems, games, etc. in a total language:
server :: Loop -> IO Loop
server (Loop x) = do req <- readRequest
sendResponse (process req)
Loop <$> server x
"Loop" is equivalent to "Stream ()", where:
data Stream a = Cons a (Stream a)
Of course, "Stream a" is to "[a]" what "Inf a" is to "Partial a": it's just missing the base-case ("[]" and "Now a", respectively).
Interestingly, we can think of "Partial a" as being like "[a]", except that it stores elements in "[]" rather than ":". Alternatively, we can think of "Partial a" as being like "Nat" (the Peano numbers), which uses an element of "a" in place of "Zero".
Koka [1] does this; I find Daan's experiences in using it to build Madoko [2] to be very enlightening. Basically, I think he had to give up on total-ness more than he would have liked.
This article says many things that need to be said, but I take issue with this:
> No sane person can argue that we want to bifurcate a programming language into distinct base-level and type-level fragments.
Well, I must be insane then. I much prefer writing type expressions in a language free from, say, mutation and iteration.
Restricting languages aids not only computers which analyze programs, but also humans which create programs. It's much easier to remember, say, "I'm not allowed to mutate anything, there's no syntax for it", than, "I can use mutation, but if I use it in such-and-such a way, then compiler won't be able to statically determine the type, unless I have version 4.2.3 or later".
I think most of the article is as bad as the part you are criticizing. The part you refer to is just particularly bad in it's presentation. Making unsupported claims while declaring oneself correct and anyone who disagrees is not sane is not helpful or productive.
There is precedent that these types of splits don't make things as safe as advocates believe. Amusingly, the precedent is part of the foundational bits of modern theoretical CS :)
The short version is to look at the early twentieth-century work on formal systems, languages and metalanguages, etc., and then notice how those particular sandcastles got kicked over by the incompleteness theorems.
By the way, personally I tend to look at type systems in terms of the tradeoffs they make. It certainly seems to me like the diminishing-returns point -- where writing verifiable code within a "more expressive" type system becomes more work than writing unverifiable code in a "less expressive" system and dealing with any bugs after the fact -- has already been passed by some of the popular FP languages.
I think you misunderstand me. I'm not talking about computation languages. I'm talking about type languages (as was the OP). As in, the grammatical construction in which types are expressed. For example, OCaml has a separate type language (the module system) from its computation language. Languages like Python and Elixir do not (you can define classes as part of a computation).
The OP is advocating that the type language be one in the same as the language in which computation is performed. I'm suggesting that this leads to usability issues, as it's not clear from the syntax what type expressions are and are not usable by the compiler to check/optimize/etc. the value-level computation.
This is entirely orthogonal to the point you seem to be making about expressive vs. inexpressive type systems.
Bravo. This is the best thing I've read anywhere about the general issue of types, both static and dynamic. It is even-handed, acknowledging the benefits and limitations of both static and dynamic languages, and incredibly well-informed. I now have something to send to the next person that threatens to send me a copy of PFPL.
That's because we don't actually have any concrete evidence one way or the other. This is the biggest problem with the whole static/dynamic debate.
Until there are studies that clearly demonstrate that projects written in statically typed languages have statistically less defects, are developed faster, or are easier to maintain, it's all anecdotal evidence.
There are tons of large real world projects developed in both static and dynamic languages, there's very little indication that projects developed in dynamic languages have significantly more defects or that they're more difficult to maintain.
One can debate how viable the evidence is, but at least it is some evidence.
However, there does appear to be some evidence that static typing helps with program comprehension/navigation, for example: http://dl.acm.org/citation.cfm?id=2568299
That's exactly my point, the best anybody managed to say is that there may be some slight difference. Even then it's difficult to draw any sweeping conclusions since there's a lot of variation between languages in each family as well.
> Now that I've ranted my heart out, here's a coda that attempts to be constructive. As I mentioned, part of the problem is that we equivocate when talking about “types”. Instead of saying “type” or “type system”, I encourage anyone to check whether anything from the following alternative vocabulary will communicate the intended meaning. Deciding which term is appropriate is a good mental exercise for figuring out what meaning is actually intended.
> * data abstraction
> * data type
> * predicate, proposition
> * proof, proof system
> * interface specification [language]
> * [compile-time] checker
> * specification, verification
> * invariant (the thing that “types” are usually specifying!)
I add precondition, postcondition, state. As well, I think the space in between has importance: map, transformer, relation, command. Formalization is tricky, from my understanding.
"Rich Hickey's transducers talk gave a nice example of how patterns of polymorphism which humans find comprehensible can easily become extremely hard to capture precisely in a logic."
I haven't watched his talk, but Rich Hickey also commented on /r/Haskell with the types. It doesn't seem like it was that difficult to capture. There are also other comments that approach the types from different angles and achieve some success. Maybe this isn't a good example for this thesis?
Rich brought this up in his Strange Loop talk [1]. All the attempts he had seen either disallowed perfectly valid uses or allowed invalid ones. I believe to really guarantee early termination and a few other tricky invariants of reducers you would need a linear type system. [2]
To the degree that it allows invalid types, it's weakly typed. That's fine as a compromise (no type system is perfect), but the same impulse that says you should type in the first place also says you should only allow valid types. It's a challenge type systems should aspire to solve.
That you can't type it isn't an argument against types, but that you can't type it and that's perfectly fine with you suggests you don't actually think types are that important.
Those, I believe, are fairly easy to capture in the `Moore b r -> Moore a r` formulation. It also captures linearity.
That said, I think the more general formulation is something like Fold or Traversal in Lens which operates polymorphically over the whole thing with a type that a refinement of
Profunctor p => p a b -> p s t
where such a type generalizes all of the concrete Haskell formulations.
Yep, that's right. There seem to be a few factors involved:
- Most of that discussion happened before a full description of transducers was up anywhere, so a lot of these details are just completely missing.
- Haskell's type system is not expressive enough to capture certain interesting properties, like linearity. (To be clear, it's quite possible to translate any arbitrary transducer into Haskell -- but the type system will not prevent all the mistakes it is possible to make.)
- There are different classes of transducers that Haskell people actively would not want to have identical types. (ie. stateful vs. stateless, early-terminating vs. unbounded) Many of the discussions revolve only one or a few of these classes -- typically the stateless unbounded version, which turns out to have a particularly simple encoding, but it definitely varies.
In my defence, on the question of header typing, I did mention that these were not things that couldn’t be done in dynamically typed languages, but rather that they were more dangerous in such languages, that it was the assurance of correctness that languages like Rust get you. A strongly typed header scheme is absolutely possible in Python, but the risks are considerably higher.
I certainly agree that there were a few points that I didn’t substantiate or provide sufficient evidence for—I was deliberately intending to be broad and wanting people to think about these things, rather than going down rabbit holes. I’m afraid that half an hour really isn’t long to talk about such things in detail.
Because the /newest page is fortune's fickle wheel, where worthy essays like this one are lost, more frequently than not, in the flood of spammy chaff.
I thought it was worthwhile, so I submitted it again. We got lucky this time ;)
I take objection with the article regarding "Pretending that syntactic overhead is the issue". Or rather, I would say it's an incomplete way of looking at it. Non-statically typed languages (I'm thinking reasonably full-featured languages like Perl, Python or Ruby, not Bash) are, IMHO, go-to languages for certain tasks because they allow you to get results quickly. How?
* Less typing (there is freedom in being able to cobble together a reasonably complex program in a single file of a few hundred lines in an editor, vs a number of files and thousands of lines in a IDE).
* Advanced features! It is only now that closures, first-class functions, etc, start to trickle down to mainstream, statically typed languages. And they do make for shorter programs (see, eg, the post on expired links on HN).
Of course, it's not to say that non-statically typed programs don't suffer from a number of drawbacks, but dismissing conciseness as something trivial does not, I think, accurately reflect reasons for why people use this kind of language.
For reference, here's the difference: Strong typing ensures all results are well-defined, as the result of a specification, and it uses type information to do that. Weak typing throws away type information, so it can no longer uphold such specifications.
Note that auto-conversion and error on mismatch are both evidence of strong typing in action, because both behaviors rely on type information in order to occur.
> If somebody stays away from typed languages, ...
These people who stay away from typed languages, who are they and what are they using?
Shell? Awk? Assembler?
When is a language untyped: is it when you can misuse an operand by applying the wrong operation, with no diagnosis and strange, non-portable consequences? Or is when the character string "1.3" can be used as an arithmetic operand, automatically becoming a floating-point approximation of "1.3"? (If that is "untyped", how is it that "1.3" is identified as a character string, and how is the context identified as requiring an arithmetic operand?)
Or is it untyped when wrong type operands do something unpredictable, but dumb?
There's the formal definition of type and there's the colloquial definition of type. A type in the context of the article means a formal type, it is an syntactic classifier that is part of the static semantics of the language. The properties you describe ( string/number exceptions ) are part of the dynamics of the language, which is a separate concept. Dynamically typed languages quite often use boxed values with an attached runtime tags which are colloquially referred to as "types" even though they have nothing to do with static types.
If you read the article in the context of "type" meaning "static type" it will make more sense.
Does that mean I can search and replace every occurrence of "type" in the article with "static type"?
Are you sure dynamic types have nothing to do with static types? What about compilers for dynamic languages which reason about type, like that if the program calls (symbol-name x), it is inferred that x must hold a string, so that a subsequent (cos x) is inconsistent? Is such a compiler is reasoning about something which has nothing to do with static type?
> What about compilers for dynamic languages which reason about type, like that if the program calls (symbol-name x), it is inferred that x must hold a string, so that a subsequent (cos x) is inconsistent?
That's irrelevant; if the program containing (symbol-name x) and (cos x) works in an interpreter but not in the compiler then the compiler is broken. Note that throwing-exceptions/triggering-errors/etc. is "working", since it's a valid form of control flow which the program could intercept and carry on. Aborting immediately (like a segfault) isn't "working". Triggering undefined behaviour also isn't "working", but since most of these languages lack any real defined behaviour, it's assumed that "whatever the official interpreter does" is correct.
Usually, these compilers completely acknowledge that their behaviour is incorrect WRT the interpreter. They embrace this and define a new, static language which just-so-happens to share a lot of syntax and semantics with that dynamic language. For example, HipHop for PHP could not compile PHP; it could only compile a static language which looks like PHP. Likewise for ShedSkin and Python. PyPy even gives its static language a name: RPython.
Hence, these compilers don't show that runtime tags and static types are the same/similar. Instead, they show that sometimes programmers without static types will use tags to do some things that static types are often used for.
> if the program containing (symbol-name x) and (cos x) works in an interpreter but not in the compiler then the compiler is broken.
The compiler isn't broken.
The interpreter throws an exception, which meets your own definition of "works".
The compiler gives an opinion "x has an inconsistent type". In this manner, the code also "works" in the compiler.
You can still run the program if you like; then that opinion becomes an exception if an input case exercises the code, proving the compiler's opinion right.
Compilers for dynamic languages do not preserve all interpreted behaviors. This is usually explicitly rejected as a goal: users must accept certain conventions if they want code to behave the same in all situations.
For instance, some Common Lisp implementations re-expand the same macros during evaluation, which is friendly toward developers when they are modifying macros. But in compilation, macros are expanded once.
Usually these compilation-interpretation discrepancies are obscure; we are not talking about obscure features here. It's a basic tenet of type checking in a compiler that it will flag things that will throw type-related exceptions at run time. You cannot say that type checking compilers for dynamic languages are simply not allowed because type mismatches are well-defined behavior; that's simply outlandish.
> The compiler gives an opinion "x has an inconsistent type".
> You can still run the program if you like; then that opinion becomes an exception if an input case exercises the code, proving the compiler's opinion right.
In this example, the compiler cannot give "x" the static type of "string" and also allow the program to handle errors when "x" is used in places where strings aren't allowed. If the static types don't match, the result is undefined; that's what static typing means.
If the compiler causes an exception to be thrown, then either:
- That's just an implementation-specific quirk, which just-so-happens to be the way this compiler behaves when it hits this undefined case; it's not part of any spec/documentation and useful only to hackers trying to exploit the system.
- We accept that the compiler didn't assign "x" the static type of "string" after all; it actually assigned it "string + exception + ..." (which may be the "any" type of the dynamic language, or some sub-set of it)
The situation isn't one of static typing, but of static type checking. Check (parent (parent (parent yourpost))) where it says
> What about compilers for dynamic languages ...
A compiler for a dynamic language that checks types does not bring about static typing. The language in fact remains dynamic.
What the checking means is that the compiler can give an opinion based on a static view of the program, and we can run that program regardless.
The compiler can say: yes, all expressions in the program can be assigned a type; no, some expressions have a conflicting type, or couldn't be assigned a type.
Static typing indeed means that we use the result of the static analysis to remove all traces of type from the program, and only run it when its type information is complete and free of conflicts.
The dynamic language optimizer can in fact take advantage of its findings to eliminate run-time type checks where it is safe to do so.
> What the checking means is that the compiler can give an opinion based on a static view of the program, and we can run that program regardless.
> The compiler can say: yes, all expressions in the program can be assigned a type; no, some expressions have a conflicting type, or couldn't be assigned a type.
What is a "conflicting type"? In the dynamic language, there is only one static type (any = int + string + float + (any -> any) + array(any) + ...), so there can't be any conflicts.
It's fine to have a compiler try to specialise the types of variables beyond that, eg. narrowing-down the type of "x" in the example to "string + float", based on how it's used (or just leave it as "any").
A "conflict" would imply that a variable is used in ways that don't match the inferred type; but the inference is based on how it's used.
PS: Static type inference, like your example of (symbol-name x) and (cos x), has been going on for at least 40 years, since ML. The article mentions, Ocaml, which is a descendent of ML and hence has full type inference (you can leave out all annotations and the type-checker will figure them out). Depending on the type system, this may or may not be possible. For example types in Agda, Coq and Idris are full programs, so inferring them is undecidable.
Since dynamic languages only have one static type, static type inference is trivial. So is type-safety: there's no way to cast a value of one type to an incompatible type, since there are no other types.
Inferring tags doesn't work in general: when I say `print x` it doesn't mean that `x` must be a string; I might want Python to raise a "TypeError" exception, so that I can jump to the handler where all the real work takes place. Of course that's bad practice, but exceptions are only "errors" by convention; in the same way that False is only a failure by convention.
Are you sure dynamic types have nothing to do with static types?
"Static type" does not really mean "class of values." That is a common use of static types, but they can also be used for things like tracking what resources are available/released or describing what exceptions a function might raise.
Types are usually understood as proofs: in this case, a proof about whether the tag of x is suitable for passing to cos without causing an error. (If you like, read "dynamic type" for "tag".)
So the proof and the tag are related in that one is about the other, but that does not mean they are the same thing.
Indeed, just like a basket with three oranges in it, the numeral 3 written on a piece of paper, and a die with the three-dot face turned up, are not the same thing. They just potentially serve as representations of the same thing.
Of course something at compile time isn't the same thing as at run time. They exist at different times in different spaces. Compile time might be on an x86 build machine, and run-time on an embedded ARM device.
I didn't say that types are tags at a different time, I said they
are different things.
And only concrete types are about representation. Abstract,
universal, existential and uninhabited types are all useful and
meaningful concepts without any particular relation to
representation.
But a concrete value, like a symbol or string, has a concrete type, and that better agree with what its tag information says (if it has it).
Sure, expressions like "for all x: function from x to x" may not have concrete instantiations: there is no thing that is type tagged that way. It could be, of course; there is no limitation on what constitutes a type tag. Any compile time information, whether it is a universally quantified expression or whatever, can be represented at run-time also. In practice, that isn't done.
Quite sure, in a dynamically typed language there is only a single static type inhabited by all values. If the compiler is reasoning about classes of values at runtime then it does indeed have nothing to do with static typing.
In the run time image for a dynamic language, there may be a "cell" or "value" representation which holds anything. That doesn't mean this is the "type" of that language.
It may certainly be a static type in another language; that
language which is used to implement an interpreter or virtual machine for the dynamic language. E.g. a Lisp interpreter in C has some "value" typedef which can hold a number, string, symbol, ...
That type is a C type, though, not a Lisp type.
A language can't be confused with the one it compiles into, or which is used to write an interpreter for it.
Utimately, everything is interpreted by a processor, through some machine language representation which uses typeless words.
But what it's reasoning about coincides with the dynamic type which will exist at run time, so that means we cannot say the two have nothing to do with each other. It's the same knowledge, just known at different times, right?
Just because we have stuffed some knowledge into tags that exist at run time (and chosen suitable machine representations to make that possible) doesn't mean it's a different class of knowledge.
Types are propositions for which the text of your program is the proof. Type theory arose as an alternative to set theory in order to deal with Russell's paradox[0]. Types have been around for a lot longer than computers have. They have nothing at all to do with a running program.
What is missing from the above articulation of a view is that the run-time state of the program is also text. (It's made up of the symbols "0" and "1", at one level.) There can be propositions about that text.
Such a proposition pertains to a mathematical function. A running computer program does not contain mathematical functions, merely representations of them.
It is akin the difference between the theory of gravity and a falling rock. You can describe the rock with all manner of equations but you will not find any equations inside the rock.
A running computer program does not contain mathematical functions, merely represenations of them.
A printed computer program text does contain mathematical functions, and not merely representations of them.
Because of this difference, logical propositions about structures in the program text are types, whereas those about structures in the running program are not.
Furthermore, this has something to do with gravity and rocks: the running program is analogous to a falling rock, but the non-running program is like ... something else.
Somehow, compilers are exempt: they are running programs, which work with types. The source of this exemption is that they operate on other programs; a proposition inside a compiler is a representation about the program being compiled, not about anything in the compiler. Running programs may not entertain propositions about their own typing, because typing is purely prior to run-time. Type does not exist at run-time, period. The propositions of run-time typing are not type, but some kind system of data representations that mimics type.
This is a bit of a nit to pick, but once again, I'm mystified as to why the HN front page mods changed the title of this submission.
I had it as: "The Seven Deadly Sins of Talking About Types", which also happens to be the title of the blog post itself, give or take a definite article.
>If the original title begins with a number or number + gratuitous adjective, we'd appreciate it if you'd crop it. E.g. translate "10 Ways To Do X" to "How To Do X," and "14 Amazing Ys" to "Ys." Exception: when the number is meaningful, e.g. "The 5 Platonic Solids."
If a (some might say, literary) allusion has no place on Hacker News, just because it happens to have a number in it, then the small-minded literalists have really taken over this joint.
You don't need to know the seven deadly sins to know anything about types. They could have had 5 or 8 sins having to do with types and the content wouldn't have changed. That's pretty much the definition of unnecessary.
But the content doesn't really relate to the allusion in any way. It's not like the seven sins listed are analogues to gluttony, lust, sloth, etc. There just happen to be seven of them.
These guidelines are really strict, but I have to say I like the results in general. I can usually tell by the title if something is likely to contain information that's interesting to me, unlike with many news sites.
Oh good, a discussion on typed vs untyped. I expect this will be a solid exchange of ideas, with each side trying to learn something from the other side.
Did you read the article? It is actually quite refreshing in it's tone:
Patronising those outside the faith
If somebody stays away from typed languages, it doesn't mean they're stupid,
or lack education, or are afraid of maths. There are
entirely valid practical reasons for staying away.
Please drop the condescending tone.
> People who I enjoy discussing issues with bow out of the discussion the same way they do when people bring up emacs vs. vim. They know they aren't going to hear anything new, so why bother?
That's how I felt after reading this: "why did I bother?"
'loose' typing isn't a word normally used when talking about type systems. Most people work with a two-axes "strong" vs "weak" and "dynamic" vs "static" system. But very, very few languages are 'weakly' typed, to the point of it almost being a silly way to talk about them. Ruby, for example, is Dynamic and Strong.
> Most people work with a two-axes "strong" vs "weak" and "dynamic" vs "static" system.
I dispute this.
First of all, I think that most programmers use a one-axis system of strong/weak, where "strong" means approximately "the safety that I like", and "weak" means approximately "lacking the safety that I like", a system in which they have no words for "additional safety that I do not value". I'm not aware of any useful definition of these words, nor any specific definition widely agreed upon by any meaningful non-partisan class of speakers (e.g. "working programmers" or "published academic type theorists"). Can you correct me?
As such, I think that saying "very, very few languages are 'weakly' typed" is unlikely to be understood by your audience in the sense that you intended it (whatever sense that was).
I agree that the dynamic/static axis seems to be in wide use, and I'm unaware of massive divergences in usage. Although note that many people spell the word "static" using the letters ['s', 't', 'r', 'o', 'n', 'g'].
I seem to be inside your parent's "most people" and you seem to be outside of it. I have no idea if the word "most" is appropriate, but at least "some" is.
The sense in which that post intended "weakly typed" is the one in which it means that types are checked neither at compile-time (statically) or at runtime (dynamically). For instance, (most?) assembly is "weakly" typed, because I can "pass" whatever data I want into a subroutine without it noticing whether or not that data has the correct type. On the other hand, Ruby is "strongly" typed, because I get "NoMethodError" if I try to call a method on a type that doesn't implement that method. And Rust is also "strongly" typed because I get an error from the compiler if I try to call a method on a type that doesn't implement that method. But Ruby is "dynamically" typed because I get that "NoMethodError" at runtime, while Rust is "statically" typed because I get that compiler error at compile-time.
I guess you know all this stuff, but you seem to think the terminology is less common than I do!
Yeah. And all three of us are outside of my "most programmers", which is unsurprising, because we are reading comments on a post about type systems, which is a weird thing to do.
I encourage you to go ask some random C++ programmers, perhaps a bunch of EA employees, whether C or PHP is more "strongly typed", and see what they say. Note that because of void pointers, I believe that C fits your definition of "weakly typed", and as far as I know, PHP does not fit your definition (and btw I agree with you about what steveklabnik intended, though we may both be wrong).
And yeah, we're primarily disagreeing about what classes of people agree on what terminology. This is why I agree with chrisseaton in the cousin-thread, that there is no good definition... people assume other people know what "strong" means, and yet they mostly disagree.
And I'd like to repeat that as far as I can tell, if you survey the type-theorist literature, I think that type theorists
a) have mostly stopped using those stupid vague labels, and
b) back when they did use them, each author used them for something different
and repeat my entreaty for evidence to the contrary.
I agree with everything you said except that I think there are good definitions, even if they are not widely agreed upon. That is to say, I believe the definitions I laid out are good in the sense that they are consistent and provide a good framework in which to view things. That doesn't mean I think they're rigorous.
I would be very interested to hear the results of your survey of C++ programmers (or one of programmers more widely). I have no idea what the results would be.
>nor any specific definition widely agreed upon by any meaningful non-partisan class of speakers (e.g. "working programmers" or "published academic type theorists"
How are "academic type theorists" partisan exactly? That sounds a lot like saying "I haven't seen a definition of medicine that is widely agreed upon by both biologists and homeopaths". Of course not, but those are not equal opposite sides. One side is correct, and the other is not.
"Working programmers" and "published academic type theorists" are examples of meaningful-to-me hopefully-non-partisan classes of speakers. I can't manage to parse what I wrote any other way, I don't know what you read.
In this context, advocates of a particular programming language, or particular paradigm, are what I had in mind by "partisan", in that I believe them to be likely to have relatively-agreed-upon definitions of "strong" (within their language community), and I also believe those definitions to be amusingly self-congratulatory.
Also, btw, while I think that homeopathy as a "medical discipline" is contemptible, it's pretty stupid to say "biologists are correct about the meaning of this word, and homeopaths are not". The way that words work is that they mean what people use them to mean. This, in fact, is my point about the phrases "strong typing" and "weak typing"; in my experience and my literature surveys, these phrases are NOT used consistently, and thus i.m.o. if you wish for your audience to hear the thing you intend to mean, you're better off avoiding those phrases.
As a supposed Haskell advocate you are really doing a very good job of being thoroughly unpleasant. Knock it off. It reflects very badly on the Haskell community.
There's no good accepted definition of weak vs strong typing in the literature, but I don't think many people would agree that PHP or Perl are weakly typed. I would say they are both strongly, and dynamically, typed. Can you argue otherwise?
I'd say that was an example of strong typing. The + operator in both languages converts types if required. But it's a proper type conversion - it's not reinterpreting the existing data as another type.
If you are happy with a method like Ruby's puts converting objects to a string, why is + any different? They both do type conversion if required.
In my mind an example of a weakly typed language would look like this:
print 1 + "1";
> 399299293
Here my imaginary language has taken the object that represents the string "1", it has ignored the type and has reinterpreted the string object (which happens to be a pointer to a character array) as if it were an integer. This is what C does - which is why it is a weakly typed language in my mind.
C is also a weakly typed language, but it implicitly casts the string object to an integer memory address instead of implicitly parsing the string into an integer like PHP and Perl do.
JavaScript, also being weakly typed, will opt to perform string concatenation, so 1 + "1" = "11".
In strongly typed languages, you can expect an exception rather than an implicit type coercion, when you pass an argument or operand of the wrong type.
But that doesn't rule out that in a strongly typed language like Ruby you can still do something like monkey-patch the String and Fixnum classes to provide custom polymorphic add methods.
>> In strongly typed languages, you can expect an exception rather than an implicit type coercion, when you pass an argument or operand of the wrong type.
Scala and Java's + operator also do implicit conversions to string. Do you think Scala and Java are both weakly typed?
You said that you think Ruby and Python are strongly typed because they don't do these implicit conversions. But they do: Ruby's Kernel#puts and Python's print convert to string.
Also, is 0.5 + 1 ok? One of those operators is a float and the other an integer. In almost every language we get an implicit conversion from integer to float. Does that mean all these languages are weakly typed? Or is conversion to string some kind of exception for some reason?
> JavaScript, also being weakly typed, will opt to perform string concatenation, so 1 + "1" = "11".
How is this an example of weak typing when type information is essential to doing it?
> In strongly typed languages, you can expect an exception rather than an implicit type coercion, when you pass an argument or operand of the wrong type.
That's static typing, which is orthogonal to what we're talking about.
I think your definition is "good", even if it isn't widely-agreed-upon. There may be other "good" definitions, but I don't think the one your parent post is one, because it doesn't seem to permit a "strongly typed" language to define methods that take different types and convert them before operating on them, which would make all languages "weakly typed".
>The + operator in both languages converts types if required.
But that is what weak typing is.
>In my mind an example of a weakly typed language would look like this
Just because it is converting the other way doesn't mean it is a different thing. The fact that it one happens to convert in the direction that pleases you doesn't mean it is "strong" now. If it is converting, then it is weak.
But most languages have some operation that will implicitly convert values to a different type. In fact Ruby and Python both do this (puts and print are examples), and they were examples given of a strongly typed language.
I didn't give those examples. Strong/weak isn't binary, it is a scale. A language that does more implicit conversion is weaker than one that does less.
It's the difference between conversion and casting:
Casting, explicit or implicit, throws away type information, and reinterprets the same data in a different way. Data, here, is bytes, or some other low-level representation that makes what the types apply to.
Conversions, explicit or implicit, uses type information to either change data, or prove that no change is needed, in order to use the data a different way.
But that isn't what we're talking about. Changing a number to a string and using it as a string is exactly as bad as changing a string to a number and using it as a number.
Only if the strong type system is specified to return errors on those kinds of mismatches, as opposed to using the type information to perform a conversion and give another result which is just as well-defined as an error.
That's the point: Strong typing ensures all results are well-defined, as the result of a specification, and it uses type information to do that. Weak typing throws away type information, so it can no longer uphold such specifications.
"weak typing" usually means a type system (static or dynamic) where there are many implicit coercions going on. "strong typing" requires these casts to be written explicitly.
The only languages I know which would fit your definition of "weak typing" are Forth and Assembly (no implicit casts ever, no type checking at all). Maybe TCL, but I'm not sure. Anyway, using your definition, every other language would become "strongly typed". And this wouldn't be very practical, I think, because then we'd have no way of distinguishing type systems of for example PERL and Python.
> "weak typing" usually means a type system (static or dynamic) where there are many implicit coercions going on. "strong typing" requires these casts to be written explicitly.
This makes no sense, because both of those definitions assume the runtime and/or compiler has the same amount of type information available to it, and the only difference is what the runtime and/or compiler chooses to do with it.
Weakly typed languages have less type information, either because they never had it or because they allowed the programmer to throw it away.
Casts and conversions are not the same thing: A cast throws away type information so the same data can be used in a different way, a conversion uses the type information to change the data (or prove no change is needed) in order to use it in a different way.
C has casts, Javascript has conversions. C is weakly typed, Javascript is strongly dynamically typed.
"A variable can be anything" isn't loose typing. It's just that the type expressions that describe the program are not (necessarily) compile-time constant expressions.
In many "variable can be anything" languages, you cannot for instance add the character string "3" to the number 3 to obtain 6. You get an error, like "the string '3' is not an arithmetic operand".
So even though the variable which holds that "3" can be anything, the arithmetic add operation doesn't take any kind of value.
Languages of this type can be subject to static checking, which can identify potential programs. Mature dynamic languages have good compilers with such checks. Instead of being told "you cannot run this program because of such and such a type mismatch", you get a softer diagnostic like "there is probably a type mismatch here". (And if the program is run anyway, the mismatch is still caught at run time and turned into an exception---if the execution of the program steps into the code with the values which trigger the problem.)
No. Some (more or less) modern languages have loose typing. Some modern languages have quite strict typing (Haskell for instance, but there are others).
I don't buy it.
It's a tacit assumption that, somehow, everything is on the Pareto frontier. Profoundly implausible. In the background, it's basically a hybrid appeal to moderation (all extreme positions are wrong) and popularity (if so many people are doing something, it can't be wrong). Never mind the fact that popularity is almost meaningless, driven by fashion more than anything, and that what's "extreme" is arbitrary and largely driven by popularity itself...
This is how we get "worse is better"; this is how we get standards that are compromises ideal for nobody but still widely adopted because they're standard or because that seems like the thing to do or because that's what everyone else is doing or because " engineering".
The worst part is that this all goes unsaid, assumed correct by default. Just the fact that we're having the discussion the article wants, and not this meta-discussion, implicitly elevates its premise!
Some things are better than others. A strong debate between strong positions is healthier than an artificial social force reverting everyone back to mean.
Now, all this is not to say the usual message about types is perfect. It could definitely be improved in many different ways. But not by muzzling, weakening or qualifying: not by most of the argument's suggestions.
Especially ironic is the rhetorical techniques the author uses to argue against rhetoric. The one I personally find the most grating—beyond the ambient implication that everything is equal that I've just written about—is establishing cheap credibility by aligning himself with the camp he's arguing against. "I like X, but..." Or, in this case, "I use OCaml, but..."
Then again, my pointing this out is probably ironic in the same way. Or is it?