This article is full of FUD and completely misses the point of static/dynamic distinction (flame-war, if you want). It's a perfect example of someone arguing to make a point without even trying to understand the opposite position.
No one has ever argued that dynamic languages are more expressive than static languages. This is impossible, as long as we're considering Turing-complete languages. The fact that Harmony/EcmaScript 5 reference compiler was implemented on OCaml. However, it is equally futile to argue that they are less expressive[1].
"you are imposing a serious bit of runtime overhead to represent the class itself [...] and to check [...] the class [...] on the value each time it is used."
Users of dynamic languages simple trade runtime efficiency for compile-time efficiency. What's wrong with that?
The point of having multiple languages is that different languages make different things simple and easily expressible. Sure, you can do dynamic programming in Haskell or OCaml, but the language is going to work against you in some way, requiring you to specify your intension in a particularly lengthy and awkward way (think Java, except Java makes you do that for any kind of programming).
[1] Ironically, the author fails to point out the one way in which static languages are more "expressive" than dynamic languages: overloading on function's return type. I have yet to see that in a dynamically typed language.
tl;dr: All Turing-complete languages are equally expressive, but different languages make different things simple.
I have not used dynamic languages enough to know what I say before for sure, but this is my current understanding:
>> No one has ever argued that dynamic languages are more expressive than static languages. This is impossible, as long as we're considering Turing-complete languages.
As I think you note in your last sentence, "expressive" here is not meant to imply whether something can be expressed in the language or not. It's rather about being able to express the concept in the language in a way that is no more or less complex than the concept itself, and expresses no more or less than the concept itself. So the Turing-complete argument does not really apply. (John McCarthy in fact had to say the same thing about Tuning-machines themselves -- that while they help to understand limits of any machine, for AI programs they are at too low a level to help real humans build new insights into AI beyond understanding those limits.)
>> Users of dynamic languages simple trade runtime efficiency for compile-time efficiency. What's wrong with that?
1. A program is run more times than it is compiled (hopefully). So the goal is really to make the effective combination of the two more efficient in a given usage scenario. On the other hand, with a dynamic typed language, if N seconds are shoved away from compilation, at least N seconds will be added to run-time.
2. The bigger issue I have is that dynamic languages often turn compile-time "errors" into run-time errors. Run-time errors take longer to detect and correct and the process of doing so is bound to see less automation. This is not to say of course that dynamic typing is always an issue.
> 1. A program is run more times than it is compiled (hopefully). So the goal is really to make the effective combination of the two more efficient in a given usage scenario.
You're not counting development time, or rather, cost.
Even if I eventually need the efficiency of a statically typed language, dynamic languages let me save the time that I would have spent satisfying the type checker on code that didn't make it into that version.
> On the other hand, with a dynamic typed language, if N seconds are shoved away from compilation, at least N seconds will be added to run-time.
It's unclear that that's true. In fact, the cost of compile-time type checking is typically more than the cost of run-time type checking during much of development.
> Run-time errors take longer to detect and correct and the process of doing so is bound to see less automation.
Run-time errors aren't detected until run-time but since folks with dynamic languages get to run-time faster, they're often detected earlier.
As to "less automation", I don't see it. Do you have an automated system that corrects type errors?
I partly agree to what you said, but just for sake of completeness:
>> Even if I eventually need the efficiency of a statically typed language, dynamic languages let me save the time that I would have spent satisfying the type checker on code that didn't make it into that version.
This conversion does not sound so trivial to me generally. At times converting from a dynamic language to a statically typed language requires either a change of code architecture or a massive refactoring operation.
>> Do you have an automated system that corrects type errors?
Not corrects, just detects. I have at times faced issues when code runs for half-an-hour and then crashes just because a the given object could not be converted to the needed new type at run-time. My reaction: "Sigh! I wish the compiler told me earlier during compile-time!"
> This conversion does not sound so trivial to me generally. At times converting from a dynamic language to a statically typed language requires either a change of code architecture or a massive refactoring operation.
I didn't claim that it was necessarily trivial. However, I'm happy to claim that it is very rarely required. What is it that they say about premature optimization?
That said, I think that you're overstating the amount of work. As our python friends keep demonstrating, the amount of code that is actually performance-critical is usually fairly small and can be handled by special means once you get things relatively stable.
And, when it is required, it may be an indication of massive success. If you wrote the first version, said massive success may make the conversion someone else's problem. :-)
> Not corrects, just detects. I have at times faced issues when code runs for half-an-hour and then crashes just because a the given object could not be converted to the needed new type at run-time. My reaction: "Sigh! I wish the compiler told me earlier during compile-time!"
No optimization produces speedups in every situation.
As to that particular problem, I develop in dynamic languages so I "never" have that problem after a significant amount of run-time.
To me, it's important to remember that dynamically and statically typed languages give you different rope with which to hang yourself, so it's best to program with that in mind.
To put it another way, it is generally agreed that it's a bad idea to write fortran in other languages even though it is possible. Why would you think that it would be a good idea to write static in a dynamic language or the reverse?
About your [1] ... dynamic languages can support multimethods (i.e. method overloading on steroids) so I don't get how a dynamic language can't do overloading based on return type.
The difference between a static and a dynamic language would be when this dispatch is made (compile-time versus runtime).
Overloading on return type is fundamentally impossible in dynamic languages, since they dispatch on values, and the return value can not be known before the function is called.
This is an impressive and informative argument. It's a shame that it's aimed at a straw man.
I use the term "dynamic language" to refer to a well-recognized (though not perfectly agreed upon) set of languages. I didn't choose the term "dynamic," and I've never tried to argue that since they're dynamic, they're better/more fun/more expressive than non-dynamic languages. Hence the straw man.
I use the term simply to distinguish between, say, the set {Java, C, C++} and the set {Python, JavaScript, Ruby}. Most people are aware of this usage and immediately understand the distinction. You could call the latter set "cranberry languages" instead of "dynamic languages" and it would be okay. As long as everyone understands the usage, it's a functional (no programming language pun intended) term.
It's not so much a straw man as it is a misnomer. What he's calling a "dynamic language," I (and I think most others) would call a "dynamically typed language." It's possible to have a dynamically typed, static language - with the dynamic keyword, that's what C# is, for example.
I try to avoid any discussion of "typing" when I comes to programming languages because I recognize both my own ignorance in the nitty gritty of language design and the general lack of consensus in the meaning of "x typing" terminology.
I found this article extremely difficult to read, as he seemed more concerned with enunciating through italics than with expressing his point in a clear and unambiguous manner.
I am pretty sure the Carnegie Mellon remark was just saying that a CMU CS graduate would have learned this already. Which is true, since the blog author (Bob Harper) teaches a course there which covers this very topic!
Funny how he treats 'static typing' as the natural order and 'dynamic typing' as the special case. I would have pegged it the other way round. Machines are typeless, its all numbers until somebody puts (or doesn't put) a type system on it.
Plus it's rather meaningless. Bald is a hair color. Clear is a paint color. Silence is a syllable.
Your assumption is the natural order of computation are digital computers. I think some will argue that type systems are inherent in the world. Mass has a type, velocity has a type. mv has a type that is a function of mass and velocity.
Computation is about expressing the type system that is inherent in the computation. Modern computers not having types are an artifact of their implementation, but not of computation per se.
Whoa there. Church-Turing tells us that everything that is computable is computable by a universal Turing machine. It also tells us that a universal Turing machine and the lambda calculus are computationally equivalent. The lambda calculus is untyped. (There are typed lambda calculi, but they are not computationally more powerful than the lambda calculus. Some of them are less so.)
Right. Computability says that you can compute any computable function with the lambda calculus. But I'm saying something broader, which is that anything you're trying to model with computation has a type system. IOW, computability theory says that one can model any computable function, while type theory says that one can properly constrain it.
In essence you need both. You can compute if the number five is greater than the color red, but you may want to ensure that you never do.
and search for the big threads. This direction of the argument (dynamic languages are static languages) was made most forcefully, in my recollection, by Frank Atanassow.
However, the other direction _also_ applies: static languages are dynamic languages. This was first pointed out in LtU, to the best of my memory, by Kevin Millikin, here:
Someone ought to write up the old days of LtU holy wars somewhere. They taught me more about programming languages than everything I ever read anywhere else.
And as in most holly wars, most arguments thrown around are dumb and without consequence to us, the developers that have to work.
Fact: I know of a single statically typed language that could be used for real-world usage and that gets operator-overloading right, and that is Haskell, a language who's authors list contains many PhDs.
Another fact: it is perfectly within reach of a talented sophomore student to implement a dynamically typed language with support for multimethods, which would get operator-overloading right, in the course of a single semester.
And yet another fact: dynamic versus static really is runtime versus compile-time and many languages to which we refer as being "static" or "dynamic" are in fact somewhere in between.
That's because -- real language designers ship.
And it would be perfect if the compiler would tell you all kind of things about your algorithm at compile-time, but you have to draw the line somewhere and start making compromises. And did I mention the halting problem? Yes, that's a problem for compile-time too.
One of the things that a 'dynamic' type system (or as the author would see it, a restriction to a single type) imposes on its creators is the need to make that single type as useful as possible. This includes things like being able to easily lookup an object's class, documentation, methods, properties, uses (is it a sequence? is it a map?, etc.). As a result, I find the design of the core libraries of dynamic languages like Clojure more well thought-out than say something like .NET. It's simple things like being able to treat a string as a sequence (Haskell, a statically-typed language, does this), an 'object' as a map, a map as a sequence, etc. I wonder if it has to do with a sort of de-centralization of control; being dynamic seems to make a language more malleable and open to experimentation. Accomplishing the same things in a statically-typed language requires more planning (which means that after it ships, its too late).
I really appreciate a well-thought out static language like Haskell, but it still has some way to go for the common programmer. For now, I don't think that the practical outcome of this article is for programmers to abandon dynamic languages.
"This includes things like being able to easily lookup an object's class, documentation, methods, properties, uses (is it a sequence? is it a map?, etc.)."
Most decent OO, statically-typed compilers allow for runtime type information. Delphi (native) and the .NET compilers allow you to query all sorts of information about a given object instance, including its class name, properties, methods, etc. Here's some links on Delphi's native compiler RTTI and attributes:
Object instances in environments like .NET and Delphi are inherently dynamically-typed.
Statically-typed languages provide the safety of checking the "stupid stuff" during compilation without sacrificing the flexibility of using dynamic types in the form of classes. And many do so without any major compilation overhead (Delphi and .NET compilers are very fast).
To paraphrase an old joke about mathematicians: "He must be a computer scientist!" "How do you know?" "His answer is absolutely correct, but has no practical value."
Most of the points he makes do not touch any practical problems of this devide. For example: How do I get my XUnit tests to run if one of the functions/methods/whatever in a file does not compile because the software has changed? Ruby really excells at that: it fails at runtime. Java? Not so, it will break my whole test file at compile time. Any other number of similar examples can be found. Yes, its a problem that the language marrying both of these properties has not been found yet. But Haskell certainly isn't the one.
As for you comment, if I were in the mood for trolling I would say: likewise. If you introduce a static error in a file, hopefully a tool will detect it, and the sooner the better. That you can't even run any code in the file because it makes no sense any more won't prevent your unit tests from running on other files (or at least a subset) if your program has at least a semi-correct architecture. At this point you just have to use the diagnostic information the compiler gave up to fix the syntaxic/typing problem, instead of trying to fix it at a latter stage for a greater cost. There is absolutely no point in trying to test a program that statical analysis has proved to be wrong.
In a theoretical sense: yes. But for practical reasons, I do not have one test per compilation unit, but multiple. Thats why I chose XUnit as an example: classes are groups of tests, while the actual tests are written in methods. So it makes sense to defer (or at least to be able to defer) "broken parts" to have a better understanding of how much is exactly broken. And "just have to use" is sometimes not that easy if for example your task is ripping out a whole subsystem and fitting a new one. You would really like to see progress there instead of "tiny feature Y is not provable by your compiler, so I won't tell you whether feature X,Z and B work". So I am interested in whether only a part of the compilation unit makes sense. Ruby does that: I can tell my test suite that I fully expect this test to be broken at the moment and that I want that reported. In Java, I'm lost until I satisfied the complainer - ehm - compiler.
Thats actually the reason why I know more then one Java and C shop that use Ruby/Python/similar for their test suite.
"Now I am fully aware that “the compiler can optimize this away”, at least in some cases, but to achieve this requires one of two things (apart from unreachable levels of ingenuity that can easily, and more profitably, be expressed by the programmer in the first place). Either you give up on modular development, and rely on whole program analysis (including all libraries, shared code, the works), or you introduce a static type system precisely for the purpose of recording inter-modular dependencies."
Or you can use a tracing jit. You monitor at runtime the actual values taken by the variables and produce type-specialized code.
"Since every value in a dynamic language is classified in this manner, what we are doing is agglomerating all of the values of the language into a single, gigantic (perhaps even extensible) type."
Yes, that's right, but you're overlooking the upside of doing things this way. What this gives us is the ability to define new types -- to extend the universal type, if you want to put it that way -- at runtime. No longer do we need this strict separation between compilation time and runtime; no longer do we need the compiler to bless the entire program as being type-correct before we can run any of it. This is what gives us incremental compilation, which (as I just argued elsewhere, http://news.ycombinator.com/item?id=2345424) is a wonderful thing for productivity.
"[...] you are depriving yourself of the ability to state and enforce the invariant that the value at a particular program point must be an integer."
This is just false. Common Lisp implementations of the CMUCL family interpret type declarations as assertions, and under some circumstances will warn at compile time when they can't be shown to hold. Granted, not every CL implementation does this, and the ones that do don't necessarily do it as well as one would like; plus, the type system is very simple (no parametric polymorphism). Nonetheless, we have an existence proof that it's possible at least some of the time (of course, it's uncomputable in general).
"[...] you are imposing a serious bit of run-time overhead to represent the class itself (a tag of some sort) and to check and remove and apply the class tag on the value each time it is used."
For many kinds of programming, the price -- which is not as high as you suggest, anyway -- is well worth paying.
In particular, dynamicity is necessary whenever data live longer than the code manipulating them. If you want to be able to change the program arbitrarily while not losing the data you're working with, you need dynamicity. In dynamic languages, the data can remain live in the program's address space while you modify and recompile the code. With static languages, what you have to do is write the data into files, change your program, and read them back in. Ah, but when you read them in, you have to check that their contents are of the correct type: you've pushed the dynamicity to the edges of your program, but it's still there.
For this reason, database systems -- the prototypical case of long-lived data -- have to be dynamic environments, in which types (relational schemata, e.g.) can be modified without destroying the existing data.
So to argue -- rather arrogantly, I might add -- that dynamic languages are really static languages is to overlook an operational difference that is a commonplace to anyone who uses both.
Let me expand just a little on my point about files. When your data are in files, they're just "raw seething bits" (to quote a colorful phrase I once heard); they have no type structure. To turn them into structures in memory, you have to parse and validity-check the contents. (This can be an expensive operation!)
Dynamic languages give you a middle ground: the stuff in memory is more structured than "raw seething bits", but less structured than data in a statically typed program. This is often very handy, as it's much more convenient to operate on data in memory; the slight performance cost relative to fully statically typed data is often no big deal.
Isn't this a property of the implementation rather than the language, though? Couldn't you have a compiler option to keep the type tags attached to the bits, sacrificing some of the speed benefits of static typing but keeping the primary advantage of proofs/testing?
You would still need to verify the data is still typed correctly. You just can't trust bits in general.
I've come to thinking of a strongly-typed program as being like a cell. Once you cross the cell boundary and it lets you pass, you can trust it and do all the wonderful type-based magic and optimizations that we know and love, but you have to get past that cell boundary first. And as annoying as that may be, poking a gaping wound in the cell wall and just jamming stuff in is likely to have bad results. Even between two cells of the same type, you really ought to do the checking, lest it turn out they not be quite as same as you thought (versioning issues).
(And of course in both theory and practice you still can't truly fully "trust" it even after it gets past the cell wall, but at some point you hit the limits of what you can verify. Real-world cells have the same problem too.)
1) If CL can only sometimes enforce invariants about type sometimes, and in special cases, I'd argue that the original statement was true. You'll still be deprived in many (probably most) cases. (People have survived falls where their parachutes didn't open, but in an essay most people wouldn't say "You can of course jump out of a plane without a parachute" despite the existence proof. And saying "Although there have been cases where people have survived, you can't generally expected with any reasonable success rate to survive a fall from an airplane without a working parachute" just isn't good writing.)
2) I don't get the database analogy. When I add a column to a table, the size of the table on disk does indeed change. Not to mention the schema can be changed statically. If you add/remove/modify a Field in a FieldList, the type doesn't change. So that really isn't a static vs dynamic issue.
About 1): I think the point was that CL can only sometimes enforce invariants at compile time. If the compiler can't prove that something is always true or always false, it will simply emit code to do a runtime check.
But as a general principal, you can't reliably or consistently do this in a dynamic language in any sort of guaranteed fashion, which was implicit in the original statement being refuted.
Even with a runtime check, you still CAN pass in a bad value, and the program will just fail, either locally or globally.
(Not to mention that the paragraph immediately following the refuted statement starts with 'Now I am fully aware that “the compiler can optimize this away”, at least in some cases, but to achieve this requires one of two things (apart from unreachable levels of ingenuity that can easily, and more profitably, be expressed by the programmer in the first place)...')
1) I think the right question to ask here is, how often is the lack of compile-time checking a problem in practice, and how big a problem is it? I've worked extensively in both static and dynamic languages, and my experience has been that in dynamic languages, while I do miss static type checking occasionally, it happens quite a bit less often than your parachute analogy would lead one to imagine.
Look at it this way. Programs have lots of important properties. Some of these we have figured out how to encode as types so that we can verify them statically. But (although research in this area is ongoing) there are still a lot of important properties we require of our programs that can't be statically verified. The upshot is, we have to test them. Yes, testing is necessarily imperfect, but we have to do it anyway. In my experience, in the course of testing a program in a dynamic language, the kinds of errors that would have been caught by static typing are relatively easy to find by testing; usually, the errors that are hard to find by testing are also well beyond the scope of static typing. Maybe that will change eventually, though I'm skeptical, but it's certainly the state of type systems in common use at the moment.
So the parachute analogy is grossly exaggerated; it's more like leaving your shoelaces untied.
When Doel says "you're depriving yourself of the ability to state and enforce the invariant...", it's not clear whether he means "always" or "at least sometimes". I concede that he could have meant the latter, in which case you're right, my counterargument fails. But as I've just argued, there are lots of other invariants that we can't statically enforce anyway, and they tend to be the more important ones.
2) A table is a bag (multiset) of tuples of some specific type (schema). When you add a column, the table now has a different type, because it's now a bag of tuples of a different type.
Imagine that instead of using a database you keep all your data in a running process of a program written in a static language. (Let's ignore the possibility that the program or the machine might crash.) How are you going to structure the data? To make use of static typing, you need to use collections of records (or "structs" or whatever you like to call them). How then would you add a field to one of these record types? There's no way to do it without recompiling your program, and there's no way to do that without either losing your data or dumping them to files and reading them back in to the new version of the program.
Now, you could store your data differently, by explicitly modelling rows as maps from keys to values. But what's the range type of your map? Why, it's the union of the types of the possible data values you might want to store; you've lost static typing for the data values, and will have to check at runtime whether each value you read out of one of these row maps is of the correct type.
Programs written in Python are often augmented with programs written in C. Didn't Python become popular BECAUSE it worked seamlessly with C programs? Use both types of languages together.
But I'm interested in whether type-inferring languages will ever take off, bringing the best of both worlds, i.e. the code tersity of "dynamic" and the compile-time checking of "static". I'm not sure if types can be changed dynamically in these types of languages, though.
In particular, dynamicity is necessary whenever data live longer than the code manipulating them. If you want to be able to change the program arbitrarily while not losing the data you're working with, you need dynamicity. In dynamic languages, the data can remain live in the program's address space while you modify and recompile the code. With static languages, what you have to do is write the data into files, change your program, and read them back in.
Not completely true. While I don't know of many use cases for doing what you're talking about, there is at least one that I know of, which is debugging.
In the case of debugging, you can change code with static languages almost arbitrarily while the data stays live ine program's address space. Although one thing that is not possible to do, with current implementations, is change the type definition. But otherwise changes can be made to code w/ data never leaving main memory (I probably do this 10 times per day).
Now it is theoretically possible to also change type definitions, but that would require a notion of constructor that builds the new type from the old type. In some cases this can effectively just be an interface, then its really just a noop.
"Another part of the appeal of dynamic languages appears to be that they have acquired an aura of subversion. Dynamic languages fight against the tyranny of static languages, hooray for us! We’re the heroic blackguards defending freedom from the tyranny of typing! We’re real programmers, we don’t need no stinking type system!"
I don't know anyone who prefers dynamic languages who actually thinks this way. I program in both statically- and dynamically-typed languages, and I prefer dynamically-typed languages because it's less code I have to write.
I do like the direction C# is going with the dynamic type. I want a statically typed language, but I want the ability to have dynamically extensible type classification -- when I want it.
Are you referring to extension methods? Or the "dynamic" keyword/type? I've only used the dynamic type a little bit, can you use it to solve the type vs representation problem he is referring to?
Yes. Basically it works by using the class hierarchy to define types. And w/ subtype polymorphism you get new types that are also classified under their base types. And with dynamic you now get the ability to join the "uber type" and be classified dynamically.
Here is why the article doesn't make sense to me: I frequently find that languages in the "dynamic" group (python, ruby, js, smalltalk etc) feel much, much more similar to languages in the "ultra strict" type group (haskell, ocaml, etc), than either do to say, the medium group (c(#|+)*, java). If the case were really that there is some sort of natural order along the type strictness lines, wouldn't it be that python felt closer to c than to haskell in terms of power and expressiveness?
What you're seeing there is the distinction between "manifest typing" [1] and "implicit typing" [2]. With manifest typing, you actually have to tell the compiler the specific type. With implicit, it will figure it out, either because (as discussed in the article) everything is the same type "Object", or because it will do type inference. Most modern type inference has started with Hindley-Milner [3], but move beyond it in various ways with varying degrees of justification and success. Raw HM doesn't seem to be enough to work with in practice, but moving beyond it gets you into the realm of undecidability unpleasantly quickly. But some progress is being made.
The C(/#/++)/Java languages had a lot of people convinced that being statically typed required manifest typing. Including me. I thought I was against static typing, what I was against was manifest typing.
Please don't use "strict" regarding a language's type checking. It's already used in domain theory for functions which rely on arguments being well-defined and computable. It is one of Haskell's defining features that by default, functions and data structures are not strict.
While the basic perspective of this post is interesting, I find the authors rhetoric entirely off-putting.
Example:
"There are ill-defined languages, and there are well-defined languages. Well-defined languages are statically typed, and languages with rich static type systems subsume dynamic languages as a corner case of narrow, but significant, interest."
I would like to see justification for the claim that there does not exist a single well defined language with runtime type enforcement.
This tone runs through the entire article, which is mostly begging the question rather than supporting the premise. I'd love to see the author's point illustrated by contrasting Haskell/ML and Python/Scheme/Ruby code listings or something similar.
Instead the article merely restates its premise in an attempt to make an impression, rather than to inform. Disappointing.
Author apparently knows has knowledge and experience, but attributing psychological traits to tools like programming languages is a poor trick and a fallacy.
No one has ever argued that dynamic languages are more expressive than static languages. This is impossible, as long as we're considering Turing-complete languages. The fact that Harmony/EcmaScript 5 reference compiler was implemented on OCaml. However, it is equally futile to argue that they are less expressive[1].
"you are imposing a serious bit of runtime overhead to represent the class itself [...] and to check [...] the class [...] on the value each time it is used."
Users of dynamic languages simple trade runtime efficiency for compile-time efficiency. What's wrong with that?
The point of having multiple languages is that different languages make different things simple and easily expressible. Sure, you can do dynamic programming in Haskell or OCaml, but the language is going to work against you in some way, requiring you to specify your intension in a particularly lengthy and awkward way (think Java, except Java makes you do that for any kind of programming).
[1] Ironically, the author fails to point out the one way in which static languages are more "expressive" than dynamic languages: overloading on function's return type. I have yet to see that in a dynamically typed language.
tl;dr: All Turing-complete languages are equally expressive, but different languages make different things simple.