I'd have to ask how much Scala and how much Clojure you know to challenge my statement about the equivalence of the code and the reduction in size? (no offence intended - a genuine question)
I like Scala. I'd hoped it would become my company's "One True Language". It has a lot less ceremony than Java. I went thru the very painful 2.7 to 2.8 transition. I attended Scala Lift Off! 2010 and Scala Days 2011. I think Scala is very impressive - a (much) better Java. I hope it will succeed and I still follow it closely.
I've always liked functional programming, and Lisp, and Clojure hits a particular sweet spot for me. It also happens to hit a sweet spot for our company - our developers like dynamic scripting languages and Clojure fits that better than Scala. Scala _allows_ you to adopt a functional style but doesn't enforce it (and, to be honest, makes you work at it) - Clojure makes you work at deviating from the functional path.
Clojure has less ceremony than Scala, is more functional (or at least more purely functional), and - in my opinion - far fewer demands on developers in terms of what you must learn to be productive...
I've written a substantial prototype in Scala, which included half a dozen Neo4j-backed implementations of scala.collection traits. As for Clojure, I've only really toyed with it, but I have written meaty code in several other Lisp dialects. So while I am by no means an expert in either, I'd say I'm qualified to be skeptical of your claims. Furthermore, I have no vested interest in either language. I have a future production project in mind for each of them.
I'm going to ignore the discussion about functional vs OOP, and team sweet spots, and cognitive load of the Scala type system, and all of that. I agree with and disagree with several of your points, but they are not relevant.
Scala, being a statically typed language, has some additional "ceremony" over a Clojure. That's for sure. But, in my experience, that ceremony is a relatively small fixed cost of around 5 to 10 lines per "big scary top-level thinggie" in the problem and solution domains.
In general, I'd expect the Clojure source code to be shorter by about 5%, not 75%. In some cases, either Scala or Clojure will be significantly shorter that the other, depending on whether or not the type system or macros are employed cleverly.
Regarding your specific claim -- "The form of the Clojure code mostly follows the form of the Scala code" -- I am beyond skeptical. I do not trust your account of "mostly the same functions".
Without seeing concrete examples of how the code shrank so significantly, your anecdote is not useful to any reader.
Thanx for taking the time to give some background. I posted the anecdote on the Clojure list just by way of sharing some of the progress we were making with Clojure. If someone had actually asked me, I'd have probably said "No, don't post this on HN" because it's an isolated anecdote - not a general case.
A couple of folks have picked on the "Clojure code follows the Scala code" claim. Since the code embodies a fair bit of proprietary business logic, I can't share it - which is a shame because it would clearly show the structure really is very similar with almost all of the same function names.
I'll see if I can find some time to sit down with the two code bases side by side and produce a more detailed comparison which would answer some of the more critical voices here.
A couple of folks mentioned library support and part of the increased conciseness in the Clojure probably comes from that - in particular, in Clojure, a SQL query result is a sequence of maps whereas I had to construct a collection wrapper around ResultSet in Scala (remember, the code was written back 2009 - there are better SQL abstractions available for Scala now as third party libraries).
Using parallel collections in Scala would more closely mirror the approach we now use in Clojure - and that would shrink the Scala code a bit - but parallel collections weren't available when the code was written. However, using agents in Clojure - which would more closely mirror the old Scala code - would only add a few lines.
I think the Scala code could be made quite a bit more concise because of the advances in Scala over the last two years. I suspect I could make the Clojure code a little bit more concise too. I don't believe I could make the Scala code more concise than the Clojure code.
> I like Scala. I'd hoped it would become my company's "One True Language".
With all of these exciting languages that target the JVM, I think it makes a lot of sense to have a company's "One True Language" simply be the JVM itself.
That allows a programmer to use Scala, Clojure, JRuby, or Jython to initially write software quickly. Anything that becomes a core library can be rewritten in Java for efficiency and so that bindings can be added for all of the high-level languages. Then projects themselves can be written in whatever language best suits the individual employees.
I was a little disappointed in the point where he said he rewrote it in 15 hours, and was 75% lighter code... and didn't explain why.
Makes me think is think a few things...
- the original code was bloated and needed a rewrite anyway, and hence was performing badly. Most importantly having written lots of code that has evolved myself over the years... I have taken my own code (written in the same language) and cleaned them unto be half the size of they where before hand.
- The closure rewrite was incomplete as compared to the scala version and totally invalidates his comparison
- would the out of memory issue and performance difference be solved by not using actors with known memory leaks and swapping to akka.
Would love a follow up on this as I'm slowly learning scala as a hobby ATM
Yes, switching to Akka would almost certainly have solved the OoM problems. That's the whole point. We had to switch from the original code. The choices were: move to Akka, wait for 2.10 (when Akka replaces the default actors implementation - and whatever migration was involved), or rewrite. Given that we're getting heavily into Clojure, it seemed a worthwhile spike to create a Clojure version to see how it performed...
Was the original code bloated? I don't think so. I've written a LOT of code over 30+ years in dozens of languages. When the Scala code was written two years ago I was very impressed with its conciseness compared to Java (and other languages). My experience has been that Scala is 3-4x more concise than Java in general (and sometimes as much as 10x more concise) but Clojure is even more concise.
If you're learning Scala, don't let this put you off: Scala is an incredibly impressive language. Odersky and his team have done an amazing job.
I have a comparison too: a rather different kind, but no less reasonable for this sort of game, and with all source available -- http://www.hxa.name/minilight/#comparison
It says that languages do not vary by factors of 10. Considered as pure 'lexical structure' the range is about 2 or 3 between small and large. Anything more than that must have some reason -- most probably library support.
Of course, library support is important, and quite fair to be compared in lines -- it represents real work that has to be done. But it seems worth trying to distinguish it as a library matter -- or whatever else.
Scala is up to "10x more concise" as Java - your numbers suggest as they represent more or less the same structure in Scala and Clojure - a 4x reduction from Scala to Clojure (260 loc vs. 1000 loc).
This would result in a 12x (4x3) to (4x10) 40x reduction from Java to Clojure. Which is hard to believe.
Take this example where [edit] Python is compared to Java with a 1.7x size reduction:
Could you give an example to reduce this code by 12x, or even by 40x? The biggest reduction LOC would probably be the reduction in constructor size with Scala case clases.
Your math is a bit off: Scala can be _up to_ 10x more concise than Java. This specific Clojure code was either 3x or 4x more concise than this specific piece of Scala code (based on char count or line count).
My experience over the last few years suggests that Scala code can be anything from 2-3x to 10x more concise than Java. My experience also suggests that Clojure code can easily be up to 10x more concise than Java. I don't think too many people would argue with that as a general rule (or at least, not too many people who've programmed in all three languages :)
I'll take a look at the link you posted and respond in more detail later.
Could you elaborate on the point where my math is off?
You mentioned a range from 3-10x between Scala and Java, and we have a 4x multiplier with your code (260 LOC Clojure vs. 1000 LOC Scala). This would result in a 12 - 40x range which I quoted in my comment.
I also find it a little discomforting that "Scala is 3-4x more concise than Java in general" becomes "my experience over the last few years suggests that Scala code can be anything from 2-3x" in a matter of minutes.
You're sure about your numbers? Not wanting to be offensive, but it sounds a little like you pull them out of thin air.
I think you're striving for absolutes where only relatives exist.
You're right that I said 3-4x in one place and 2-3x in another. Sorry. It's a gut feeling. Is it 2x, 3x, 4x? One of those. Any of those depending on the code. Somewhere in the middle. It's going to depend on the specific code. You can't reasonably suggest that language X is always Nx more concise than language Y, surely?
So, somewhere in that space, Scala is a substantially more concise than Java (will you allow "substantially"?) and Clojure seems to be substantially more concise than Scala.
A little bit disappointed about the down votes, I'd hope we as an industry take numbers not so lightly and find "Is it 2x, 3x, 4x? One of those." no longer comfy.
I'd also hope we strive more towards "Extraordinary claims require extraordinary evidence" and more science than folklore.
1.7x is probably misleading here. The example in the post is less than 100 lines long, which makes it very difficult to tell how the python code could be shorter. For example, in this tiny example program he writes, the author creates a "artist" class:
class Artist:
def __init__(self, name):
self.name = name
def __str__(self):
return self.name
Anyone familiar with python will realize that if this is the only functionality you need, than a bare string could be used instead, eliminating this code entirely.
At the end of this article (Update 3), the author links to a blog post that describes a reduction in lines of code from 90,000 to 12,000 after a re-write [1]. He claims that this reduction is more the product of increased domain knowledge than any sort of language benefit.
I think that really gets at the crux of the issue. Rapid prototyping, ease of understanding, clarity, ease of modification, could not only be equally important to code size, but could also directly influence it.
"Anyone familiar with python will realize that if this is the only functionality you need, than a bare string could be used instead, eliminating this code entirely."
Your argument was: the LOC is too high compared to Java because a class is used instead of a String? Or is there a shorter syntax for constructors, methods and attributes?
Well, the only methods implemented in this class are the __init__ method, which is a constructor and the __str__ method which tells Python what to spit out when print is called.
Since Python has the capability of duck-typing, if I had an instance of an artist class, and a string with the artists name in it, I could use them in the exact same way with regards to the functionality implemented in the artist class. That is, I could type "print some_artist" and either would print out the name.
But, my larger point was that lines of code is tied in to a much more complex relationship that the developer has with their language and the problem domain. And that this is kind of a silly example to claim that Java is 1.7x more verbose than Python.
Exactly... in this case, the functionality of Artist() is a subset of the functionality of str(). In Python, you would just name the variable "artist", assign a string to it, and be done with it.
It's interesting that rewriting in Clojure was "less work and more timely" than migrating to Akka or waiting for Scala 2.10. Any insight as to why? Having written no Scala, I'm not sure what the Akka changes would entail.
Not sure what you mean - if code is statically type safe, it will also be dynamically type safe. Moving from Scala to Clojure is not really affected by the type system.
The original Scala code was written back in November 2009. Akka wasn't really the "accepted production solution" back then. There's no question that it is _today_.
We used actors to create a pool of workers in order to run some slow processes in parallel. The simplest equivalent in Clojure was to run the processes via pmap. Cheap, quick and very effective. We didn't have as much control over the parallelism but it didn't really matter for our use case.
We had about 1,000 lines of Scala that we'd developed as we learned Scala. Today we have more experience in Clojure than we did in Scala back then. Writing 260 lines of Clojure with the Scala code as a guide took a couple of days, including writing a new set of unit tests.
Learning enough about Akka to migrate to it would have taken us some time - and would still have left us with three languages in the mix. By replacing this Scala code with Clojure, we only had two languages in the mix, which was also a win.
I'm surprised that you chose to use actors to parallelize your code. I always thought they were more valuable for concurrency. Back before Scala 2.8 came out with parallel collections, when I need parallelism I wrote my own pmap-like functions. It was really easy, probably easier than learning the whole concept of actors, let alone an implementation of them, and I got the performance speedup I was looking for.
It sounds to me like you got a large code reduction because in Scala you used a heavyweight concurrency solution to a lightweight parallelism problem.
I just went back and looked at the code overhead of using actors over what it might look like with 2.9's parallel collections and the difference would be about 25-30 lines so I don't think that counts as a potentially "large code reduction". Wrapping three bits of code in actors doesn't add much ceremony (which is in fact a great example of how concise Scala can be when adding concurrency!).
(And parallel collections came in 2.9, not 2.8 - we'd already migrated, painfully, from 2.7 to 2.8)
[edit: See my comment below on how this might be a culture thing]
I wonder how 15 hours in the post become a couple of days here on HN. But of course, the post was written as an anecdote and should be taken as that.
Could you elaborate what makes Akka "a language" for you? I did assume Akka actors are for all practical reasons a drop in replacement for scala actors.
More people should btw. read "Anecdotal Evidence and Other Fairy Tales" in Hacknots book:
15 hours = 2 x 7.5 hours - 2 days? (seems like basic math to me :)
Akka is _not_ a language - I'm not sure where you got that impression? (please explain so I can respond to it - no judgement). Based on the Akka documentation (pre-1.0), it did not seem to be completely a "drop in replacement for scala actors" but I hope it's moving that way (and I'm very pleased that Scala is adopting Akka as the default implementation!).
Ah, I see. No, we have another language in the mix which is not relevant to this discussion. The "One True Language" comment was a bit tongue in cheek since I doubt we'd really end up with just a single language across all tiers. I had initially hoped Scala would become our primary backend language but it looks like that "honor" will go to Clojure as we're doing more and more work with that.
Excellent, the title says anecdote to set the right context. There have been other discussions on how much the knowledge of writing one solution helps in writing another solution in a different languages and what is attributable to the new language itself.
It takes a little away from the story that it took them a lot of time to discover a widely published performance and memory problem with scalas actors.
"after a lot of poking around, we became fairly convinced it was due (at least in part) to the default actor implementation in Scala"
"widely published performance and memory problems with scala actors" - remember the code dates back to November 2009 when that was not common knowledge...
The post was probably a little unclear, I did not understand you poked around to find the oOms in 2009, I just assumed it was more recent as the Clojure post was from September 2011.
No, we didn't hit the OoM regularly until we had more traffic as we moved more sites onto the new system and it wasn't clear that there was an admitted problem with the Scala actors at first either.
These days folks seem much more comfortable about admitting there's a problem with the built-in actor implementation.
Numbers of lines of code and improved readability alone are worth switching.
As a consequence, less code means less consumption of resources, for Java it is very true.))
Just by looking at the code one could see that Clojure makes the development process for JVM less annoying and frustrating while Scala makes it even more verbose and complicated. ^_^
"The form of the Clojure code mostly follows the form of the Scala code, most of the same functions"
Well evidently not. Otherwise, the code would not have seen a 75% reduction in lines of code.
What specific transformations resulted in this delta?