Hacker News new | past | comments | ask | show | jobs | submit login
Clojure Implemented in Pure Python (github.com/halgari)
122 points by gdw2 on Feb 29, 2012 | hide | past | favorite | 53 comments



This is an exciting idea, but I think the home page focuses far too much on these hypothesized advantages of a dynamic VM. Not only are the advantages unproven, it's hard to see how this project represents anything novel in that aspect given that ClojureScript already exists.

The things that come to my mind are things like Numpy, Scipy, PySide, boost.python and all other sorts of Python bindings to C, C++ and Fortran code that aren't readily available on the JVM. Also, the startup time that the JVM can't match, the Python standard library and the other random bits people have developed that make Python such a wonderfully flexible scripting environment.

If you have excellent interop, there will be plenty of interest in this project regardless of how fast or slow it is. If it's fast, all the better.


I agree. Native access to legacy linear algebra functionality exposed by Numpy is the #1 reason this project seems exciting to me. I don't really care if the bulk of the code is O(10) slower so long as the numerical bits are O(1000) faster, which is a typical comparison of LAPACK for the JVM equivalents.


(Pedant warning)

I think you may be abusing Big-O notation a little here since, technically O(10) is equivalent to O(1000)—and both are O(1).


Haha, thanks for the warning. I am reminded of another time I ended up abusing the Landau notation in a numerical analysis class ( was it probability?) some years ago. The homework was to get and N^2log(N) algorithm of some kind, but I became concerned about certain large constant factors which I could determine exactly. However, these constant factor could be written as a series in another parameter, which I couldn't approximate in a certain limit and... well, you see where I'm going. There where lots of O's with slightly different semantics on the same paper.


It's fairly clear in the context that he's using O(x) in the informal 'order of' meaning: 'order of 10', 'order of 1000' etc.


I agree, and had no problem understanding his intended meaning but I was irked by the unnecessary misuse of technical notation to attempt to get across an idea that would have been perfectly easy to write without it (e.g., "10 times" and "1000 times").


The interesting thing about python is that while it's fast to start, it can be slow to stop. I've certainly seen that in CPython, I don't know if it's something that affects PyPy as well.


Huh. I think Common Lisp would be a much better choice than Python. It's much faster, for starters. And it lets you do some fairly low-level stuff if you want.

In general I think Common Lisp's virtues as an implementation substrate for other languages are much greater than most people appreciate. It is flexible, expressive, and fast. Its dynamicity comes in very handy. And some of its vices -- its sheer size, its lack of orthogonality, and its occasionally archaic naming conventions -- are much less problematic for a language implementation task than they are for general programming.

There are exceptions, of course. You wouldn't want to implement C++ in Common Lisp. But for dynamically typed languages it ought to be a leading candidate.


Interoperability between clojure and python is promising, because it lets you do lisp programming with python libraries and data structures.

If you want to do lisp programming with CL libraries, you could just use CL.

I'm not sure what the advantage of clojure in CL is over just using CL.


I dougth CL-Clojure is faster then pypy-Clojure.


Why? CL compilers (SBCL at least) generate really fast code.


It does not matter with how good a compiler you compile an interpreter it stays an interpreter. With Pypy you get a JIT Compiler. Sure you could maybe write a JIT with CL but thats a lot more work.


I don't understand. Do you mean that you can't compile Clojure into Common Lisp the way you can compile it into Python/PyPy?


You can, but I think you'd run into the same issues mentioned here: http://clojure-py.blogspot.com/2012/02/ive-been-asked-many-t...

Basically it reflection (aka dynamic dispatch) is a major performance killer if you don't implement a tracing jit.


CLOS implementations have had to deal with the problem of extreme dynamism for a long time, so there has been quite a bit of optimization work associated with it. Polymorphic inline caches help a lot, for example.

That said, a good tracing JIT is probably better; but on the other hand, SBCL has profited from many years of optimization work on CMU CL. It would be interesting to see some real-world performance comparisons between SBCL and PyPy.


Wouldn't SBCLs type inference(+ type hints) help with this? I'm no compiler expert, but i think clojure(with the possible exception of multimethods, which are slow even on the JVM) would map well to CL(maybe even better than the JVM).


PyPy is not just an Python-Interpreter. The PyPy Team build a Toolchain that can creat a JIT-Compiler out of a Interpreter (read some about the pypy project and you will be enlightened. Such a Toolchain does not exist for Common Lisp.


Interesting! Thanks for the pointer.


Here's a quick benchmark on my machine (a EeePC 1001HE). I used reduce1 with clojure-py because there doesn't seem to be a reduce BIF. I don't know if that effected this benchmark any:

Python:

    (time (reduce1 + (range 100000)))
    Elapsed time: 3882.57193565 msecs
    4999950000
PyPy:

    user=> (time (reduce1 + (range 100000)))
    Elapsed time: 259.984970093 msecs
    4999950000
Clojure via Java Hotspot:

    user=> (time (reduce + (range 100000)))
    "Elapsed time: 75.35225 msecs"
    4999950000


Those numbers a re bit small to get a full view of the way the pypy jit works:

  user=> (time (reduce1 + (range 100000)))
  Elapsed time: 903.011083603 msecs
  4999950000
  user=> (time (reduce1 + (range 1000000)))
  Elapsed time: 777.748823166 msecs
  499999500000
  user=> (time (reduce1 + (range 10000000)))
  Elapsed time: 8764.57095146 msecs
  49999995000000
  user=>

  Clojure via hotspot:
  user=> (time (reduce + (range 100000)))
  "Elapsed time: 174.768593 msecs"
  4999950000
  user=> (time (reduce + (range 1000000)))
  "Elapsed time: 267.60421 msecs"
  499999500000
  user=> (time (reduce + (range 10000000)))
  "Elapsed time: 1131.77367 msecs"
  49999995000000
  user=>
But yes, we still have some room for improvement.


Here's a blog post explaining the issues involved in a bit more detail: http://clojure-py.blogspot.com/


That looks nice. I have to ask: Clojure on the JVM is fast: can a PyPy runtime really compete after Hotspot has a chance to optimize?


I'm less concerned with speed on this platform. I think it'd be excellent even if only suitable for small scripts that need to start up fast.


All VM JITs compile to statically-typed machine code, so it's doubtful that building a VM from the ground up for dynamic code is going to yield major performance improvements. There are some enhancements that give VM-level visibility into dynamic invokation (like INVOKEDYNAMIC in JVM 7), but this has a relatively small performance impact on real-world codebases. In addition, the modern JVM is a sophisticated, highly-tuned machine with millions of man-hours on the books. It's unlikely to be eclipsed any time soon.


This is not true at all. PyPy implements a tracing jit that compiles specific loops for each set of types run through the interpreter. This means it is actually possible to have Python code that runs faster than C code in some rare cases: http://morepypy.blogspot.com/2011/07/realtime-image-processi...


Proving that there are cases where it's faster than C doesn't prove that it's faster than the JVM ;) There are plenty of cases where JVM code will be faster than C too...


Er, what about Java interop? The notions of atoms, agents and refs? How are you going to translate these into "pure Python"?


CLR Clojure doesn't have Java interop either. Neither does clojurescript. I don't think its a requirement for a Clojure implementation.

Atoms, refs and agents seem like an integral part of the Clojure language however.


There's absolutely nothing stopping us from implementing atoms, refs, and agents on clojure-py. Sure, the GIL won't help at all, but PyPy is working hard to fix that.


Is this a hobby project or are you aiming toward getting adoption? I will confess that I don't understand the point. If we accept the Church-Turing thesis then there's no reason that by the time you catch up to Clojure's performance on the JVM it will have been optimized beyond the measurement you made. The only real guarantee you can make is that you will have added an abstraction layer to Python.

While I definitely think this is a cool project, when I think about using it in production it strikes me as more Sisyphus than Prometheus.


First of all, it should be mentioned, that this is not an abstraction layer. Clojure-py functions are actual Python functions (not classes as they are on the JVM). This means that the speed of clojure-py is almost exactly the same as python code.

Secondly, there is some benifit to not having to worry about static typing in a dynamic language. Anyone want to explain how to read a binary file in Clojure? Here's a hint, it takes the use of FileInputStream, DataInputStream. In clojure-py it's as simple as (py/open "foo.bin" "r").

And thirdly, why are we writing a dynamic language on a static VM? Fast Clojure code on the JVM these days takes little hints. You have to tag parameters with ^Integer and ^Double to kick the compiler type inference into gear. None of this is required on a dynamic VM. In fact, it's completely pointless.


Ok, what's the use case? What is the benefit to this versus regular ol' Python? In what situations might I write clj-py assumine identical performance to Cpython or pypy, instead of just Python?

I agree that this is a really cool project and I plan on reading through the source when I have time strictly as a learning exercise, but I cannot think of a use case.


Consider the case of a person that prefers programming in Clojure to programming in Python... :)


I am such a person. That's why I program in Clojure -- yes, including a couple of Clojure repos at work (after gettin sign off from my boss). I also love Python. Best tool for the job and all that. Rewriting Clojure in Python is as absurd to me as rewriting JavaScript or Ruby in "pure Python". You get nothing meaningful except different syntax.


I agree, if you're into Python, use python. However, I find things like macros, protocols, and multimethods drastically reduce the amount of coding that has to be done to accomplish a certain task. But I'm biased there ;-)


Well, don't get me wrong, I love Clojure and have been writing more Clj than Py the last few weeks.


If nothing else lispreader.py will be useful for reading Clojure data in Python. Think communicating with nREPL and reading the forms that come back


I have to admit I'm not hip to all the newfangled doings with alternative Clojure targets. To me though java interop is one of the core differentiators for Clojure from other langs aside from its lispness. But, you're right, it's not required.


Really nice. I hope it will make clojure scripts faster to start than with the jvm and thus a real alternative as a scripting language.


ClojureScript scripts can already do this. :)


Are there many options for running ClojureScript scripts on a system outside of Node.js?


You want more command line Javascript runtimes? There's Rhino, but it's slower than Sunday with your in-laws...

Truth is, you only need one, as long as it's a good one.


Does every ES3 browser count?



PyClojure (in the link) didn't actually take it to the extent we are. clojure-py goes so far as to implement all the standard collections in pure python. So [1 2 3] in clojure-py is a PersistentVector not a python list.


Why? If your going to rewritte the collections, why not in Clojure? Speed?


Mostly because we have to have these collections inside the compiler. So it's a bit hard to write collections in a language that has no compiler. Chicken/egg issue..


Well now that you have a compiler you can rewrite it in clojure-py to make clojure-py-in-clojure-py :)


This is awesome.


Please remember kids, language X implemented in language Y, probably means language X > language Y. (And if you question this, first implement a small, elegant language, and a large "pragmatic" one and then talk to me about it.)


Or language X > language Y in some domains, but language X is not practical in some other domains - see Lua/C[++]


What do you mean by "language X > language Y"?


It means that after cast language X and language Y to an int value language X is greater than language Y (that's my understanding).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: