> A quick look at the code implies that various data structures are used under the hood for what looks like one data structure in the language. That's a lot of complexity! I'm not sure that's a tradeoff I'd be happy to make. It makes it harder to reason about performance.
I don't think that's true. A persistent array map is a single data structure. It's just a persistent data structure optimized for non-destructive (functional) updates. On a high-level it's not too different from finger trees which may be more famous. When I used to write Clojure, I didn't feel any need to reason about its internals for performance reasons or any other reasons. It simply feels natural.
The actual details of how the underlying datastructures work isn't that hard to grasp (see: https://youtu.be/wASCH_gPnDw?t=1817), but in practice you don't even need to go that far.
At a high level there's really just a few things you need to keep in mind, the biggest being the big O guarantees of the core DS interfaces, which are listed here https://clojure.org/reference/data_structures. Those guarantees hold across the different underlying implementations, and there are plenty of things you can tune to get a 2x, 10x, 50x improvement in performance before resorting to things that depend on which specific DS is being used under the hood
Isn't it similar to what happens in other languages?
In example, in C++ std::maps are implemented as RB trees, are::vectors instantiate 50% (sometimes 100%) more memory than requested, etc. All of that happens under the bonnet in order to fulfill the performance guarantees of the language.
IIRC, one of the most widely quoted memes in Clojure space goes along the lines of "simple != easy". Or better (not= :simple :easy)
Traditionally most C++ implementations use RB trees, the standard doesn't require it to be so, all data structures requirements are described via big O notation.
Kind of, if tomorrow someone comes up with funky zombie trees with the same O() characteristics, or even better, they can ship them on their C++ standard library.
And then some clever C++ developers that have code that depends on rb tree's behaviours, would cry that their cheese was moved and their code doesn't work as before, even though nothing prevented funky zombie trees to be shipped instead.
I think the author's referring to the multiple implementations of persistent maps (i.e. APersistentMap). Clojure's generated docs[1] say there are four implementations (PersistentArrayMap, PersistentHashMap, PersistentStructMap, PersistentTreeMap), and this Stack Overflow answer[2] explains that the latter two are rarely used, and the former two make up the bulk of actual APersistentMap instances, with the language's core functions for persistent maps changing implementation used under the hood as it likes in order to do what it thinks is best for performance (basically, using PersistentArrayMap if the number of keys is small, and using PersistentHashMap otherwise). You can see this, for example, in PersistentArrayMap's implementation of assoc[3], which replaces itself with a PersistentHashMap if the number of keys has grown over the hardcoded threshold of 16.
So like you say, your experience as an "end user" makes it seem like it's just one data structure at least in interface, though in implementation it's actually doing some performance optimization stuff which, I think I'm willing to grudgingly grant to the author, does make performance harder to reason about in principle. Though in my (very unserious) experience, I think I've enjoyed languages that take a possibly incomplete stab at getting performance right over ones that draw a line leaving it entirely out of scope and declare victory.
I'll equivocate a little and share that I have been bitten by these internals before, though the fault was mine: one time, I was unwittingly relying on an implementation detail of PersistentArrayMaps, which is that they preserve insertion order. PersistentHashMaps don't, and my code was mysteriously breaking when number of keys grew beyond 16 until several hours later I learned much of what I've just shared above.
> does make performance harder to reason about in principle.
the thing is, you don't reason about performance. you measure it.
Once you've obtained measurements, you can make decisions based on the outcome of said measurements.
If the Runtime (and not just talking about clojure here) makes a decision on performance, and it's suboptimal, your measurements _should_ reveal it - for example, by increasing your input size, the performance graph should show a clear inflexion of where this happens.
The real question is whether you _can_ force the Runtime to do something at the programmer's behest, rather than a hidden decision that you cannot control or change. I believe in clojure, you can change the backing data structure, but i'm not sure if you could do it after the fact?
> the thing is, you don't reason about performance. you measure it.
This is only half-true. You do reason about it and later profile. For example, I want to know if my assinment operator in C++ will run in constant time.
Recently I tried to jump into scheme (I'm okish in clojure) man what a mission to install scheme on linux.
I first tried todo gerbil scheme, then Ubuntu seems to have the wrong version of gambit (gerbil-scheme runs optop of gambit-scheme for those that don't know).
That took a few hours of compiling, reading, trying diff versions always ended in some deadlock-issue ??
Then I tried Guile,was a little better and easier, but soon got stumped when I wanted to access mysql-db yes there are packages but those packages I couldn't install for the life of me !
In the end chicken-scheme was surprisingly easy and complete solution for a hit-the-ground-running scheme.
Now I'm prob a noob as it comes to scheme, but not to fiddle and massage packages and programs on linux. Trying to learn scheme but spending a few hours just to get it installed was what killed it for me.
Clojure is certainly not perfect, but you could get up and running much faster in my opinion. Oh and it uses square brackets for parameters :)
One (non?) problem with Scheme is that it is has a long history of being a language implementation research device. This is why it is relatively easy to implement, and this is why there are so many implementations.
At least it didn't go the way beautiful Standard ML went!
Not sure why you went with Gerbil/Gambit - Gambit is usually marked as a hard-core Scheme, and you usually go with it if you need an excellent optimizing Scheme compiler without too much dance around it (it has a minimal interface). For Guile, you usually rely on OS package management to get modules or go with Guix.
But if you want a "download, unpack it, run" Scheme, you go with Racket [1]. Tons of documentation, a good-enough JIT compiler, and usually everything works. You get an amazing IDE with it (DrRacket) with many advanced features yet to be found in modern IDEs for popular languages.
> Oh and it uses square brackets for parameters
Most modern Scheme implementations make square brackets and parenthesis equivalent. E.g.
(let ((a 1)
(b 2)) ...)
;; is the same as
(let ([a 1]
[b 2]) ...)
If you've went with chicken scheme, I recommend installing it from source code (has very few dependencies, so it isn't a time wasting process). The reason being that with the version that you might get from the OS package manager static libraries aren't shipped (at least that was the case in Fedora) and you won't be able to build statically linked executables.
The other upside of using chicken is that their FFI interop with C is really streamlined. From one of my projects; in order to expose the `getpass` C stdlib function in my program I only had to write the following line
Installing Clojure, because of its dependence on Java and plethora of IDEs, is a major pain in my experience. I have tried several times and usually just give up because I find it difficult to get up and running with a simple project that allows me to write code. I don’t understand the seemingly complicated project systems.
It takes literally just a few minutes to download, install, and get going with Racket, Elixir, and F#. It’s the main reason I have not got into Clojure yet, despite trying.
Yep your experience is (sadly) not even remotely unusual, and Cognitect have repeatedly demonstrated that they don’t care about this issue (which will come back to bite them as the community stagnates).
The “best” advice I have for Clojure beginners is to follow this guide: https://calva.io/get-started-with-clojure/, which will ultimately land you in a solid VSCode-based IDE environment for Clojure.
That’s not how I personally like to approach a new language mind you (REPL from the command line plz), but I’ve pretty much given up trying to get Clojure beginners started there as there are just too many moving parts that can go wrong, and unjustifiable frictions.
I think it's just bad online tutorials and guides. It's not any more complicated in practice than any other language I've tried, maybe with the caveat that getting a REPL running and connected is more work than an only static analysis based IDE.
besides having the best mascot, just curious why you would start with gebril? esp for a newbie. surprised you didnt just pick racket
not an expert but i believe you should use guix for anything guile related
> Oh and it uses square brackets for parameters :)
for many lispers this is a negative because they see as a big positive the manimal (no) syntax value proposition of lisp. besides if you use emacs you can easily (if you are a lisper) make it do something like having parameters highlighted without introducing additional syntax in the underlying language
Gambit scheme also just works. That's what I use on FreeBSD, MacOS (m1 and x86-64), Ubuntu and Raspbian. Had a version running under plan9 as well but that was a long time ago.
I don't think the person you're responding to is attempting to make a case against Clojure in general, just saying why it's not something they are personally interested in it.
To a degree. Interop is useful, but you can getby with most of things without it. Long term you will learn interop as it's often times easier to write small piece of code to take java lib, rather than trusting in community wrappers.
The problem is that the stacktrace shows you all the gory JVM details that are irrelevant to your Clojure code. Even when your code is 100% Clojure, the Java classes, that are the foundation for Clojure, are shining through when you get the stacktrace.
It's been a while since I last used Clojure, so I don't know if the situation has improved in the meantime.
For what it’s worth, it’s very easy to filter stack traces in most Clojure tooling. You don’t have to look at anything but your code, or just Clojure code etc. Obviously you’re still looking at Java exceptions for the most part, but I’ve never found being on top of the JVM as icky as some others obviously do.
Even such a filter is too often useless. Like, 4 out of 5 times. I rely more on logging and parameter inspection. E.g.:
(defn foo [a b]
(def a a)
(def b b)
(/ a b))
If the function fails, I have captured its input parameters right before the exception occurred and now I can inspect it from the REPL if 'b' is 0. From the REPL I can change the value of 'b' with simple `(def b 42)` or change the very definition of 'foo' and re-evaluate it again on the REPL with simple `(foo a b)`.
That's not an IDE agnostic approach.
IIRC after modifying and reevaluating the function inside which the debugger is currently in, the debug session is terminated and the evaluation context i.e. the values of input parameters are lost. Even if the session can be restored I'm not sure if it survives a REPL restart. And even if it does I'm not sure if it survives a reboot, unlike my approach.
Moreover, this approach with defs allows you to inspect every function, even those macro generated. And if committed, such a "debug session" is persistent across every instance where the code is deployed, from any development machine, through all test and staging machines right into production. (So when combined with some remote REPL access... you see what I mean?)
Another thing is that I like when my code strongly express the notion of compositionality, i.e. when it looks something like this:
here I introduce and comment in/out the '(fn [p] ...)' expressions as needed. That kills two birds with one stone. 's1' captures the output of 'foo1' which is also the input of 'foo2' at the same time.
FYI you're writing this post on a device which is a product of millions of ideas and inventions all of them refined, abstracted and parametrized thousands of times, if not even more.
My browser is cpp program. Say, 10 layers are typical. Then, there's a libc + ui linux code. Usually in c, so even less layers. Say, 10 more. Then, there are sycalls annoying the kernel from time to time. Pretty shallow, btw, 5.
That's 30 roughly speaking. Maybe 40.
Java + clojure combo is not impressed, not at all.
I'm sorry but this as an argument is weak. Yes, it is true, the stacktraces are terrible, bordering on useless noise, but that doesn't mean that "exceptions & stacktraces" is the only possible approach when it comes to error handling.
Clojure marries the elegance of Lisp and functional programming with an extreme practicality. The practicality of it is precisely _why_ it is hosted on the JVM ecosystem.
There is this very common reaction to Clojure (also reflected throughout the original post): a balking at its pragmatic decisions. But Clojure's pragmatism is precisely what makes so significant and wonderful.
My issue isn't that. its that every Clojure project I've seen has always been at Java Shops, where eventually some java dev decides it too hard/ doesn't have the time to learn Clojure, and it just gets rewritten in Java. I've seen this happen with a few Scala projects as well.
That's true, but the other way is also true, nobody would adopt Clojure at a professional shop if it wasn't for an existing Java shop where some devs thought they'd rather use Clojure and started adding Clojure progressively to their existing stack.
There's plenty of professinal shops that aren't Java shops to begin with. There's also plenty of professional multi language shops that aren't JVM shops as well.
On multiple measures Clojure is the most used Lisp, while also being the youngest, which I think speak a lot to the adoption benefits of being on the JVM and on JS.
Maybe it'll bite it long term, hard to say, but it definitely allowed it to quickly catch up to CL and Scheme and pass them in popularity and adoption.
The JVM ecosystem pioneered a lot of what I see as awful anti-patterns in language development:
- Language-exclusive build chain: Not that Make is the be-all and end-all of build tools but it worked with everything without prejudice. To understand the basics of a build system, you only needed to know one tool. All attempts at fixing Make's flaws fail to understand this fundamental recipe for success. Now when you switch languages, you need to relearn that layer on top of the language itself. And the tools all do the same thing but with their own syntax, commands, and eccentricities.
- Language-exclusive packages and repositories. Once upon a time, you learned your OS's package manager and you were done. Now you need to learn a new package system for every language out there. Again with their own syntax, commands, versioning, and eccentricities.
- Needing language-aware parsers to determine dependencies. You can't depend on spatial locality or compiler-output dependency trees to know whether to rebuild, reinstall, etc. You had to parse out the org.whatever's and resolve that back to a jar or class, which was clumsy and mostly wrong.
- The OS doesn't know how to run your program. What's so hard about building your executable so it can be natively run by the OS? Everyone has to write the same `#!/bin/sh exec java -jar ...` wrapper script. Java programs worked best by assuming it was going to be a pariah hanging out in /opt and never actually integrate with anything.
- Runtime versioning and installation was painful. I disliked having to suddenly manage environment variables just to run something. It continues today with python2 vs python3, GOPATH, etc. On top of that, JRE's weren't a simple "curl|bash" and no one was allowed to repackage the JRE to make it more user friendly.
I'm not sure I agree here. Bazel / Make / Ninja still exist. Most of the tooling that exists extant to that usually leverage some knowledge of the conventions of both the community as well as actual knowledge from the code itself (Cargo for example understands the module system in Rust). You kind of touch on this by mentioning "language-aware parsers" for dependency management, but this existed in C and C++ long before the JVM came along, so it's weird to place the blame there.
Make has many things wrong with it but I wouldn't say that a "one build tool to rule them all" is something it did well. Even with stuff like CMake or GNU Autotools the final makefiles are almost exclusively used for C projects. I rarely see Make used in any other context, which I'm not convinced is a machination of language-specific tooling rather than some intrinsic property of Make itself.
> Language-exclusive packages and repositories
The proliferation of packaging systems is frustrating, but would you rather learn pip (PyPI) / cargo (crates.io) / npm (npmjs.com) across several OS's and platforms or learn 5 different OS / platform package managers? I know for certain there's nothing like that on Windows!
This isn't the JVM's fault, and I think you're looking at it in the wrong order here. These language-specific package managers and repos are made to make the language run consistently in different environments. Most repository maintainers do not and cannot have the bandwidth to package everything under the sun for every programming language.
> The OS doesn't know how to run your program
Oof. The program you're running is the JVM, which as an input is taking a jar file and as output does some work. The OS knows how to run that just fine. Doesn't this criticism basically disqualify every single useful program that isn't a compiled C binary on Linux? ELF isn't that hard but also I'm not sure we need to tie all our boats to that sail either.
If I'm trying to be charitable, I think you're experiencing a lot of frustration with your tooling and expecting something out of the JVM and programming in general that isn't necessarily true. You may just want to stick to C / C++ and tooling built 40+ years ago, but dismissing most of the tools that people use every day seems a bit "cut off your nose to spite your face."
I don't think that's true. A persistent array map is a single data structure. It's just a persistent data structure optimized for non-destructive (functional) updates. On a high-level it's not too different from finger trees which may be more famous. When I used to write Clojure, I didn't feel any need to reason about its internals for performance reasons or any other reasons. It simply feels natural.