Hacker News new | past | comments | ask | show | jobs | submit login
Tour of our 250k line Clojure codebase (redplanetlabs.com)
373 points by grzm on June 3, 2021 | hide | past | favorite | 227 comments



> Detecting an error when creating a record is much better than when using it later on, as during creation you have the context needed to debug the problem.

This is a great insight no matter what language or framework you're operating in. Laziness has its virtues, but invariants, validity checks, run-time type checks, etc. should all be performed as early (and often) as possible - it's much easier to debug issues when the are proximate to an object or structure being created or modified, then when that data type is being used much later.


Yes, and this is where statically typed languages shine (in my opinion). I like programming in a style that makes heavy use of the type system to enforce this.

For example when writing an api endpoint to create a task I would typically deserialise the json into a CreateTaskRequest. If the object is created without exceptions I can be sure it is valid. CreateTaskRequest implements the ToTask interface. The service layer takes only objects of this interface and converts into a Task object that gets persisted. The persisted Task then gets converted into a TaskResponse so only valid JSON comes back out.

Lots of classes and interfaces but they are all small and with a single purpose.


This is more or less how my teams' back end REST API server code is written as well.

It's far, far superior to the idiotic God classes most Java devs tend to write - the ones covered in attributes that are sometimes populated, other times not, and annotations for ten different purposes. Just no. Don't use the same single class to parse incoming requests, persist records at the innermost db layer, serializing the outgoing json etc etc etc.

The sane way to do it is to have a dedicated class to parse incoming requests, another dedicated class that represents a validated and decorated record to be persisted (that doesn't have an id attribute, because it hasn't been persisted yet), another that represents a persisted record (with an id now, that is guaranteed to never be null), another that represents the outgoing response etc etc.

A common criticism of this style is that it's wordier, there are too many types, and it can be hard to follow if you're not used to it. The thing is, that's the price you pay for more accurately modelling what's actually going on at each step. The "use a single God class for everything" alternative is "easy to follow" only because it omits important differences that are actually there at each step, which doesn't mean they're not there - they are there, they're just hidden.


I feel the same way. Much better to encode the complexity of what you're modeling in types instead of ignoring the inherent complexity. By precisely encoding the shape of the data in types you are forced to engage with the values it can represent which leads to better understanding and better code in my experience, similar to how writing tests can lead to better code, except it's much faster to get feedback.


That's basically how most modern frameworks do it.

Django-Rest-Framework does that with view + serializer + validators.

A more recent example, with a leaner and cooler implementation is FastAPI, where type hints are also use to declare validation on your end point, while a model is used for serializing the response.


> If the object is created without exceptions I can be sure it is valid.

Without some kind of custom validation system in place (JSON schema, property attributes, etc), that doesn't tell you much. With most serialization libraries I've used the default settings would let you deserialize {} into any class without an exception and all the properties would just have the default value for their type. Using stricter setting gets you a little more but definitely no guarantee of a logically valid state. If I want to go even one step past what little validation static typing provides, and I usually do, I'd rather just take the type noise completely out of my data and move all validation to a single place.


At the end of the day, when we're working with external data, no language is a silver bullet. But you can get pretty far by doing https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va... and yelling at your colleagues when they don't.

FWIW, this sort of thing can be done with a dynamic language, too. It's true that some static languages happen to be really far ahead of the curve with this sort of thing. But it's also true that some dynamic languages put you in a better position on this front than many of the most popular static languages do.

(And for those of us who really do love an intractable quagmire, there's always JavaScript.)

For my part, I tend to find this debate to be mostly a distraction, because the influence of the language's type discipline is quite small relative to the influence of the programmer's coding discipline. In the group I'm working with currently, the most committed fans of static typing tend to also be the ones who have the greatest tendency to assume, "If it compiles, it works," and proceed to check in glaring bugs. I don't bring that up in order to point fingers at static typing proponents (I tend to prefer static myself, though my preference is not particularly strong) so much as to point out that we should be wary of memes that subtly encourage us to become complacent.


That's an interesting perspective (and now that you mention it I can think of someone at work who loves types and also tends to break stuff). I myself love static types because it allows me to avoid certain classes of errors and I try my hardest not to introduce bugs. Static typing also greatly informs my workflow. I tend to practice type driven development to the extent possible so when I go back to dynamic languages I feel like I'm coding with one hand tied behind my back since I can't encode my assumptions in types and rely on the type checker to help me validate them.


Just as bad are those that check in code that that works, but no longer makes logical sense when you read the code. The following is a simple case of what I'm talking about:

var flag_is_unset = flag_is_set

I generally land more on the dynamic side of things, but there are certainly problem domains where I love types. The more closed and "mathy" the domain, the better I think types fit. I just wish it was less an all-or-nothing choice, and that people were less religious about it.


I keep wishing someone would take C#'s concept of dynamic references and run with it.

Optional static typing, like what you get in Typescript or MyPy, is interesting, but defaulting to dynamic and making static opt-in undermines a lot of the potential benefit of static typing. I don't think that it works the other way around, though. My hunch, which I am not particularly prepared to defend, is that, at least as long as you've got good type inference, defaulting to static and making dynamic opt-in does let you keep most of the practical benefits of dynamic typing.

I guess it's kind of like unsafe code blocks, also from C#: Pointers and weak typing can be very useful in some circumstances, but I'm generally much happier having the compiler try to guarantee as much as it can, and then sometimes be able to tell it, "Nah, hang on, I got this one."


> defaulting to dynamic and making static opt-in undermines a lot of the potential benefit of static typing. I don't think that it works the other way around

Yes, there's certainly truth to this. However, I write a lot of TypeScript and it provides an _exquisite_ on ramp in that you can add types gradually and telling the compiler not to worry about it can be super useful, especially if you're dealing with poorly behaved third party stubs that may have diverged from the actual implementation. The fact that you can use TypeScript as a super-duper linter _or_ try to maximally leverage its powerful type system is a huge strength. It's also very helpful for adoption. As frustrated as I get sometimes with TypeScript's unsoundness and the lack of pattern matching, the fact that I get to use it instead of JavaScript (and have also got half the company using it!) is a huge win.


Strongly depends on the language. If you have ML style data types (eg ML, Haskell, OCaml, Rust) ands some way to abstract types (ML structs, Haskell modules hiding type constructors, OCaml modules, Rust modules) then it is generally possible to design your data structures in such a way that invalid states cannot be represented. If a user has an optional first/last name but if one is specified then so is the other, your user type has a name of type (eg) Maybe (string, string) [1]. If your deserialisation framework just fills in default values for everything or let’s you have unused fields then it is, in my opinion, broken. If you have a field that should be a positive bumber, you should have a type for a positive number that fails to deserialise if you give it a negative number.

[1] the caveat is that it somewhat sucks to change these restrictions and therefore the types. Compilers can hopefully make these refractors easier, but they may still suck. In languages like clojure, you mostly have to try to write programs/tests to be resilient to any reasonable changes to the data structures that you can imagine.


That depends on the language and deserialisation library used. And that is also the reason why I have multiple classes.

My CreateTaskRequest class which is used for deserialisation only does not have defaults (unless intentional) so it throws when required values are not present in the source json. It also takes care of dateformat/uuid parsing. The output is always fully valid typed or it throws. It’s a 1-to-1 mapping on what is described in the API docs.

The service layer that takes CreateTaskRequest and converts to Task for persistence is where the business logic validation happens (valid foreign keys, date ranges, etc, unique checks). Cleanly separated from deserialization.

For reference: I use Kotlin with Jackson which has great support for this.


You implement your own custom validation. The point is to encode the fact that you've validated a value in its type. So you have e.g. DeserializeJSON<Foo>(String) -> Foo, and then ValidateFoo(Foo) -> ValidatedFoo. And then all the business code works on ValidatedFoo.


I know the point, I'm a C# dev by trade, but if I'm going to implement validation that goes beyond static type checking, which is most of the time, I'd rather put it all in a single place and not have to deal with the type nuisance in every single line of code I write.


Types are a huge help for implementing validation in one place - by having separate types for non-validated and validated versions, you can get the compiler to ensure that every code path actually goes through validation. Static type checking isn't in opposition to custom validation code, it supports it.


Isn't this as far from Clojure's "it's just data" as it gets? The author was making the case for not doing this, ie. turning data into Java-style classes and interfaces.


> Lots of classes and interfaces but they are all small and with a single purpose.

That's the side effect. You ultimately end up with more code and not less even though it saves you from type checks.

There are trade offs for both.


One thing I wish for is some kind of "type tags". Being able to express concepts like List[Widget], List[Widget, Nonempty], List[Widget, Nonempty, Sorted], Vector[User, Sorted], etc. - or even, more generic, <Container>[<T>, NonEmpty] (where <Container> and <T> are parameters, like in C++ templates) - without implementing an explicit new type for each. Logic verification through typing would then involve not just changing "main" types, but also adding and dropping "tags" from the "set of tags" attached to the "main" type. This should cut down on the amount of boilerplate.

Hell, in the extreme, perhaps types in general could be generalized as a set of tags?

(See also, https://news.ycombinator.com/item?id=27168893)


This is quite similar to units (e.g. F# units of measure) or taints (Perl, for tracing user input and detecting unsafe usage).

A large part of the point of these systems is that you can write code which generalizes across the different subtypes (different units, tainted vs untainted values) yet it still passes through invariants by default (values calculated from tainted input are themselves tainted, a unit-quantified value multiplied by a scalar retains its unit, and so on).

I've done similar things with syntax trees in compilers. With multiple passes, you have a representation for the output of the parser, then you might do type annotation, rewrites due to coercion etc., symbol binding and overload resolution, constant expression evaluation, and so on in separate passes. If the input and output trees for each pass have different types, it's easier to keep track of what's going on, of what invariants hold for any given tree node, or indeed which node types are permissible to exist in the input or output of a pass (e.g. you don't want unbound overload calls after overload resolution).

One problem is that tags probably aren't enough when you're dealing with data structures rather than simple zero-dimensional values. Some methods shouldn't be called, or some fields shouldn't be accessed, if a value has a type with the wrong tag. But if you have a wholly different type, then you can end up reimplementing a lot of the same logic, once per type, simply to get the types to flow through.

One solution I've used a couple of times for the tree problem is tree grammars. That is, a data structure which encodes a description of a valid tree and can encode invariants like nodes of type T need to have attributes of type Y with values satisfying predicate P, and between N and M children of type U, V or W, and so on. The grammar is defined in terms of an abstract supertype, and the concrete subtypes have the grammar specific to their phase baked in. This is a hybrid between static and dynamic typing, a compromise necessary for languages like Java without much expressiveness in the type domain.


When you have higher-kinded types you can build that kind of thing yourself. Dependent types going further in that direction. Take a look at Idris.


I will, thanks. I never had a chance to catch up on the "state of the art" in typing systems.

I'm not even sure if my "types as a set of tags" idea makes much sense - perhaps it decays to what is typically understood as types. Or perhaps it hits computability problems.

I did some mental experiments on a "set of tags" type system last night, and I quickly realized the complexity will be around deciding when to keep a tag (property) on a type, and when to drop it. With a set of tags being open, a function could not possibly know about them. This creates a problem.

Consider a function like: Sort([List[T], ...], [Function([T, T] -> Bool)]) -> [List[T], Sorted, ...]. Takes a list and a comparator, sorts the list, attaches a "Sorted" tag to return type, retains all other tags. Looks like a reasonable definition. Let's look at the use cases below (with P being some applicable predicate):

- Sort([List[Int], P) -> [List[Int], Sorted] -- OK!

- Sort([List[Int], NonEmpty], P) -> [List[Int] Sorted, NonEmpty] -- Awesome, exactly what I want!

- Sort([List[Int], Foobar], P) -> ...? Should it be [List[Int], Sorted, Foobar]? But what if Foobar designates some business rule that depends on the element ordering? How is Sort supposed to know?

Without knowing a tag, we'd sometimes want a function to drop it, and other times want the function to retain it. I have no idea how to solve this at compile time (or even at runtime, without crippling the tagging system).


Would it be ok if you could define relations between tags in code integrating them?

    Contradicts sorted foobar
    Independent sorted nonempty
    Independent foobar nonempty
Additionally there should be tag that contradicts everything by default.


A language like Kotlin can do some of these things using delegates, interfaces and extension methods.

For example a MutableList<T> can be dropped down to a List<T> which is not mutable. And a generic conversion from Collection<T> to NonEmptyCollection<T> should be trivial to write as an extension method.


Can Kotlin handle multiple "tags" on a type as a set, and not a sequence? I'm not familiar with the language, so I'll use a C++ analogy. If you tried to tag types in C++, you'd end up with something like:

  TaggedType<std::vector, NonEmpty, Sorted>
but such type is strictly not the same as:

  TaggedType<std::vector, Sorted, NonEmpty>
What I mean by "set" instead of a "sequence" is to have the two lines above represent the same type, i.e. the order of tags should not matter.


You can maybe get there in kotlin with a generic type with multiple constraints in a where clause. Let’s say you have Sorted and NonEmpty as interfaces (could be empty marker interfaces so they behave like tags). Then you can write a method

  fun <T> doSomething(values: T) where T: Sorted, T: NonEmpty {}
And that function will take any type that has both Sorted and NonEmpty interfaces.


You don't need Kotlin to do this, even Java can do it:

    <T extends Sorted & NonEmpty> void doSomething(T values)


> There are trade offs for both

Exactly.

Personally I like that bit of extra code because it gives every class one reason to exist. There are no conflicts so the type system can be used fully without ambiguity.

I’ve always disliked the way early rails promoted fat models that combined serialisation, deserialisation, validation, persistence, querying and business logic in the same class.


I personally bounce back and forth about this. My experience is probably colored by the fact that I'm doing this in C++. Boilerplate gets annoying there (and attempts to cut it down tend to produce lots of incomprehensible function templates). I like the idea of using types to encode assertions at a fine granularity. I dislike the amount of tiny little functions this creates. I also dislike that the resulting code is only navigable with an IDE - otherwise you spend 50% of your time chasing definitions of these little types.


Ok yes, C++ might not be the greatest language for this.

My experience here is mostly from Kotlin which is a great language for this. Nullability, extension methods, (reified) generics, data classes, delegates, etc can all help reduce boilerplate.


You should take a look at Scala. Lots of things that have to be special-case language features in Kotlin become just a straightforward use of higher-kinded types or a combination of a couple of existing language features.


I think it is always important to understand that the data models and the approaches to modelling the world are totally different between Clojure and strongly typed languages. I’m going to ignore C++ and Java style languages because I think they give a Clojure-style model of the world without any of the benefits of a well-suited programming language or a type system that can enforce new invariants.

In the ML-family of language, you have a few key things:

1. Primitive types like ints, bools, strings, floats, arrays, not much else.

2. Product types which are records/triples of other types

3. Sun types which are proper tagged unions of types. Including things like optional or result (aka or-error) types, and also list (= nil or cons) types.

4. Abstract types which are types whose representations are hidden. You can have an abstract type called “hour_of_day” which is secretly backed by an int but which you can only interact with by using conversion functions or eg something that adds two values (mod 24).

5. Polymorphic types: you can have a list type which can be a list of units or a list of floats but not really a list of a mix of arbitrary different things.

The idea is to represent with these types a model of the world in such a way that only valid states of the world can be constructed. A user’s bank account balance isn’t an int, it’s a positive_dollars and if you try to do a transaction to make it negative, that isn’t possible as you can’t construct a suitable positive_dollars value. This can have annoying difficulties for maintainability because it is tedious to invert or change a one-to-one or one-to-many relation (eg previously 1 tax_number per person, now 1 person per uk_tax_number and one or two people per us_tax_number) and hard to represent with types a many-to-many relation like “every person has at least 1 bank account, and every bank account is associated with at least one person, and the bank accounts associated with person A have A amongst their associated persons.”

The promise is that the type system makes the practice and correctness of these refactorings easier.

In Clojure, the data model is more like:

1. There is a rich set of primitive, atomic types, eg strings, ints, but also dates and symbols and keywords and fractions and Uris and so on

2. There are collections like lists (of logically unalike objects), vectors (of logically alike objects), hash tables, and sets.

Data is built up out of e.g. hash tables of keywords to objects. The language has a rich set of features for acting on these types so one can do a lot with hash tables, whereas in an ML system only a few operations are available with a record type (eg constructing, reading fields, maybe updating them) and functions that do general things with any record can’t really be written. Closure is a language for manipulating data in general more than a framework for writing functions to manipulate your small, strict data types.

Clojure tries to model the world in a metarational way, accepting that it is unlikely that one can write a strict scheme capturing all and only valid states and instead programs should try to allow for the possibility of extra or missing information. You don’t want to care about whether you have a us_person or an eu_person so much as whether your data has a :person/preferred-name field. Issues with relations come up less because those relations are not trying to be forced into rigid types (they may be enforced by a database though—another cultural difference between Clojure and ML-family languages).

Fundamentally, I think the differences stem more from philosophies about modelling the world than type systems.


> Fundamentally, I think the differences stem more from philosophies about modelling the world than type systems.

It seems like people mistake the battleground for the countries. If you only saw the US and Japanese battles in early World War II, you'd think they were tiny nations of ships, soldiers and aircraft fighting over a handful of islands. It wouldn't be clear that there were full countries with big populations backing all this, and that they were thousands of miles apart.

So Rich Hickey gives a talk criticizing types, and a Haskeller responds with a blog post, and you think that types are the difference between the camps. But it's not. It's just that this is the spot that's close enough to home to reach, but contentious enough to fight over.


> Yes, and this is where statically typed languages shine (in my opinion).

Indeed. This feels like a just-before-the-moment-of-realization situation.

The endless cycle between "more dynamic" and "more static" continues it seems.

I wonder if there is any correlation between experience in the field and static vs. dynamic vs. "fail fast dynamic".

(I'd say Erlang falls in the latter category and it has a pretty good track record for reliability, but so does Python. It's an imperfect axis for sure.)


I like this design for making requests, but I don't like it for serving requests. Because if you don't just make every field a String in CreateTaskRequest, and you want to examine it in some way, then you're losing information. E.g. if you make a field an int, and they provide a non-int value, your CreateTaskRequest can't hold that value so its just gonna have a 0.

A class full of strings feels like a code smell - but that is the proper representation of serving the request. And the alternative is needlessly restrictive. Ultimately it feels like pointless ceremony.


If a field is an int and the incoming data has a string the parsing shouldn't succeed - instead it should give an error like "expected an int but saw a string". Maybe include the string it saw in the error. Add in the JSON path for bonus debugging points.

You shouldn't get to the point of having a CreateTaskRequest with a 0 that shouldn't be there.


My concern with all-or-nothing validation of the sort you're suggesting is that sometimes the invalid data is data that isn't your concern and doesn't matter to the task you're trying to accomplish. It doesn't even have to be a mistake on the sender's side; maybe something evolved, maybe you made the mistake. I prefer defining the minimum I need for a specific function, instead of defining exactly what I expect. For much the same reasons as we scribble in the margins sometimes.


When encoding types for requests like this, I'm exactly as strict as I need to be: no more, no less. Any assumption about the shape of the data that my code will make is encoded in the parser, but the parser is very generous with everything else.

For example, if a request has a field I didn't expect, that's not an error, I simply ignore that field because I obviously didn't need it. Likewise if requests start using a more specific schema, such as now sending only positive integers. If, however, the schema changes such that a field I rely on is dropped or changes type, my code will eventually fail anyway as soon as it makes an assumption about the data. Why not fail at the point of entry instead of saving it for later?


That sounds like it might work. I was commenting based on my experience of people deciding there had to be a single representation for an idea/type no matter what it was used for, across multiple projects and repositories. I was not dismissing appropriate validation at a boundary, just trying to pointing out that, at least in my experience, people get way too wound up about enforcing unnecessary consistency.


I understand and have done that before with XyzRequest classes. What I'm saying is the field shouldn't be an int. It's not the proper level of abstraction for what it represents, and any of: tossing out, transforming, or hiding under layers of abstraction the data that comes into your system like this is a bad idea and is the sort of thing that makes people hate OOP. Especially when perfectly fine alternatives exist for this.


Interesting. I'd say the proper level of abstraction for the incoming JSON is some structure that can represent all valid JSON. Then you convert from that JSON thing into the form you expect - in this case a CreateTaskRequest (with int fields and whatnot) - which would be the proper level of abstraction for the code that deals with creating tasks.

Is that substantially different from what you suggest?


You're writing more code than you have to though. More code = more bugs.


Just for clarification: Do you equate type ascriptions to "more code"?


I don’t agree there. Most of these extra classes are just type declarations with no methods at all.

While the total LoC written might be higher, the amount of written logic in which you can introduce bugs is less.


I've caused myself a lot of problems by using/creating types that don't make sense. Coding myself into a corner. So I don't think it's fair to not count type declarations as code.


For any type A -> B operation, it's possible to fail - that is the whole point of parsing into type B, to catch when B's invariants would not be satisfied. The bug could be as simple as neglecting to handle that failure scenario.

Some languages make this less likely (Haskell, Rust) but most mainstream languages will happily let you introduce this bug.


> invariants, validity checks, run-time type checks, etc. should all be performed as early (and often) as possible

Having worked on numerous large-scale distributed systems, I strongly disagree (and I would think the author would too).

There is a tradeoff to strictness, which is coupling. You want to validate and type check as much as necessary—but not more. If you are over-coupled in a distributed system, it can make it difficult to change your system incrementally, as one must also do with distributed systems. A canonical example of this is closed enums, which don’t tolerate unknown values. Good luck adding a new enum value if every part of your system is strictly validating that. Each part of the system should validate what it needs in order to guarantee that that system can operate correctly, and no more.


It's not a matter of distributed systems, but a matter of systems where only parts of it are updated on prod at the same time. You could easily imagine a huge release, where the whole codebase gets pushed to prod at the same time. Then, the whole issue of versioning APIs disappears. However, it would require more discipline (probably unachievable at Google scale), so most companies prefer the slowly-burning garbage fire of versioned APIs and backward/forward compatiblity of messages.


While atomic big-bang releases have their benefits (and drawbacks), I don't think there's any way to avoid dealing with data backward compatibility in at least some way. Old versions of data always exist in your system (in databases, message queues, replay logs, caches, retry loops in external parties, even on the wire at the time of the release in some cases). While explicit and long term API versioning may not be needed in some release processes, a strategy for coping with old data becomes necessary past a certain scale. Migrating all extant data (not just your RDBMS) at the same time as the big-bang release is not practical.


> Each part of the system should validate what it needs

Easier said than done. Each part of the system can't know what it needs, since you've just added a new thing it might need to contend with.

The point of closed enums is to force these checks, even if all you're doing is adding a new fall through to ignore it, since at least that means you've considered it.


> Each part of the system can't know what it needs,

Yeah it can. It needs what it needed before.

If you have a system that is shipping stuff to customers, it doesn't have to care about the new childhoodPetName Field that you've added for a different part of the system.

This is especially relevant if you handle any kind of description of the real world, e.g. in robotics or medicine, and have multiple distributed systems and codebases all communicating with each other.

The inability to gracefully ignore cases that don't fit the expected data model is a weakness most contemporary programming languages and type systems share.


If field is childhoodPetName and other case labels are e.g., mothersMaidenName, and the common operation is to perform input sanitization, then this new field would have introduced a security hole if it's just ignored.

It's a matter of correctness enforced at language level, and why languages like Rust's match statement enforces all checks at compile time by default.


I agree this is where enum coupling is beneficial. What I'm saying is that, if those fields are later used for other things in your system (like persisting to a DB or running reports or something), you likely don't want to be re-validating the strings as closed enums, that's probably over-coupled.

I don't think there's a silver bullet here, or some rule of thumb that can save us from these schema issues. It's messy and imperfect; we have static type systems that can help us locally but they are not global silver bullets. Unfortunately we still have to use our judgment to find the right balance, because the "perfect" solution requires knowledge of the future.


How would it have introduced a security hole?

By definition the field is ignored by your program, therefore even if there is any malicious text in there, e.g. raw html, the system in question can't do anything harmful with it as it is ignored.

Sanitation occurs on a type level, so if the field contains data of the type DangerousRawHTML, and there is a component in your system that just blindly renders any field besides those of type SanitisedHTML, then that is the location of your vulnerability. A type system like Rusts would also catch this.

Don't conflate "correct rust program semantics" with "correct real-world model semantics".


> One of the coolest parts of our codebase is the new general purpose language at its foundation. Though the semantics of the language are substantially different than Clojure, it’s defined entirely within Clojure using macros to express the differing behavior. It compiles directly to bytecode using the ASM library. The rest of our system is built using both this language and vanilla Clojure, interoperating seamlessly.

Actually this sounds quite horrible.


JSX would be an (early) example of this class of language. XML is equivalent to s-expressions. So React.js (drunkly) is just a reactive language embedded in a normal language with seamless ability to hop between them. I elaborate about this equivalence here https://www.reddit.com/r/Clojure/comments/mavg81/the_distinc... . The brilliance is they did this so smoothly that nobody noticed, and with dynamic types, and without the typed people even noticing!


The elephant in the room is Babel, which is essentially a macro processor for JavaScript.


React is not a "reactive language", and JSX is an abomination, so I'm not sure this is a rebuttal to the GP's "this sounds horrible" point.


Imagine starting a job as a closure dev and being told you have to learn a new language with substantially different semantics


Everyone is doing it already, to a large extent. Where "an API" ends and "a programming language" starts is a matter of opinion. It's a fuzzy boundary.


I think it really comes down to how well the new semantics fit the problem domain. If it's a much better fit than the alternative, it's a net win. But just because you can define new semantics, doesn't make it a good idea.

This is Nathan Marz we're talking about though; he's on the short list of people who can both accurately identify the need and create something workable to fill it. If he told me it was necessary for a particular problem domain, I (personally) would think long and hard before I disagreed.


Different semantics for different folks. For me, this sounds interesting. I'll defer judgment until I learn more.


I know people who work at RPL, am 1000% confident that anyone who joins knows _exactly_ what they are getting into (and this kind of crazy Nathan Marz rabbit hole is exactly why they signed up).


> Actually this sounds quite horrible.

You understand the use case well enough to criticize it?


I can only guess regarding the use case. I still can make a statement that I personally do not think that the coolest part of a codebase should be a new general purpose language at its foundation. To me this reeks of not invented here syndrome, inner-platform effect and KISS "violations". IMO when you provide new tooling you want your developers be able to stand on the shoulders of giants and let them choose the fights they want to fight. Idiomatic programming is very helpful in this regard. When programming new languages with macros is idiomatic in Clojure so be it, but I doubt that.


None of the smells you mentioned are anywhere close to absolutes.


Of course, that is the nature of the concept.


I have dabbled in Clojure and Julia, and both of them support macro to make it easier to develop DSL looking syntax. I have always wondered how difficult it it to debug macro heavy code - where's other language cite 'no macro' as a feature such as Zig.


> I have always wondered how difficult it it to debug macro heavy code

It isn't much harder than regular core. `macroexpand' is your friend! Particularly when wrapped by your IDE[0] into a "macrostepper" tool, which lets you macroexpand into an overlay[1], step by step.

Ultimately, macros are just functions - if slightly peculiar ones (they receive unevaluated arguments, and their return values are treated as if they were the code at the point of invocation). You can unit-test them just like any other function.

That's not to say there aren't macros that are very tough to debug. But the problem isn't macro-heavy code, as in code containing a lot of macros. The problem are heavy macros - monstrosities with complex internal logic, expanding their inputs into large amount of code generation, hidden state, etc. Complex macros like these take skill to write[2], but they're also rare to see.

--

[0] - By which I mean Emacs, though I guess there are Clojure IDEs now too.

[1] - I.e. a read-only view replacing the code you're macroexpanding on screen.

[2] - First rule: minimize the amount of code within the actual macro definition, move all logic into functions to be called by the macro, and unit test those extensively.


In my experience with Julia it's no easier or harder than any other code. Macros tend to be pretty transparent, and it's easy to see what code they generate, although I've never had to do so. Most of the time, a library with a macro-based DSL (e.g. JuMP) has a function-based DSL underlying it that's just more verbose.


I kind of like Rust's version: macros are limited and clunky (and really clunky if you want to do anything super fancy), which means they're available, but I only end up using them when they're really necessary (like I'd be copying and pasting huge amounts of code and there's no other way to avoid doing that)


I concur. I think it's nuts when companies do this. There's another well know SaaS that basically invented their own language but I can't recall who. It's the ultimate ego food for the lead engineer imho.

what a company needs is good leadership at every level. not unique tools, and not "10x engineers".



oh yeh that's it!

a complete head case, nevertheless he's a million times more successful than I so perhaps I'm the one who's nuts.


> I think it's nuts when companies do this.

This can go either way. Yes, it can be abused. DSLs can be very useful and powerful.

In any case, I think it is better to not to rush to judgment for this particular case.


I would claim in most cases it is abused. Beside that I criticize the sentiment not the execution.


I thought the same. Clojure is already a niche language, but then you go and niche the niche by implementing another language on top.

Could work out if its essential to offering a 10x service, but I doubt the multiplier is that higher to warrant such an engineering cost of developing and maintaining another language on top of Clojure.

Also, there would be many layers to get to the machine code (Custom lang -> Clojure -> Byte code -> JIT -> Machine code), would this impact performance in a way that makes the program too slow and needs re-writing?


If you disagree please explain your perspective...


This is what I imagine experienced clojure developers can squeeze out of a language like clojure. I would venture that they can train a junior programmer in a couple of weeks, and make them productive very fast.

I guess they could make it work in any language but judging by the description clojure is indeed a great fit, due to the macro capabilities, flexibility and solid runtime via JVM.


Yeah, it's nice to see such thoughtful adoption of language facilities clearly oriented towards creating a successful and maintainable codebase.

The Clojure team has always done a nice job expressing the rationale for specific language features, and these rationales often lean towards solving problems that system designers historically faced.

Oftentimes, I think folks think of modern languages in regard to their syntax, tooling, ergonomics, etc; however, to me, the more interesting benefit in adopting a modern language is in how its inbuilt features address design problems that earlier generation languages exposed.

For a real world example of what I'm talking about, you can google "clojure expression problem" and find compelling articles about how Clojure solves this with protocols.

Providing a toolkit for attacking categories of problems inherently gets people focused on the fact that these problems exist in the first place when they may not even recognize them otherwise, and regardless of the choice of language, it leads to better design oriented thinking in the context of larger and more complex systems.


Clojure doesn't solve the expression problem and the expression problem tends to be trivial in dynamically typed languages.

Strictly speaking the expression problem is only defined for _static_ type systems because its concerned with _static_ type safety and extensions without recompilation.


It does solve it and doing so was a key design goal of protocols. It's right there in the docs:

There are several motivations for protocols:

Avoid the 'expression problem' by allowing independent extension of the set of types, protocols, and implementations of protocols on types, by different parties

I agree that it's a concern in static type systems, but the same issue rears its head when defining methods which operate on specific types of strongly typed data, so no, it's not always trivial, nor is it a problem exclusive to statically typed languages, and if it was, there wouldn't be a need to introduce a new abstraction for which addressing the problem is a key goal.


> would venture that they can train a junior programmer in a couple of weeks, and make them productive very fast

I've got some experience with functional languages, and a good amount with functional features in OO languages. Started learning Clojure this week, and was pleasantly surprised with how quickly I could get working projects up and running. Less upfront learning curve than Elixir, which was unexpected.


I've found I typically reach for clojure when i need to do something on the jvm and want a better java than java.


Clojure's value prop is:

1) default immutability (same simple data structures used in every library -> ecosystem composes better)

2) portable code across multiple host platforms (jvm, node, browser)

3) metaprogramming

IMO, in 2007, immutability on the JVM was a competitive value prop, but in 2021+ it is nothing special. It is the combination of the three things which is a competitive value prop today. Metaprogramming in particular is a mostly unexplored frontier, because it is very hard to do well, and very easy to do badly. Default immutability is kind of a necessary starting point to do metaprogramming well.


> in 2007, immutability on the JVM was a competitive value prop, but in 2021+ it is nothing special

Can you elaborate on this? What do you think has changed in that time to make it "nothing special"?


immutability as a library is available in basically all mainstream languages now and mainstream frameworks leverage it (react, any UI framework, spark, any data framework or database); JS vms are competitive with the JVM and the JVM might even be losing ground in the cloud; typescript is a monster and is letting people explore haskell concepts in industry applications; scala is way better in 2021 than it was in 2011; every new PL can compile to JS and supports immutability. Clojure's sweet spot is currently developing sub-languages embedded in Clojure like RPL is doing (we are doing exactly the same thing at hyperfiddle). That is still too hard to do in typescript imo. Maybe it is possible in scala 3 but it took them 10 years of pure autism to figure out how to do monadic IO in scala due to how complex scala is, so i'd expect it to take another 10 years to figure out how to do metaprogramming in a commercially viable way, I'd be happy to be proven wrong.


"10 years of pure autism" -- This made me smile.


As far as the JVM goes only Clojure and Scala have real immutability built-in, ie. persistent data structures. Kotlin's default immutability is more like read-only.


I think Clojure is still a great choice when your concurrent solution to a problem is amenable to working with stale data. If it is then the correctness of your design will rely heavily upon immutability, and the bugs that can be introduced by not using an immutable-first language like Clojure just isn't worth the risk.


I think that's also why Clojure use peaked right before Java 8 was released; once Java became a better Java, and libraries started to be designed around more ergonomic APIs than wiring up objects that needed miles of stateful configuration code, the pressure that drove you out of Java and into Clojure began to diminish.


Hardly. Despite being Java-compatible Clojure's philosophy is diametrically opposed to that of Java/OOP so anyone who has experienced the benefits of Clojure would hardly return to Java simply because it managed to fake FP with Single Abstract Methods.


Well, maybe there were not as many die-hard lispers within the Clojure community and many were looking for something that handled better than Java 7. I know two other former Clojure programmers that went to Scala, but also find Java passable enough to just use it rather than Clojure.


Interesting take since Clojure and Java are two very different languages. And unlike for example Kotlin, Clojure does not try to be a better Java than Java. But true though Clojure leverages the power of the jvm.


It was a better Java in many ways.

The doto macro was a relief back in the days when Java APIs insisted on being designed around stateful setters and getters rather than the builder pattern, it allowed you to operate on such unfortunately designed objects in single logical blocks and simulating it all being an expression.

Proxy allowed you to instantiate anonymous inner classes implementing only the methods of an interface that you needed, the rest you could omit; in Java you have to put them all, empty, which necessitated that you use an IDE to generate them, and back then IDEs were not as nice as today.

Those two alone made interacting with contemporary Java libraries so much easier.

It also was convenient to go from Java collections to Clojure collections and vice versa.


" We were not out to win over the Lisp programmers; we were after the C++ programmers. We managed to drag a lot of them about halfway to Lisp."

-- Guy Steele


Interop from Clojure -> Java is still incredibly easy, though. I often will just wrap a Java library rather than look for a pre-existing Clojure implementation because it's so easy. Of course, most Java libs don't value immutability and functional patterns, but you can still push this kind of interop to the edges of your system and keep everything else in pure Clojure.


Depends on the style the Java library you are trying to use is written in. I have had mixed results. Especially the Java 8 functional style had some challenges. I might have the relevant SO question somewhere.


Basically you get a new version of a Lisp Machine, that you can sneak up in lots of places, whereas with Common Lisp it isn't so easy to do so.


What I heard from colleagues that work with Clojure is that it is a horrible language where the default way of writing code is an imperative programming style where contexts are passed around and updated. Far from the concepts of functional programming.


What is this? Are you being serious?

I've worked in multiple Clojure shops and it has always been amazing.

All the core code and business logic etc. (kernel) is implemented functionally, and the interface with the outside world (shell) is implemented imperatively.

Clojure being a horrible language is a really hot take and it being based on hearsay makes your coomment look like it's not in good faith.


As with all functional programming languages, if you limit the use to some specific areas you can get a lot done with a few easily understood lines of code.


Quite the opposite. Functional style is especially useful in larger codebases. However, I think a functional strongly typed language is often easier to get right than a weakly typed one. I mostly write F# and Clojure and based on my experience I would go for F# any day over Clojure but at the same time I would also go for Clojure over Java as well.

I do not know where your views are coming from, sounds like 2nd hand experience rather than 1st one.


As I stated in the beginning I do not work with Clojure. Pure functions is something I use in any programming language. I have a hands on experience from a lot of different functional languages though including F#. From what I understand it is very common to use context based style of coding in Clojure. That is at least what I heard and Google does not seem to disagree...


Please be specific with your links and sources.


I’m not sure I understand what you mean. I don’t consider Clojure to be imperative at all — everything is immutable by default, and you actually have to go through considerable efforts to write things in an imperative style.

When I compare Clojure to another functional language I know well, Haskell, one of the things I really feel it lacks is proper pattern matching and currying; yes there are libraries that you can use, but it’s just not idiomatic to do in Clojure. I would not, however, assert that it’s an imperative language.

Could you care to elaborate on what exactly you find is lacking in Clojure?


It's not difficult to write imperative code in Clojure; [1] is an example. Immutability-by-default makes you work to mutate things, for sure, but that in itself doesn't make the language inherently functional.

[1] https://clojuredocs.org/clojure.core/let#example-542692c7c02...


Your link might be pointing to the wrong example. I assume you wanted to show an example of an atom.


Nope. The sequence of statements in the (let) block is imperative. Atoms are actually an example of how the values of references are updated functionally. Clojure has facilities for both imperative and functional style.


Lexical shadowing isn't an imperative thing. Imagine the form:

    (let [a 10
          a (+ a a)
          a (+ a a a)]
      a)
as being like so:

    (let [a 10]
      (let [a (+ a a)]
        (let [a (+ a a a)]
          a)))
Which you can now re-imagine as the functional composition:

    (+ (+ 10 10) (+ 10 10) (+ 10 10))
So it has evaluation semantics which allow you to reduce it as described in lambda calculus using α-conversion and β-reduction.

And this is different to the imperative style, because in imperative you can do:

    int a = 10;
    a = ++a + ++a;
Where `a` is now equal to 23, but in Clojure:

    (let [a 10
          a (+ (inc a) (inc a))]
      a)
You now have `a` equal to 22 instead.

This is the difference between the variable substitution evaluation semantic of lambda calculus and the imperative semantics.


Nice strawman. Now, how would you restructure the let bindings (without the println) at https://clojuredocs.org/clojure.core/let#example-542692c7c02... equivalently? Wouldn't the ordering of the function executions matter? Could you re-order the bindings arbitrarily and get the same result? I don't think you can. That makes it imperative-as-opposed-to-declarative, even though it may not be imperative-as-a-synonym-for-side-effecting.


You do the same thing:

    (let [a (take 5 (range))]
      (let [{:keys [b c d] :or {d 10 b 20 c 30}} {:c 50 :d 100}]
        (let [[e f g & h] ["a" "b" "c" "d" "e"]]
          (let [_ (println "I was here!")]
            (let [foo 12]
              (let [bar (+ foo 100)]
                [a b c d e f g h foo bar]))))))
And now you can simply evaluate it using variable substitution again.

The difference is that when you say `(let [a 10])` the variable `a` is a constant now, you can effectively replace all occurrence of it within the scope by `10`, and you can do this at compile time.

That's why people say the symbol `a` is bound to the value 10, and not the variable `a` points to the value 10.

Effectively you cannot change `a` within the scope anymore, all use of `a` in that scope will be equal, and so within that scope you can change their ordering freely.

What I think is confusing you here is that `let` gives you syntax sugar so all the scopes are flattened, but each new binding pair is actually inside a nested scope, and so `a` is still immutable. In this case, it matters almost never, but as I showed, it still can, because in an imperative language you could still try to mutate the variable even in the same expression, and that is impossible to do in Clojure.


> each new binding pair is actually inside a nested scope

Can you point me to something that demonstrates this? I've both decompiled this form and traced its compilation:

  (defn foo [a]
    (let [b a
          b (+ b a)]
      b))
and it does create two different symbols with a name of "b", which I get has the same net effect of not mutating the original "b", but I would not consider that "nested scope". It feels more like SSA to me, which is typically quite sequential in nature, which is what I associate with the term "imperative".


Ok, came up with another example. See how let:

    (let [a 1
          a (+ (inc a) (inc a))
          _ (println a)
          a (+ a a)]
     a)
Can be rewritten simply as a nested composition of anonymous functions:

    ((fn[a]
      ((fn[a]
        ((fn[_]  
          ((fn[a]
            a)))
         (println a))
        (+ a a))
       (+ (inc a) (inc a))))  
     1)
The latter representation you would consider functional programming no? Well the let is just syntactic sugar for it and can be rewritten in terms of only composed anonymous functions (aka lambdas).

And like I said previously, you can reduce it with variable substitution, which gets rid completely of all the variables, at compile time, and the evaluation will give the same result:

    (do
     (println (+ (inc 1) (inc 1)))
     (+ (+ (inc 1) (inc 1))
        (+ (inc 1) (inc 1))))
From the let form, or from its corresponding anonymous function form:

    ((fn[_]  
      ((fn[]
        (+ (+ (inc 1) (inc 1))
           (+ (inc 1) (inc 1))))))        
     (println (+ (inc 1) (inc 1))))
In this case you see more clearly that the side-effect causes impurity, since it can't really be reduced, it's only in this case that order dependence matters, and so we can't eliminate the wrapping function, and this relies purely on Clojure's left to right argument evaluation ordering which allows you to mix/match side-effects within pure functions with predictable effect timing.

So here what happens you reduce that function into a do-block (which is Clojure's imperative form):

    (do
     (println (+ (inc 1) (inc 1)))
     ((fn[]
       (+ (+ (inc 1) (inc 1))
          (+ (inc 1) (inc 1)))))) 
And now you can further reduce the pure parts, which takes us back to what we had when we reduced the let:

    (do
     (println (+ (inc 1) (inc 1)))
     (+ (+ (inc 1) (inc 1))
        (+ (inc 1) (inc 1))))
And finally this can be reduced further:

    (do
     (println (+ 2 2))
     (+ (+ 2 2)
        (+ 2 2)))

    (do
     (println 4)
     (+ 4
        4))
To our most reducible form:

    (do
     (println 4)
     8)
This reduction can all happen in parallel or in any order, and the result will always be the same.

Ok, and this last bit is very important, this is what people mean when they say that in functional programming the order of execution doesn't matter. The side-effects must still be sequenced in their correct order, but all the computation can happen in arbitrary order, because the computation doesn't rely on a sequence of instructions like it does in the imperative programming paradigm, instead it relies on this "reduction" process I described which as you see you are free to reduce each part in whatever order you want, you'll always end up with the same thing in the end.


Thanks for your willingness to engage and instruct on this. I've learned a lot. Cheers!


It helps me to have my understandings challenged, I could have been wrong, and going through trying to explain myself helps with me better understanding things too, so cheers to you as well!


Maybe this:

    (let [a 10
          a (+ (let [a 20] a) a)]
     a)
The thing is you can't refer to the resulting compilation, there is a correspondence between CPS and SSA, and SSA lets the JVM better optimize things.

Let me try and better explain myself. For me, the qualification here is that the form exhibit semantics that are compatible with functional programming, and can be evaluated using a lambda calculus computational model. Those semantics are in turn incompatible with the imperative ones, since it prevents you from doing some things that imperative semantics would allow, as I demonstrated with my prior example.

The let form does dictate a series of expressions to be executed sequentially from top to bottom, but the binding of their result to a name is done functionally, not imperatively.

This still models a data-flow, and the fact it lets you shadow prior names doesn't change the functional nature of it.

The form simply expresses a composition of functions and how their inputs/outputs connects.

That means it doesn't actually require any separate mutable memory location, even if the Clojure implementation for it might use some.

That's inherently the conceptual model of a functional computation, and it is not that of an imperative one.

    (let [a 1
          b 2
          a (+ a b)]
     a)
Models a data-flow that you can think of as a multitree:

      1   2
       \   \
    (+,   ,   )
     \
      3
What that means is you never need the variables, they are simply a syntactic convenience, in fact that's why they are not called variables but bindings, they are simply syntactic labels, there is no real requirement to have a seperate memory location that sequential instructions will be allowed to mutate. So how I see it, that fact makes it functional, and not imperative.

If you look at the multitree, at each level of the multitree, you are free to evaluate the nodes in any order, but if there is an input/output dependency, then you must guarantee that ordering.

At the end of the day, we could argue forever that we just have different taxonomies. I consider imperative programming the computational model where you give instructions in sequence which dictates mutations on memory locations.

I consider functional programming the computational model where you declare a composition of expressions that can be reduced using variable substitutions.

With this in mind, let qualifies as a functional construct, because it defines such a composition and can be reduced through substitution.

Now in practice, the Clojure let form reduces over this sequentially both left to right on a per-form basis, and top/down between each pair of bindings. And since Clojure allows you to mix imperative anywhere, you can have a println in it and know that it'll run after all that came before, and before all that comes after. But semantically it is still functional. And that's why you can't modify the bindings, because they do not conceptually represent memory locations which you can mutate.


Also, I don't think your (incorrect / incomplete) transliteration of the C code example to Clojure demonstrates anything we don't already know; namely that `(inc a)` in Clojure has different semantics from `++a` in C. Of course the Clojure expression you wrote has a different semantic interpretation from the C expression you wrote; they're different expressions. The fact that `++a` is side-effecting doesn't make Clojure code that depends on sequencing not imperative (imperative-as-opposed-to-declarative, that is).


The difference is that it is not possible to modify the value of `a` in the scope that it is bound too. Try as you want, but you won't be able to write an `inc` that modifies the value of `a` within that scope.

The declarative nature is that you're declaring that you want `a` to mean 10 within some scope. That's why you say "let `a` be 10 in current scope", that's your intention here, for `a` to be 10.

After you've declared that, it holds true no matter what.

Where as in the imperative style you say: put 10 at place `a`. This is variable assignment, there's a place which is refered too as `a` and inside that place you can set values and change them at will.

At any point you can instruct the language to change what value is at place `a`, there are no restrictions to the instructions you can give.

It's a bit of a oversimplification to claim that imperative programming is distinguished from declarative programming by assuming a lack of ordering in the latter.

Functional programming is not prevented from expressing order, rather it is less able to express random accidental order at the operational semantics level.

At the end of the day, it's very much about the computational model, imperative is based on Turing model, and Functional on the lambda calculus. Because the latter doesn't depend on a global mutable running state, it is said to be declarative, in the sense that what you see is what you get, you don't need to keep track of what the memory currently has to proceed to the next step.

That said, you're right that Clojure supports some forms of imperative programming as well, but let isn't one, your parent commenter pointing to Atom was more on point. That's what you'd need to do to get back a more imperative `let`:

    (let [a (atom 10)
          a (+ (swap! a inc) (swap! a inc))]
     a) 
 
Will now give you 23.


"The sequence of statements in the (let) block is imperative."

The println in the example is imperative, as it has side-effects, but a let block having an ordering of bindings is not inherently imperative. You can replace any let block with a set of nested functions:

   (let [foo 12
         bar (+ foo 100)]
     [foo bar])

   ((fn [foo]
      ((fn [bar]
         [foo bar])
       (+ foo 100)))
    12)
Sequential ordering of 'statements' is not automatically imperative. On the other hand, atoms are imperative, as they are mutable and therefore side-effectful.


Hmm. I usually think of "imperative" as "as opposed to declarative", and sequential ordering is one of its hallmarks in that context. And for Clojure atoms specifically I think of "functional" as in Okasaki's "purely functional data structures".

Language is hard.


I use Clojure quite a bit and I don't write anything imperatively.

But passing contexts I can see. There is a not uncommon pattern that one can use of keeping a large map with state in it.

However it's completely compatible with pure functional programming.


At first I thought you comment was sarcasm but it does appear to be serious.

There's nothing "imperative" (as commonly understood) in 98% of Clojure code out in the wild.

"Contexts" or "context updates" are also very rare, unless required by a non-Clojure JVM or JS library.

Can discuss further if you reference a specific open-source code example with "context" or "imperative style".


Well it's a bit preposterous of you to say that without actually having tried Clojure in good faith.

But, since we all make opinions from others, let me provide you a balance of opinions by giving you mine.

I'm a senior engineer and I currently use Clojure professionally. I find it to be a lovely, fun and productive language, my favorite one to date actually. I have prior professional experience with C++, C#, ActionScript 3, JavaScript, Scala, Kotlin and Java. And of all of those, Clojure is my favourite to work with.

The "context' pattern you may have heard of, I believe I know what it refers too, and it is not an imperative pattern at all, let me explain.

There's often a case where a piece of functionality will be implemented by multiple functions composed together. For example, an API handling a request will delegate to sub-functions which could in turn call down to more sub-functions.

In that scenario, it can happen that the sub-functions need data about the context of the call to the parent function, or they need data generated by the prior sub-functions that were called. In my example, lets say the request object to the API dictates various options and has the request details, and each options is relevant to different sub-functions, maybe one needs the username, where the other needs the cart-id both of which were passed on the request, but maybe the other function also needs the user-permissions which was obtained from the prior sub-function.

Ok, so say:

    someAPI({username: "foo", cart-id: 123}) {
      permissions = get-permissions(username);
      retrieve-cart-items(username, permissions);
}

Now as you have more and more of these forming a coherent whole, in Clojure there is a pattern where people will say, all this initial data and the data generated by the intermediate steps becomes the "context" data for the operation as a whole.

    (defn some-api [context]
      (-> get-permissions
          retrieve-cart-items
          :cart-items))
Where context is at first a map of the request data:

    {:username "foo",
     :cart-id 123}
And after the call to "get-permissions" it is a map of the initial request data and the intermediate data added by get-permissions:

    {:username "foo",
     :cart-id 123,
     :permissions [:can-view-cart]}
And after the call to retrieve-cart-items it now also contains the cart-items:

    {:username "foo",
     :cart-id 123,
     :permissions [:can-view-cart]
     :cart-items [:pen, :paper]}
Thus all implementing functions for the some-api operations are designed so they take an initial context map and return a context map with added context.

And generally in Clojure you'd use the destructuring syntax to indicate what in the context map you depend on:

    (defn get-permissions
      [{:keys [username]:as context}]
      ...)
So you see that get-permissions expect the :username key to be present on the context.

But, this is still fully functional, because none of the functions share a mutable reference to a shared context, they take an immutable context as input and return a new immutable context as output which will just happen to also include all the key/values of the context they received.

Sometimes people also pass in dependencies using this context pattern, which is a form of dependency injection through parametrization, but you combine all dependencies into one parameter object.


Very good explanation.


While I've used all of them, some of their listed libraries are a little dated IMO. Of course this true of almost all mature codebases.

For the sake of the less experienced, I'd point them to these substitutions in particular:

Compojure: Reitit (which they used in the front end too, so migration maybe in progress) would be my preference for backend routing.

Component: Integrant takes Component's ideas, but prefers the flexibility of multimethods operating on plain data to the typed records approach for defining systems.

Schema: Once very popular, but superseded by clojure.spec (bigger in scope) and to a lesser degree Malli for data schemas.

Potemkin: Avoid, handy for some internal code organisation purposes but hostile to tooling and debugging IMO.


In a lage codebase, the cost of switching is huge, and if older libraries work well, there is often no clear incentive to do so.

Clojure in general being very stable and backwards-compatible makes it even more easier to just continue using older libraries. So what if the library is "dated"? If it works well, why not continue using it?

(I speak from my own experience)


Totally agree, I didn't say they shouldn't continue using them.


we have a clojure codebase that's about 100k lines. Honestly I'm kinda fed up with it. Certain 3rd party libs we've used have been abandoned. We wrote our own libs for a major framework and it is failing behind.

Too many "I'm very clever" functions that are hard to understand and also have subtle bugs.


Seconded. I work in a clojure codebase that were trying to get out of. There's just dead libraries everywhere and stuff that maintained by one person that gets no updates at all. That or we just end up making functional "wrappers" around Java libraries and at that point we might as well just write straight Java.

Also yea everyone wants to be so damn smart having macros within macros within macros that no one knows what the original intent of the code is anymore.

The repl based development I find also breeds a really bad mentality of forgoing building a deployment process and instead people just repl in and make a bunch of changes and prod rarely matches whats checked into github.


I've been using Clojure for almost 10 years and writing macros has always been discouraged in the community. You don't see too many of them in the wild, and for good reasons.

If you're writing macros on a daily basis, you better have a really good reason for it.


The codebase im in is 10+ years old and macros are everywhere. Were macros a super hyped at one point in the past? I'm fairly new to Clojure just by virtue of being part of this system rewrite but I see stuff everywhere in the code im working in like in the article where there just seems to be macros that accomplish nothing except obscuring already built in constructs for the sake of maybe saving a few lines of code or combining one or two statements into one macro.

Its maddening to just be constantly mentally unpacking this stuff and searching through docs only to find out "Oh its not a clojure thing, someone wrote a macro for this"


No, I don't think they were hyped at any point.

They are used in certain libraries like https://github.com/ptaoussanis/timbre but for things that are simply not possible without macros, for example (timbre/spy (+ 1 1)) will actually print both the expression and the result:

DEBUG [ss.experimental.scratch:1] - (+ 1 1) => 2

This is one of my favorite debugging tools. IMO beats a debugger hands down! Printing can also be disabled with a global library flag without code modification.

Perhaps if the macros are "simple" they can be unpacked relatively easily. I do understand how mentally challenging that can be for somebody who's just starting with Clojure. I've been using Clojure for a long time and only just recently became more comfortable with macros after I made a conscious effort in that direction. I'm still far from an "expert" in them.


I’m sorry you are having trouble with Clojure.

One thing to keep in mind is that Clojure has excellent interop with Java, and it is trivial to write a small Clojure wrapper that just does critical thing for your dependency injection system e.g. startup and shutdown, and just pass through a reference to the Java object when you need to make function calls.

Another thing to keep in mind is that Clojure code is extremely stable, so pure Clojure libraries often require minimal maintenance, and code written for Clojure 1.3 runs on 1.10 99.99% of the time with no modifications.


I can see where some of that is valuable and the whole pull of the JVM ecosystem and stuff that you get to pull on a lot of other packages and libraries but the majority of what I've seen is a lot of packages we use are just thin wrappers around existing Java and most of the time they have functional paradigms anyway. something like resilience4j for example.

The other large issue in packages ive found is using some performance and monitoring tools to our clojure apps such as DataDog. The java tracing brings the app to a crawl based on how it instruments the compiled clojure code. Since clojure is a dependency itself the core language ends up having instrumentation code injected and the app grinds to a half. Something the Datadog team has not been able to solve.

Overall unfortunately I've just found Clojure isn't really particularly good at anything. It can hog memory, lib support seems iffy, Codebases never seem to have this "wow" factor that lots of people mention about clojure's elegance. I just don't see the promised land.


Yes, we have some horrid macros. I think the intent was to hide infrastructure details from business logic, but it turned into a big mess.


I wonder if they'd be abandoned if companies using them considered these 3rd party libraries valuable enough to contribute/fund the development.


It sounds like Clojure just isn't your cup of tea. Both of those problems exist in every other language I've ever used.


Are those "clever" functions pure?


Would you mind sharing the abandoned libs/framework?

When you say you have too many "I'm clever" functions, do you mean within code your team wrote, or in the ecosystem at large?


it is a couple of the clojurewerkz libs.

the "I'm very clever" defns and macros that I'm referring to were written by our team.

Most of the 3rd party libs are pretty good. Except for them being not maintained and not keeping up with the latest versions of databases, etc.


Obligatory evangalism: Considered Kotlin as a JVM-lang-of-choice? We use it on all our backends and we really love it.


I think its a "to each their own" honestly. We use Java, Kotlin, Scala and Clojure, and of all four my favourite by far is Clojure.

We also don't suffer from any of the problems mentioned above, had no issues with libraries and no one abuses macros or makes prod repl changes willy nilly, etc.


Java 16 is good enough for us.


I develop and maintain a 60k line Clojure+ClojureScript codebase by myself, so I can definitely confirm that Clojure does allow for smaller teams to maintain larger codebases :-)

I also fully agree with this:

> Detecting an error when creating a record is much better than when using it later on, as during creation you have the context needed to debug the problem.

I made it a rule to perform integrity checks as early as possible: when creating, accepting or transforming data. I use spec (but schema would work just as well here), and have lots of pre/post conditions in my code.

I tend to settle on "simpler" solutions. I use mount, rather than component, because it uses the namespace hierarchy and requires less code and management. I use very few macros, and try to use simpler tools rather than more complex ones.

I noticed that these days roughly 30-40% of the code I write deals with integrity checks, anomalies and anomaly handling.


> I noticed that these days roughly 30-40% of the code I write deals with integrity checks, anomalies and anomaly handling.

Love to hear more about this.

My approach has been to convert errors into data and have behavior that deals with conveying these errors in different ways, but it's always felt too complex for the task. I've just got a lot of stuff dealing with handling errors in the different environments (jvm / browser / nodejs / rn) and across async and non-async code.


I don't really like 'Component'. I seems very clunky and we had a lot of issues with it and a lot incidental complexity in our codebase (now converted to Java). It was the first real system that did these sort of things but if I start a project now, I much rather use Integrant or Clip.

https://github.com/weavejester/integrant

https://github.com/juxt/clip

I haven't used Clip a lot yet but my next project is defiantly going to be with Clip.

For validation I have started to use Malli (https://github.com/metosin/malli). I really liked the idea behind clojure.spec but some of the implementation was a bit clunky and a bit too 'Rich Hickey', not sure how to describe it otherwise. Schema was again the first serious attempt at a library like that for Clojure so it was really nice when it came out. Malli sort of combines what is great about both.


Really a great article! It's extremely rare to find people who truly understand Clojure/Lisp willing to share to this detail.

Having said that, the article also exposes some flaws of Clojure that I also found.

1. Clojure overly obsesses with expression and non-procedural style coding. While in reality many large scale Clojure repo deploy their own macro to bring procedural style back (like letlocal in the article).

2. The builtin abstraction tools are almost always too simple to be useful.

But those are not big deals IMO.

At the same time, I also wonder if this article mention any feature of Clojure that is truly unique to Clojure compared to other Lisp/Scheme languages. I wonder if the article will still make sense if we simply substitute all "Clojure" to "Racket" (obviously I know ecosystem is not comparable).


Is there more info about the tool itself that claims 100X decrease in application development cost? Quite the claim.


I'm also curious. It seems like the website has been around for 2 years and hasn't changed much. If they wrote 250k lines in two years, that is around 340 lines a day. That seems like a rather large project to build before putting it in the hands of customers.


It does seem odd. Stealth product. Bold claims. 3 blog posts one of which is a funding announcement and the other two having nothing to do with the product. I am intrigued but also a bit skeptical.


Nathan Marz was the original creator of what became Apache Storm [1], which powered Twitter for some time. Skepticism is healthy, perhaps even warranted here, but I'm not betting against him just yet.

[1] https://en.wikipedia.org/wiki/Apache_Storm


He is also the creator of Cascalog (Hadoop query dsl in Clojure) and the Lambda architecture pattern.

Not lambda as we know it now popularised by AWS, but an architecture for stream processing where batch views from expensive and slow batch jobs are combined with speed views from stream processors into the final live result.

https://en.m.wikipedia.org/wiki/Lambda_architecture


100x increase in productivity is a silly, hyperbolic claim no matter who makes it. I'd even be skeptical of a 2x claim, because in 25 years I have yet to see any of these productivity plays actually pan out. What I have seen are small incremental improvements here and there, but you can't point to anything in the recent past that has improved productivity 10x or 100x (unless your old process was just total crap).

At best I would expect a small niche collection of very specific tasks to be improved, but definitely not applicable to general productivity.


you concluded something is impossible based on the fact that you have never seen it before? talking about silly claims :)


Yes, that's how observation and personal opinion works.


well it should not be how personal opinions are formed - maybe you should look up first principle thinking if you have not yet. I do not know if their 100X claim will come true or not, however saying something is impossible merely because it has not been done in the past is clearly wrong


> saying something is impossible

Point out where I said "impossible"? I said skeptical, which is an absolutely perfectly position to take. I'm all perfectly happy to be wrong.


“Powered Twitter” is an overstatement. Storm was indeed used at Twitter for select streaming use cases, but it was a bit of a mess and ended up being rewritten from the ground up for 10x improvements in latency and throughout [1]. Marz was at the company for < 2 years. Lately, Twitter has been moving data processing use cases to GCP [2].

Storm is also not very well regarded in the stream processing community due to its restrictive model, poor guarantees, and abysmal performance [3].

I have nothing against Marz, but I do think skepticism is warranted until we see what they’ve built.

[1] https://blog.twitter.com/engineering/en_us/a/2015/flying-fas... [2] https://blog.twitter.com/engineering/en_us/topics/infrastruc... [3] I worked at Twitter for 3 years, then at Google on Millwheel and Streaming Dataflow.


if the cost of building a large scale, end to end application used to be:

$10M, the new cost will be $100k, or $10k, or less.

$1M, the new cost will be $10k, or $1k, or less.


It's still in stealth mode. They raised $5M in 2019: https://news.ycombinator.com/item?id=19565267


how exactly is it stealth if they have a blog and announced funding?


I meant that they didn't reveal the product yet.


I've heard from investors that it is similar to Darklang. It certainly has the same goals, but dunno if it's the same approach in any way. Will be interesting to see


In that case, I wouldn't go near it with a ten foot pole.


what? nothing's perfect, but i thought darklang was kinda cool to the extent i messed around with it. i even did a couple toy "useful" things with it and found it pretty fun.


I will be very dubious of this claim, or belief is really what it is. There is not even simplistic metrics to back it up


The team is impressive, I was wondering what all this new internal language, 400 macros, etc., could be put towards, thinking they were stuck in over-engineering. But after seeing that promise for their app, I changed my mind. Something that’s capable of making you 100 times more productive probably does need that level of development.


I don't know even still if efficiency is correlated to lines of code in a product. Sure maybe a weak relationship at best, but the scale of the software does not necessarily mean it will be any more useful than something else.


I don't think twobitshifter's point was that "more code === better product" either but rather that if it's a product with big scope, it probably has more code in it than if it was a product with narrow scope.


I know Clojure well enough to call BS on that claim. Unless you compare what they do with the worst possible incompetent alternative.


If I want to learn Clojure, where is the best place to start?

I have a lot of experience with Python/Javascript now, and spent many years in C/C++/Objective C and Java. Also have some Go.


If you are into books, I will recommend Clojure for the Brave and True which is free to read online [1], and Living Clojure [2], in that order.

If you're into interactive kata-style problems, there's 4Clojure [3].

Also join the Clojurians's Slack [4] for the community.

[1] https://www.braveclojure.com/clojure-for-the-brave-and-true/ [2] https://www.oreilly.com/library/view/living-clojure/97814919... [3] https://www.4clojure.com/ [4] https://clojurians.slack.com/


And Clojurian's zulip: https://clojurians.zulipchat.com


I personally started with this talk by Rich Hickey (person who made Clojure) https://www.youtube.com/watch?v=ScEPu1cs4l0

One of the best "why Clojure" talks that I've seen, especially for somebody coming from an OOP perspective, like myself.


Seconded. Reading books will show you how to write Clojure code but exposing yourself to Rich Hickey's presentations could change your approach to programming in general.



Eric Normand’s videos are amazing. If someone is interested in Clojure, I highly recommend his Clojure for Beginners videos (or whatever the obvious beginner tutorial is in case I messed up the name). Even if you never touch Clojure again, it’s worth it for the aha moment that the whole thing builds up to about using plain data instead of code.


The "Clojure for the Brave and True" ebook is a popular and free resource for beginners: https://www.braveclojure.com/


> And doing things dynamically means we can enforce stronger constraints than possible with static type systems.

Can someone please explain this to a novice like me?


Taken literally the claim isn’t true - a Turing complete type system can enforce any constraint that a Turing complete programming language can. But your everyday type systems typically can’t express concepts like “a list of at least 3 elements” or “a number which is a power of 2”.


I think he's referring to the fact that you have more information at runtime (the actual request) than you do at compile time.

But your distinction about the theoretical limitations vs the practical capabilities of current type systems is also a good point.


Pardon my ignorance, but can this spec thing automatically deduce that 8^n evaluates to "a number which is a power of 2" or log(2, "a number which is a power of 2") is an integer?

If yes, then I agree this is super helpful (and magical).

If not, how is this different from a normal constrcutor (with runtime input validation)?


No, it's not automatic like that. You define predicate functions that need to evaluate to true to pass. One of the reasons this is better than a normal constructor is these predicate functions are attached outside of the function they're describing. So they're inspectable and composable.


Refinement types and Dependent types can do all of that. But you are right that mainstream languages ca’t do that today. Hopefully that will change.


The biggest manifestation of this is clojure spec:

https://clojure.org/about/spec

If your impression is that this is like sugary unit tests: It is not. You can run specs during development, while running a in-editor REPL, code gets evaluated while you type it so to speak.

It is way more expressive than a type system and it is opt-in, but it doesn't give the same guarantees obviously. It is also not meant to be a type system but rather a tool to express the shape of your data. It is used for obvious things like validation but also for documentation (over time) and generative testing among other things.


Most static time type systems can enforce that a voting age is an Integer, say, but can't validate that any value is at least 18, for example.


They can, this is a misconception. Create a new type, and make it so that it's only produced by a function that checks if the age is superior to 18.


That's a poor hack, it moves an invariant that could be enforced at compile time to not much more than a convention that has to be preserved by code review.

E.g. a colleague implements de-serialisation for your type but adds an empty constructor to make their life easier. You might not learn there's a hole in the boat before your first bug.


I wouldn't call it a poor hack. In a way it's a parser, which aren't really poor hacks, but can be abused. Sure, it's not perfect and I'd like it better if it was checked at compile time, but it's way better than using a simple int. Also, the moment you have a bug, tracking it is really easy: just list the functions that returns this specific type.


Yes, but this can be easily worked around by creating custom types that wrap the integer (and can be unwrapped on compile time) and some conversion functions. So slightly more tedious but saying one can't define complex constraints on static types is not quite correct.


This is not really true if you include ML languages. Most of the time you create a specific type for a type that is constrained. Ada has pretty good support for this and ML languages too. Once you have a specific type you make all your functions accept only VoterAge instead of Int.


I think this would work with every language that has nominal and not structural typing. If you have structural typing, you have to wrap the int. For example, this is "branding" in Typescript. I'm not sure if there is a performance penalty and how big it is though.


FSharp can easily do this.


Turbo Pascal could.

In practice the only benefit is very similar to checked exceptions - can't forget to check that value is in range.


You can have arbitrary types. Like a PrimeInteger type or a NameStartingWithD type or a complex hashmap with types defined for each key, like found in most configuration files.


IMO the sentence is incorrect. Static type systems would "enforce the stronger constraint" at run time... same as the dynamic type system. Perhaps the dynamic type system can have a fancy linter that does something crazy like running your code... but I'm not aware of any such linter.


Most static type systems that can do what clojure.spec can do tend to include runtime assertion and type checks and do not erase type data from runtime (what some static type zealots call "uni-type" approach).

For example Ada's type system, which has equivalent of Common Lisp's SATISFIES construct, which implements a runtime type assert that can use all the power of the language.


Sorry, I don't understand

>Most static type systems that can do what clojure.spec can do tend to include runtime assertion and type checks and do not erase type data from runtime (what some static type zealots call "uni-type" approach).

When you don't erase the type data... you're gonna have more than one type. How is this a "uni-type" approach?


A popular (stupid) talking point in the stupid discussions about static/dynamic while missing that the axes were orthogonal, was for proponents of static types to claim that dynamic languages were "unityped" based on some convoluted logic about tagging data at runtime.


As someone who prefers static types, I agree with you that claiming dynamic languages are "unityped" is stupid. Javascript/Python/Clojure all literally have types.


> Perhaps the dynamic type system can have a fancy linter that does something crazy like running your code... but I'm not aware of any such linter.

Names in Clojure codebases and libraries are pretty reliably annotated with trailing exclamation marks, looking like: `save!`.

To that end, running Clojure code blindly to test it and its types is a fairly practical practice, coming up in cases like Ghostwheel [1], which uses this for generative testing of clojure.spec types, which can be much more sophisticated than what is commonly used in static systems, even with refinement types.

[1]: https://github.com/gnl/ghostwheel#staying-sane-with-function...


>which can be much more sophisticated than what is commonly used in static systems, even with refinement types

Can you elaborate on this? I have some experience with Clojure, and have been relatively unimpressed with spec. Everything it does I can do with (refinement) types (I think). Reading over the Spec documentation it constantly talks about predicates... which is exactly what a refinement type is.

Spec/Clojure has the problem where it validates... but doesn't parse. See: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...


Thank you for sharing this detailed rundown. One of the challenges that I faced as a new Clojure developer was understanding how all the parts fit together (and in fact, what parts I should care about in the first place). Example: is learning Component worth it? (yes)

Extremely glad to see this resource published! Thank you to the Red Planet team for releasing this!


> Our codebase consists of 250k lines of Clojure split evenly between source and test code. It’s one of the largest Clojure codebases in the world.

Aren't there many Clojure projects out there, or is the language generally not used for large projects? I have a JavaScript frontend written in Vue that is 150k+ lines of code, without any tests. Which would be 25k lines of code more than they have, disregarding their tests.

I'm not saying that more code is better, I just found it odd that 250k was considered one of the largest Clojure codebases in the world


In general clojure can be a very terse language. A lot of mileage per line of code, so to speak.


At least according to Paul Graham, one of the appealing things about Lisp is the economy of expression. Not many Clojurists would brag about how big their code base is, I think.

http://www.paulgraham.com/popular.html


I jumped to the comments first, and my initial thought was, “what would Nathan Marz think of all of this...and what is he investing in these days?”

I was honestly skeptical, reading all of this clojure-macro criticism (constructive for the most part), but trust was restored when I saw the author. Well done, sir! Excited to see the details.


Why hasn't Clojure provide any java.util.function interfaces integration? for language a that's suppose to "embrace the host" to have "ergonomic" access to the host libs because it itself lacks an ecosystem this seems weird.


This is actually one of my rare criticisms of Clojure. There's a patch, but there was apparently concern about perf. Personally, if I had to do a lot of interop like this, I'd pull it into a custom build, as it wasn't large. I've not kept up with it lately, though, as I haven't needed to do a lot with Kafka recently.


Clojure precedes Java 8.


GP means why it hasn't kept up with fundamentals of the host language, Java 8 was released in 2014.


I wonder why they are using Schema instead of Spec. Schema was popular before the latter existed, I didn’t realize anyone still uses it.


That is probably exactly why they are using it ... you don't get to 250k lines of code overnight.


My understanding is that the company is a few years younger than Spec, but perhaps the code base is very very old.


Schema is a good library. Everyone seems to use spec today, but Schema easier to use than spec IMO.

Spec is also still in alpha, with spec2 still under development.


We used clojure.spec in production for years. "Alpha" just means the API could change.


I still prefer Schema because I find it more much more readable.


yes i also prefer it. With Schema, the defined schema's are normal vars, which you can jump to with editor support. With spec i'm always searching for the definitions.


I would like to ask if you are comfortable of Clojure protocols, because i tend to avoid them. What do you thik about it?


I've been using Clojure for a while (8+ years) and I don't use them often. Unless you have a specific need (say, very high performance), regular Clojure maps give you most of the benefits without any obvious downsides.


> I would like to ask if you

Ask who?


Seeing that there is a need for type checking in Clojure. Has anyone used https://github.com/typedclojure/typedclojure in production?


this is wonderful - I've been curious about continuations and program state as a language construct. Not sure I understand the redplanet - type system et


This is the first time I'm looking closely at Clojure code and it seems wholly antithetical to the practice of software engineering. Not very readable, too many ways to create magical jank, odd coding conventions, and it looks like refactoring and maintenance would be a nightmare. I'm very glad the community has had the collective sense to not adopt this widely.


It's weird to dismiss something so easily after seeing it for the first time. Maybe you should give it more time.


By "looking closely at" do you mean you've actually grokked the language and used it to build anything?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: