Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Haskell in Production: Standard Chartered (serokell.io)
186 points by NaeosPsy on May 3, 2023 | hide | past | favorite | 160 comments


I've talked to a lot of people using Haskell in production over the years, and one of the things I've consistently noticed is a disconnect between the Haskell features that are often touted, and the Haskell features that people use in production.

This isn't always a quiet omission, like "we didn't end up using this" either. The best example is that a lot of people end up doing some significant work to disable lazy evaluation. Simon Peyton Jones is reported to have said that lazy evaluation has been kept "to keep us honest about side effects", i.e., that size effects become immediately painful with pervasive lazy evaluation. But in a production system, functional purity (no side-effects) is a means to an end, and lazy evaluation comes at a pretty extreme performance cost. Not everyone disables lazy evaluation in production Haskell, because not everyone thinks they need to for performance, but I've definitely heard about it enough times that it seems like a relevant pattern.

Production users of Haskell do seem to be happy about types, years into projects, so I'd take away that types are a benefit, but I do wonder if one might get those same benefits in another language without the problems which seem to crop up with Haskell, such as laziness causing performance issues, monadic complexity, and difficulty hiring.


If you disable lazy evaluation, you would not be able to write arbitrary "where" sections, with definitions ordered as you want them, including tying-the-knot definitions, which are not exceptionally rare.

So no one disables lazy evaluation.

It is strict evaluation that is selectively enabled.

I also have to say that focusing on the types in the compiler development was the right thing, but what was not right is not focusing on the automatic analysis and removal of thunk creation for lazy evaluation. There is a project now to implement GRIN (graph reductin intermediate notation) [1] which supports such analysis and GRIN is early 90-th.

[1] https://grin-compiler.github.io/

GRIN also automatically gives you many benefits of supercompilation.

We could have a much better Haskell code now if ghc decided to support several backends back then, not only STG/C--.


>So no one disables lazy evaluation. It is strict evaluation that is selectively enabled.

True of vanilla Haskell, but from the article:

> So we ended up with Mu, a whole-program compiler to a bytecode representation with a small runtime interpreter... Mu has a strict runtime, which makes program performance easier to analyse and predict – but prevents us from just copy-pasting some Haskell libraries that rely crucially on lazy evaluation.

If S&C is to count as a Haskell success story (which it's often touted as), then I think it's fair to point out that it's a codebase where strict evaluation is used throughout.


Mu was developed using Haskell, that's the point. The strict runtime and interpreter may or may not be build using Haskell, I dont't know.

As far as I know, the author of Mu is Lennart Augustsson, also author of Bluespec [1]. Bluespec compiles language that is very reminiscent of Haskell into a hardware gates. And Bluespec is relatively successful. Is there anything in the hardware gates' "evaluation model" that is related to the success of Bluespec.


I mean it's a point, but it's not really the point that the article is making:

>In this article of our Haskell in Production series, we interview José Pedro Magalhães from Standard Chartered – a multinational bank that has over 6 million lines of code written in their own dialect of Haskell called Mu.


Mu is strict because it was designed to target an existing strict runtime. That's all.


> Quite possibly, but I suspect there are good reasons why the existing runtime is strict.

This makes me wonder if this affects how code is typically written. Is it very elm like? Is composition more difficult? Is it very monomorphic?


Quite possibly, but I suspect there are good reasons why the existing runtime is strict.

In any case, regardless of the underlying motivation, it is certainly an interesting fact that one of the largest commercial Haskell(ish) codebases uses a strict variant of the language.


Mu is not Haskell, it is entirely different language.

The Haskell codebase that made Mu possible is not strict.


> their own dialect of Haskell called Mu. [emphasis mine]

>Mu has a strict runtime, which makes program performance easier to analyse and predict – but prevents us from just copy-pasting some Haskell libraries that rely crucially on lazy evaluation. [emphasis mine]

If Mu were an entirely different language then you wouldn't be able to copy-paste any Haskell libraries. See also e.g. here: https://www.quora.com/Why-did-Standard-Chartered-need-its-ow..., https://www.youtube.com/watch?v=hgOzYZDrXL0


I can copy-paste some of Haskell definitions into Agda source file and vice versa. I can copy-paste some of C code into C++ source and vice versa. Given C preprocessor, some Lisp code can be moved into C or other languages.

The likeness of languages' syntaxes does not mean they are the same.


https://anil.recoil.org/papers/2011-cufp-scribe-preprint.pdf

> Mu is a true Haskell dialect in that code written in Mu may be compiled with a Haskell compiler.

> [...]

> Their experience with strict semantics has been positive. Particularly useful is the ease of obtaining meaningful stack traces, tracking resource usage, […]

My point was that "if S&C is to count as a Haskell success story (which it's often touted as), then it's fair to point out that it's a codebase where eager evaluation is used throughout." Please note the 'if'. The article presents S&C's large Mu codebase as a Haskell success story. In this context it's perfectly fair to point out that Mu has a strict semantics.

If you want to insist that Mu is an 'entirely different' language to Haskell then that's fine. As with natural languages, there's no objective line to be drawn between 'different dialect' and 'different language'. However, anyone who holds to this view obviously cannot follow the article in presenting a large Mu codebase as an example of Haskell in production.


What is described in article is a subset of Haskell: "...as long as one is writing mostly vanilla Haskell you can’t tell the difference."

And then it proceeds to tell us that you can't tail recurse.

Mu is different language which uses Haskell ecosystem.

What I am interested in and what article does not tell is how big the actual Mu compiler is. I think it is in the hundreds of thousands of lines of code, 200K SLOC or more.

And it is still a big code base and it's use of Haskell is success.


> So no one disables lazy evaluation.

That's objectively incorrect.


> It is strict evaluation that is selectively enabled.

Worth fighting over it? Zero snark intended.


Probably not!


A few minor corrections

> Simon Peyton Jones is reported to have said that lazy evaluation has been kept "to keep us honest about side effects"

Rather, one of the main benefits of laziness was to force purity and thereby to instigate the development of typed effects. By contrast, the reason that laziness has been "kept" is simply that switching to strictness now would break (almost) all Haskell programs!

> lazy evaluation comes at a pretty extreme performance cost

Not really. Haskell's implementation of laziness is very high performance and the compiler can easily determine when creating thunks is not necessary. The reason that lazy evaluation leads to performance problems is that programmers don't know how to use it correctly. Whether that's the fault of the language or the programmers is up for debate.


>The reason that lazy evaluation leads to performance problems is that programmers don't know how to use it correctly

It's because it's objectively harder to reason about performance with lazy evaluation as performance becomes non-local. You can no longer reason about a function's performance just by looking at its body, instead you have to know the nature of the inputs passed to it (whether they're thunks or materialised values).

Each single uses of lazy evaluation also imposes a latency overhead as the thunk/value must be behind a pointer. Yes the compiler can remove unnecessary thunks, but that's "unnecessary" in the sense of "unnecessary given the semantics of the program", not "unnecessary given the problem the program's trying to solve". I.e. lazy by default will generally result in someone writing code that's lazier than necessary unless they're very careful with annotating to the compiler everywhere laziness isn't needed.


> It's because it's objectively harder to reason about performance with lazy evaluation as performance becomes non-local.

This is off-repeated, but has two interpretations. If you mean performance as in number of steps needed to reduce an expression, lazy evaluation is superior to strict evaluation. If you mean the indirection needed to access a thunk, well someone -- either the producer (strict) or the consumer (lazy) -- had to do that anyways. If GHC can determine that the demand for the value is strict, it won't generate the thunk in the first place. Instead it will provide a specialize calling convention via the worker/wrapper transform, which is standard. On the un-optimized case we still have dynamic pointer tagging (which is a transgression on the T of the STG machine) which serves as a tag to check whether we are dealing with a value or a thunk quickly. So if by non local performance you mean the indirection, modern GHC shows that is not true.

If you means that space leaks are a non-local property and they affect performance, well you are sort that right. But as with all programming, we have defensive patterns against that.

There are 2 types of space leaks: liveness leaks and strictness leaks. Only the first one are a non-local property, ie the appear as a consequence of the composition of different code pieces. But given the default purity of Haskell, those can only appear on long lived data references with are syntactically marked by:

- IORef, MVars, TVars

- get/put pairs over a state environment

So what you do is to specify in the types that values stored on those environments are to be evaluated before being stored, so references don't leak. I speak about this on this

https://epicandmonicisnotiso.blogspot.com/2023/04/how-to-avo...

and how to specify the a specific evaluation level on the types at here

https://epicandmonicisnotiso.blogspot.com/2023/04/presenting...

although that library has changed a lot to enforce the invariants.


This reply neatly captures that functional programming tends to consider performance/efficiency in terms of number of reductions. Sadly other programmers tend to view it as wall clock time and the two don't correlate very well.


I just gave a list of possible meanings of what parent said. Only one of them is the numbers of reductions.


Sadly?

If a client comes to me and says one of the parts of their application was slow, and I come back to them saying I lowered the number of reductions with zero change in the wall clock time, they'll fire me, and rightly so.


To be nitpicky, wall clock time isn't the same as CPU time or power draw. You can increase efficiency at the same time as you increase wall clock time because they aren't necessarily the same metric.


I think you're overcomplicating this. In a language with lazy evaluation, you can't know what gets evaluated without looking at the whole program (in the worst case). It's in that sense that performance becomes non-local.

Here's a simple specific example. Take the Haskell expression [1..n], for some big n. There is no general answer to the question "what would be the performance implications of replacing [] with [1..n]" – it depends on the context of [].

In a strict language, you can say (roughly at least) that replacing [] with [1..n] will add at minimum a certain number of extra CPU cycles and a certain amount of additional allocation. This kind of reasoning is pretty rough and ready, but it is accurate enough to be useful for reasoning about performance in practice.

I note that Simon Peyton-Jones has said these exact words in a public talk:

"Laziness makes it much, much harder to reason about performance."

I think it's extremely unlikely that he's mistaken or confused on this point.


> In a strict language, you can say (roughly at least) that replacing [] with [1..n] will add at minimum a certain number of extra CPU cycles and a certain amount of additional allocation.

It's not at minimum, it's always the worst case cost of N in strict languages, whereas the lazy setting provides you with amortized complexity.


I think you might be misunderstanding what I meant by "at minimum". I'm talking about the case of replacing a [] constant in some arbitrary piece of code with something like [1..10000]. Given strict semantics you'll of course incur at least the time and space costs associated with constructing the list (that's the "at minimum" bit), but you might also incur additional costs depending on what the rest of the code does with the list. For example, it might execute some arbitrarily expensive computation if the list has more than 20 elements, or whatever.

I think you might have thought I was saying that given a strict semantics it was somehow still possible that you wouldn't necessarily incur the full time and space penalty for constructing n elements (which of course is not the case).


Interesting comment, but many parts of it flew over my head. Would you by chance have some suggestions on where could I better my knowledge of Haskell internals/FP runtimes? Of course your blog has been added to my to-read list :)

I used to know Haskell to an “intermediate level”, but haven’t used it in years, but I am more familiar with “traditional” runtimes/compilers, like the JVM as a reference.


> By contrast, the reason that laziness has been "kept" is simply that switching to strictness now would break (almost) all Haskell programs!

Also, laziness is a big part of why Haskell exists in the first place. "Strict Haskell" would arguably feel more like ML, Agda, Idris, etc. than (lazy) Haskell. In which case, just use one of those; they're also decent languages ;)


Subtle correction: as far as I’m aware if you were to disable lazy evaluation (which you can now do via compiler pragma) on any Haskell program of significant complexity it would blow up. Common practice is to force evaluation of critical sections of code. That’s something the compiler can sometimes do for you.

Laziness in Haskell is a great way to write declarative code, you really are describing a concept rather that telling the computer what to do. That can lead to very clear and orderly source code.

Having said that, the drawbacks are significant, performance, memory usage, and debugging can all be a real pain. The last one sucks because it affects development of all kinds of programs, not just performance critical (and challenged) ones.

Shout out for PureScript, which is essentially a strictly evaluated Haskell that runs on the browser, servers, and PaaS.

Difficulty hiring on a small scale in my experience is complete BS, there are still many passionate users looking for professional gigs. Once that pool is exhausted, I think a company should probably look outside the language for people with some type of FP experience and expect language-specific training as part of onboarding, whether that’s worth the time and cost is entirely settled for me but certainly a topic for debate.


> Subtle correction: as far as I’m aware if you were to disable lazy evaluation (which you can now do via compiler pragma) on any Haskell program of significant complexity it would blow up. Common practice is to force evaluation of critical sections of code. That’s something the compiler can sometimes do for you.

Anecdotally I can confirm this is true.

> Laziness in Haskell is a great way to write declarative code, you really are describing a concept rather that telling the computer what to do. That can lead to very clear and orderly source code.

100% agree.

> Having said that, the drawbacks are significant, performance, memory usage, and debugging can all be a real pain. The last one sucks because it affects development of all kinds of programs, not just performance critical (and challenged) ones.

I think this largely depends on domain. Then, it's not clear if the best tradeoff is to go fully strict or develop better intuition for lazy evaluation in many of those cases.

> Difficulty hiring on a small scale in my experience is complete BS

I have a theory this meme is repeated by people who don't know where to find the many functional programming candidates.


If a candidate is primarily interested in your company because of your particular tech stack, they are probably the wrong candidate.

If your company needs to find candidates primarily on the basis of their familiarity with your particular tech stack, that is probably the wrong tech stack.


> If a candidate is primarily interested in your company because of your particular tech stack, they are probably the wrong candidate.

I disagree. I believe that most candidates are likely only interested in your salary, benefits, or other things not relevant to what your company does. A necessary state of affairs where worker rights and worker loyalty don't count for much in the face of the horribly named "right-to-work" laws.

Therefore a candidate that has a strong interest because of your tech stack is a positive where there wouldn't otherwise be one.


I agree that “here for the tech stack” is generally better than “here for the money”, but I’ve just been fortunate enough to work in situations where people were there for the mission and there for the people. It’s also really hard to blame people for being there for the money when, hey, we don’t know what kind of student loans or life financial situation they have. And when you don’t know a domain or company really well, “we pay more” is a harder-to-forge signal about prospective coworker quality than “oh our mission is really inspiring”.


> I’ve just been fortunate enough to work in situations where people were there for the mission and there for the people.

How is "being there for the mission and for the people" better than "being there for money and for tech stack"? The latter also has to do with people and missions, only the missions and the people are important and related to the candidate directly, and not through company owners' or hiring managers' goals (most likely motivated by prospects of monetary rewards too).


> How is "being there for the mission and for the people" better than "being there for money and for tech stack"

Tech stacks change; human stacks stay the same. Intellectual honesty isn't going to obsoleted by some shinier virtue in 5 years—and if a company needs to pivot, it's still going to be a right tool for the job.


Typically tech stacks don't change for good reasons, just subjective reasons.

Why should I value the subjective decision of some new engineering manager that decided the tech stack should change so they can pad their resume?

Even if I'm there for the mission, this would give me pause.

If I was there for the tech stack alone, I'd quickly be looking for a new job.

The central point you seem to be making is "hiring for people there for the mission means employees friendlier to company changes/pivots". This feels valid, however the tech stack could affect execution of the mission. Or a given person could just hold the opinion that tech stack affects execution of the mission.

I guess my counter-argument to that then is it's not such a straight-forward win.

However my views are that tech stacks and programming languages matter a lot more than most give them credit for. See:

It's not what programming languages do, it's what they shepherd you to https://nibblestew.blogspot.com/2020/03/its-not-what-program...

So it's easy for me to recoil to hearing "right tool for the job" cargo-culted without real arguments justifying the comment.

Circling back to the central point, I do think I would bias towards hiring people that seem "there for the mission". I believe many people would probably be just pretending though, so it's not that great of a postive signal imo.

However we may differ in that I don't think I'd heavily avoid hiring people "there for the tech stack" any more than I'd try (and fail) to avoid hiring people "there for the money".


I think where we're ending up here is that—while all these points ("tool for the job", "here for the mission") may be true—they are often cited by people who are full of shit, so seeing them in job postings, interviews, etc doesn't really send any useful signal.


If you're working at a for-profit corporation, "the mission" is "money", so there's no distinction between "here for the mission" or "here for the money". Anyone who says otherwise is either naive about what the mission is, or lying (which I can't blame them for, because that's often the politically savvy move). Exceptions exist, but they're temporary because competition usually kills them.

Sadly, the mission is often money at nonprofits as well, because capitalism won't let you survive without it. But the "here for the mission" folks skew a lot more toward the "naive" side of the spectrum than the "lying" side. Again, exceptions exist.


I was on the naive side, you're not being unfair!


I’ve seen versions of the first statement several times from different users on HN, so know I’m not singling you out, but I believe this to be a deeply flawed view:

It’s not the candidate’s job to be interested in your company. As a founder, leader, or even hiring manager it’s your job. Create a positive working environment. Get to know your employees and what their goals and interest are. Do your best to find ways to incorporate their interests into their work and align it with the company’s needs. Build rapport and treat them with respect, so when you must ask them to do tasks that don’t align with their long term goals they won’t resent you.

If you can use your tech stack to get desirable talent in the door, that’s excellent. Now it’s your job to retain them, keep them productive and happy.

Regarding your second statement, every job ad I’ve ever seen for software engineering contains a list of preferable languages and libraries, and I’ve never doubted that all things being equal, preference would be given to candidates with greater familiarity in that stack. Should it ever the primary criterion for hiring? Probably not.


> It’s not the candidate’s job to be interested in your company. As a founder, leader, or even hiring manager it’s your job.

It’s my job to do that for the right candidates. Positive working environment, caring about employee growth, long term goals—all these things you mention are extremely important, and they are what we should be competing on. If an engineer would forgo an opportunity that provides those just so they can compose monoids in the category of endofunctors, that’s unfortunate—but I am under no obligation to indulge them.

> Should it ever the primary criterion for hiring? Probably not.

I explicitly referred to when it’s the primary concern; we don’t disagree here.


Surely there must be some lower bound of technology after which a candidate can be completely excused for turning down a job regardless of the factors you have said should be their criteria. Examples: Perl, COBOL, or punch cards.


There is—at which point you need to find candidates primarily by their willingness to work with your tech stack, which means you probably have the wrong tech stack. It's not like leadership at punchcard-dependent firms disagree, and think their tech stack is great; they just can't switch that choice on a dime, and often need those punchcard experts specifically for the get-of-the-punchcards effort.


Regarding hiring difficulty, some say it's a feature rather than a bug. There are always good candidates. And good engineers can learn and be proficient in any language. Haskell is certainly not harder than C++.

I see some issues though.

Some niche languages enthusiasts can be very picky and reluctant to work on other languages, and would pick Haskell when a simple python script would be more appropriate.

People from Academia interested in Haskell may be good in PL theory, and writing compilers or static analysers, but may lack in other fields such as system programming, which incidentally is harder in Haskell too. So you also need to find devops, system guys and so on who can also be productive in Haskell, and those are more rare.


> Some niche languages enthusiasts can be very picky and reluctant to work on other languages, and would pick Haskell when a simple python script would be more appropriate.

in my experience, these people also refuse to learn about any software engineering best practices as they are "for OO languages" and "not needed" in haskell


Hiring doesn’t really seem to be a problem for Haskell teams in my experience. Onboarding can be a little slower, but there are a lot of developers out there who want to work with Haskell. If anything, I would consider Haskell a hiring advantage.


Hiring for Haskell has never been a problem anywhere I've seen.

I've worked profesisonally in Haskell for 9 years now.

When working as a consultancy, whenever we hired for ourselves or our clients, we got 5x more applicants than we were looking for. Of those, around 80% got a hire recommendation (unfortunately we couldn't hire them all).

When hiring 1 role for our startup, we got 40 good applicants immediately, with a single post on Reddit.

In all cases we had the luxury of picking the best-fitting among many excellent engineers.

Maybe you'll face issues if you want to hire 100 people on the spot. But most companies don't have that problem.


If you want to hire five developers, Haskell makes that easier. If you want to hire five hundred developers, Haskell probably makes that harder, although it's not something I have experience with.


> But in a production system, functional purity (no side-effects) is a means to an end, and lazy evaluation comes at a pretty extreme performance cost.

In theory yes, in practice... not in my experience. Or at least, things are almost always fast enough in web backends.

> Not everyone disables lazy evaluation in production Haskell, because not everyone thinks they need to for performance, but I've definitely heard about it enough times that it seems like a relevant pattern.

Most people don't disable lazy evaluation in the way I think you mean it, but they use either BangPatterns selectively or StrictData.

Maybe there's widespread usage of `-XStrict` as well I don't know about.

Anecdotally I can say that trying to enable `-XStrict` in production caused performance issues that prevented us from trialing it further.

Laziness is frequently talked about as a negative, but it has big advantages in composition and code ordering like another reply talked about with regards to where bindings to enable truly top-down program design.


> Anecdotally I can say that trying to enable `-XStrict` in production caused performance issues that prevented us from trialing it further.

hah this is in many ways indirect proof that laziness-by-default is improving performance. It may even be that a rote translation to strict (which `-XStrict` does) is causing _time_ leaks, which are actually pervasive in production strict-by-default code.


> hah this is in many ways indirect proof that laziness-by-default is improving performance.

I believe this is the case, but the times it saves you are invisble.

The occurences where it bites you though are very visible and made out to be a much bigger deal than they usually are.


Personally I think this is a strength of Haskell that many people don't really recognize and appreciate. It's a language the provides a lot of flexibility. If you want it to be a pure research language, it's happy to do that. If you need to make some sacrifices in regards to purity so you can get stuff into production, you can do that too.

As someone with a lot of time spent experimenting with programming languages, I can't think of any that can be so excellent from a research/experimentation perspective that also share the real production usage that Haskell sees. Take for example Racket, an amazing and also extremely flexible research language. It's probably easier to get started with than Haskell, but has never seen the real world usage that I've seen from Haskell.


Small distinction that does not detract from your excellent point: the question is not about laziness vs. strictness since any reasonably large program uses both extensively. Rather, it is about the properties of laziness or strictness as a default evaluation strategy.


People take advantage of laziness without even realizing it in production all the time. In fact, people often get free performance gains due to laziness all the time!


> lazy evaluation comes at a pretty extreme performance cost

This is surprising to me. I understand it making code harder to debug because timing/ordering is less predictable, but I would expect lazy evaluation to help with performance if anything ("the code doesnt rely on a specific 'when', so the compiler/runtime are free to re-arrange stuff for optimization")

What am I missing?


> I do wonder if one might get those same benefits in another language without the problems which seem to crop up with Haskell

Not until other languages raise their game. Nothing compares to Haskell for productivity and I've tried many different languages professionally over the years (including Haskell). But yes, reasoning about what happens operationally is the big problem.


> Not until other languages raise their game. Nothing compares to Haskell for productivity and I've tried many different languages professionally over the years (including Haskell).

100% agree.

> But yes, reasoning about what happens operationally is the big problem.

What types of issues do you experience?


> What types of issues do you experience?

In GHC Haskell, it can be difficult to predict the lifetime of heap objects and control when objects are not shared (optimisations may or not lift terms out of a local context).


> monadic complexity

In production Haskell I've seen it's typical you only need to understand how to use the common Maybe, Either, and IO monads. Maybe... sometimes... you need to create your own.

I actually think it's a bit of an anti-pattern, but this anti-pattern leads to a reality where the monadic complexity you alledge does not exist.


In production Haskell you need to combine monads using e.g. monad transformers and this is significantly more complicated than just understanding how bind works. It also requires a lot of boilerplate code and any errors there are not obvious to track down.


Writing your own monad transformers is more complicated than understanding bind. Using standard ones isn't: swap out `State s a` for `MonadState s m => m a` (or whatever the relevant effects are) and put `lift` where the compiler tells you to, and you're 90% of the way there.

The other 10% is stacking the appropriate sequence of `FooBarT (BarBazT ... )` at the entry point to your program, which is admittedly pretty tedious.


> In production Haskell you need to combine monads using e.g. monad transformers

In some workplaces, yes. However it's pretty tenable to use https://www.parsonsmatt.org/2018/03/22/three_layer_haskell_c... with no or few additional monads.

Alternatively, you can have just a few people that understand how to compose Monads and many others writing code within them.

I'm not sure if I think it's a best practice by any means, but it's a pattern I've seen that avoids having to include monad transformers in onboarding.


Also a much simpler alternative in my opinion to monad transformers is effectful:

https://github.com/haskell-effectful/effectful

Here's a talk on it:

https://www.youtube.com/watch?v=BUoYKBLOOrE


> Simon Peyton Jones is reported to have said that lazy evaluation has been kept "to keep us honest about side effects", i.e., that size effects become immediately painful with pervasive lazy evaluation.

IIRC he wrote or said that lazy evaluation, was a mistake in hindsight.


Do you have a source for that? From what I recall, SPJ says that the grass is always greener on the other side. So if Haskell was strict by default, some of us would be pining for a lazy version of it.



Sadly no. I think it may have been the answer in an interview to what he would have done differently with all the hindsight experience he now has.



> Production users of Haskell do seem to be happy about types, years into projects, so I'd take away that types are a benefit, but I do wonder if one might get those same benefits in another language without the problems which seem to crop up with Haskell, such as laziness causing performance issues, monadic complexity, and difficulty hiring.

Rust. Rust is that language.

Rust's type system is good enough and close enough to Haskell without the footguns of other options (Java, C++, etc), with new paradigms that Haskell hasn't yet fully adopted (borrowing, etc) and consistency in build tooling, etc.

I hate to say it, but it's incredibly hard to suggest people run Haskell in production when Rust exists and you basically trade the difficulty of learning monads + mtl + ... for the difficulty of learning to live with the borrow checker.


> Rust's type system is good enough

Good enough for what?

> close enough to Haskell

Close enough for what?

> with new paradigms that Haskell hasn't yet fully adopted (borrowing, etc)

Honestly, I don't want to think about borrowing. For my gamedev/rendering tasks I have resource regions (as a library) and trust the runtime to do its thing.


Rust is good, but writing the experience writing Rust isn't like the experience writing Haskell.

Here is a good post on it:

https://serokell.io/blog/rust-vs-haskell


Really? I can’t imagine people turning off the lazy evaluation. It’s a deep part of the language. You can force things to evaluate strictly, but that’s not disabling it.


OCaml may be seen as more pragmatic solution.


It cannot.

Arbitrary side effects allowed by strict-by-default language without types constraining these ruin easeness of software development provided by Haskell.

When your Haskell program works and walks, add bangs and it will run. Lately I heard from my colleagues it will run even without bangs as ghc has improved.


I find that too:

Purity is Haskell's fundamental killer feature.

Other languages now also have quite sophisticated type systems, some of which can do specific checks beyond Haskell in certain areas (Rust's control flow analysis, TypeScript's ways to do "gradual" typing of JS), while Haskell can do more in general.

But only in Haskell can I look at a function's type and know it doesn't do any IO.

In development and production, this allows you to immediately skip over large amounts of code when debugging issues. Pure functions are a key reason why Haskell is good at correctness and maintainability.


I used to entertain myself embedding various languages into Haskell, mostly hardware description related.

Usually, one goes with a static parameterization of a monad. But [1] shows that it is possible to have a dynamic parameterization, where underlying types of a state can change.

[1] http://blog.sigfpe.com/2009/02/beyond-monads.html

This means that we can control state types and, borrowing from ST monad, we can control introduction of resources and their use. If certain free type variable is in state, we can use associated resource. Deletion of resource is a deletion of free variable from a state parameterization.

All this is from 2009 or so.

The borrow checker of Rust is a library in Haskell, if you patient enough.

The purity of Haskell made many these libraries possible, exactly because we can control different side effects.


Nim and D both have this feature. Among more popular languages, C++20 constexpr comes close: if you see a constexpr function, you know it's a "pure" function. Although there are still some things that can't be done in constexpr functions, the list is getting smaller and smaller with every new version.


Effect tagging in first order, monomorphic code is easy. Doing it in the higher order and polymorphic case is the tricky bit. Do Nim and D support that?


constexpr reduces general side effects. Can we reduce specific side effects?

For one very important example, consider software transactional memory. It has general side effects in the implementation behind the scenes, but inside transaction code they are not allowed.

What will C++ solution look like? stmconstexpr?


GCC has this: https://gcc.gnu.org/onlinedocs/libitm/C_002fC_002b_002b-Lang...

But I have seen any use for it, besides every GNU/Linux program having code linked in to call _ITM_registerTMCloneTable in case libitm is used.


I thought STM was dead. Seems to be minimal adoption within clojure and all the hardware which tried to do it has been disabled for being broken. What makes it a particularly important example?

The C++ plan of attack probably is more syntax though. Maybe parameterise constexpr, constexpr<input_only_io> or similar.


STM is very pervasively used in Haskell. It's usually among the first tools a production Haskeller would reach for when dealing with concurrency. It has died in other ecosystems mainly because it really shines when you have a lot of pure functions and a static type system, which is really up Haskell's alley. Any impure functions really spoil the soup.

Clojure's STM implementation is missing crucial combinators to combine different STM actions compared to Haskell (e.g. Haskell's `orElse`), which makes it much less useful. It's also more difficult to remember how to compose STM actions in a dynamically typed language, since you can't directly inspect the action at the REPL in the same way you would do with a simple data structure, which means those few Clojure codebases which use STM don't tend to use it pervasively, but instead confine it to a single place, because it gets too easy to make the equivalent of type errors in Haskell (but which are much more difficult to diagnose because they're not really type errors from the perspective of Clojure).

Or to put it another way, Haskell's STM implementation focuses on STM actions as first-class entities, which allows you to have all sorts of reusable actions scattered around your codebase or even stored in data structures (e.g. a list of actions). This is also why there are combinators for combining different STM actions (e.g. `orElse`) rather than just operators on transactional references.

This means you can build up a rich library of actions in different places throughout your codebase, then at the "very last moment" in your program you atomically combine different STM actions together (ensuring through atomicity that they all see a consistent view of the world).

Clojure's STM implementation on the other hand focuses exclusively on transactional references, which means it's difficult to build up a library of STM action "lego pieces" that you can reuse throughout your codebase. You can try to retrofit the same thing, but then you run into mismatches between different actions which is what I alluded to with type errors. This lack of composability is my view as to why STM has mostly withered in Clojure.


It is not dead in Haskell precisely because of controllable side effects. It is dead in some other languages precisely because of lack of controllable side effects. In .Net, for one example, STM execution was 4 times slower because of arbitrary modification of state [1].

[1] https://haskell-cafe.haskell.narkive.com/t6LSdcoE/is-there-a...

Also, the rule of thumb is that you have to use STM, and it is mandatory, if there are more than one developer working on a project or if you have more than one module in your solo project.


> I thought STM was dead.

My company codebase makes generous use of STM in Haskell. STM is the first concurrency solution I reach for in Haskell and I believe it's the idiomatic thing to do.


As for importance of example, it goes like this: double linked list is a first year programming exercise on a single processor; double inked list is master thesis on a multiprocessor if you use locks, etc; and it is again first year programming exercise if you use STM on a multiprocessor.

STM simplifies parallel execution of atomic consistent transactions significantly.


(I say "only in Haskell" disregarding some more sophisticated or specific languages such as Agda, Idris, Futhark, etc. but those are a lot more niche.)


Interesting interview. I applied for a Haskell job at Standard Charter in Singapore about eight years ago. I didn’t really expect to get the job because the only Haskell experience I had was my own Haskell projects and I had written a short book on Haskell. The guy interviewing me and reviewing my take home programming test had done a PhD studying aspects of Haskell. It was an interesting experience for me.


Did you get the job?


No. The interviewer suggested that I get one year experience elsewhere, then reapply for the job. I started to do that, but a month into a Haskell gig, I received an offer to manage a deep learning team at Capital One, and pivoted.


I worked in that team from 2012 to 2014. I don't have any degree and didn't publish any books.

Perhaps they made the interviews more difficult since, or you got unlucky?


I just don’t think I had enough Haskell experience.


You won't need a Haskell PhD to work at a bank, even one that uses Haskell.


While Haskell jobs may appear scarce, Standard Chartered is often recruiting for Haskell devs:

https://discourse.haskell.org/search?expanded=true&q=Standar...


Thanks for finding these! More directly, here's my latest jobs post: https://discourse.haskell.org/t/haskell-jobs-at-standard-cha...


I'm shocked, in the first paragraph they assert that they handle 6 million lines of code. Is it common for the fintech industry to have codebases of that magnitude? I'm aware that for example OSes and browsers can achieve similar sizes (if I'm not wrong certainly more) but I'd have never thought that a fintech company could approach such size. It just seems way too much...

Edit: Question out of ignorance for people that regularly program in Haskell in the industry: I want to believe that the language has nothing to do with this and is just their business logic, right?


6 million is easy, in fact its probably a small-ish fraction of their total code as a firm. Especially for a bank where their processes are a bit more defined than (say) a hedge fund.

What do you think fintech companies do? (I'm not entirely sure either but I can potentially give a flavour of finance coding in general)

If you want to come up with prices for things then you'll need a bunch of mathematical optimization code, interpolation code etc, then an enormous chunk of market conventions (Quick! How do you calculate the yield to maturity for a Brazilian treasury bill).

Having established this you will then need to build a system for getting market data to the right places to be used - and probably snapping data for historical record too. This could be as simple as drinking from the Bloomberg firehose, or reading from an SFTP server in the morning (no really), or pulling in trading data from an exchange (https://www.eurex.com/ex-en/data/trading-files/efpi for example). It all gets very messy.

And now, if you are a bank rather than merely a player in a market you also probably have your fingers in many fiddly boring pies too (e.g. retail banking, mortgages stuff like that).


That’s a fairly accurate overview. If you’re a bank, there is also code to handle market parameters/fits, software to handle positions/booking, exotic payoffs ( and the accompanying compiler to compile the dsl ), regulation, GUIs and custom tools for trading/sales, risks, etc.

6 million sloc is indeed a fraction of what you’ll find in an investment bank. It’s also (I think) what makes it very interesting: an IB is a huge software system, with a mixture of extremely legacy (30y) and bleeding edge tech, and very strange human made rules.


6 million lines of Haskell. A language known for the ease of creating domain specific languages and thus writing the business logic declaratively. You would assume that is what is also done in Standard Chartered.

6 million lines of Haskell maintained by around 40-50 engineers. Sounds like a lot of tech debt in the making.

Maybe the 6 million lines includes their fork of GHC?


40-50 people is tiny indeed, but Haskell is supposed to help quite a bit here. 6 million lines of c++ and 50 people could be very hard. I’ve seen 4 million sloc of c++ and 80 eng, and it is hard.

Investment banks are extremely complex systems - 6 millions sloc, even in a very expressive language, make sense to me.

What’s harder to grasp is the real complexity of an IB - it took me years to get that. 30 years of random, non homogeneous software evolution, coupled with 30 years of funky human made rules in 200 countries. It’s going to be big and messy! And I really wish it were easier.

To some extent, it’s a bit like Unicode, or font rendering, or the tax code : on paper it’s just a bunch of rules, but for real, it’s big and complex.


Perhaps it also includes (quasi) dead code that is not used anymore because regulations changed, and code for one-off analyses and experimentation?


I don't know about Standard Chartered specifically, but I remember one of the big US banks (I forget which one. JPMC or Goldman Sachs, probably - it was a while ago) telling us that they had more developers than Microsoft.

Don't sleep on these finance companies, they're big development shops.


Jamie Dimon said that they have enough engineers to build their own Windows, back in 2007/2008.


> I'm shocked, in the first paragraph they assert that they handle 6 million lines of code.

I work at a bank.

To answer your question - yes. The consumer website for the bank I work at was 9 million lines of code (at least as of 2019). A lot of it was bad code, some of it was dead code that was never deleted.


Yes it's hard to imagine what all that code's doing.

But I'm currently at a very small fintech, and I've got about half a million lines of code checked out. There is a lot of copy-pasted code, but I also have nowhere near all the repositories checked out.


I work on a trading desk at a financial firm. I just checked out all our applications, and they total 388444 non-comment lines of Java, 73889 of JavaScript, 13619 of Python, 9160 of C++, and some other things. The JS probably includes some vendored libraries. Also, this is just apps - we have a bunch of libraries used across our apps which will add tens or low hundreds of thousands more lines, but i have no systematic way to check them out, so i can't tell.

That's a single trading desk!


So many questions left unasked! In particular they have "recursion disabled by default" but "we have all the usual recursive combinators". Does this mean that they can't explicitly write recursion buy may achieve recursive effects only through the combinators? I'd like to know more about this. Is it as expressive as recursion only through "properly optimised" combinators?


By combinators they mean things like folding/reducing over lists, i.e. combinators/higher order functions for operating on potentially infinite structures.

It's not as expressive as raw recursion since many uses of recursion are hard to express through combinators, albeit a great deal of research has been done in this area (look up recursion schemes).


The reason is that we don't have a general solution for tail-call optimisation for computations made of a mixture of Mu code and C++ primitives. We have higher order functions defined in C++ that can be called from Mu, and can call back Mu functions passed to them. This back and forth between languages has occurred naturally in recursive definitions, and we don't have a way of reusing stack frames between the Mu interpreter and the native C++ code for these higher-order functions. We decided a long time ago to disable recursion by default, encouraging programmers to use recursion operators. We do however have a Mu language extension that enables recursion, which is used when the recursion depth is known to be bounded.


It has been argued that unrestricted recursion is like GOTO. I have some limited experience with "recursion schemes", they're quite expressive. But I feel like it's trying to give every possible pattern of recursion a name, to the point that it's almost becoming a joke - e.g. "zygohistomorphic prepromorphism".


I'm struggling to understand how possibly inventing your own language -- one not even compatible with the Haskell eco-system -- was ever a good idea in production.


It was written to replace an existing functional DSL called "Lambda", and to target Lambda's runtime.


Reminds of an epistemology joke.

The reason the moon is made of cheese is because it is made of yellow cheese.


Main reasons at the time: * strictness was preferred for consistency 1) with the C++ execution model (a lot of our code base is written in C++ by quants, and a typical Mu program will make extensive use of these external calls; they can even be higher-order, calling back into C++); and 2) because we have a lot of end-users who are not Haskell experts and are much more confortable with a strict execution model * code mobility is essential to what we do, and was not directly available in Haskell: we often capture computations on our users' desktops running Windows, and stream them to a Unix compute farm. * GHC was not openly available as an API years ago; this is different now which is why we have started a new implementation that is GHC-based.


Why is it any different to "bank python" used by e.g. JPMorgan or BAML? Or Google writing Dart?


I don't think they ever touched the interpreter. It's more like a huge framework on top of regular py.


It's an entire fork of the Python ecosystem. So not really that much different to what we have here. Maintaining an ecosystem of libraries and tools is the hard part, not a compiler or interpreter.


I guess I have the relatively unique distinction of having worked at Standard Chartered and used Haskell/Mu in anger from 2012-14, as a currency/interest rates options trader using it for volatility surface mapping and making pricing/market data tools. Offering thoughts:

> I’m the head of the Core Strats team at Standard Chartered Bank. Core Strats consists of over 40 developers using Haskell (well, actually Mu) as their main programming language.

wow - this team used to be 8-10 people? amazing to see its grown so much. our old head used to be https://donsbot.com/about/ . i have a pet theory that Don and gang ensured Haskell's viability by building so much stuff at SCB that SCB was kind of pot-committed to Haskell, thereby ensuring that they would keep investing and hiring until the end of SCB. Similar move was played by Cheng Lou over at Facebook, for Reason/Rescript. if you want to grow a programming language at a bigcorp, embed it in a biz critical application and you have job security for life

> Our core financial analytics library, Cortex, consists mostly of Mu/Haskell code (over 6.5 million lines) and forms the basis of our price and risk engine across all asset classes. It’s also pretty full-stack; we use it for everything, from server-side long-running batches handling millions of trades a day to front-end desktop graphical user interfaces.

yep i can attest to this - i will say that Mu was the best developer experience i've had for a long time. It comes with its own storage/persistence layer (simple KV store iirc) but the joy of not having to choose or maintain or scale was great. also the entire internal package ecosystem was searchable via function signature (via Neil Mitchell's Hoogle) and versioned daily (so if any regressions happened you could just jump back day by day - we didnt use git in my day). i also missed if there was a testing suite.

> We can seamlessly call C++ functions from Mu and vice-versa, and pure Mu functions can be used in Excel formulae via our Excel add-in

yeah this was used a bunch. however the traders rarely touched this code so it probably required a lot of support from the devs and probably some misunderstandings where had.

> What benefits do you see for using Haskell in banking and fintech areas? One great advantage is static typing..

no no no. correct me if i'm wrong but dont large fintechs like stanchart and jane st use functional languages because its so much better to parallelize for risk scenarios and complex option pricing?

> Does Standard Chartered have an in-house training program for upskilling Haskell developers? For training absolute beginners in Haskell we typically engage an external training partner.

ha. in my day i was just given Learn You A Haskell and told good luck!

SCB people reading along, i hope some of my FXO apps are still around. Coriolis, Corona, and i'm blanking on the main pricing one. 10 years wipes away a lot.


> Mu can easily take several minutes to compile a single large application. We will circumvent those problems by switching our front-end to use GHC directly

What does he mean by this?


I may be wrong, but if I remember correctly Standard Charter uses their own “dialect” called Mu[1].

1 - https://icfp21.sigplan.org/details/hiw-2021-papers/14/Haskel...


He probably means that GHC will parse and type check faster. Switching the front-end doesn't sound like it will circumvent the whole program nature of their compilation. But I'm just guessing.


From what I understand: GHC takes care of the parsing of the source file, error reporting, etc. Then the AST (or an intermediate representation) is passed to the of the Mu compiler, which performs whole-program optimisations.


Gergo Erdi gave a talk on the GHC efforts of Standard Chartered last year:

https://www.youtube.com/watch?v=fZ66Pz7015Q


Gergo and I are working on a new implementation of the Mu compiler that uses GHC up to Core; from which we translate to the legacy Mu bytecode format.

The original Mu compiler is not designed for separate compilation, our new version is, based on GHC's binary representation of compiled interfaces with unfoldings.


I'm also interested. Just guessing, but I guess it means that while they use their own (slow compiling) 'Mu' dialect for most applications, they have switched to using 'standard' GHC for their front-end apps.


Frontend here refers to the first stage of the three-stage structure of compilers, ie. the functionalities such as parsing and type checking that come before the optimization and target code generation phases: https://en.wikipedia.org/wiki/Compiler#Three-stage_compiler_...


Didn't know that, thanks! But then I'm wondering how they can do parsing and type checking with GHC when they are using their own dialect, "Mu". How come GHC can parse that?


The differences between Mu and Haskell are not that large. They use GHC as a library in their front end. Gergo Erdi gave a talk on it last year:

https://news.ycombinator.com/item?id=35801844


GHC has a system for language extensions so they have developed a few proprietary ones for Mu. For the standard ones, see https://downloads.haskell.org/ghc/latest/docs/users_guide/ex...


I'm just starting to learn Haskell and I honestly can't fathom how it gets written in production. Everything requires so much more consideration, which is fun and valuable in some situations, but not when you just need to get the damn thing done in time for a deadline.

I'm not experienced enough to know, but it does feel like Haskell prioritizes "clever" code. Very beautiful, but hard to understand at a glance.


You can get features done very quickly in Haskell after the initial learning curve that most languages have. The type system lets you catch a lot of mistakes before runtime and in my experience things are more likely to work the first time in Haskell.

Testability does require some forethought but if your codebase always takes that into account then it's as easy as other languages.


Sounds like you just need more practice. Haskell is more expressive than most other languages and offers much more consistent and powerful abstractions. This does result in more productivity once the learning curve is overcome.


> I'm just starting to learn Haskell and I honestly can't fathom how it gets written in production. Everything requires so much more consideration, which is fun and valuable in some situations, but not when you just need to get the damn thing done in time for a deadline.

There are ways to write quick n dirty Haskell. Actually a super-power of Haskell is how easy it is to refactor from "quick n dirty" to "reasonable quality" without introducing defects.


> Actually a super-power of Haskell is how easy it is to refactor from "quick n dirty" to "reasonable quality" without introducing defects.

I call haskell "the language of maintainable spaghetti" in this spirit. It really is incredible how types can take 99% of the risk out of refactoring.

Indeed while I miss machine refactoring tools from Java, I've seen vanishingly few bugs result from manual refactoring in haskell.


It's sad you're getting downvoted only because you've commented (ever so slightly) negatively about people's language de jour.

I would argue that "Can people understand a unit of code without experience in that language/domain" is valuable, even if not the _most_ valuable aspect of a language. I also think dismissing this desire is usually an act of gatekeeping. "_I_ understand it so it's fine."


> We can seamlessly call C++ functions from Mu and vice-versa, and pure Mu functions can be used in Excel formulae via our Excel add-in.

wtf


What's weird about that? Excel is absolutely everywhere in finance, so if you have a special programming language that you consider to be part of your secret sauce it makes sense to integrate it into excel as well.

A friend who works as a at risk manager at a major bank sometimes shares horror stories about the giant spreadsheet they have for some part of their holdings. They have disabled the "automatic recalculation" feature because it takes over 48 hours to recalculate everything. This is after running it through a special excel compiler to speed it up.


Excel is everywhere used everywhere and for very important things.

I just never... thought about writing a Haskell plugin for it. It makes sense! Better than using some other language. Functional for Excel makes sense.

I love hearing horror stories about small (and large!) governments and companies abusing and misusing Excel. https://sheetcast.com/articles/ten-memorable-excel-disasters


It makes perfect sense; non-technical users and employees with different specializations (like actuaries) use lots of Excel, especially in fintech; and if you can share function implementations verbatim between your codebase and Excel, you get higher confidence for coherence.


It's not very different from using python to hook C/Fortran libraries together.


> Haskell in Production

Does not use Haskell.

Is it just me or is this... weird?


Sure, it's a bit weird but not that weird as they have a huge codebase in a specific domain. Also, they still end up using a lot of Haskell code (and semantics) although they write in a thin layer of "DSL" on top of Haskell and GHC (which itself is a lot of "Haskell in production").

> Mu has a strict runtime, which makes program performance easier to analyse and predict – but prevents us from just copy-pasting some Haskell libraries that rely crucially on lazy evaluation.


From what I understood they don't use Haskell underneath at all, but they copy paste code from Haskell and sometimes have to modify it.


They say their codebase includes 400k lines of Haskell code (plus 4.5M lines of Mu). Also, they use GHC which is a huge Haskell program, and they customize it with their own extensions etc.

As they copy Haskell library code and only sometimes modify it, I would count it as Haskell code until they modify it.


Mu is essentially Haskell. It's not clear to me where you got "does not use Haskell" from.


It's quite clearly different in many important ways. Not being lazy is a pretty huge one!


Ok, but we're not quite talking about the difference between Haskell and Clojure, for example.


Sure. But it is almost the difference between Haskell and OCaml.


I suppose it's all relative. Given some selection of enabled language extensions, are you still writing Haskell? I think in the minds of most people (at least on this forum) the difference is academic. Whether a given body of work in Haskell looks like Elm, or if it's all GADTs, data kinds, and type families, they'd both be met with "Surely nobody actually uses this in production! This is ivory tower nonsense! I can't read it! It's not practical! I just want to get stuff done!" ad infinitum.


I write TypeScript in production but I'm happy to call it JS


Should probably be 'Dialect of Haskell in Production'


Should probably look into Haskell. Have absolutely no clue what's it best suited for?


Here are some stories of what it's actually used for: https://old.reddit.com/r/haskell/comments/12vhz1c/genuine_qu...


Ha! I was also curious. Thank you for sharing.


Please allow me to share a blog post and a learning resource:

- Consider Haskell - https://gilmi.me/blog/post/2020/04/28/consider-haskell

- Learn Haskell by building a blog generator - https://lhbg-book.link


You might find this survey of Haskell's maturity of different use cases very useful:

https://github.com/Gabriella439/post-rfc/blob/main/sotu.md


I have yet to find a place in my own work where Haskell would strictly be a better choice, but god damn its fun to write.

Thats not a knock on Haskell either, I have a more bog standard job than it likes.


It's a general purpose language, I wouldn't say it's more or less suited to this or that, other than the libraries available for it.


It is a garbage collected language so it is generally less suited for real time applications. As far as I am aware there aren't any implementations designed to run on very low memory embedded platforms.

As far as I am aware GHC's collector is not as good as Java's.


Writing parsers and tokenizers probably.


That's what it's best for, but personally I use it for everything. If I ever get into low-level code I'll probably use Rust though.

You can confirm that parsers/tokenizers is ranked "best in class" here though:

https://github.com/Gabriella439/post-rfc/blob/main/sotu.md




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: