ICFP report on the use of Haskell at Google

edifice · on Sept 29, 2010

Good article. Mirrors my own experience in three ways:

1. Immutable datastructures make it a lot easier to reason about data flow and to write correct code.

2. Static typing helps write cleaner, more explicit APIs.

3. Programming effectively in a functional style is more difficult than banging out imperative code.

I wonder why they didn't try Scala. With Google's extensive Java infrastructure that seems like a more logical choice. Certainly the problems they discuss with Haskell's string handling and debugging support wouldn't be issues in Scala.

iustin · on Sept 30, 2010

As I tried to explain in the paper, this started as an experiment, and not "let's write this in Haskell and use it in production".

But aside from that, the cluster balancer runs in just 5MB of RAM (RSS). I'm not familiar with Scala, but I doubt any JVM can run in just that; and the way we deploy this software, we like it to use as few memory as possible.

edifice · on Sept 30, 2010

That's true. Scala can be pretty fast but it needs a lot of ram.

Thanks for taking the time to write this up. Detailed descriptions of real-world applications of FP languages are always welcome.

runT1ME · on Sept 30, 2010

>3. Programming effectively in a functional style is more difficult than banging out imperative code.

As someone who is just starting to learn functional programming, do you believe this is because "we've" all been taught to think imperatively, and functional is just new to us? Or is it something inherent in functional style coding?

jerf · on Sept 30, 2010

I think both are true. It is true that we aren't good at functional programming because we are not practiced. It is also true that functional programming makes some things intrinsically harder to do than in the imperative case. This is because modern functional programming gains its power by restricting the programmer, and building on those restrictions.

(Sometimes I get in trouble when I try to distinguish between "new functional" and "old functional", but despite superficial similarities they are diametrically philosophically opposed. Old functional, embodied by Lisp, works by trying to empower the programmer, and builds on that; new functional works by restricting the programmer and building on what guarantees we get from those restrictions. Haskell and Lisp may share some terminology when it comes to list manipulation but I would actually put them very, very far apart in my grand map of programming languages; they take one step together, the first lambda calculus step, and then immediately begin sprinting in opposite directions.)

loewenskind · on Sept 30, 2010

Lisp was always multi-paradigm and supported functional programming.

I'm also not sure I'd go as far as to say "restricts" as much as demarcates. Composing existing code has been vary difficult in imperative languages (practically impossible when you have to deal with things like memory management and/or threading).

silentbicycle · on Sept 30, 2010

Some problems decompose cleanly that way, some don't. It's not hard to design a system where most is functional, though, and that's still useful. You've probably done that without considering it "functional programming"TM, just trying to cut out global variables and whatnot. Decades ago, "structured programming" was the new thing, and manifestos appeared about how gotos were "considered harmful".

If you have a system where 95% of it is functional, cool, you've eliminated 95% of the places where you have to double-check for dataflow/mutability-related bugs. (To heavily paraphrase jerf's comment: functional programming is about accepting restrictions that make debugging simpler.) Make the system as clean as feasible, but don't get bent out of shape trying to get the last 5% FP-approved. It's probably less trouble to just know that's the hairy part.

The big idea in functional programming is that working with immutable data structures makes a lot of annoying bugs go away, and goddamnit, they're right. Does that mean you need to go 100% immutable? No - it's just a good default.

FP-style programming without (generational) garbage collection is really awkward, though - immutability means you're going to be creating a lot of temporary data structures. Good language implementations will recognize this and just stack-allocate them or let them go in the first collection.

nickik · on Sept 30, 2010

Your write but if you use a FP Language you are more forced to do so and it makes it easier to do. Why should I tear my self a arm of to do FP with Java or C++ if I can just use Haskell or a language that is not as strict as haskell.

gaius · on Sept 30, 2010

I'd say to all intents and purposes you can't do FP with Java, or at least not leverage some of the advantages of doing do, because you simply don't know where in the libraries there might be some mutable state. In that sense, Java's greatest strength, its vast ecosystem, is a fundamental weakness. The same is true with other JVM languages such as Scala and Clojure as soon as you start to leverage third-party JARs.

With Haskell, you know that something that claims to be pure is.

silentbicycle · on Sept 30, 2010

No, you aren't. In OCaml, you can use mutable references, they're just not the default. Erlang has process dictionaries. Etc.

Haskell is unusually strict in that regard, turning what is normally a design option into an all-or-nothing situation.

Edit: I know there are a few ways around it via monads, but the culture surrounding Haskell seems quite a bit more likely to treat state/purity as an all-or-nothing issue than the others'. Maybe I'm making inaccurate assumptions about Haskell based on the loudest members of its community, though.

jrockway · on Sept 30, 2010

Not really. Mutable refs are provided by ST, which allows for chunks of your program where you are mutating state, but where the mutable state can't leak outside of boundaries you set.

(There is also the good-old-IORef, but I avoid IO for anything that's not input or output.)

moultano · on Sept 30, 2010

I think it's fundamental. We don't go through our lives recursively. We do one thing, then we do another. When I was first starting writing ML, I found it convenient to assume my program already works, and then do the next step. I don't think this is something people are naturally very comfortable with doing.

edifice · on Sept 30, 2010

Good question. My opinion is that thinking in terms of recursive transformations of immutable datastructures is just fundamentally more difficult than the imperative equivalents. My earliest exposure to recursive algorithms was in AP computer science in high school and I remember a lot of the students really struggled with them, both those with previous imperative experience and without. This extra effort can pay off over time though because it makes the overall program easier to understand and incurs fewer bugs than the imperative style. You can also avoid a lot of this if you express your functions in terms of other elemental sequence functions (map, reduce etc) instead of explicit recursion.

Doubtless others will disagree though. Maybe we don't all have the same mental strengths and weaknesses.

silentbicycle · on Sept 30, 2010

I think many programmers find it hard because experience with Algol-derived languages give them bad assumptions - they expect all of the conceptual baggage of mandatory stack allocation, so recursion seems ludicrous and complicated.

Recursive cases in languages with tail-call optimization are like inductive reasoning in mathematics - "How long is a list? If it's empty, zero, otherwise one (for the head) plus however long the rest is."

jemfinch · on Sept 30, 2010

Chalk me up as someone who disagrees. Personally, I find recursive transformations of immutable structures far easier to reason about than iterative modifications to mutable structures.

At least in some cases, I think that's the expected result: consider the difficulty of reversing a singly linked list in-place and reversing it applicatively.

chromatic · on Sept 30, 2010

Perhaps the difficulty is in designing the appropriate data structures and identifying the correct transformation functions. In other words, designing a good FP program is difficult just like designing a good OO program is difficult.

edifice · on Sept 30, 2010

It definitely depends on the algorithm. It also depends on the language. A lot of the difficulty of learning Haskell, for example, is in the type system. A dynamically typed FP lang like Clojure has a shallower learning curve.

iustin · on Sept 30, 2010

It is true, but it's also unfortunate—I think a lot of Haskell's strengths come from its type system. So what makes Haskell worth learning also makes it hard to learn Haskell :)

silentbicycle · on Sept 30, 2010

Haskell's type system is unfortunate, because for most programmers, 80+% of the benefit comes from how it utterly nails the basics (checking variant type coverage in pattern matching, typeclasses), but the language community is focused on the cutting edge top .003%, because that's where the research papers are.

strait · on Sept 30, 2010

After working through a few of my own Haskell projects, I eventually found the type system disappointing. A better design would have made it even more sophisticated and harder to learn, IMO. That's the price to pay for being purely functional. Instead, I felt that certain design trade-offs were made, which in effect, make the type system awkward and tend to lead to inflexibility in writing ambitious, comprehensive programs.

silentbicycle · on Sept 30, 2010

Any examples you care to share?

nickik · on Sept 30, 2010

Iit's also unfortunate - I think a lot of Clojure's strengths come from its macro system. So what makes Clojure worth learning also makes it hard to learn Clojure :)

There is stuff to learn in every language :)

nickik · on Sept 30, 2010

He i think you are write but that is where the Language can help you. You don't have do the recursion every time by hand. You can have language constructs that are just as easy as normal loops but work like recursion in the compiler. If you really need to do the recursion you would have to do so in imperativ languages to.

(Node: I don't no haskell)

strlen · on Sept 30, 2010

When I worked at Yahoo in an operations role ("production engineering", Yahoo's equivalent of Google's SRE), my team did something similar, we used OCaml in combination with Perl for one task: developing an a mini-language language for describing large clusters of machines (ranges of machines, querying and describing metadata such memberships of machine in a cluster, designating special nodes, descriptions of clusters). Although we ended up switching to C and lex/yacc for this mini-language (due to compiler issues after the switch to 64-bit OS, no idea if these compiler issues were fixed) it was a great experience.

I've since moved on to a software engineering role, but there's something to be said about the freedom (e.g., picking your tools) that exists when you're writing software that only you are responsible for.

nickik · on Sept 30, 2010

Would be intressting to know why Haskell? Did already know a little or did do a comparison of more then one FP Language.

jrockway · on Sept 30, 2010

Weird. It seems like they didn't really know Haskell going into this; not having heard of "Either String" and not knowing that you can get a stack trace when you run your code in ghci.

iustin · on Sept 30, 2010

I thought that is clear in the paper—I didn't have (almost) any experience with Haskell before, indeed.

And the stack trace example is not related to ghci. You run software in production, and it crashes. What do you get?

jrockway · on Sept 30, 2010

I write my software in Haskell because it doesn't crash in production. But then again, before I deploy, I ensure that there are unit tests.