Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Lisp as the Maxwell’s Equations of Software (2012) (michaelnielsen.org)
128 points by graderjs on Nov 9, 2022 | hide | past | favorite | 87 comments


I thought the same thing for a long time, I found s-expressions very elegant. Then I discovered lambda-calculus, with its scary syntax but based on a process I could understand, text-rewriting. So I had the idea to explore the lambda-calculus using s-expressions and patiently rebuilt booleans, pairs, lists, recursion and beyond a real complete Turing language, for instance:

- http://lambdaway.free.fr/lambdawalks

- http://lambdaway.free.fr/lambdawalks/?view=lambdacode5

No need for cons, car, cdr and other historical names. We can rebuild everything from scratch, from abstractions and applications. No need for Lisp, Scheme, Common Lisp and other Clojure, even if they remain excellent examples. So in my opinion the Maxwell equations of computation are to be seen more in the lambda-calculus than in LISP and its dialects. We just need to make the syntax a little more human.


I agree that Maxwell’s Equations of Software should be based on lambda calculus, the binary version of which [1] admits a much more concise 29 byte self-interpreter

        (λ11)(λλλ1(λλλλ3(λ5(3(λ2(3(λλ3(λ1 2 3)))(4(λ4(λ3 1(2 1))))))(1(2(λ1 2))(λ4(λ4(λ2(1 4)))5))))(3 3)2)(λ1((λ1 1)(λ1 1)))
[1] https://tromp.github.io/cl/Binary_lambda_calculus.html


I agree with you, but so far I have never been able to get into the logic of de Bruijn's clues. As for example John Tromp knows how to do. (https://tromp.github.io/) But I don't despair of getting there one day...


A picture is worth a thousand words: https://en.wikipedia.org/wiki/De_Bruijn_index#/media/File:De...

It's literally just notation:

- It doesn't matter how you name variables. (if you want to sound cool: "lambda terms are invariant under alpha-conversion")

- Each new variable introduces a scope.

- Therefore, how scopes are nested defines the entire expression.

- Instead of the usual way, we represent scopes with numbers, with the index representing "how nested the scope is". Imagine it's python, and you're pressing tab that many times.


Syntax and layout ware worth yet a few more orders of magnitude of words. But that argument has slipped past LISPers forever. I'll just let them keep scratching their heads why Python got so popular when LISP was around for so long and is /clearly/ superior.


I don't think LISPers are scratching their head about this. Lisp syntax is optimized towards meta-programming. Generating and processing code is built-in and supported with central mechanisms: READ, EVAL, PRINT, s-expressions, macros, ...

Python is optimized towards simple pseudo-code like imperative OOP. That's easier for most programmers.


What world are you living in where popularity is somehow a way of judging quality? The masses love marvel movies and fast food


Python is popular because the majority of programs are extremely simple.

Look at how ugly decorators are in python when in lisp you get them for free and they don't even need a name.


> As for example John Tromp knows how to do.

You might not have noticed the userid of the person you're replying to ;)


yes, I hadn't noticed that. I wish I could understand him better.


Note that the person you were responding to is John Tromp. :-)


lambda calculus is turing complete. But so are turing machines. So why not have turing machines be the Maxwells equations of software.

Maybe there's an infinite amount of mathematical theories that are turing complete. Why is lambda calculus chosen as the foundational one?

I would argue, the Maxwells equation of software should be MORE foundational. Not some arbitrarily specific turing complete language.

This is what I think it should be (link is a comment in the same thread): https://news.ycombinator.com/item?id=33533321


Because Turing machines are so cumbersome compared to lambda calculus that a universal Turing machine takes thousands of bits [1].

[1] http://www.verostko.com/tur-doc.html#B0,%20Universal%20Turin...


I find Lambda calculus and term rewriting much more elegant and easy to understand. I don't think there is a simpler Turing-complete language (maybe Game of Life?).

For Turing machines I recommend the book "The annotated Turing".


Well if game of life is simpler, why not game of life?

What does it mean to be Maxwells Equations? Does it mean absolutely most fundamental? Most practical? What?

Maybe there's something more fundamental then turing. Thanks for the book recommendation


> What does it mean to be Maxwells Equations?

In a word: elegant.


It means much more than just being elegant!

Maxwell’s equations are arguably the most successful existing physical theory. They are incredibly accurate over a huge range of scales. They are used in essentially unaltered form all over modern day engineering and have astonishing predictive power.

On top of this, they are a useful tool without modification! They are the working tool for all electrical engineers. They’re not some lower-level substrate that exists in the background. They are used directly to model and generate other, simpler approximate theories (such as geometric and physical optics) which are powerful and elegant in their own right.

I don’t think Lisp, lambda calculus, Turing machines, or similar can make this kind of claim.


The lambda calculus is useful without modification. It can also be used to model other simpler languages or logics which are power and elegant in their own right.

So, yes, since I've used the lc in these ways I believe the same claim can be made regarding the lambda calculus.


what it means to say maxwell's equations have 'predictive power' is that we can

1. take a situation observed or designed in the contingent universe,

2. translate it into the abstract entities maxwell's equations talk about,

3. deduce consequences in the world of abstract entities,

4. translate those consequences back into the contingent universe, and then

5. find that the consequences in the contingent universe are within narrow uncertainty bounds we translated from the abstract world of ideas

the meaning of turing universality is precisely that any turing-complete programmable system can be used to model any other logical or mathematical system, including other turing-complete systems, in exactly the same way that maxwell's equations model electromagnetism

for example, you can model risc-v execution in lisp and predict what a risc-v processor will do, you can model lisp execution in the λ-calculus and predict what a lisp interpreter will do, you can model the λ-calculus in a turing machine and predict what λ-reduction will do, and you can model a turing machine in a risc-v processor and predict what the turing machine will do

there is a significant sense in which this sort of modeling is much more perfect than the kind done with maxwell's equations

when we apply maxwell's equations, we are subject to measurement error in steps 1 and 5; our measurements are never complete and correct, and heisenberg's uncertainty principle strongly suggests that they never can be. and in step 3, because maxwell's equations are continuous-time continuous-space differential equations, we often also introduce numerical error in our calculations as well, because we usually have to integrate them numerically rather than algebraically

on the other hand, in the case of computational universality all the entities being discussed are discrete, algebraic, mathematically abstract entities, so our simulations are absolutely perfect unless we run out of memory or suffer a rare hardware error

obviously these universal machines are not limited to modeling other universal machines; we can also use lisp or turing machines or risc-v processors to model things like gravitation, taxation, or maxwell's equations. and they are obviously also the main working tool for all electrical engineers today, having displaced slide rules and load lines generations ago

ultimately, though, we are also using maxwell's equations (and other equations describing electromagnetism, like the ebers-moll transistor model) to design our electronic computers which we use to simulate lisp


I think you're confused about the premise of this discussion. The comparison is not between Maxwell's equations and lambda calculus, it's between lambda calculus and other theoretical constructs in CS.

Maxwell's equations are "the Maxwell's equations" of physics because they are "the most successful" physical theory: i.e., incredible accuracy, wide applicability, and concision. This is relative to other physical theories. (I guess you could debate whether this is actually the case---obviously this is all subjective, and not everyone agrees. Maybe someone thinks Einstein's field equations should really be "the Maxwell's equations" of physics.)

To say that the lambda calculus or anything else are "the Maxwell's equations" of computer science is to say that they have the same set of properties relative to other let's theoretical constructs in computer science. But is this actually the case? It seems like Turing machines, lambda calculus, etc. are all similar in terms of relative modeling utility and concision. I disagree that they are widely applicable---in practice, just about any commonly used programming language is far more widely applicable than anything of these. And as you pointed out, by Turing universality, they're all logically equivalent---so the question of difference in modeling "power" is not relevant here.

I guess I just don't think it makes much sense to talk about anything in CS being "the Maxwell's equations" of CS. In physics, a wide variety of disparate physical phenomena are being modeled, with individual models having a quite varied range of power and applicability. In CS, things are much different.

Incidentally, even if you simulate Maxwell's equations on a computer using Lisp, a Turing machine, or whatever else, you will not go far if you insist on algebraically exact computations. :-) As you say yourself, you must integrate them numerically, which means numerical error introduced through your discretization. But I honestly can't say I understand your point including this information above.


there are lots of equivalent formulations of maxwell's equations (cf. https://en.wikipedia.org/wiki/Mathematical_descriptions_of_t... for a few); the ∇× ∇· one we most commonly use is quite a bit more compact and usable than maxwell's own formulation because vector calculus was, to a significant degree, invented to systematize maxwell's equations. there's apparently an even more compact form of them in terms of clifford algebras that i don't understand yet, \left(\frac{1}{c} \dfrac{\partial}{\partial t} + \boldsymbol\nabla \right) \mathbf F = \mu_0 c (c \rho - \mathbf J).

similarly there are lots of equivalent formulations of universal computation. lisp and the λ-calculus are analogous to the geometric-algebra formulation above and the more commonly used form in terms of vector field derivatives: they look very different, and they make different problems easy, but ultimately they all model the same thing, just like the various formulations of maxwell's equations

maxwell's equations are the maxwell's equations of classical electrodynamics, not of physics. without a lot of extra assumptions they won't get you very far in understanding why solid things are solid (which depends on the pauli exclusion principle) or why some nuclei break down or how transistors work or how stars can burn or why mercury's orbit precesses or why hot things go from red to orange to yellow when you heat them further

my point about modeling maxwell's equations in lisp (etc.) is that lisp (etc.) can model maxwell's equations (at least, as well as any other effective means of calculation we know of can model them — discretization and rounding error also happen when you numerically integrate with pencil and paper), so if we're looking for incredible accuracy, wide applicability, and concision, lisp (etc.) would seem to trump maxwell's equations — but good luck trying to build a computer without electromagnetism, maybe you can in some sense do it on a neutron star but not with atoms


elegance is a human attribute. Opinionated and open to interpretation. So Maxwells equations are an opinionated choice? I don't think so.


I was going to say I had heard that this was the intention of Lisp's creators, then I thought I had better check, and immediately found this:

Lisp had assignments and goto before it had recursion, and started as a dialect of Fortran! It was only later that Lisp programmers investigated the benefits of pure functions. Lisp was not based on lambda calculus, but rather Kleene's work on recursive functions. At the time, McCarthy had heard of lambda calculus but had not yet studied it!

https://crypto.stanford.edu/~blynn/lambda/lisp.html


The list processing in Lisp was inspired by FLPL: Fortran List Processing Language. This was more like a library of Fortran routines than a language, but in 1958 the term library for routines of code didn't even exist. Making new words of any kind was language development.

FLPL already used the CAR and CDR terminology in its naming, with extra characters: XCARF and XCDRF (and others). The "extract CAR function" and "extract CDR function".


As far as I understand, McCarthy was into Godel, recursive functions, mathematics, and AI in trying to create Lisp to study these, not so much computability and the like. I think that's at least one line of thinking that causes people to say that McCarthy discovered Lisp instead of inventing it.


For the reasons you give, it is perhaps more reasonable to say that Lisp is a Maxwell's Equations of software. But there are several. Forth has another reasonable claim to it, despite unpopularity, and Turing Machines have a claim too, despite their extreme unpopularity as a programming methodology. I don't know array languages well enough to know but I bet they have one too.

There isn't really a unique set of Maxwell's equations in software.


i wasn't able to get a runnable forth to less than a couple of pages written in itself https://github.com/kragen/stoneknifeforth but schönfinkel's ski-combinators are maybe the simplest practical basis

    s f g x → f x (g x)
    k a b   → a
    i = s k k (one of many possible definitions)
maybe wolfram knows of a simpler alternative

my favorite is abadi and cardelli's object-calculus on which i based bicicleta. it has two reduction rules rather than the λ-calculus's one. using a hybrid of bicicleta's syntax and abadi and cardelli's

    {f = ς(v)b, g = ς(v)c, ...}{f = ς(v)d} → {f = ς(v)d, g = ς(v)c, ...}
    {f = ς(v)b, g = ς(v)c, ...}.f          → b[{f = ς(v)b, g = ς(v)c, ...}/v]
the first of these derives an inherited object with a new definition for method f. the second one invokes method f on an object, which is evaluated by replacing its self-variable v with the object itself throughout its body, using the standard β-reduction semantics with α-renaming that we're familiar with from the λ-calculus (b[x/y] means b but with x replacing y)

despite the simplicity of the semantics the ς-calculus is enormously more usable for actually writing down functions than the λ-calculus. here's factorial(10) in the notation above (untested)

    {fac = ς(env){
        n = ς(_)3,
        return = ς(o)
            (o.n < 2).if_true{
                then = ς(_)1
                else = ς(_)o.n * env.fac{n = ς(_)o.n - 1}.return
            }.return
        }
    }.fac{n = ς(10)}.return
bicicleta has a lot of syntactic sugar which reduces this to (tested)

    {env: fac = {fac:
        arg1 = 3
        '()' = (fac.arg1 < 2).if_true(
            then = 1
            else = fac.arg1 * env.fac(fac.arg1 - 1)
         )
    }}.fac(10)
in particular to be able to define infix operators inside the language, bicicleta rewrites

    x * y
as

    x.'*'{arg1 = y}.'()'
and analogously for -, <, etc.

they published a bunch of papers and a book on this but they were more interested in static typing than anything else. a paper on one imperative variation of the thing is http://lucacardelli.name/Papers/PrimObjImpSIPL.A4.pdf

you can implement a turing machine in a few lines of c, making it internally simpler than the other alternatives, but as you point out it's unusable except as a compilation target or to prove theorems about


> maybe wolfram knows of a simpler alternative

Wolfram on combinators:

https://writings.stephenwolfram.com/2020/12/combinators-a-ce...

not any simpler, but maybe these are:

https://en.wikipedia.org/wiki/Iota_and_Jot


oh yeah, how could I forget iota and jot, that was stupid of me


> but schönfinkel's ski-combinators are maybe the simplest practical basis

Okay, but how much code goes into creating those operators?


it depends on what you're implementing them in, but in a term-rewriting language all you need is the two lines above. in js you need to implement some kind of term-rewriting. in 02006 i wrote a compiler from λ-calculus (using \ for λ, so you can write for example \x.\y.x for λx.λy.x, which is equivalent to the k combinator) to an augmented sk-combinator language in js that's at http://canonical.org/~kragen/tmp/sk.html

the definitions of s and k in it are

    S: [3, function(f, g, x) { return App(App(f, x), App(g, x)) }],
    K: [2, function(k, _) { return Ind(k) }],
which are used as templates for matching by the 33-line-long sk_reduce function, which i'll spare you here

the λ-calculus parser, the compiler from λ-calculus to sk-combinators, the sk-combinator parser, and the sk-combinator evaluator together are 391 lines of js

if you just want to define the s and k combinators in js they look like this

    s = f => g => x => f(x)(g(x))
    k = a => b => a
but that is depending on the js interpreter to do the actual evaluation

more recently i implemented a term-rewriting language in i386 assembly, it's about 800 bytes and 400 instructions: http://canonical.org/~kragen/sw/dev3/qfitzah.s but i haven't quite figured out how to handle arithmetic, i/o, and variadic lists

i think you could probably implement sk combinatory logic in about half a page of c if you were okay with leaking memory, you don't really need the flexibility of variadic lists and arbitrary identifiers in the sk-combinator language. every term is either a function application (of one function to one argument), s, or k


> it depends on what you're implementing them in, but in a term-rewriting language all you need is the two lines above

Okay, so what do I type those two lines into to implement that? How many lines of code does that involve?

> more recently i implemented a term-rewriting language in i386 assembly, it's about 800 bytes and 400 instructions: http://canonical.org/~kragen/sw/dev3/qfitzah.s but i haven't quite figured out how to handle arithmetic, i/o, and variadic lists

Have you considered having a parameter stack that arithmetic operators pull their arguments from and push their results onto? Oh wait... ;-)


> so what do I type those two lines into to implement that? How many lines of code does that involve?

i just answered that in the comment you are replying to

> Have you considered having a parameter stack

this is a remarkably bad answer to my explanation of how much code i needed to write a self-compiling forth that generates machine code

what is wrong with you

i am sad now


Nice take on lambda calc! It looks like you're making use of beta-reduction for the eval engine. Shen has a more formal take on lambda calc, and I think it's useful for more advanced constructs like type checking. Shen actually uses K-lambda (a lispish version of lambda calc) as part of it's compiler Sequent-calc->Prolog->Shen->K-lambda->VM:

https://shenlanguage.org/TBoS/tbos_170.html


How is beta reduction not "formal"???


Sorry I wasn't clear - beta-reduction is formal, I was trying to refer to the extension of the formal treatment of lambdas to cover proofs, formal logic and typechecking. Shen does this for typechecking.


This is really cool, thanks for linking that.


My first goal was to write in a wiki using a light and extended HTML. I then discovered LISP and lambda-calculus. I built an "inverted" evaluator working directly on s-expressions (without AST), ignoring words (which therefore did not have to be protected by apostrophes and quotation marks), like in basic HTML. I can thus write wiki pages that are lambdatalk programs, mixing pure text and code effortlessly. It's my all-purpose tool.


Do you have a write up on this tool somewhere? This sounds really cool!


Thank you, everything starts in this page http://lambdaway.free.fr/lambdawalks


No need for those pesky variable names that make term rewriting clunk either[1]! The entire hygiene problem just disappears. Your style guide section on naming will also be greatly simplified.

[1] https://en.wikipedia.org//wiki/De_Bruijn_index


I agree with you, but so far I have never been able to get into the logic of de Bruijn's clues. As for example John Tromp knows how to do. (https://tromp.github.io/) But I don't despair of getting there one day...


About "We just need to make the syntax a little more human", see for instance http://lambdaway.free.fr/lambdawalks/?view=sierpinsky4


The Curry-Howard correspondence suggests the analogy of computation with lambda calculus and logical proof systems. As a non-expert it seems that same analogy extends roughly to lambda calculus and a lisp system


The Curry-Howard Correspondence is about proof systems. You would have a very hard time finding a proof theorist that says it's okay to have an undecidable proof checker or an inconsistent one.

For that reason, only _typed_ lambda calculi have any hope of corresponding to a proof system. You could rephrase Lisp as a Unityped system, but I seriously doubt it would be consistent.

Perhaps a para-consistent logic might work, but we're getting very exotic here.


> You could rephrase Lisp as a Unityped system, but I seriously doubt it would be consistent.

It's not. (x -> x) is always inhabited, so if you have a fixed point combinator (x -> x) -> x then every theorem is true.


Interesting, thanks


>So I had the idea to explore the lambda-calculus using s-expressions and patiently rebuilt booleans, pairs, lists, recursion and beyond a real complete Turing language, for instance:

Syntax is not semantics.

The only thing that s-expressions do that is mathematically interesting is make the abstract syntax tree of expressions explicit in a way that standard notation does not. The whole point of them is that they make string rewriting into tree rewriting which is vastly easier and makes writing your own DSLs a breeze.

From a practical point of view they are also the simplest way to serialize any expression which is why I can type them out in a text only comment as they are. Meanwhile doing the same with a math (or lambda calculus) expression requires a type setting system like latex to be understandable.

Lisp has decided that application is nameless in its syntax which means that things like ((lambda (x) (+ x 1)) 2) work when semantically they mean (apply (lambda (x) (+ x 1)) 2). This makes programming with multivariable functions very clean but partial application incredibly messy, e.g. ((((lambda (x) (lambda (y) (lambda (z) (+ x y z)))) 1) 2) 3). This is not something that can be fixed because the ABS of partial application when thought of as a tree is indeed very messy. The only reason why it works in languages like Haskell as well as it does is that as long as you only have single variable function - goodbye thunks and multivariable functions - you can treat the partial application trees as a list instead.

You can of course do the same in lisp using a DSL:

    '(partial
      ((lambda (x)) (lambda (y)) (lambda (z)) (lambda (w) (+ x y z w)))
      (1 2 3 4))
But it doesn't fit very well with the rest of the language and feel quite pointless.


@thrown: This makes programming with multivariable functions very clean but partial application incredibly messy

In lambdatalk the implementation of lambdas does not make them closures (no free variables) but is such that they accept de facto partial application and a number of values higher than the number of arguments. This compensates for the lack of closure. So for instance:

((lambda (x)) (lambda (y)) (lambda (z)) (lambda (w) (+ x y z w))) (1 2 3 4))

is written this way

{{{{{lambda {x y z w} {+ x y z w}} 1} 2} 3} 4} -> 10

and obviously

{{lambda {x y z w} {+ x y z w}} 1 2 3 4} -> 10


I would change it a bit. Lambda calculus (in Martin-Lof type theory) are Maxwell equations of logic. But, lambda calculus has very little real use in programming. There should be something else here.


Maxwell's Equations are an interesting analogy. As they were originally written, they were cumbersome and hard to understand. Along came Oliver Heaviside, who translated them into the notation of vector calculus, resulting in the form that remains in use today. But even the present form is not without its detractors, and I've read comments in this forum to the effect that a better notation would make them even less forbidding for students.

Algebra has a similar history too. You practically had to be a philosopher to understand algebra as described by Al-Khwarizmi's book -- there are no numerals or equations, just a wall of text. The notation of equations hadn't been invented yet. Today, thanks in part to gradual development of notation, we can teach algebra to schoolchildren.

I think there are probably many developments, where the brilliant invention, and the language that can be understood by the rest of us, had to come from two different people.


Related:

Lisp as the Maxwell’s Equations of Software (2012) - https://news.ycombinator.com/item?id=23687904 - June 2020 (46 comments)

Lisp as the Maxwell’s equations of software (2012) - https://news.ycombinator.com/item?id=9607843 - May 2015 (8 comments)

Lisp as the Maxwell Equations of Software (2012) - https://news.ycombinator.com/item?id=9038505 - Feb 2015 (122 comments)

Lisp as the Maxwell’s equations of software - https://news.ycombinator.com/item?id=3830867 - April 2012 (37 comments)


Also related, An Intuition for Lisp Syntax (2020) - https://news.ycombinator.com/item?id=32630675#32635785


Interestingly, the article users the nowadays standard 4-equation form of the Maxwellian concept - but Maxwell himself used 20 equations [1]. The current version is due to Oliver Heaviside (who doesn't get enough respect for this).

As I understand things, Maxwell's work was held in high regard, but only became usable due to Heaviside's reformulation.

Whether that aspect (beautiful but unusable) is true and carries over to Lisp... dunno, I'm insufficiently familiar with Lisp. But the xkcd with brackets [2] comes to mind.

[1] https://en.m.wikipedia.org/wiki/History_of_Maxwell%27s_equat...

[2] https://xkcd.com/297/


Maxwell's equations address only part of physics. E.g. you can't describe anything nonlinear with it, like transistors.

Also, it is quite difficult to prove that Maxwell's equations have a single solution for different boundary conditions. For lambda calculus it is quite easy, there can only be one solution.


If Lisp is the Maxwell's Equations, what is the Heaviside language?

https://en.wikipedia.org/wiki/Oliver_Heaviside


Heaviside promoted Maxwell's equations and created clearer, simpler versions of them. So I think it would be a language that uses list processing and functional concepts as a feature of a syntax that is easier for beginners. Something like Python or JavaScript.


Reminds me of Peter Norvig's sketch of Lisp in Python: https://norvig.com/lispy.html


I think a more astute argument would be that Backus-Naur Form is the Maxwells Equations, since that's what the definition of LISP is written in on the 1/2 page screenshot that starts the OP's thesis.


Guy Steele talk on Metanotation [1]

https://www.youtube.com/watch?v=dCuZkaaou0Q


If this is true then Haskell's type system is Noether's Theorem.


Are you familiar with this? https://bentnib.org/conservation-laws.html


I think I might argue that the "Maxwell's Equations of Software" could be a set of things that proscribe and constrain...whose consequences are pervasive throughout.

For example:

Turing Machines and the Halting Problem

Queueing Theory

Shannon Entropy

The CAP Theorem


A lot of people don't agree. But I feel category theory is the Maxwells Equation for software.

What is the shortest distances between two points? A straight line. I calculate the solution to that problem. I don't design it. What's the best way to travel between two points? Do I take a car or plane? Which is cheaper, faster, or more comfortable? I can't calculate a solution because the problem is to complex. Instead I design the solution to the problem. That is the fundamental difference between design and calculation.

Whenever you use the word "design" you are operating in a zone where humans have no foundational optimization theory. You are guessing, using your instincts, your common sense and your gut to find solutions for the problem. It's unlikely you'll hit the most optimal solution, but you can very likely arrive at a "good" solution.

If you squint, Category theory looks to fill this gap for designing programs. Depending on how you look at it, category theory looks like a fundamental theory on how to organize code. It can be thought of as a fundamental theory on abstraction or a fundamental theory on interfaces.

Maybe another way to put it is that the concept of interfaces are the maxwells equations of software..

Even turing machines/lambda caculus seem oddly specific let alone lisp. If we want something fundamental and general I think Category theory is it.

My guess is, if the advancement of computer science was allowed to play out for centuries and we got rid of all the historical baggage, some form of category theory would lie at the heart of it all.


Every time I try to understand what category theory is, I hit an impassable wall, I don't understand it. Alexander Grothendieck is much too intelligent for me. But I don't despair of getting there one day...


I've found Bartosz Milewski's lectures on category theory for programmers to be very understandable and enlightening. (And I'm weak on math) He explains it really well in plain terms.

https://www.youtube.com/playlist?list=PLbgaMIhjbmEnaH_LTkxLI...

He also has a series of blog posts on it, but I found the lectures especially helpful.


It's the idea that most structures in math can be put upon a foundation of a structure consisting of objects and the relationships between those objects.


Eugenia Cheng has a couple of very accessible books on Category Theory


I was going to make the same recommendation. In particular I'd recommend her new book, _The Joy of Abstraction_. It's a real textbook that goes up through the Yoneda Lemma, but it isn't too scary and it doesn't assume that the reader already knows any theoretical math.


Yes, that book in particular. I have Emily Riehl's Category Theory in Context which was not immediately accessible to me at all. Joy of Abstraction aims to be an on-ramp to that.


I read the first couple chapters. It's a bit too watered down in the beginning IMO. I couldn't get past the first part especially when she tries to relate category theory to feminism. Might come back to it later.

As a non-math expert and programmer I highly highly recommend Bartosz stuff I linked it in another branch under my original comment.


Bartosz's material is great but it has a practical programming focus, so Cheng's books which focus more on teach Category Theory as mathematics, make a good complement.


Yeah you're probably right. Ill dive back into her book eventually.

It's mostly the beginning chapters don't even get into the meat of CT quick enough. Instead it spends a lot of time justifying things. I feel It's written for people who hated math even more then I did.


I'm not an math expert. I hated math. I learned it off of very informal material:

https://bartoszmilewski.com/2014/10/28/category-theory-for-p...

and his associated youtube videos (which are EVEN better):

https://youtube.com/playlist?list=PLbgaMIhjbmEnaH_LTkxLI7FMa...


Thank you so much for these links, a lot to learn ...


Lambda Calculus is a much better representation of computation than Category Theory. I can teach somebody functional programming using Lambda Calculus fairly quickly. It would take a lot longer using Category Theory. I have studied quite a bit of Category Theory but I would still struggle to explain beta-reduction using Category Theory. Another better system to use IMHO is Martin Luff style Type Theory. Much simpler. Much more useful in practice.


Well if teaching length justified the title of Maxwell's equations I would agree with you. CT does take longer to learn.

However if foundational theory was what justified the title, I believe CT is more foundational then lambda calculus.


I do not agree with this (however, I only have one grad category theory course). I think category theory is a fine tool for formalizing computation and thinking about type systems, but wouldn't approach it as a program design tool.


All of computation is related to types.

You are taking Type A and converting it to Type B. That is the entire point of computation. All else is abstractions on top of that and algorithms below.

Category theory is what lives on top. Haskell is a programming language and style that borrows very very heavily from category theory. Getting a certain level of mastery in haskell well help you see how programming is related to CT.


I have studied haskell from this perspective. Like I said, you learn how to model computation and how the haskell language works, you don't learn how to design programs. I actually have gotten much more program design ideas from traditional branches of math (analysis, algebra, etc).


Then your comment doesn't make sense to me. If you designed programs in haskell then you designed them using concepts from category theory.

Category theory isn't just about how haskell works. You use it to design your programs. Haskell is basically a category theory framework.

I would say you didn't get very far then if you didn't come out of learning haskell with the realization of how you design programs with CT.


> If you designed programs in haskell then you designed them using concepts from category theory.

It's the level of abstraction and focus. Category theory can describe "mapping over promises" it doesn't give a lot of insight into how to design an internet communication protocol with low latency (for example).

> I would say you didn't get very far then

Sounds like you put a lot of value in your experience with this topic and I am unlikely to convince you.


I actually agree with you. Its not a good theory for finding optimal speed.

I would say it's more of a better theory for optimal organization and reuse of code. It's a theory of interfaces. If programming is all about abstraction then CT is the theory of design and abstraction.

I agree that detail and speed and execution are important in computer science but these things maybe aren't general enough as the efficiency of speed also relies on which turing complete language (out of an unlimited set of languages) that is used to drive the computation.

We also do have a general theory for optimizing speed. Computational Complexity theory. Which is based on Knuth's little assembly language. This theory can literally help you converge onto the best possible solution. Not by calculation but it can quantify the speed of an algorithm for definitive comparisons.

But then would complexity theory be the Maxwell's equations of programming? I think there's a good argument for that route but I don't think that's what the OP is thinking about in his article.

The op is thinking about design because Lisp is definitely a design philosophy. Lisp won't help you design an internet communication protocol for low latency either.

Maybe you did study category theory deeply but your thinking about it from an angle of computational cost. This is certainly valid. Apologies for that comment, Haskell and CT like any other design pattern out there is usually not targeted specifically at optimizing speed, but more for optimizing code reuse and increasing modularity. It's for dealing with technical debt and how to handle the complexity of programming through abstraction.

If you're thinking about it from that angle then I think the universe already has your answer. Computational Complexity theory is the Maxwell's equations of software. It's not a very complete theory but it's much more concrete then all the other "design" stuff in computer science.


> Haskell is basically a category theory framework.

Haskell probably draws more from category theory than any other mainstream language, but in absolute terms that's still not very much. It's okayish for modelling cartesian closed categories, but if you want any more structure than that things get quite painful. Even something as simple as a category with finitely many objects requires stupid amounts of type-level boilerplate.


[flagged]


There is no best mental tool. This is design, we don't have methods to converge on optimums. So you can't probably say any mental tool is better. It's all just different sets of primitives and category primitives, if you squint, feels like the proper theory of design.

Category theory is also so abstract that you can literally find it in everything. So it's not like it doesn't apply, it applies to everything.

When I say Haskell is a category theory framework I mean that Haskell has primitives that are explicitly modeled after CT and named with the same mathematical names. You don't have to know CT from a mathematical angle to use Haskell but it doesn't change the fact you are using CT primitives like functor to build things.

Other languages have CT concepts like functors but they don't crystallize the concept in such a pure and explicit form. That's just my opinion on it. You are welcome to disagree.

But when you say garbage like "100% nonsense" it's just fucking rude and against the rules here at HN. Don't say it again. Speak respectfully or don't speak at all.


> But when you say garbage like "100% nonsense" it's just fucking rude and against the rules here at HN. Don't say it again. Speak respectfully or don't speak at all.

I found that commment to be seriously rude and disrespectful. Yes I should have avoided the "100% nonsense" comment. However that doesn't give you licence to be rude and abusive in return.


It is rude, but it is reasonable. When a child acts unruly and immature he must be disciplined with a forceful hand. I called you out harshly and the harshness was fucking deserved, no matter how miffed you get about it.

I mean you expect me to just stand here and get raped by your rudeness? No reasonable man with spine will take that shit in real life and that's also typically why you don't likely speak that way in real life.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: