> Checking the correctness of proofs is a much easier problem than coming up wit...

zozbot234 · 2025-03-12T22:08:35 1741817315

> The famous exceptions (such as the results by Clausen-Scholze and Gowers-Green-Manners-Tao) have special characteristics which make them much more ground-level and easier to code in lean.

"Special characteristics" is really overstating it. It's just a matter of getting all the prereqs formalized in Lean first. That's a bit of a grind to be sure, but the Mathlib effort for Lean has the bulk of the undergrad curriculum and some grad subjects formalized.

I don't think AI will be all that helpful wrt. this kind of effort, but it might help in some limited ways.

oersted · 2025-03-13T09:08:39 1741856919

Yes I definitely concur, I have spent significant time with it.

The main bottleneck is having the libraries that define the theorems and objects you need to operate at those levels. Everything is founded on axiomatic foundations and you need to build all of maths on top of that. Projects like mathlib are getting us there but it is a massive undertaking.

It’s not just that it is a lot of maths to go through, it’s also that most maths has not really been proven to this degree of exactitude and there is much gap-filling to do when trying to translate existing proofs, or the reasoning style might be quite distant to how things are expressed in Lean. Some maths fields are also self-consistent islands that haven’t been yet connected to the common axiomatic foundations, and linking them is a serious research endeavor.

Although Lean does allow you to declare theorems as axioms. It is not common practice, but you can skip high up the abstraction ladder and set up a foundation up there if you are confident those theorems are correct. But still defining those mathematical objects can be quite hard on its own, even if you skip the proving.

Anyways, the complexity of the Lean language itself doesn’t help either. The mode of thinking you need to have to operate it is much closer to programming than maths, and for those that think that the Rust borrow-checker is a pain, this is an order of magnitude more complex.

Lean was a significant improvement in ergonomics compared to the previous generation (Coq, Isabelle, Agda…), but still I think there is a lot of work to be done to make it mathematician-friendly.

Most reinforcement-learning AI for maths right now is focused on olympiad problems, hard but quite low in the maths abstraction ladder. Often they don’t even create a proof, they just solve problems that end with an exact result and you just check that. Perhaps the reasoning was incorrect, but if you do it for enough problems you can be confident that it is not just guessing.

On the other side of the spectrum you have mathematicians like Tao just using ChatGPT for brainstorming. It might not be great at complex reasoning, but it has a much wider memory than you do and it can remind you of mathematical tools and techniques that could be useful.

hewtronic · 2025-03-13T12:21:43 1741868503

> Anyways, the complexity of the Lean language itself doesn’t help either. The mode of thinking you need to have to operate it is much closer to programming than maths, and for those that think that the Rust borrow-checker is a pain, this is an order of magnitude more complex.

Could you elaborate on this? I'm interested to learn what the complexities are (beyond the mathematical concepts themselves).

oersted · 2025-03-13T13:00:37 1741870837

Found something I wrote last year, see below, but off the top of my head:

Something like 5 different DSLs in the same language, most of it in a purist functional style that is neither familiar to most mathematicians nor most programmers, with type-checking an order of magnitude more strict and complex than any programming language (that's the point of it), with rather obscure errors most of the time.

It's really tedious to translate any non-trivial proofs to this model, so usually you end up proving it again almost from scratch within Lean, and then it's really hard to understand as it is written. Much of the information to understand a proof is hidden away as runtime data that is usually displayed via a complex VSCode extension. It's quite difficult to understand from the proof code itself, and usually it doesn't look anything like a traditional mathematical proof (even if they kind of try by naming keywords with a similar terminology as in normal proofs and sprinkling some unicode symbols).

I never-ever feel like I'm doing maths when I'm using Lean. I'm fighting with the syntax to figure out how to express mathematical concepts in the style that it likes, always having several different ways of achieving similar things (anti Zen of Python). And I'm fighting with the type-checker to transform this abstract expression into this other abstract expression (that's really what a proof is when it boils down to it), completely forgetting about the mathematical meaning, just moving puzzle pieces around with obscure tools.

And even after all of this, it is so much more ergonomic than the previous generation of proof-assistants :)

---

I think that the main reasons for Lean's complexity are:

- That it has a very purist functional style and syntax, literally reflecting the Curry-Howard Correspondence (function = proof), rather than trying to bridge the gap to more familiar maths notation.

- That it aims to be a proof assistant, it is fundamentally semi-automatic and interactive, this makes it a hard design challenge.

- A lot of the complexity is aimed at giving mathematicians the tools to express real research maths in it, on this it has been more successful than any alternative.

- Because of this it has at least 5 different languages embedded in it: functional expressions of theorems, forward proofs with expression transformations, backward proofs with tactics, the tactics metaprogramming macro language, and the language to define data-types (and at least 4 kinds of data-types with different syntax).