But they aren't useful for that. Terry Tao uses them to improve his ability to use poorly-documented boilerplatey things like Lean and matplotlib, but receiving advice from them‽ Frankly, if a chatbot is giving you much better advice than a rubber duck, you're either a Jack-of-all-Trades (in which case, I'd recommend better tools) or a https://ploum.net/2024-12-23-julius-en.html Julius (in which case, I'd recommend staying away from anything important).
> With o1, you can kind of do this. I gave it a problem I knew how to solve, and I tried to guide the model. First I gave it a hint, and it ignored the hint and did something else, which didn’t work. When I explained this, it apologized and said, “Okay, I’ll do it your way.” And then it carried out my instructions reasonably well, and then it got stuck again, and I had to correct it again. The model never figured out the most clever steps. It could do all the routine things, but it was very unimaginative.
I agree with his overall vision, but transformer-based chatbots will not be the AI algorithm that supports it. Highly-automated proof assistants like Isabelle's Sledgehammer are closer (and even those are really, really crude, compared to what we could have).
I recommend reading his interview with Matteo Wong, where he proposes the opposite: https://www.theatlantic.com/technology/archive/2024/10/teren...
> With o1, you can kind of do this. I gave it a problem I knew how to solve, and I tried to guide the model. First I gave it a hint, and it ignored the hint and did something else, which didn’t work. When I explained this, it apologized and said, “Okay, I’ll do it your way.” And then it carried out my instructions reasonably well, and then it got stuck again, and I had to correct it again. The model never figured out the most clever steps. It could do all the routine things, but it was very unimaginative.
I agree with his overall vision, but transformer-based chatbots will not be the AI algorithm that supports it. Highly-automated proof assistants like Isabelle's Sledgehammer are closer (and even those are really, really crude, compared to what we could have).