I tried asking an Electrostatics problem which I assume is not very interesting training data for such CS/Maths biased LLM. It's still going....
I like the tentativeness, I see a lot of : wait, But, perhaps, maybe, This is getting too messy, this is confusing, that can't be right, this is getting too tricky for me right now, this is very difficult.
I kind of find it harder to not anthropomorphise when comparing with ChatGPT. It feels like it's trying to solve it from first principles but with the depth of Highschool Physics knowledge.
Of course. I make up my own test problems, but it is likely that the questions and problems that I make up are not totally unique, that is, probably similar to what is in training data. I usually test new models with word problems and programming problems.