I understand it's just a language model, but clearly it has some embedded method of generating answers which are actually quite close. For example it gets all 2 digit multiplications correct. It's highly unlikely it has seen the same 6 ordered 3 digit (or even all 10k 2 digit multipies) integers from a space of 10^18 and yet it is quite close. Notably, it gets the same divisions wrong as well (for this small example) in exactly the same way.
I know of other people who have tried quite a few other multiplications who also had errors that were multiples of 60.
It can repeat answers it has seen before but it can’t solve new problems.