One of the terms in the solution is a bit bigger than 2^51 so its square is around 2^102. Assuming the methods used need to actually compute these numbers, you’ll need something bigger than a 64-bit int. Not sure how good GPUs are going to be at bignum-ish stuff, but it’s not the most obvious fit