The fast algorithm is less accurate in the least significant bits. In zX41ZdbW's first link above, there's a code comment that includes the exact count of numbers grouped by how inaccurate the results were.
Someone needs to make a bumper sticker for developers that says “fast and wrong”. I’d imagine non-programmers would buy more of them for other reasons…
If it (dragonbox?) is more performant, why is it also worse?