Credit where its due, the academics did publish the code they wrote on github. But I don't know if anyone - reviewers or readers - actually took the time to read it. Let alone understand why it throws doubt on the paper's conclusions.
Usually, (at least in my specific niche of the computer science field,) if the code is published it's only published after the paper has been reviewed. This is partly to preserve anonymity during the review process, and also because usually the code isn't seen as "part of the paper" (i.e. "the paper should stand on its own"). Although I agree that you could argue that for papers about benchmarks, the code should definitely be considered an essential part of the paper.