Those aren’t CS papers, social science and biology research have constraints that CS does not. I haven’t seen any evidence that there is anywhere close to that level of issue here. A couple of conferences adopted artifact review where an independent reviewer attempts to reproduce the experiments listed in the paper. Nearly all papers that participate do end up passing
One case where this happens a lot is papers that pick bad comparisons as state of the art. If you have the code, you can run it vs better configurations of existing tools to see if the promises still hold up.
Yes, that's what I tried to say. I can easily try a piece of code with my own data to verify that its results are plausible. I can't do that for a paper. So if I don't have the code, I might waste a significant amount of time trying to reproduce a fake paper.
You're saying that the majority of CS papers only appear to work because the analysis code has bugs?
And that checking the code (presumably also the analysis code) is easier than "understanding the idea"?
Neither of those ring true to me, but your mileage may vary.