he might be referring to the data in https://lmarena.ai/ they conduct blind tria...

michaelmrose · 2025-05-20T04:31:07 1747715467

In general and quickly chosen "best answer" is perhaps not the best means to analyze such output because people are on average very very stupid and at time of immediate reception less than ideally situated to discern quality of output especially if it concerns data that they aren't intimately familiar with.

For instance the lawyers who submitted briefs with references to fake cases and fake precedents were presumably satisfied with the output at time of reception but less so when they got sanctioned for thousands of dollars for presenting lies to a judge in place of truth.