Yes, which is funny because there was an article on here with some pretty hard data which showed not much difference or maybe worse performance from GPT4 than 3.5-turbo.
You'll likely refute that as your mind is already made up, but there you go, another conflicting and confusing data point.
What are talking about? Just compare the output of a 3.5 vs 4 yourself for a problem you are interested in, it’s a single click in the interface.. Do you always need a study or an “expert“ to make up your mind?
Benchmarks are good. You may be a less experienced software engineer than others (or maybe more experienced?), then you will tell me “ChatGPT x is insane bro", but that's only a matter of perspective. A benchmark gives us facts, outside of our own experience, not opinions.
You'll likely refute that as your mind is already made up, but there you go, another conflicting and confusing data point.