Are they? Did you see the picture of the chicken with three legs? Because there's no human I know who would confidently assert that chicken has two legs.
Throw 1000 pictures of chickens at a human, ask how many legs each chicken has. If 999 of them have two, I bet you'll get two as an answer back for the 1000th one no matter how obvious.
Humans do things a lot harder than that every day in the form of QA in factories. Do they sometimes make mistakes from the repetition or boredom? Sure. Is that at all comparable to the failures in the paper? No.
So a human failure looks like "alarm fatigue"? That when asked the same question many times, they might miss one or two?
Is that at all what is being exhibited here? Because it seems like the AI is being asked once and failing.
I don't disagree that humans might fail at this task sometimes or in some situations, but I strongly disagree that the way the AI fails resembles (in any way) the way humans would fail.
If I were given five seconds to glance at the picture of a lion and then asked if there was anything unusual about it, I doubt I would notice that it had a fifth leg.
If I were asked to count the number of legs, I would notice right away of course, but that's mainly because it would alert me to the fact that I'm in a psychology experiment, and so the number of legs is almost certainly not the usual four. Even then, I'd still have to look twice to make sure I hadn't miscounted the first time.
Ok, but the computers were asked to specifically count the legs and return a number. So you've made the case that humans would specifically find this question odd, and likely increase their scrutiny. Making an error by a human even more unusual.