Hacker News new | past | comments | ask | show | jobs | submit login

This raises an interesting question. Do we want computers to see "correctly", or to see how we see?

Would a preferred computer vision system experience the Checker shadow illusion? http://en.wikipedia.org/wiki/Checker_shadow_illusion

If yes, computer vision will be as fallible as ours. If no, then there will always be examples, like the ones presented, where computers will see something different than humans.




The question is, what do you mean by "correctly"?

When we look at the Checker Shadow Illusion, our brains are automatically "parsing" that image into a 3D scene and compensating for the lighting and shadowing. The reason you see square B as being lighter than A is because, if you could reach into the image and remove the cylinder so it's not casting a shadow anymore, square B would be lighter than A. Our brain doesn't think about color in terms of absolute hex values. Instead it tries to compensate for the lighting and positioning of elements, assuming that they're similar to what we would see in real life—a challenge which, by the way, computer vision systems have always struggled with.

That sounds pretty "correct" to me.


A computer vision system can have multiple ways of processing an image. So at the limit, it could interpret a scene in terms of what a human sees and also have a separate, better understanding of the scene.


The OP shows that computers DO NOT have a better understanding. Its evident they have no understanding at all; they are simply doing math on pixels and latching on to coincidental patters of color or shading.

People recognize things by building a 3D model in their head, then comparing that to billions of experiential models, finding a match and then using cognition to test that match. "Is that a bird? No, its just a pattern of dog dropping smeared on a bench. Ha ha!"


How could I have better put So at the limit, it could?

I meant to talk about what some hypothetical future system could do (which I think was a reasonable context given the comment I replied to), not to characterize current systems.


Sorry, I re-read and see that.

To get there, computers will clearly have to change utterly their approach. A cascaded approach of quick-math followed by a more 'cognitive' approach on possible matches, could definitely improve on the current state of affairs.


>People recognize things by building a 3D model in their head, then comparing that to billions of experiential models, finding a match and then using cognition to test that match. "Is that a bird? No, its just a pattern of dog dropping smeared on a bench. Ha ha!"

So you're saying people are generative reasoners with very general hypothesis classes rather than discriminative learners with tiny hypothesis classes.

To which the obvious response is, yes, we know that. The question is how to make some computerization of general, generative learning work fast and well.


People are far more than that. Lots of our brain is dedicated to visual modeling. Those 'hypothesis classes' are just the tip of the iceberg. For computers, they're the whole enchilada. To mix metaphors.


I don't think your description of how humans recognize things is true. You can do object recognition of a silhouette, or a 2D image, or an impressionistic rendering. http://www.sciencedaily.com/releases/2009/04/090429132231.ht...


We can't help but build real models of what we see - our retina/optic nerve are already doing this before our brain even receives the 'image'!

I can't help but believe some of the image recognition mentioned in your article, especially of icons, is built through previous experience with similar iconic images. Symbols for things become associated with the real things. Its a modern adaptation of a much older processing mechanism.


OK... but how is that pattern-matching different from what the computer is doing? Why is human pattern-matching "understanding" and computer patter-matching is not?


Its the 2nd state of cognitive engagement that makes humans different. Of course a field of static isn't a panda. The computer has no capacity to recognize the context.


I think I get your point now. It's OK if a human momentarily mistakes a random blob for a panda, but they should be able to figure out from other visual cues and context that it's not a panda. And it's that second part that's missing from the computer models?


That's it. Both consciously and subconsciously - lots of image filtering going on unaware.


That would be interesting; it could flag inputs that are ambiguous to humans but not machines (or vice versa, or when there's a discrepancy at all) since it could suggest that something shady is happening.


You can always fool any system into a paradox. Very grossly speaking, this comes out of Godel's incompleteness theorem; that with any set of laws you can always get P=~P out of the set. How this paradox looks, acts, or feels is interesting and possibly artistic, as the OP shows. If anything, I think there is a bit of beauty, art, and cleverness in that.


That isn't quite right. With a sufficiently powerful formal system, you're forced to either have inconsistency or incompleteness - you're describing a system that is inconsistent. It's usually much better to have consistency and to sacrifice completeness. Then you'll have Ps that are true but unprovable, but at least you won't have P=~P which makes the system rather useless.


What does true but unprovable mean? What happens if you take such a proposition, negate it and add as an axiom?


I'm not a logician by a long shot, so I probably can't explain that correctly. I think Gödel found a way to make a logical proposition refer to itself, and then found a way to assert provability. He could then construct the sentence "this sentence is not provable". He showed that such a sentence must exists within any system of sufficient power. Thus the system must be self-contradictory (inconsistent), or the sentence must be true (and the system must be incomplete). I'm not sure if such a sentence still refers to itself when negated, so I can't answer the last one.


My understanding of the incompleteness theorem is that, for a given set of axioms, there will be unprovably true things. Changing the axioms would change which things were unprovable.

That being said, here is a much better resource than I am: http://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_t...


Are all humans susceptible to the checker shadow illusion? Or just those that have been trained to interpret flat arrangements of color as accurate representations of 3D scenes and objects? If you showed the checker illusion to someone who had never seen a photograph or representative painting, would they see different colors or the same color?

I don't know the answer. I do find it fascinating that something as simple as perspective (which we take for granted in a graphic like the checkers illusion) is a fairly recent technique, invented by Renaissance artists.


With good lighting control, you could set up he checkerboard illusion in the real world. That's part of he point.


Does it matter if we want them to do it or not? NSA and China will build them to do that anyway.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: