Detecting boundaries and straight lines is something that has been "easily" done for a while now. Categorizing complex images (such as differentiating between a cat and a dog) is still extremely difficult for computers. I'm not really surprised that Watson couldn't do it, I've only recently started hearing about preliminary breakthroughs in the feasibility of this sort of tech.
EDIT: you could send me example images and what you need from them. I could check how much I would need to extend it to handle your case.