Measuring the Progress of AI Research

Animats · on July 3, 2017

I'd add robot manipulation in unstructured situations. Progress has been very slow, but has picked up a little in recent years. Stanford vision-guided robot assembly, 1973.[1] DARPA robot manipulation project, 2012.[2]

[1] https://archive.org/details/sailfilm_pump [2] https://www.youtube.com/watch?v=jeABMoYJGEU

joe_the_user · on July 3, 2017

This seems like a compendium of metrics for processes which AI is a making progress on now. Doing something like that seems like a fine idea - I can't judge the quality of these metrics but it's hard to be excited by this.

However, what I think would be interesting would be for researchers to make a compendium of "human abilities", classifying and quantifying them as well as possible. One could then analyze the progress which AI could make towards emulating those capacities.

Obviously, this would be a rather crude measure but it at least could give some idea of AI's toward new capacities.

jamessb · on July 3, 2017

I made an alternative interface that displays the same data using D3: https://jamesscottbrown.github.io/ai-progress-vis/index.html

shpx · on July 3, 2017

https://rodrigob.github.io/are_we_there_yet/build/ is something similar, although last update was in Feb 2016.

albertzeyer · on July 3, 2017

This looks like it's about image mostly/only.

Here something similar for speech recognition: https://github.com/syhw/wer_are_we

yorwba · on July 3, 2017

They specifically mention using it for inspiration and/or data.

IshKebab · on July 3, 2017

There's no way we have reached human-level performance on image recognition tasks. My guess is that when there is ambiguity (e.g. 'ship' vs 'boat') the AI is better at learning the answer the labellers chose. Humans haven't looked through the training data so they use their real-life biases which may not match those of the labellers.

Just a guess, but whether that is true or not we're definitely not at human-level performance.

IanCal · on July 3, 2017

I think we're past human performance on imagenet, but one of the reasons for that might be categories like specific species of dog.

If I remember right, face recognition was passed a long time ago.

joe_the_user · on July 3, 2017

Well, the challenge is developing more fine-tuned tests than simple labeling that humans perform better at.

Things: labeling and segmenting, labeling according natural language instruction or whatever.

I mean, AI naturally have made more progress in situations where progress can be exactly quantified but the activity of creating tests is only somewhat separated from the activity of creating AIs. Most tests in different directions could spur more research in those directions.

0x4f3759df · on July 3, 2017

Harvard Professor, "We are Building Artificial Brains and Uploading Minds to Cloud right now" https://www.youtube.com/watch?v=amwyBmWxESA

randcraw · on July 3, 2017

1) No robotics? No interaction with the physical world at all?

2) No measure of the AI's ability to teach others? How can you say AI really understands if it can't then teach 1) what it has learned, and 2) understand what essential facts a tyro does not know or misunderstands?

3) No assessment of the AI's semantic interpretive skills, like those long emphasized by cognitive scientists, such as those in Doug Hofstadter's "Fluid Concepts and Creative Analogies" -- i.e. Miller analogies, double entendres, literary symbolism, poetry interpretation, and so on?

Without mastery of analogies, an AI will have all the cultural insightfulness of a portrait of leisuresuit Elvis in neon paint on velvet.

albertzeyer · on July 3, 2017

There is research in all those topics you mention. It's just missing there. But you could contribute and extend the list if you are familiar with these topics.

ragebol · on July 3, 2017

I'd love to see a game with high complexity, long term planning, incomplete information etc in there as well, e.g. Star Craft.

hughperkins · on July 3, 2017

Looks like 'bowling' is quite a challenge, for now...

narvind · on July 3, 2017

Great compilation!