I'd add robot manipulation in unstructured situations. Progress has been very slow, but has picked up a little in recent years. Stanford vision-guided robot assembly, 1973.[1] DARPA robot manipulation project, 2012.[2]
This seems like a compendium of metrics for processes which AI is a making progress on now. Doing something like that seems like a fine idea - I can't judge the quality of these metrics but it's hard to be excited by this.
However, what I think would be interesting would be for researchers to make a compendium of "human abilities", classifying and quantifying them as well as possible. One could then analyze the progress which AI could make towards emulating those capacities.
Obviously, this would be a rather crude measure but it at least could give some idea of AI's toward new capacities.
There's no way we have reached human-level performance on image recognition tasks. My guess is that when there is ambiguity (e.g. 'ship' vs 'boat') the AI is better at learning the answer the labellers chose. Humans haven't looked through the training data so they use their real-life biases which may not match those of the labellers.
Just a guess, but whether that is true or not we're definitely not at human-level performance.
Well, the challenge is developing more fine-tuned tests than simple labeling that humans perform better at.
Things: labeling and segmenting, labeling according natural language instruction or whatever.
I mean, AI naturally have made more progress in situations where progress can be exactly quantified but the activity of creating tests is only somewhat separated from the activity of creating AIs. Most tests in different directions could spur more research in those directions.
1) No robotics? No interaction with the physical world at all?
2) No measure of the AI's ability to teach others? How can you say AI really understands if it can't then teach 1) what it has learned, and 2) understand what essential facts a tyro does not know or misunderstands?
3) No assessment of the AI's semantic interpretive skills, like those long emphasized by cognitive scientists, such as those in Doug Hofstadter's "Fluid Concepts and Creative Analogies" -- i.e. Miller analogies, double entendres, literary symbolism, poetry interpretation, and so on?
Without mastery of analogies, an AI will have all the cultural insightfulness of a portrait of leisuresuit Elvis in neon paint on velvet.
There is research in all those topics you mention. It's just missing there. But you could contribute and extend the list if you are familiar with these topics.
[1] https://archive.org/details/sailfilm_pump [2] https://www.youtube.com/watch?v=jeABMoYJGEU