Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

ericjang · on Dec 11, 2018

I'm one of the authors of the paper. Many folks from our lab (including myself) are super bullish on "using robots to supervise representation learning and using representation learning to supervise robots". Happy to answer any questions!

yazr · on Dec 11, 2018

Why is everyone turning to a xxx2vec representation?

Obviously, NN work well with one-hot and other vector relations. But I keep wondering why some sort of higher-order-input graph-network (Kipf,etc) is not more popular.

What has been your experience ?

ericjang · on Dec 11, 2018

We were inspired by the (emergent) linearity property of Word2Vec representations (man - woman = king - queen). Vector addition (which is Abelian) make sense for representing sets of objects, e.g.

(cup, table, ball) = (cup, table) + (ball)

Unlike word2vec (where this property magically arises from training a language model), we enforce this property in our training objective in order to learn good visual models.

Unlike typical use of graph and relational networks, we explicitly wanted to throw away spatial relations between objects, to force the representation to keep track of what objects are still remaining even if the robot re-arranges the scene during the grasp. I think using a graph network makes sense if one wishes to model pairwise relations between objects though.

mlthoughts2018 · on Dec 12, 2018

It begs the question of creating something like “2vec2vec”, a vectorization model that takes vectorization models as input and embeds them in a vector space that encodes representational semantics of vectorization models into linear operations in a vector space.

Of course then you just run 2vec2vec through itself and get the vectorized representation of a vectorized representation model of vectorized representations.

sjg007 · on Dec 12, 2018

This would define an AI that classifies if something has jumped the shark or not.

coolspot · on Dec 12, 2018

I'll have what this gentleman is having.

a-dub · on Dec 11, 2018

It's a really cool and intuitive idea... have you attempted to link it to any of the literature on sensorimotor system development/learning in young people/animals?

formalsystem · on Dec 12, 2018

Thanks for coming on OP. Am wondering if you have any plans on sharing the code as well?

hideo · on Dec 12, 2018

It's linked from above - putting it here for anyone else looking for videos.

I found more content at https://sites.google.com/site/grasp2vec/

And this video

https://drive.google.com/file/d/1Z1q7zSQERrm_tgboGGoG8MHewj1...