Some truly impressive results. I'll pick my usual point here when a fancy new (g...

sendtown_expwy · on Jan 6, 2021

It seems unlikely the model has seen "baby daikon radishes in tutus walking dogs," or cubes made out of porcupine textures, or any other number of examples the post gives.

m3at · on Jan 6, 2021

It might not have seen that specific combination, but finding an anthropomorphized radish sure is easier than I thought: type "大根アニメ" in your search engine and you'll find plenty of results

numpad0 · on Jan 6, 2021

Image search “大根擬人化” do return similar results to the AI-generated pictures, e.g. 3rd from top[0] in my environment, but sparse. “大根アニメ” in text search actually gives me results about an old hobbyist anime production group[1], some TV anime[2] with the word in title...hmm

Then I found these[3][4] in Videos tab. Apparently there’s a 10-20 year old manga/merch/anime franchise of walking and talking daikon radish characters.

So the daikon part is already figured in the dataset. The AI picked up the prior art and combined it with the dog part, which is still tremendous but maybe not “figuring out the daikon walking part on its own” tremendous.

(btw anyone knows how best to refer to anime art style in Japanese? It’s a bit of mystery to me)

0: https://images.app.goo.gl/LPwveUJPWHr6oK8Y8

1: https://ja.wikipedia.org/wiki/DAICON_FILM

2: https://ja.wikipedia.org/wiki/%E7%B7%B4%E9%A6%AC%E5%A4%A7%E6...

3: https://youtube.com/watch?v=J1vvut5DvSY

4: https://youtu.be/1Gzu2lJuVDQ?t=42

tkgally · on Jan 6, 2021

> anyone knows how best to refer to anime art style in Japanese?

The term mangachikku (漫画チック, マンガチック, "manga-tic") is sometimes used to refer to the art style typical of manga and anime; it can also refer to exaggerated, caricatured depictions in general. Perhaps anime fū irasuto (アニメ風イラスト, anime-style illustration), while a less colorful expression, would be closer to what you're looking for.

ronsor · on Jan 6, 2021

At least for certain types of art, sites such as pixiv and danbooru are useful for training ML models: all the images on them are tagged and classified already.

Alex3917 · on Jan 6, 2021

If you type in different plants and animals into GIS, you don’t even get the right species half the time. If GPT-3 has solved this problem, that would be substantially more impressive than drawing the images.

agravier · on Jan 6, 2021

What is GIS? I only know Geographical Information System.

the8472 · on Jan 6, 2021

probably Google Image Search

spyder · on Jan 6, 2021

Yea, with these kind of generative examples, they should always include the closest matches from the training set to see how much it just "copied".

londons_explore · on Jan 8, 2021

It's very hard to define closest...

jonesn11 · on Jan 6, 2021

This is a spot on point. My prediction is that it wouldn't be able to. Given its difficulty to generate correct counts of glasses, it seems as though it still struggles with systematic generalization and compositionality. As a point of reference, cherrypicking aside, it could model obscure but probably well-defined baby daikon radish in tutu walking dog, but couldn't model red on green on blue cubes. Maybe more sequential perception, action, video data or system-2 like paradigm, but it remains to be seen.

adsche · on Jan 6, 2021

Yes, I don't really see impressive language (i.e. GPT3) results here? It seems to morph the images of the nouns in the prompt in an aesthetically-pleasing and almost artifact-free way (very cool!).

But it does not seem 'understand' anything like some other commenters have said. Try '4 glasses on a table' and you will rarely see 4 glasses, even though that is a very well-defined input. I would be more impressed about the language model if it had a working prompt like: "A teapot that does not look like the image prompt."

I think some of these examples trigger some kind of bias, where we think: "Oh wow, that armchair does look like an avocado!" - But morphing an armchair and an avocado will almost always look like both because they have similar shapes. And it does not 'understand' what you called 'object concepts', otherwise it should not produce armchairs where you clearly cannot sit in due to the avocado stone (or stem in the flower-related 'armchairs').

ralfd · on Jan 6, 2021

> I would be slightly more impressed about the language model if it had a working prompt like: "A teapot that does not look like the image prompt."

Slightly? Jesus, you guys are hard to please.

adsche · on Jan 6, 2021

Right, that was unnecessary and I edited it out.

What I meant is that 'not' is in principal an easy keyword to implement 'conservatively'. But yes, having this in a language model has proven to be very hard.

Edit: Can I ask, what do you find impressive about the language model?

dash2 · on Jan 6, 2021

Perhaps the rest of the world is less blasé - rightly or wrongly. I do get reminded of this: https://www.youtube.com/watch?v=oTcAWN5R5-I when I read some comments. I mean... we are telling the computer "draw me a picture of XXX" and it's actually doing it. To me that's utterly incredible.

adsche · on Jan 6, 2021

> "draw me a picture of XXX" and it's actually doing it. To me that's utterly incredible.

Sure, would be, but this is not happening here.

And yes, rest assured, the rest of the world is probably less 'blasé' than I am :) Very evident by the hype around GPT3.

viggity · on Jan 6, 2021

I'm in the open ai beta for GPT-3, and I don't see how to play with DALL-E. Did you actually try "4 glasses on a table"? If so, how? Is there a separate beta? Do you work for open ai?

nicholast · on Jan 6, 2021

In the demonstrations click on the underlined keywords and you can select alternates from dropdown menu.

hanniabu · on Jan 7, 2021

Sounds like the perfect case for a new captcha system. Generate a random phrase to search an image for, show the user those results, ask them to select all images matching that description.