Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, I don't really see impressive language (i.e. GPT3) results here? It seems to morph the images of the nouns in the prompt in an aesthetically-pleasing and almost artifact-free way (very cool!).

But it does not seem 'understand' anything like some other commenters have said. Try '4 glasses on a table' and you will rarely see 4 glasses, even though that is a very well-defined input. I would be more impressed about the language model if it had a working prompt like: "A teapot that does not look like the image prompt."

I think some of these examples trigger some kind of bias, where we think: "Oh wow, that armchair does look like an avocado!" - But morphing an armchair and an avocado will almost always look like both because they have similar shapes. And it does not 'understand' what you called 'object concepts', otherwise it should not produce armchairs where you clearly cannot sit in due to the avocado stone (or stem in the flower-related 'armchairs').




> I would be slightly more impressed about the language model if it had a working prompt like: "A teapot that does not look like the image prompt."

Slightly? Jesus, you guys are hard to please.


Right, that was unnecessary and I edited it out.

What I meant is that 'not' is in principal an easy keyword to implement 'conservatively'. But yes, having this in a language model has proven to be very hard.

Edit: Can I ask, what do you find impressive about the language model?


Perhaps the rest of the world is less blasé - rightly or wrongly. I do get reminded of this: https://www.youtube.com/watch?v=oTcAWN5R5-I when I read some comments. I mean... we are telling the computer "draw me a picture of XXX" and it's actually doing it. To me that's utterly incredible.


> "draw me a picture of XXX" and it's actually doing it. To me that's utterly incredible.

Sure, would be, but this is not happening here.

And yes, rest assured, the rest of the world is probably less 'blasé' than I am :) Very evident by the hype around GPT3.


I'm in the open ai beta for GPT-3, and I don't see how to play with DALL-E. Did you actually try "4 glasses on a table"? If so, how? Is there a separate beta? Do you work for open ai?


In the demonstrations click on the underlined keywords and you can select alternates from dropdown menu.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: