Hacker News new | past | comments | ask | show | jobs | submit login

The readme.md has some image examples. If I interpret them correctly, the first one is the input image, the second the segmented output, and the rest are example outputs using the prompt text shown above the collage.

The visual quality of the output images is not particularly impressive compared to what we've become used to.

What (IMO) it attempts to showcase is how the input image segmentation is used to guide the final image generation. That part is quite impressive. The shapes, and "segments" are very well preserved from input to output.




"The human prompt and BLIP2 generated prompt build the text instruction." The examples only show the BLIP2 prompt.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: