Hacker News new | past | comments | ask | show | jobs | submit login

> if we asked an LLM to produce an image of a "human woman photorealistic" it produces result

Large language models don't do that. You'd want an image model.

Or did you mean "multi-model AI system" rather than "LLM"?




It might be possible for a language model to paint a photorealistic picture though.


It is not.

You are confusing LLM:s with Generative AI.


No, I'm not confusing it. I realize that LLMs sometimes connect with diffusion models to produce images. I'm talking about language models actually describing pixel data of the image.


Can an LLM use tools like humans do? Could it use an image model as a tool to query the image?


No, a LLM is a Large Language Model.

It can language.


You could teach it to emit patterns that (through other code) invoke tools, and loop the results back to the LLM.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: