if you are talking about latent diffusion, No, the "particle" is in a hyper-dimensional space, like for example a 10k-dimensional space. We are not supposed to interpret the meaning of that vector.
and when that particle has moved to the right location, there is a decoder that converts it into an image. The decoder network knows how to interpret it.
Does this mean that you need this vector state in order to generate the next step? Ie I can't take an image (the pixels) and the prompt and run a few more steps on it?
and when that particle has moved to the right location, there is a decoder that converts it into an image. The decoder network knows how to interpret it.