Isn't it quite a leap to go from single image to usable 3DGS model? The editing part seems relatively minor step afterwards. I thought that 3DGS typically required multiple viewpoints, like photogrammetry.
It's not "real" 3D -- the model doesn't infer anything about unseen portions of the image. They get 3D-embedded splats out of their pipeline, and then can do cool things with them. But those splats represent a 2D image, without inferring (literally or figuratively) anything about hidden parts of the image.
This is what I initially thought, however, I have already witnessed working demoes of 3DGS when using a single viewpoint, but armed with additional auxiliary data that is contextual relevant to the subject.
Yeah exactly, this page doesn't explain what's going on at all.
It says it uses a mirror image to do a Gaussian splat. How does that infer any kind of 3D geometry? An image and its mirror are explainable by a simple plane and that's probably what the splat will converge to if given only those 2 images.