When the output looks the same as the original we would say that the simulation was successful. That is how computer games do it. We're not asking for the exact position of each grain, just the general outline of the pile.
An image of something is likely to be the simplest model of that thing that happened, and it has A LOT less information than a 3D model of arbitrary resolution would have.
Simulation is never an "image". It may simulate each grain, just saying it doesn't need to simulate each precisely, because the law of large numbers kicks in.
This is the basis for example Monte Carlo simulation, it simulates real world with random numbers it generates.
Every video game engine is a simulation and many of them are a very simplified model of images of things happening instead of simulating the actual physics. Even "physics" in these engines is often just rendering an image.