In my mind it's because you have a more direct route to the underlying physics of the problem than either hand-coding features or trying to learn them from real data (at least when that real data is as poorly managed and controlled as it usually is).
Our ML algorithms are really good at finding correlations -- but we don't necessarily know if the correlations in our data are actually the ones we want our system to learn. When we're using synthetic data, we have many more levers at our disposal to ensure that this is the case.
Our ML algorithms are really good at finding correlations -- but we don't necessarily know if the correlations in our data are actually the ones we want our system to learn. When we're using synthetic data, we have many more levers at our disposal to ensure that this is the case.