Just curious, but what "error" did the original tweeter make? Did anyone really expect the model to accurately reconstruct the original photo starting from a pixelated mess? That makes no sense to anyone with even a passing knowledge of ML. You're always going to get craploads of bias and variance (i.e. blatant inaccuracy, over and above the bias) in such a setting, even starting from "ideal, unbiased" data. The problem domain is at issue here.
Yeah I get your point. But I guess for this model you can kinda have a concept of the "ideal" training set, where all high frequency features appear at the same rate as in real world.