For backwards-compatibility's sake, where a web page doesn't specify a character set, browsers will assume the predominant pre-Unicode encoding used in your region.
The older the site, the less likely it is that it will have been updated. Therefore, it's reasonable to assume that newer sites will either declare UTF-8, or can be modified to declare UTF-8, while old sites stay the way they always were, pre-UTF-8.
Keeping the backwards-compatibility heuristic the same makes sense.