For backwards-compatibility's sake, where a web page doesn't specify a character...

scrollaway · on June 20, 2016

Why is this still the case? UTF8 is dominant now, wouldn't it make more sense to assume UTF8?

niftich · on June 20, 2016

The older the site, the less likely it is that it will have been updated. Therefore, it's reasonable to assume that newer sites will either declare UTF-8, or can be modified to declare UTF-8, while old sites stay the way they always were, pre-UTF-8.

Keeping the backwards-compatibility heuristic the same makes sense.

TazeTSchnitzel · on June 21, 2016

Old sites lacked encoding declarations, and old browsers (e.g. early versions of IE) didn't support them.

Sites that want UTF-8 can ask for it.