> It should be noted that Unicode uses 2 bytes for each character. But you're pr...

Arkanosis · on Aug 31, 2018

Just a reminder BTW that since version 2.0 (1996), Unicode is not an encoding scheme but a character set (I avoid the confusing “charset” word on purpose). Therefore, Unicode does not use any number of bytes: it only assigns code points to characters.

Windows used to use the UCS-2 encoding scheme which indeed used 2 bytes for each character, but since Windows 2000, it uses UTF-16 instead, which like UTF-8 uses a variable number of bytes per character.

_m96l · on Aug 31, 2018

Indeed. "Unicode" is an abstract character set, it doesn't "use" any bytes. A specific encoding does.

jhomedall · on Aug 31, 2018

Even with UTF-16 that quote is incorrect, due to surrogate pairs. It's only correct for UCS-2, and even then, only if you take 'characters' to mean 'codepoints', and take 'Unicode' to mean 'a specific Unicode encoding'.