Hacker News new | past | comments | ask | show | jobs | submit login

You're right, that should read "codepoint boundary" not "character boundary". I can fix that.

I do briefly mention grapheme clusters near the end, didn't want to introduce them as this article was more about the encoding mechanism itself. Maybe a future article after more research :)




Please do. You have the best visualizations of UTF-8 I have seen so far.

Usually people write just the UTF-8 encoding part, then don't mention the rest of the Unicode, because it's clearly not as good and simple.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: