You're right, that should read "codepoint boundary" not "character boundary". I ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

SethMLarson on Feb 8, 2022 | parent | context | favorite | on: How UTF-8 Works

You're right, that should read "codepoint boundary" not "character boundary". I can fix that.

I do briefly mention grapheme clusters near the end, didn't want to introduce them as this article was more about the encoding mechanism itself. Maybe a future article after more research :)

nabla9 on Feb 8, 2022 [–]

Please do. You have the best visualizations of UTF-8 I have seen so far.

Usually people write just the UTF-8 encoding part, then don't mention the rest of the Unicode, because it's clearly not as good and simple.

Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact