Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I would argue that for most programs when you're doing string manipulation you're doing it for internal programming reasons - logs, error messages, etc. In that case you are in nearly full control of the strings and therefore can declare that you're only working with ASCII.

Why would you argue that? In my experience it's about formatting things that are addressed to the user, where the hardest and most annoying localization problems matter a lot. That includes sorting the last name "van den Berg" just after "Bakker", stylizing it as "Berg, van den", and making sure this capitalization is correct and not "Van Den Berg". There is no built in standard library function in any language that does any of that. It's so much larger than ascii and even larger than unicode.

Another user said that the main takeaway is that you can't process strings until you know their language (locale), and that is exactly correct.



I would maintain that your program has more string manipulation for error messages and logging than for generating localised formatted names.

Further I do say that if you're creating text for presenting to the user then the most common operation would be replacement of some field in pre-defined text.

In your case I would design it so that the correctly capitalised first name, surname, and variations of those for sorting would be generated at the data entry point (manually or automatically) and then just used when needed in user facing text generation. Therefore the only string operation needed would be replacement of placeholders like the fmt and standard library provide. This uses more memory and storage but these are cheaper now.


I agree, but the logging formatters don't really do much beyond trivially pasting in placeholders.

And as for data entry... Maybe in an ideal world. In the current world, marred by importing previously mangled datasets, a common solution in the few companies I've worked at is to just not do anything, which leaves ugly edges, yet is "good enough".




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: