But one person edited the word foo. The other edited a weird that no longer exists. I get why this feels like a clever combination of the edits. But I'm struggling to see how doing this at the character level makes any sense.
Consider instead that you could do this at the byte level, with equally off results.
At higher levels, this trick sounds useful. But you pick your abstraction height where all conflicts should just go back to the user.
So, people edit the same document, but at different paragraphs? Fine. They edit the same paragraph? Almost certainly a problem. No different than code.
Different strategies have different tradeoffs and work better for different use cases. If you have multiple people interactively editing the same document (i.e. syncs are happening regularly), it can feel more natural to err on the side of applying each user's edits and letting the humans work it out, rather than flagging a conflict that must be resolved before proceeding. When using Google Docs I've had the occasional awkward "after you, no after YOU" moment while trying to edit the same text, but it's pretty rare. You obviously wouldn't want to use this same approach for asynchronous/offline collaboration, where more explicit conflict resolution like a VCS offers is necessary.
The place where this kind of character-level approach actually does start to fall apart is when users can make larger structural changes to the document with single actions -- reordering lists, cutting and pasting chunks of text, etc. There are other options for that.