Keywords are quite easy: just plug your identifiers into the hash table after th...

Keywords are quite easy: just plug your identifiers into the hash table after their boundaries are detected. Again, it is much more expensive than generating a FSM, but the expense is really in the noise when everything else is considered.

As for incremental lexing, you need to tell if your tokens pre-existed as the same kind (not necessarily the same string!) before the edit or not. It would be trivial to add this to a generator, but how would it then feedback the signals needed to take advantage of the memoization (e.g. by providing a persistent token ID that can unlock pre-existing information about the token). There are simply no standards for that.

In most of the professionally written compilers (e.g. scalac) I've worked on, lexer and parser generators aren't even used, and it really isn't that big of a deal to write these in code; you also get the benefit that the same language is being used. This becomes especially true when IDE services are considered, whereas most generators are pretty much limited to batch applications.

> I have also written my own hashtable for many reasons, but I would never sit down with an array literal and manually put the entries in there by hand: even if I don't make any mistakes, it is a pointless waste of my time that is trivially automatable using a computer.

I see your point, but it really depends on the key space you are optimizing for. You might just put the elements in by hand if a generic algorithm isn't really called for.