Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you rarely got to see letters and just saw fragments of words as something like Chinese characters (tokens), could you count the R's in arbitrary words well?

The bigger issue is LLMs still need way way more data than humans get tons what they do. But they also have many less parameters than the human brain.



> If you rarely got to see letters and just saw fragments of words as something like Chinese characters (tokens), could you count the R's in arbitrary words well?

While this seems correct, I'm sure I tried this when it was novel and observed that it could split the word into separate letters and then still count them wrong, which suggested something weird is happening internally.

I just now tried to repeat this, and it now counts the "r"'s in "strawberry" correctly (presumably enough examples of this specifically on the internet now?), but I did find it making the equivalent mistake with a German word (https://chatgpt.com/share/6859289d-f56c-8011-b253-eccd3cecee...):

  How many "n"'s are in "Brennnessel"?
But even then, having it spell the word out first, fixed it: https://chatgpt.com/share/685928bc-be58-8011-9a15-44886bb522...


Counting letters is such a dull test. LLMs generally have a hard time with this question because letters are tokenized before they receive them, and they have to go through an involved reasoning process to figure it out. It's like asking a color blind person what color the street light is, and declaring him unintelligent because he sometimes gets the answer wrong.


I mean, if you don’t want to include tests that LLMs are, by definition, bad at, why don’t we do the same thing for humans?


Because this is a gotcha test and they are trained on a different tokenized transform of the data that makes it hard. As a system they can output it as a string and have an interpreter, as part of the system, count the letters. So no real meaningful deficiency except they maybe don't waste system prompt space to tell it to do that, as a choice.


"tons what they" autocorrected from "to do what they do."

"Paucity of the stimulus" is the term for what I'm talking about with the brain needing much less data, but beyond just more parameters we may have innate language processing that isn't there in other animals; Chomsky has been kind of relegated away now after LLMs but he may still have been right if it isn't just parameter count and or the innate thing different from animals isn't something like transformers. If you look at the modern language program in Chomsky's later years, it does have some remarkably similar things to transformers: permutation independent internal representation, and the merge operation being very similar to transformer's soft max. It's kind of describing something very like single head attention.

We know animals have rich innate neural abilities beyond just beating the heart and breathing etc.: a baby horse can be blind folded from birth, several days later blind fold taken off and it can immediately walk and navigate. Further development goes on, but other animals like cats have a visual system that doesn't seem to develop at all if it doesn't get natural stimulus in a critical early period. Something like that may apply to human language, it may be multiple systems missing from other apes and early hominids, but whatever it is we don't think it had many generations to evolve. Researchers have identified circuits in songbird brains that are also in humans but not apes, and something like that may be a piece of it for tracking sequences.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: