Hacker News new | past | comments | ask | show | jobs | submit login

I liked the paper, and think what they’re doing is interesting. So, I’m less negative than you are about this, I think. To a certain extent, saying writing a full sentence with at least one good candidate rhyme isn’t “planning” and is instead “maintaining multiple candidates” seems like nearly semantic tautology to me.

That said, what you said made me think some follow-up reporting that would be interesting would be looking at the top 20 or so probability second lines based on adjusting the rabbit / green state. It seems to me like we’d get more insight into how the model is thinking, and it would be relatively easy to parse for humans. You could run through a bunch of completions until you get 20 different words as the terminal rhyme word, then show candidate lines with percentages of time the rhyme word is chosen as the sort, perhaps.




Like you, I also find the paper's findings interesting. I'm not arguing that LLMs lack the ability to "think" (mechanically), but rather expressing concern that by choosing the word "thinking" in the paper, LLMs might become anthropomorphized in ways they shouldn't be.

I believe this phenomenon occurs because high-performance LLMs have probability distributions of future words already reflected in their neural networks, resulting in increased output values of LLM neurons (activation functions). It's something that happens during the process of predicting probability distributions for the next or future output token dictionaries.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: