A good literary production. I would have been proud of it had I thought of it, b...

visarga · 2025-02-07T04:35:16 1738902916

Well parrots can make more parrots, LLMs can't make their own GPUs. So parrots win, but LLMs can interpolate and even extrapolate a little, have you ever heard a parrot do translation, hearing you say something in English and translating it to Spanish? Yes, LLMs are not parrots. Besides their debatable abilities, they work with human in the loop, which means humans push them outside their original distribution. That's not a parroting act, being able to do more than pattern matching and reproduction.

danielmarkbruce · 2025-02-07T22:00:42 1738965642

LLMs can easily order more GPUs over the internet, hire people to build a datacenter and reproduce.

Or, more simply.. just hack into a bunch of aws accounts, spin up machines, boom.

gsam · 2025-02-07T01:50:23 1738893023

I don't like wading into this debate when semantics are very personal/subjective. But to me, it seems like almost a sleight of hand to add the stochastic part, when actually they're possibly weighted more on the parrot part. Parrots are much more concrete, whereas the term LLM could refer to the general architecture.

The question to me seems: If we expand on this architecture (in some direction, compute, size etc.), will we get something much more powerful? Whereas if you give nature more time to iterate on the parrot, you'd probably still end up with a parrot.

There's a giant impedance mismatch here (time scaling being one). Unless people want to think of parrots being a subset of all animals, and so 'stochastic animal' is what they mean. But then it's really the difference of 'stochastic human' and 'human'. And I don't think people really want to face that particular distinction.

ggm · 2025-02-07T02:29:15 1738895355

"Expand the architecture" .. "get something much more powerful" .. "more dilithium crystals, captain"

Like I said elsewhere in this overall thread, we've been here before. Yes, you do see improvements in larger datasets, weighted models over more inputs. I suggest, I guess I believe (to be more honest) that no amount of "bigger" here will magically produce AGI simply because of the scale effect.

There is no theory behind "more" and that means there is no constructed sense of why, and the absence of abstract inductive reasoning continues to say to me, this stuff isn't making a qualitative leap into emergent anything.

It's just better at being an LLM. Even "show your working " is pointing to complex causal chains, not actual inductive reasoning as I see it.

gsam · 2025-02-07T02:40:31 1738896031

And that's actually a really honest answer. Whereas someone of the opposite opinion might be like parroting in the general copying-template sense actually generalizes to all observable behaviours because templating systems can be turing-complete or something like that. It's templates-all-the-way-down, including complex induction as long as there is a meta-template to match on its symptoms it can be chained on.

Induction is a hard problem, but humans can skip infinite compute time (I don't think we have any reason to believe humans have infinite compute) and still give valid answers. Because there's some (meta)-structure to be exploited.

Architecturally if machines / NN can exploit this same structure is a truer question.

visarga · 2025-02-07T04:40:37 1738903237

> this stuff isn't making a qualitative leap into emergent anything.

The magical missing ingredient here is search. AlphaZero used search to surpass humans, and the whole Alpha family from DeepMind is surprisingly strong, but narrowly targeted. The AlphaProof model uses LLMs and LEAN to solve hard math problems. The same problem solving CoT data is being used by current reasoning models and they have much better results. The missing piece was search.

UniverseHacker · 2025-02-07T01:58:46 1738893526

I'm sure both of you know this, but "stochastic parrot" refers to the title of a research article that contained a particular argument about LLM limitations that had very little to do with parrots.

danielmarkbruce · 2025-02-07T02:20:26 1738894826

The term is much more broadly known than the content of that (rather silly) paper.... I'm not even certain that it's the first use of the term.

ggm · 2025-02-07T02:39:24 1738895964

https://books.google.com/ngrams/graph?content=Stochastic%2C+...

ggm · 2025-02-07T02:40:00 1738896000

And the word "hallucination" ... has very little to do with...

HappMacDonald · 2025-02-08T23:26:18 1739057178

But it's far easier for human parrots to parrot the soundbyte "stochastic parrot" as a thought-terminating cliche.