Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To start with, we know what a person is, rudimentary things about how they behave, our senses, how they commonly work, and can do mental comparisons (reality checks).

We know LLM’s don’t start with that because we initialize them with zero’d or random weights.

Then, their training data can be far more made up, even works of fiction, that the reality most humans observe with is almost always real. We could raise a human in VR or something where technically there would be comparisons. Most humans’ observations connect to expectations in their brain which was designed for the reality we operate in.

Finally, the brain has multiple components that each handle different jobs. They have different architectures to do those jobs well. Sections include language, numbers, reasoning, tiers of memory, hallucination prevention, mirroring, and even meta-level stuff like reward adjustment. We don’t just speculate that they do different things: damage to those regions shuts down those abilities. Tied to the physical realm are vision, motor, and spatial areas. We can feel objects, even temperature or pressure changes. That we can do a lot of that without supervised learning shows we’re tailor-made by God for success in this world.

LLM’s have one architecture that does one job which we try to get to do other things, like reasoning or ground truth. We pretend it’s something it’s not. The multimodal LLM’s are getting closer with specialized components. Even they aren’t all trained in a coherent way using real-world, observations in the senses. There’s usually a gap between systems like these and what the brain does in the real world just in how it gets reliable information about its operating environment.



> To start with, we know what a person is, rudimentary things about how they behave, our senses, how they commonly work, and can do mental comparisons (reality checks).

How much is this a matter of fidelity? LLMs started with text, now text + vision + sound; it's still not the full package relative to what humans sport, but it captures a good chunk of information.

Now, I'm not claiming equivalence in the training process here, but let's remember that we all spend the first year or two of our lives just figuring out the intuitive basics of "what a person is, rudimentary things about how they behave, our senses, how they commonly work", and from there, we spend the next couple years learning more explicit and complex aspects of the same. We don't start with any of it hardcoded (and what little we have, it's been bestowed to us by millennia of a much slower gradient-descent process - evolution).

> LLM’s have one architecture that does one job which we try to get to do other things, like reasoning or ground truth.

FWIW, LLMs have one architecture in a similar sense brain has one architecture - brains specialize as they grow. We know that parts of a brain are happy to pick up the slack for differently specialized parts that became damaged or unavailable.

LLMs aren't uniform blobs, either. Now, their architecture is still limited - for one, unlike our brains, they don't learn on-line - they get pre-trained and remain fixed for inference. How much a model capable of on-line learning will differ structurally from current LLMs, or even the naive approach to bestow learning ability on LLMs (i.e. do a little evaluation and training after every conversation)? We don't know yet.

I'm definitely not arguing LLMs of today are structurally or functionally equivalent to humans. But I am arguing that learning from sum total of the Internet isn't meaningfully different from how humans learn, at least for anything that we'd consider part of living in a technological society. I.e. LLMs don't get to experience throwing rocks first-hand like we do, but neither of us get to experience special relativity.

> Even they aren’t all trained in a coherent way using real-world, observations in the senses.

Neither them nor us. I think if there's one insight people should've gotten from the past couple years is that "mostly coherent" data is fine (particularly if any given subset is internally coherent, even if there's little coherence between different subsets) - both humans and LLMs can find larger coherence if you give them enough such data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: