> For one, humans can integrate absolutely gargantuan amounts of information extremely efficiently.
What we can integrate, we seem to integrate efficiently*; but compared to the quantities used to train AI, we humans may as well be literally vegetables.
* though people do argue about exactly how much input we get from vision etc., personally I doubt vision input is important to general human intelligence, because if it was then people born blind would have intellectual development difficulties that I've never heard suggested exist — David Blunket's success says human intelligence isn't just fine-tuning on top of a massive vision-grounded model.
Low level details like that aren’t relevant to this discussion. Most human processing power is at the cellular level. The amount of processing power in a single finger literally dwarfs a modern data center, but we can’t leverage that to think only live.
So it’s not a question of ‘a lot’ it’s a question of orders of magnitude vs “the quantities used to train AI”
Library of congress has what 39 million books, tokenize every single one and you’re talking terabytes of training data for an LLM. We can toss blog posts etc to that pile but every word ever written by a person isn’t 20 orders of magnitude larger or anything.
>Hearing is also well into the terabytes worth of information per year.
If we assume that the human auditory system is equivalent to uncompressed digital recording, sure. Actual neural coding is much more efficient, so the amount of data that is meaningfully processed after multiple stages of filtering and compression is plausibly on the order of tens of gigabytes per year; the amount actually retained is plausibly in the tens of megabytes.
Don't get me wrong, the human brain is hugely impressive, but we're heavily reliant on very lossy sensory mechanisms. A few rounds of Kim's Game will powerfully reveal just how much of what we perceive is instantly discarded, even when we're paying close attention.
The sensory information form individual hairs in the ear start off with a lot more data to process than simple digital encoding of two audio streams.
Neural encoding isn’t particularly efficient from a pure data standpoint just an energy standpoint. A given neuron not firing is information and those nerve bundles contain a lot of neurons.
Is that a positive thing? If anything I would consider that as the reverse - LLMs have the "intelligence of vegetables" because even with literally the whole of human written knowledge they can at most regurgitate that back to us with no novelty whatsoever, even though a 2 years old with a not even matured brain can learn a human language from orderS of magnitude less and lower quality input from a couple of people only.
But any Nobel-price winner has read significantly less than a basic LLM, and we see no LLM doing any tiny scientific achievement, let alone that high impact ones.
It's perfectly legit to call these models "thick" because they *need* to read such a vast quantity of text that a human would literally spend two thousand lifetimes to go through it even if that was all the human did with their days.
It also remains the case that, unlike us, they can go through all of that in a few months.
> with no novelty whatsoever, even though a 2 years old with a not even matured brain can learn a human language from orderS of magnitude less and lower quality input from a couple of people only.
You're either grossly underestimating AI or overestimating 2 year olds, possibly both.
I just about remember being a toddler, somewhere between then and 5 was around the age I had the idea that everyone got an invisible extra brain floating next to them for every year they lived. Took me an embarrassingly long time (teens, IIRC) to realise that the witch-duck-weight-comparison scene in Monty Python and the Holy Grail wasn't a documentary, thanks to the part of the film captioned "Famous Historian". One time my dad fell ill, and he was talking to mum about "the tissue being damaged" while I was present, so I gave him a handkerchief (AKA "a tissue"). And while I don't remember this directly, my mum's anecdotes include me saying "fetrol fump", waving a spoon in a jam pan and calling this act "spelling", and when discovered running around with my pockets inside-out explaining myself as trying to fly because I apparently thought that the lining of a pocket was called a "wing".
When it comes to human novelty, I also quite often find there's a lot of remixing going on that just isn't immediately apparent. As Steve Jobs apparently once said, “Good artists copy; great artists steal.”, except Jobs stole that quote from Picasso.
It's easy to categorise different levels with AI, but which one of these counts as "novelty", and how often do humans ever achieve each of these grades?
0. Memorisation of the training set. Think: bunch of pictures, pick best fit.
1. Linear interpolation between any pair of elements in the training set. Think: simple cross-fade between any two pictures, but no tracking or distorting of features during that fade.
2. Let the training set form a basis vector space, and interpolate freely within the constraints of the examples. Think: if these pictures are faces, it would make any hair colour between the most extreme limits shown, etc.
3. Extrapolate beyond the examples. Think: Even if no black or white hair was visible, so long as several shades of grey were, it could reach the ideas of black or white hair.
4. Invent a new vector. Think: even if it had been trained only on black-and-white images, it could still invent green hair.
> But any Nobel-price winner has read significantly less than a basic LLM, and we see no LLM doing any tiny scientific achievement, let alone that high impact ones.
We do see them doing *tiny* scientific achievements, with extra emphasis on "tiny". Just like with using them in software, even the best "only" act like fresh graduates.
When any AI gets to high-impact… the following (fictional) quote comes to mind: "as soon as we started thinking for you, it really became our civilization."
> that a human would literally spend two thousand lifetimes to go through it even if that was all the human did with their days.
Well, `cp` would go over that data even faster, but depending on what retention/conclusion is reached from that it may or may not be impressive.
Humans are fundamentally limited by our biology, and rotating a tiny sphere and turning pages and serial processing does make certain hard limits on us.
A two years old can definitely say stupid stuff, or have wildly incomplete/incorrect models of their reality, but can most certainly already think and reason, and update their internal models at any point.
> Tiny scientific achievements, only acting as fresh graduates with regards to software
I don't believe they are anywhere close to being as good at software as a fresh graduate. Sure, many people write terrible code, and there are a lot of already solved problems out there (not even just solved, but solved thousands times) - LLMs are definitely a novel tool when it comes to finding information based on some high-ish level patterns (over exact string match, or fuzzy match), and they are very good at transforming between different representations of said data, with minimal (and hard limited) reasoning capabilities, but I have never seen evidence of going any further than that.
I don't think your grades are "correct" - e.g. a random generator can easily create new vectors, but I wouldn't call that intelligence. Meanwhile, that two years old can do a novel discovery from their POV every couple of day, potentially turning around their whole world model each day. To me, that sounds way "cooler" than a statistically likely token given these previous tokens, and LLMs definitely need some further structure/architecture to beat humans.
--
I do like your last quote though, and definitely agree there!
> Well, `cp` would go over that data even faster, but depending on what retention/conclusion is reached from that it may or may not be impressive.
Sure, but it would be a level zero on that list, right?
I'd say even Google would be #0.
> A two years old can definitely say stupid stuff, or have wildly incomplete/incorrect models of their reality, but can most certainly already think and reason, and update their internal models at any point.
I think that this presumes a certain definition of "think" and "reason". Monsters under the bed? To move from concrete examples to the abstract, from four apples to the idea of four?
Imagine a picture of a moon's orbit around the parent planet and the planet's orbit around a star, first at one time of year, then again 60° later, the circular orbits of each drawn clearly, with the two positions of the moon's orbits aligned at the top of the image; exaggerate the scale for clarity, and find it in an astronomy book — my peers at age 6 or 7 thought it was a picture of a mouse.
Imagine teachers and an ambulance crew explaining to the class how blood is donated, showing that they're putting a bag up the teachers sleeves and explaining how they'll demonstrate this by taking "blood" (fake? No idea at this point) from that bag. Everyone's looking, we see it go up the sleeve. We see the red stuff come out. Kid next to me screams "they're killing her!". Rather than say "we literally saw the bag go up the sleeve", 5-year-old-me tried to argue on the basis that killing a teacher in front of us was unlikely — not wrong, per say, but a strange argument and I wondered even at the time why I made it.
Are these examples of "reason"? Could be. But, while I would say that we get to the "children say funny things" *with far fewer examples than the best AI*, it doesn't seem different in kind to what AI does.
> LLMs are definitely a novel tool when it comes to finding information based on some high-ish level patterns (over exact string match, or fuzzy match), and they are very good at transforming between different representations of said data, with minimal (and hard limited) reasoning capabilities, but I have never seen evidence of going any further than that.
Aye. So, where I'm going with #2 and #3: even knowing what the question means well enough to respond by appropriately gluing together a few existing documents correctly, requires the AI to have created a vector space of meaning from the words — the sort of thing which word2vec did. But:
To be able to translate questions into answers when neither the question nor the answer are themselves literally in the training set, requires at least #2. (If it was #1, you might see it transition from "Elizabeth II was Queen of the UK" to "Felipe VI is King of Spain" via a mid-point of "Macron is Monarch of France").
For #3, I've tried the concrete example of getting ChatGPT (free model a few months back now) to take the concept of the difference between a racoon and a wolf and apply this difference again on top of a wolf, and… well, their combination of LLM and image generator gave me what looked like a greyhound, so I'm *not* convinced that OpenAI's models demonstrate this in normal use — but also, I've seen this kind of thing demonstrated with other models (including Anthropic, so it's not a limit of the Transformer architecture) and the models seem to do more interesting things.
Possibly sample bias, I am aware of the risk of being subject to a Clever Hans effect.
For #4, this seems hard to be sure it has happened when it seems to have happened. I don't mean what word2vec does, which I realise now could be described in similar language, as what word2vec does is kinda a precursor to anything at least #1. Rather, what I mean, in a human, would seem like "spots a black swan before it happens". I think the invention of non-Euclidian geometry might count, but even then I'm not sure.
What we can integrate, we seem to integrate efficiently*; but compared to the quantities used to train AI, we humans may as well be literally vegetables.
* though people do argue about exactly how much input we get from vision etc., personally I doubt vision input is important to general human intelligence, because if it was then people born blind would have intellectual development difficulties that I've never heard suggested exist — David Blunket's success says human intelligence isn't just fine-tuning on top of a massive vision-grounded model.