Hacker News new | past | comments | ask | show | jobs | submit login

Totally agree. An LLM won't be an AGI.

It could be part of an AGI, specifically the human interface part. That's what an LLM is good at. The rest (knowledge oracle, reasoning etc) are just things that kinda work as a side-effect. Other types of AI models are going to be better at that.

It's just that since the masses found that they can talk to an AI like a human they think that it's got human capabilities too. But it's more like fake it till you make it :) An LLM is a professional bullshitter.






> It's just that since the masses found that they can talk to an AI like a human

In a way it's worse: Even the "talking to" part is an illusion, and unfortunately a lot of technical people have trouble remembering it too.

In truth, the LLM is an idiot-savant which dreams up "fitting" additions to a given document. Some humans have prepared a document which is in the form of a a theater-play or a turn-based chat transcript, with a pre-written character that is often described as a helpful robot. Then the humans launch some code that "acts out" any text that looks like it came from that fictional character, and inserts whatever the real-human-user types as dialogue for the document's human-character.

There's zero reason to believe that the LLM is "recognizing itself" in the story, or that is is choosing to self-insert itself into one of the characters. It's not having a conversation. It's not interacting with the world. It's just coded to Make Document Bigger Somehow.

> they think that it's got human capabilities too

Yeah, we easily confuse the character with the author. If I write an obviously-dumb algorithm which slaps together a story, it's still a dumb algorithm no matter how smart the robot in the story is.


Just wanted to point out that the notion of a "document" is also an illusion to the LLM. It's processing a sequence of low dimensional spaces into another sequence of low dimensional spaces. The input spaces preserve aspects of content similarity based on co-occurrence. The model learns to transform these spaces into higher order spaces based on the outcome of training.

You couldn't say that the model has a singular sense of self, but it certainly has been trained on data that allows it to mimic it in short spurts, and mimicry is what humans do to learn more complex/abstract tasks. The training goal is not to learn how to "be", but rather to learn how to "do" the parts necessary to continue existing.

"Fake it till you make it" is really all that's required to exist in the world.


For humans, the notion of “self” is also an illusion. We’re just atoms dancing to physics. But that’s not the right level of abstraction from which to do psychology.

Is the notion of an LLM operating on a document useful for building an intuition about how the LLM operates? That’s the more interesting question.


I’d say it is, although it’s not really a document but a stream of tokens. The intuition is that the model is able to build abstractions from that stream. Some of the abstractions are related to goal states of the training metric. However, it’s easy to imagine how some of the goal states pertain to the model’s survival in the “wild”, and part of the abstraction that the model captures for this goal is a reflection of itself.

> In truth, the LLM is an idiot-savant which dreams up "fitting" additions to a given document.

Tbh I'm not too sure that my brain works fundamentally differently. I'm an idiot-savant who responds to stimuli.


Do you know the number of atoms in my coffee cup right now?

If you say "No, I don't" then you're already way ahead of any LLM, even the biggest ones. Remember, I asked about how many are in there right now. The only correct answer is "I don't know", because I haven't even told you whether my mug is full or empty.

Here's copilot's answer:

    how many hydrogen atoms are there in my coffee cup right now
    Copilot said:

    That's an interesting question! The number of hydrogen atoms in your coffee cup depends on the volume of coffee and its composition.

    Let's assume you have a standard coffee cup that holds about 240 milliliters (8 ounces) of coffee. Coffee is mostly water, so we can approximate the number of hydrogen atoms based on the water content.

    Water (H₂O) has 2 hydrogen atoms per molecule. The molar mass of water is approximately 18 grams per mole, and 1 mole of water contains Avogadro's number of molecules, which is about (6.022 \times 10^{23}) molecules.

    1 milliliter of water weighs about 1 gram, so 240 milliliters of water weighs 240 grams. This is approximately ( \frac{240}{18} \approx 13.33 ) moles of water.

    Therefore, the number of water molecules in 240 milliliters of water is: [ 13.33 \times 6.022 \times 10^{23} \approx 8.02 \times 10^{24} \text{ molecules} ]

    Since each water molecule has 2 hydrogen atoms, the total number of hydrogen atoms is: [ 2 \times 8.02 \times 10^{24} \approx 1.60 \times 10^{25} \text{ hydrogen atoms} ]

    So, there are approximately (1.60 \times 10^{25}) hydrogen atoms in a standard coffee cup of 240 milliliters. Keep in mind this is a rough estimate, as the actual number can vary based on the exact composition of your coffee.
If I wanted an explanation of how to approximate it, that's what I would have asked for. You could very probably figure that much out.

No, this is the correct answer. This is the answer I'd be looking for if I was asking that question. That you failed to specify the question is a failure on your part.

Even stupid models understand that if I ask them the population of Denmark they only need to give rough approximation.


no, i expect the system to answer the question I asked. Not the question it thinks I wanted to ask. The question is not underspecified, because the point of it is to demonstrate how the llm will never tell you it doesn't know. Nor will it ever tell you your prompt is underspecified.

I am not sure what you mean by LLM when you say they are professional bullshitter. While it was certainly true for model based on transformers just doing inference, recent models have progressed significantly.

> I am not sure what you mean by LLM when you say they are professional bullshitter.

Not parent-poster, but an LLM is a tool for extending a document by choosing whatever statistically-seems-right based on other documents, and it does so with no consideration of worldly facts and no modeling of logical prepositions or contradictions. (Which also relates to math problems.) If it has been fed on documents with logic puzzles and prior tests, it may give plausible answers, but tweaking the test to avoid the pattern-marching can still reveal that it was a sham.

The word "bullshit" is appropriate because human bullshitter is someone who picks whatever "seems right" with no particular relation to facts or logical consistency. It just doesn't matter to them. Meanwhile, a "liar" can actually have a harder job, since they must track what is/isn't true and craft a story that is as internally-consistent as possible.

Adding more parts around and LLM won't change that: Even if you add some external sensors, a calculator, a SAT solver, etc. to create a document with facts in it, once you ask the LLM to make the document bigger, it's going to be bullshitting the additions.


I think the problem is the way you are phrasing your argument implies the LLM is always wrong. Consider a simple prompt: "Write a hello world in Python."

Every LLM i've tested gets this correct. In my mind, it can't be both bullshit and correct.

I would argue that the amount of real bullshit returned from an LLM is correlated to the amount of bullshit you give it. Garbage in, garbage out.

In the end, its irrelevant if its a statistical engine or whatever semantics we want to use (glorified autocomplete). If it solved my problem in less time than I perceive I would have solved it without it, bullshit isn't the word I would use to describe the outputs.

In all fairness though, I do get some bullshit responses.


It only gives you the statistically more likely way a conversation would evolve after one party says "Write a hello world in Python." It only happens to be the correct one.

If I ask a 5yo "42 * 21 equals...?" and the kid replies with a random number, say, "882", and gets it right, it does not mean that the kid knows what multiplication is or how it works.


ChatGPT can use a stateful python environment to do math. It isn’t confabulating the answers, it’s using a calculator.

I mean that's just confabulating the next token with extra steps... ime it does get those wrong sometimes. I imagine there's an extra internal step to validate the syntax there.

I'm not arguing for or against anything specifically, I just want to note that in practice I assume that to the LLM it's just a bunch of repeating prompts with the entire convo, and after outputting special 'signifier' tokens, the llm just suddenly gets a prompt that has the results of the program that was executed in an environment. for all we know various prompts were involved in setting up that environment too, but I suspect not.


> In my mind, it can't be both bullshit and correct.

It's easy for bullshitters to say some true things, but it doesn't change the nature of the process that got the results. Ex:

________

Person A: "The ghost of my dead gerbil whispers unto me the secrets of the universe, and I am hearing that the local volcano will not erupt today."

Person B: "Bullshit."

[24 hours later]

Person A: "See? I was correct! I demand an apology for your unjustified comment."


> it does so with no consideration of worldly facts

Why don't you consider its training set (usually the entire internet, basically) worldly facts? It's true that the training set can contain contradictory facts, but usually an LLM can recognize these contradictions and provide analysis of the different viewpoints. I don't see how this is much different from what humans can do with documents.

The difference is that humans can do their own experiments and observations in the real world to verify or dismiss things they read. Providing an LLM with tools can, in a limited way, allow an LLM to do the same.

Ultimately its knowledge is limited by its training set and the 'external' observations it can make, but this is true of all agents, no?


LLMs are trained with a data which may contain both truthful and false information.

But at inference time it’s not referring to that data at all. Some of the data is aliased and encoded in the model’s weights, but we’re not sure exactly what’s encoded.

It may very well be that vague concepts (like man, woman, animal, unhealthy) are encoded, but not details themselves.

Further, at inference time, there is no kind of “referencing” step. We’ve just seen that they can sometimes repeat text they were trained on, but sometimes they just don’t.

The LLM based systems you’re probably using do some RAG work to insert relevant information in the LLM’s context. This context still is not being referred to per se. An LLM might have a document that says the sky is red, but still insist that it’s blue (or vice versa)

So while the info an LLM may have available is limited by its training data and the RAG system around it, none of that is guaranteed at inference time.

There’s always a significant chance for the LLM to make up bullshit.


> The word "bullshit" is appropriate because human bullshitter is someone who picks whatever "seems right" with no particular relation to facts or logical consistency.

Not quite true - this is true for your random bullshitter, but professional bullshitters do, in fact, care for the impression of logical consistency and do have a grip on basic facts (if only so they can handwave them more effectively). As such, LLMs are definitely not yet pros at bullshitting :)


Tell me you haven’t used the latest models, without telling me you haven’t used the latest models?

They do hallucinate at times, but you’re missing a lot of real utility by claiming they are basically bullshit engines.

They can now use tools, and maintain internal consistency over long context windows (with both text and video). They can iterate fully autonomously on software development by building, testing, and bug fixing on real world problems producing usable & functioning code.

There’s a reason Microsoft is putting $80 billion dollars on the line to run LLMs. It’s not because they are full of shit!


Meta put $45 Billion into the Metaverse... so how much virtual real estate do you own?

It's true, they're very convincing bullshitters ;)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: