As excited as I am by this, I still feel like this is still just a small approximation of a small chunk of human reasoning ability at large. o3 (and whatever comes next) feels to me like it will head down the path of being a reasoning coprocessor for various tasks.
My personal 5 cents is that reasoning will be there when LLM gives you some kind of outcome and then when questioned about it can explain every bit of result it produced.
For example, if we asked an LLM to produce an image of a "human woman photorealistic" it produces result. After that you should be able to ask it "tell me about its background" and it should be able to explain "Since user didn't specify background in the query I randomly decided to draw her standing in front of a fantasy background of Amsterdam iconic houses. Usually Amsterdam houses are 3 stories tall, attached to each other and 10 meters wide. Amsterdam houses usually have cranes on the top floor, which help to bring goods to the top floor since doors are too narrow for any object wider than 1m. The woman stands in front of the houses approximately 25 meters in front of them. She is 1,59m tall, which gives us correct perspective. It is 11:16am of August 22nd which I used to calculate correct position of the sun and align all shadows according to projected lighting conditions. The color of her skin is set at RGB:xxxxxx randomly" etc.
And it is not too much to ask LLMs for it. LLMs have access to all the information above as they read all the internet. So there is definitely a description of Amsterdam architecture, what a human body looks like or how to correctly estimate time of day based on shadows (and vise versa). The only thing missing is logic that connects all this information and which is applied correctly to generate final image.
I like to think about LLMs as a fancy genius compressing engines. They took all the information in the internet, compressed it and are able to cleverly query this information for end user. It is a tremendously valuable thing, but if intelligence emerges out of it - not sure. Digital information doesn't necessarily contain everything needed to understand how it was generated and why.
I see two approaches for explaining the outcome:
1. Reasoning back on the result and justifying it.
2. Explainability - somehow justifying by looking at which neurons have been called.
The first could lead to lying. E.g. think of a high schooler explaining copied homework.
While the second one does indeed access the paths influencing the decision, but is a hard task due to the inherent way neural networks work.
No, I'm not confusing it. I realize that LLMs sometimes connect with diffusion models to produce images. I'm talking about language models actually describing pixel data of the image.
I think it's hard to enumerate the unknown, but I'd personally love to see how models like this perform on things like word problems where you introduce red herrings. Right now, LLMs at large tend to struggle mightily to understand when some of the given information is not only irrelevant, but may explicitly serve to distract from the real problem.
That’s not inability to reason though, that’s having a social context.
Humans also don’t tend to operate in a rigorously logical mode and understand that math word problems are an exception where the language may be adversarial: they’re trained for that special context in school. If you tell the LLM that social context, eg that language may be deceptive, their “mistakes” disappear.
What you’re actually measuring is the LLM defaults to assuming you misspoke trying to include relevant information rather than that you were trying to trick it — which is the social context you’d expect when trained on general chat interactions.
LLMs are still bound to a prompting session. They can't form long term memories, can't ponder on it and can't develop experience. They have no cognitive architecture.
'Agents' (i.e. workflows intermingling code and calls to LLMs) are still a thing (as shown by the fact there is a post by anthropic on this subject on the front page right now) and they are very hard to build.
Consequence of that for instance: it's not possible to have a LLM explore exhaustively a topic.
LLMs don’t, but who said AGI should come from LLMs alone. When I ask ChatGPT about something “we” worked on months ago, it “remembers” and can continue on the conversation with that history in mind.
I’d say, humans are also bound to promoting sessions in that way.
Last time I used ChatGPT 'memory' feature it got full very quickly. It remembered my name, my dog's name and a couple tobacco casing recipes he came up with. OpenAI doesn't seem to be using embeddings and a vector database, just text snippets it injects in every conversation. Because RAG is too brittle ? The same problem arises when composing LLM calls. Efficient and robust workflows are those whose prompts and/or DAG were obtained via optimization techniques. Hence DSPy.
Consider the following use case: keeping a swimming pool water clean. I can have a long running conversation with a LLM to guide me in getting it right. However I can't have a LLM handle the problem autonomously. I'd like to have it notify me on its own "hey, it's been 2 days, any improvement? Do you mind sharing a few pictures of the pool as well as the ph/chlorine test results ?". Nothing mind-boggingly complex. Nothing that couldn't be achieved using current LLMs. But still something I'd have to implement myself and which turns out to be more complex to achieve than expected. This is the kind of improvement I'd like to see big AI companies going after rather than research-grade ultra smart AIs.
kinda interesting, every single CS person (especially phds) when talking about reasoning are unable to concisely quantify, enumerate, qualify, or define reasoning.
people with (high) intelligence talking and building (artificial) intelligence but never able to convincingly explain aspects of intelligence. just often talk ambiguously and circularly around it.
what are we humans getting ourselves into inventing skynet :wink.
its been an ongoing pet project to tackle reasoning, but i cant answer your question with regards to llms.
>> Kinda interesting, every single CS person (especially phds) when talking about reasoning are unable to concisely quantify, enumerate, qualify, or define reasoning.
Kinda interesting that mathematicians also can't do the same for mathematics.
Mathematicians absolutely can, it's called foundations, and people actively study what mathematics can be expressed in different foundations. Most mathematicians don't care about it though for the same reason most programmers don't care about Haskell.
I don't care about Haskell either, but we know what reasoning is [1]. It's been studied extensively in mathematics, computer science, psychology, cognitive science and AI, and in philosophy going back literally thousands of years with grandpapa Aristotle and his syllogisms. Formal reasoning, informal reasoning, non-monotonic reasoning, etc etc. Not only do we know what reasoning is, we know how to do it with computers just fine, too [2]. That's basically the first 50 years of AI, that folks like His Nobelist Eminence Geoffrey Hinton will tell you was all a Bad Idea and a total failure.
Still somehow the question keeps coming up- "what is reasoning". I'll be honest and say that I imagine it's mainly folks who skipped CS 101 because they were busy tweaking their neural nets who go around the web like Diogenes with his lantern, howling "Reasoning! I'm looking for a definition of Reasoning! What is Reasoning!".
I have never heard the people at the top echelons of AI and Deep learning - LeCun, Schmidhuber, Bengio, Hinton, Ng, Hutter, etc etc- say things like that: "what's reasoning". The reason I suppose is that they know exactly what that is, because it was the one thing they could never do with their neural nets, that classical AI could do between sips of coffee at breakfast [3]. Those guys know exactly what their systems are missing and, to their credit, have never made no bones about that.
_________________
[1] e.g. see my profile for a quick summary.
[2] See all of Russeel & Norvig, as a for instance.
[3] Schmidhuber's doctoral thesis was an implementation of genetic algorithms in Prolog, even.
i have a question for you, in which ive asked many philosophy professors but none could answer satisfactorily. since you seem to have a penchant for reasoning perhaps you might have a good answer. (i hope i remember the full extent of the question properly, i might hit you up with some follow questions)
it pertains to the source of the inference power of deductive inference. do you think all deductive reasoning originated inductively? like when some one discovers a rule or fact that seemingly has contextual predictive power, obviously that can be confirmed inductively by observations, but did that deductive reflex of the mind coagulate by inductive experiences. maybe not all deductive derivative rules but the original deductive rules.
I'm sorry but I have no idea how to answer your question, which is indeed philosophical. You see, I'm not a philosopher, but a scientist. Science seeks to pose questions, and answer them; philosophy seeks to pose questions, and question them. Me, I like answers more than questions so I don't care about philosophy much.
well yeah its partially philosphical, i guess my haphazard use of language like “all” makes it more philosophical than intended.
but im getting at a few things.
one of those things is neurological. how do deductive inference constructs manifest in neurons and is it really inadvertently an inductive process that that creates deductive neural functions.
other aspect of the question i guess is more philosophical. like why does deductive inference work at all, i think clues to a potential answer to that can be seen in the mechanics of generalization of antecedents predicting(or correlating with) certain generalized consequences consistently. the brain coagulates generalized coinciding concepts by reinforcement and it recognizes or differentiates inclusive instances or excluding instances of a generalization by recognition properties that seem to gatekeep identities accordingly. its hard to explain succinctly what i mean by the latter, but im planning on writing an academic paper on that.
To clarify, what neural nets are missing is a capability present in classical, logic-based and symbolic systems. That's the ability that we commonly call "reasoning". No need to prove any negatives. We just point to what classical systems are doing and ask whether a deep net can do that.
well lets just say i think i can explain reasoning better than anyone ive encountered. i have my own hypothesized theory on what it is and how it manifests in neural networks.
i doubt your mathmatician example is equivalent.
examples that are fresh on the mind that further my point.
ive heard yann lecun baffled by llms instantiation/emergence of reasoning, along with other ai researchers. eric Schmidt thinks the agentic reasoning is the current frontier and people should be focusing on that. was listening to the start of an ai machine learning interview a week ago with some cs phd asked to explain reasoning and the best he could muster up is you know it when you see it…. not to mention the guy responding to the grandparent that gave a cop out answer ( all the most respect to him).
>> well lets just say i think i can explain reasoning better than anyone ive encountered. i have my own hypothesized theory on what it is and how it manifests in neural networks.
I'm going to bet you haven't encountered the right people then. Maybe your social circle is limited to folks like the person who presented a slide about A* to a dumb-struck roomfull of Deep Learning researchers, in the last NeurIps?
possibly, my university doesn’t really do ai research beyond using it as a tool to engineer things. im looking to transfer to a different university.
but no, my take on reasoning is really a somewhat generalized reframing of the definition of reasoning (which you might find on the stanford encylopedia of philosophy) thats reframed partially in axiomatic building blocks of neural network components/terminology. im not claiming to have discovered reasoning, just redefine it in a way thats compatible and sensible to neural networks (ish).
Well you're free to define and redefine anything and as you like, but be aware that every time you move the target closer to your shot you are setting yourself up for some pretty strong confirmation bias.
yeah thats why i need help from the machine interpretability crowd to make sure my hypothesized reframing of reasoning has sufficient empirical basis and isn’t adrift in lalaland.
terribly sorry to be such a tease, but im looking to publish a paper on it, and still need to delve deeper into machine interpretability to make sure its empirically properly couched. if u can help with that perhaps we can continue this convo in private.
I'd like to see this o3 thing play 5d chess with multiverse time travel or baba is you.
The only effect smarter models will have is that intelligent people will have to use less of their brain to do their work. As has always been the case, the medium is the message, and climate change is one of the most difficult and worst problems of our time.
If this gets software people to quit en-masse and start working in energy, biology, ecology and preservation? Then it has succeeded.
> climate change is one of the most difficult and worst problems of our time.
Slightly surprised to see this view here.
I can think of half a dozen more serious problems off hand (e.g. population aging, institutional scar tissue, dysgenics, nuclear proliferation, pandemic risks, AI itself) along most axes I can think of (raw $ cost, QALYs, even X-risk).
You've been greviously mislead if you think climate change could plausibly make the world uninhabbitable in the next couple of centuries given current trajectories. I advise going to the primary sources and emailing a climate scientist at your local university for some references.
> going to the primary sources and emailing a climate scientist at your local university for some references
I assume you've done this, otherwise you wouldn't be telling me to? Bold of you to assume my ignorance on this subject. You sound like you've fallen for corporate grifters who care more about short-term profit and gains over long-term sustainability (or you are one of said grifters, in which case why are you wasting your time on HN, shouldn't you be out there grinding?!)
Severe weather events are going to get more common and more devastating over the next couple of decades. They'll come for you and people you care about, just as they come for me and people I care about. It doesn't matter what you think you know about it.
I've read some climate papers but haven't done the email thing (I should, but have not).
The IPCC summaries are a good read too.
Do you genuinely think severe weather events are going to be even amongst the top ten killers this century? If so, I do strongly advise emailing local uni climate scientist. (What's the worst that can happen? Heck, they might confirm your views!)
(In other circumstances I might go through the whole "what have you observed that has given you this belief?" thing, but in this case there is a simple and reliable check in the form of a 5 minute email)
... actually, I can do so on your behalf... would you like me to? The specific questions I would be asking unless told otherwise would be:
1. Probability of human extinction in the next century due to climate change.
2. Probability of more than 10% of human deaths being due to extreme weather.
3. Places to find good unbiased summaries of the likely effects of climate change.
1. Do you think a tornado has real probability of forming in north-western Europe, where historically there has never been one before? And what do you think are the chances of it being destructive in ways before unseen? (Think Netherlands, Belgium, Germany, ...)
2. How are the attractors (chaos theory) changing? Is it correct to say that, no, our weather prediction models are not going to be more accurate, all we can say is that weather is going to _change_ in all extremes? More intense storms. Colder winters. Hotter summers. Drier droughts.
3. What institution predicted the floods in Spain? Did anyone? Or was this completely unprecedented and a complete surprise?
I don't think that humans will go extinct from climate change, but it will drastically change where we can comfortably live and will uproot our ability to make meaningful cultural and scientific progress.
In your comment above you mention:
> e.g. population aging, institutional scar tissue, dysgenics, nuclear proliferation, pandemic risks, AI itself
These are all intertwined with each other and with climate change. People are less likely to have kids if they don't think those kids will have a comfortable future. Nuclear war is more likely if countries are competing for less and less resources as we deplete the planet and need to increase food production. Habitat loss from deforestation leads to animals comingling where they normally wouldn't, leading to increased risk of disease spillover into humans.
You claim that somebody saying "climate change is one of the most difficult and worst problems of our time" is a take you're surprised to see here on HN, but I'm more surprised that you don't list it in what you consider important problems.
But, still, this is incredibly impressive.