More

mentalgear · 2025-08-14T15:19:19 1755184759

> Whether AI reasoning is “real” reasoning or just a mirage can be an interesting question, but it is primarily a philosophical question. It depends on having a clear definition of what “real” reasoning is, exactly.

It's pretty easy: causal reasoning. Causal, not statistic correlation only as LLM do, with or without "CoT".

glial · 2025-08-14T15:26:16 1755185176

Correct me if I'm wrong, I'm not sure it's so simple. LLMs are called causal models in the sense that earlier tokens "cause" later tokens, that is, later tokens are causally dependent on what the earlier tokens are.

If you mean deterministic rather than probabilistic, even Pearl-style causal models are probabilistic.

I think the author is circling around the idea that their idea of reasoning is to produce statements in a formal system: to have a set of axioms, a set of production rules, and to generate new strings/sentences/theorems using those rules. This approach is how math is formalized. It allows us to extrapolate - make new "theorems" or constructions that weren't in the "training set".

jayd16 · 2025-08-14T15:36:43 1755185803

By this definition a bag of answers is causal reasoning because we previously filled the bag, which caused what we pulled. State causing a result is not causal reasoning.

You need to actually have something that deduces a result from a set of principles that form a logical conclusion or the understanding that more data is needed to make a conclusion. That is clearly different than finding a likely next token on statics alone, despite the fact the statical answer can be correct.

d4rkn0d3z · 2025-08-16T14:06:37 1755353197

Two thoughts:

1) As far as I recall this program of formalizing mathematics fails unless you banish autoregression.

2) It is important to point out that a theorem in this context is not the same as a "Theorem" from mathematics. Production rules generate theorems that comply with rules and axioms of the formal system, ensuring that they could have meaning in that formal system. The meaning cannot justify the rules though, fortunately, most know to use the rules of logic so that we are not grunting beasts, incapable of conveying information.

I think the author wonders why theorems that don't seem to have meanings appear in the output of AI.

apples_oranges · 2025-08-14T15:39:31 1755185971

But let's say you change your mathematical expression by reducing or expanding it somehow, then, unless it's trivial, there are infinite ways to do it, and the "cause" here is the answer to the question of "why did you do that and not something else"? Brute force excluded, the cause is probably some idea, some model of the problem or a gut feeling (or desperation..) ..

stonemetal12 · 2025-08-14T16:07:40 1755187660

Smoking increases the risk of getting cancer significantly. We say Smoking causes Cancer. Causal reasoning can be probabilistic.

LLMs are not causal reasoning because there are no facts, only tokens. For the most part you can't ask LLMs how they came to an answer, because it doesn't know.

lordnacho · 2025-08-14T15:53:49 1755186829

What's stopping us from building an LLM that can build causal trees, rejecting some trees and accepting others based on whatever evidence it is fed?

Or even a causal tool for an LLM agent that operates like what it does when you ask it about math and forwards the request to Wolfram.

suddenlybananas · 2025-08-14T16:09:43 1755187783

>What's stopping us from building an LLM that can build causal trees, rejecting some trees and accepting others based on whatever evidence it is fed?

Exponential time complexity.

blackbear_ · 2025-08-15T06:58:59 1755241139

In principle this is possible, modulo scalability concerns: https://arxiv.org/pdf/2506.06039

Perhaps this will one day become a new post-training task

mdp2021 · 2025-08-14T16:09:37 1755187777

> causal reasoning

You have missed the foundation: before dynamics, being. Before causal reasoning you have deep definition of concepts. Causality is "below" that.

naasking · 2025-08-14T15:25:04 1755185104

Define causal reasoning?

mentalgear · 2025-08-10T00:38:16 1754786296

Regardless of personal opinions about his style, Marcus has been proven correct on several fronts, including the diminishing returns of scaling laws and the lack of true reasoning (out of distribution generalizability) in LLM-type AI.

These are issues that the industry initially denied, only to (years) later acknowledge them as their "own recent discoveries" as soon as they had something new to sell (chain-of-thought approach, RL-based LLM, tbc.).

hodgehog11 · 2025-08-10T02:47:00 1754794020

Care to explain further? He has made far more claims of the limitations of LLMs that have been proven false.

> diminishing returns of scaling laws

This was so obvious it didn't need mentioning. And what Gary really missed is that all you need are more axes to scale over and you can still make significant improvements. Think of where we are now vs 2023.

> lack of true reasoning (out of distribution generalizability) in LLM-type AI

To my understanding, this is one that he has gotten wrong. LLMs do have internal representations, exactly the kind that he predicted they didn't have.

> These are issues that the industry initially denied, only to (years) later acknowledge them

The industry denies all their limitations for hype. The academic literature has all of them listed plain as day. Gary isn't wrong because he's contradicted the hype of the tech labs, he's wrong because his short-term predictions were proven false in the literature he used to publish in. This was all in his efforts to peddle neurosymbolic architectures which were quickly replaced by tool use.

AndrewKemendo · 2025-08-10T03:24:39 1754796279

I’m just trying to find where all this hype is

I think the hype is coming from people who have no idea what is going on and just feeding on each other

Much like blockchain, metaverse or whatever was dominated by know nothings who spoke confidently to people even dumber than them

No professionals that have any experience or research credentials have made any crazy claims

hodgehog11 · 2025-08-10T04:58:16 1754801896

The hype is coming from startups, big tech press releases, and grifters who have a vested interest in raising a ton of money from VCs and stakeholders, same as blockchain and metaverse. The difference is that there is a large legitimate body of research underneath deep learning that has been there for many years and remains (somewhat) healthy.

I would argue that the claim of "LLMs will never be able to do this" is crazy without solid mathematical proof, and is risky even with significant empirical evidence. Unfortunately, several professionals have resorted to this language.

mentalgear · 2025-08-10T00:26:19 1754785579

The AI community requires more independent experts like Marcus to maintain integrity and transparency, ensuring that the field does not succumb to hyperbole as well as shifting standards such as "internally achieved AGI", etc.

Regardless of personal opinions about his style, Marcus has been proven correct on several fronts, including the diminishing returns of scaling laws and the lack of true reasoning (out of distribution generalizability) in LLM-type AI.

These are issues that the industry initially denied, only to (years) later acknowledge them as their "own recent discoveries" as soon as they had something new to sell (chain-of-thought approach, RL-based LLM, tbc.).

kylehotchkiss · 2025-08-10T00:37:46 1754786266

Agreed, the hype cycles need vocal critics. The loudest voices talking about LLMs are the ones who financially benefit the most for it. I’m not anti-AI, I think the hype and gaslighting the entire economy to believe this is the sole thing that is going to render them unemployed is ridiculous (the economy is rough for a myriad of other reasons, most of which come originate from our countries choice in leadership)

Hopefully the innovation slowing means that all the products I use will move past trying to duck tape AI on and start working on actual features/bugs again

igorkraw · 2025-08-10T17:06:17 1754845577

I have a tiny tiny podcast with a friend where we try to break down what parts of the hype are bullshit (muck) and which kernels of truth are there, if any, startedpartially as a place to scream into the void, partially to help the people who are anxious about AGI or otherwise bring harmed by the hype. I think we have a long way to go in terms of presentation (breaking down very technical terms to an audience that is used to vague-hype around "AI" is hard), but we cite our sources, maybe it'll be interesting gpr you to check out out shownotes

https://kairos.fm/muckraikers/

I personally struggle with Gary Marcus critiques because whenever they are about "making ai work" it goes into neurosymbploc "AI" which o have technical disagreements with, and I have _other_ arguments for the points he sometimes raises which I think are more rigorous, so it's difficult to be roughly in the same camp - but overall I'm happy someone with reach is calling BS ad well.

vessenes · 2025-08-10T01:21:01 1754788861

Hard disagree. The essay is a rehash of Reddit complaints, no direct results from testing and largely about product launch (simultaneous launch to 500mm+ users mind you) snafus. Please.

I think most hit pieces like this miss what is actually important about the 5 launch - it’s the first product launch in the space. We are moving on from model improvements to a concept of what a full product might look like. The things that matter about 5 are not thinking strength, although it is moderately better than o3 in my tests, which is roughly what the benchmarks say.

What’s important is that it’s faster, that it’s integrated, that it’s set up to provide incremental improvements (to say multimodal interaction, image generation and so on) without needing the branding of a new model, and I think the very largest improvement is its ability to retain context and goals over a very long set of tools uses.

Willison mentioned it’s his only daily driver now (for a largely coding based usage setup), and I would say it’s significantly better at getting a larger / longer / more context needed coding task than the prior best — Claude - or the prior best architects (o3-pro or Gemini depending). It’s also much faster than o3-pro for coding.

Anyway, saying “Reddit users who have formed parasocial relationships with 4o didn’t like this launch -> oAI is doomed” is weak analysis, and pointless.

ModernMech · 2025-08-10T02:38:12 1754793492

If ChatGPT 5 lived up to the hype, literally no one would be asking for old models back. The snafus are minor as far as presentations go, but their existence completely undermines the product OpenAI is selling, which is an expert in your pocket. They showed everyone this "expert" can't even assist the creators themselves to nail such a high stakes presentation; OpenAI's embarrassing oversights foretell similar embarrassments for anyone who relies on this product for their high stakes presentation or report.

heyoni · 2025-08-10T01:15:11 1754788511

I don’t associate any of these AI limitations and mischaracterizations with Marcus. Do you?

mentalgear · 2025-08-08T08:19:00 1754641140

Meteor was amazing, I don't understand why it never got sustainable traction.

hobofan · 2025-08-08T12:43:20 1754657000

I think this blog post may provide some insight: https://medium.com/@sachagreif/an-open-letter-to-the-new-own...

Roughly: Meteor required too much vertical integration on each part of the stack to survive the strongly changing landscape at the time. On top of that, a lot of the teams focus shifted to Apollo (which at least from a commercial point of view seems to have been a good decision).

h4ch1 · 2025-08-08T11:03:34 1754651014

Seems like meteor is still actively developed and is Framework agnostic! https://github.com/meteor/meteor

thrown-0825 · 2025-08-08T11:09:27 1754651367

Tight coupling to MongoDB, fragmented ecosystem / packages, and react came out soon after and kind of stole its lunch money.

It also had some pretty serious performance bottlenecks, especially when observing large tables for changes that need to be synced to subscribing clients.

I agree though, it was a great framework for its day. Auth bootstrapping in particular was absolutely painless.

dustingetz · 2025-08-08T10:37:43 1754649463

non-relational, document oriented pubsub architecture based on MongoDB, good for not much more than chat apps. For toy apps (in 2012-2016) – use firebase (also for chat apps), for crud-spectrum and enterprise apps - use sql. And then React happened and consumed the entire spectrum of frontend architectures, bringing us to GraphQL, which didn't, but the hype wave left little oxygen remaining for anything else. (Even if it had, still Meteor was not better.)

vlasky · 2025-08-09T08:24:06 1754727846

I'm the defacto maintainer of the Meteor MySQL integration. Since 2015, I've been involved in the design and maintenance of six different Meteor webapps for real-time geospatial applications built for B2B and B2C.

Given this, I reject your assertion that Meteor is limited to MongoDB and "toy apps".

mentalgear · 2025-08-08T08:15:25 1754640925

Local-First & Sync-Engines are the future. Here's a great filterable datatable overview of the local-first framework landscape: https://www.localfirst.fm/landscape

My favorite so far is Triplit.dev (which can also be combined with TanStack DB); 2 more I like to explore are PowerSync and NextGraph. Also, the recent LocalFirst Conf has some great videos, currently watching the NextGraph one (https://www.youtube.com/watch?v=gaadDmZWIzE).

CodingJeebus · 2025-08-08T18:15:05 1754676905

How is the database migration support for these tools?

Needing to support clients that don’t phone home for an extended period and therefore need to be rolled forward from a really old schema state seems like a major hassle, but maybe I’m missing something. Trying to troubleshoot one-off front end bugs for a single product user can be real a pain, I’d hate to see what it’s like when you have to factor in the state of their schema as well

kobieps · 2025-08-09T00:25:49 1754699149

I can't speak to the other tools, but we built PowerSync using a schemaless protocol under the hood, specifically for this reason. Most of the time you don't need to implement migrations at all. For example adding a new column just works, as the data is already there when the schema is rolled forward.

rogerkirkness · 2025-08-08T14:38:20 1754663900

Reminds me of Meteor back in the day.

8n4vidtmkvmk · 2025-08-08T15:16:38 1754666198

Whatever happened to meteor? They made it sound so great. What I didn't like was the tight coupling to mongodb.

explorigin · 2025-08-08T16:11:51 1754669511

For me it was the lack of confirmation with the backend. When it was the next big thing, it sent changes to the backend without waiting for a response. This made the interface crazy fast but I just couldn't take the risk of the FE being out-of-sync with the backend. I hope they grew out of that model but I never took it serious for that one reason.

rogerkirkness · 2025-08-08T16:18:41 1754669921

Yeah I built my first startup on Meteor, and the prototype for my second one, but there was so many weird state bugs after it got more complicated that we had to eventually switch back to normal patterns to scale it.

virgil_disgr4ce · 2025-08-08T14:37:08 1754663828

Thank you for this, I'm going to have to check out Triplit. Have you tried InstantDB? It's the one I've been most interested in trying but haven't yet.

tbeseda · 2025-08-08T15:58:25 1754668705

They're also the past...

mentalgear · 2025-08-02T12:24:29 1754137469

Much appreciate seeing the huge wave of new local-first libraries/tools !

Maybe someone can explain how this compares to other solutions like y.js or automerge ?

blinry · 2025-08-03T13:16:20 1754226980

Gladly! Automerge on its own is just a library that makes local-first data structures possible.

Ethersync uses this library for a concrete purpose: Collaborating on local text files. We wrote editor plugins and a daemon that runs on your computer, to enable you to type in plaintext files/source code together, from the editors you already know.

Hope that clears things up a bit.

mentalgear · 2025-08-01T22:18:00 1754086680

The Neuro-Symbolic approach is what the article describes, without actually naming it.

jgalt212 · 2025-08-02T03:10:34 1754104234

Perhaps, but I see it more as an endorsement of careful feature selection. Subject matter experts can do this, and once done, you can get a away with a much smaller model and better price / performance.

mentalgear · 2025-07-30T10:19:02 1753870742

Why? You can use it, only if you want to extend the tool as part of your (paid) work, you are meant to contribute the code changes back.

But none of it should prevent someone from just using it (GPL does not mean any usage data is being made "public").

mentalgear · 2025-07-30T10:17:48 1753870668

Looks great & kudos for making it local-first & open-source, much appreciated!

From a business perspective, and as someone looking also into the open-source model to launch tools, I'd be interested though how you expect revenue to be generated?

Is it solely relying on the audience segment that doesn't know how to hook up the API manually to use the open-source version? How do you calculate this, since pushing it via open-source/github you would think that most people exposed to it are technical enough to just run it from source.

yujonglee · 2025-07-30T15:04:14 1753887854

I mentioned about the monetization plan in other threads! (search with 'license').

Hope that make sense

mentalgear · 2025-07-30T10:08:55 1753870135

kudos for being transparent on your approach here