Sligtly related: Energy Based Models (EBMs) are better in theory and yet too res...

Horffupolde · 2025-03-14T21:33:26 1741988006

WTF. The cardinality of words is 100,000.

tyronehed · 2025-03-14T18:43:27 1741977807

Since this exposes the answer, the new architecture has to be based on world model building.

uoaei · 2025-03-14T19:02:27 1741978947

The thing is, this has been known since even before the current crop of LLMs. Anyone who considered (only the English) language to be sufficient to model the world understands so little about cognition as to be irrelevant in this conversation.

vessenes · 2025-03-14T18:39:02 1741977542

Thanks. This is interesting. What kind of equation is used to assess an ebm during training? I’m afraid I still don’t get the core concept well enough to have an intuition for it.

albertzeyer · 2025-03-14T22:03:43 1741989823

Jürgen Schmidhuber has a paper with Lucas Beyer? I'm not aware of it. Which do you mean?

__rito__ · 2025-03-15T07:02:47 1742022167

Oh, I made a mistake unfortunately. I meant hardmaru (David Ha).

I am very sorry.