Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Starting from a point of outputting random gibberish, the only feedback these models are given during training is whether their next word prediction was right or wrong (i.e. same as next word in the training sample they are being fed). So, calling these models "next word predictors" is technically correct from that point of view - this is their only "goal" and only feedback they are given.

This is true for pretraining - creating a "base model" - but it's not true for instruction tuning. There's a second stage (RLHF, DPO, whatever) where it's trained again with the objective being "take questions and generate answers" and from there "generate correct answers".

I would expect there could be further advancements where we actually program algorithms into transformers (which can be done) and then merge models with proven capabilities together rather than trying to train everything by example. Or emit tool-running tokens which can do unbounded computation.

> so what magic is inside them that let's them learn so well?!

Funny thing is there _are_ known limits to what it can do. In particular, it can't do reverse association from anything it learned going forwards. This is called the "reversal curse".

ie, if you give GPT4 a line from a song it can tell you what the line after it is, but it's a lot worse at the line before it!



> This is true for pretraining - creating a "base model" - but it's not true for instruction tuning. There's a second stage (RLHF, DPO, whatever) where it's trained again with the objective being "take questions and generate answers" and from there "generate correct answers".

Yes, but those are essentially filters, applied after the base model has already learnt it's world model. I think these are more controlling what the model generates that what it learns, since you don't need much data for this.

> merge models with proven capabilities together rather than trying to train everything by example

Merging specialist LLMs is already a recent thing. I'm not sure how it works exactly but basically merging weights post-training. Yannic Kilcher mentioned this on one of his recent YouTube videos.

> if you give GPT4 a line from a song it can tell you what the line after it is, but it's a lot worse at the line before it!

I suppose a bidirectional transformer like BERT would handle this better, but generative language models are deliberately only using the past to predict the future, so this might be expected. Some short term memory (an additional "context" persisting across tokens) would presumably help.


Does Quiet STaR [0] address the association issue, forward reasoning using past learning?

[0] https://arxiv.org/abs/2403.09629


No; it can reason backwards from things it found in context, just not things trained into the model. If you have lines A, B, C there's no association in the model back from C to B. I don't think this can be solved by better reasoning.

A proposed solution I saw recently was to feed every training document in backwards as well as forwards.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: