Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The key ingredients are not the humans but the feedback they carry to the model. Humans are embodied and can test ideas in the real world, LLMs need some kind of special deployment to achieve that. It just so happens that chat rooms are such a deployment.

For example, AlphaZero started from scratch and only had feedback from the self-play game outcomes, but that was enough to reach superhuman level. It was the feedback that carried insights and taught the model.

You can make a parallel to the scientific method: you have two stages, ideation and validation. Ideation alone is not scientific. Validation is what makes or breaks ideas. LLMs without a validation system are just like scientists without a lab.

We're not that smart, as demonstrated by the large number of ideas that don't pan out, we can churn ideas fast but we learn from their outcomes, we can't predict outcomes from the beginning and skip validation.

Here is an example of LLMs discovering useful ideas by feedback, even when they are completely outside their training distribution:

"Evolution through Large Models" https://arxiv.org/abs/2206.08896

This works because the task proposed by this paper is easy to test, so there is plenty of feedback. But the LLM still needs to apply ingenuity to optimize it, you can't brute force it by evolutionary methods alone.



That _is_ an interesting paper, I'll need to give it a read through.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: