Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> With a human, garbage in can lead to something fully legitimate out!

Because we get to see the error messages, fix and try again. You can try this on chatGPT - give it a task, run the code, probably fails, copy the error back, and let it fix is errors. After a few rounds it gets the result with much higher probability than when you allow it one single shot.

A language model can write programs, and then we can run the programs to check if they pass tests, then the language model has a special signal - execution feedback. If you retrain the model with this new data, it will learn to code better and better. It is reinforcement learning, not language modelling.

AlphaGo was able to generate its own data and beat humans at Go by doing this exact thing. It's an evolutionary method as well, because you are cultivating populations of problems and solutions through generate + execute + validate.



> Because we get to see the error messages, fix and try again.

As I noted explicitly, a human will get better even with garbage input even without access to a computer. I also explicitly noted how we are able to learn from a single well-reasoned note.

I recommend you seriously evaluate how you yourself learn if you truly believe that you only learn things using active feedback from external sources of truth via trial runs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: