Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We should offer a prize for the first person who finds an innocuous input that leads to the model responding with an unintended malicious response.

I think it's funny that 1990's sci-fi movies about AI always showed that two of the most ridiculous things people in the future could do were:

- give your powerful AI access to the Internet

- allow your powerful AI to write and run its own code

And yet here we are. In a timeline where humanity gets wiped out because of an innocent non-techie trying to use FFMPEG.

Somebody is watching us and throwing popcorn at their screen right now!



LLMs don't have intentions, so it would never be an unintended malicious response.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: