Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think this is part of the model. It's a control layer above the actual LLM that interrupts the response when the LLM mentions any of the banned names. So if you prompt the LLM directly, without that control layer, you still get full responses.


Yes you're right, it's kinda' obvious if you say you forgot the name of a person and ask ChatGPT to help: https://imgur.com/a/hCl94B0




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: