Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People should be demanding consistency and traceability from the model vendors checked by some tool perhaps like this. This may tell you when the vendor changed something but there is otherwise no recourse?


Agreed! FWIW I am attempting to create an open-source wiki/watchdog eval platform -- weval.org -- , so we can all keep an eye on LLMs, their biases, and their general competencies without relyong in the AI providers marking their own homework. I really believe this needs to exist to express our needs and hold model creators to account. Especially as model drift and manipulation becomes a risk.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: