Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Then you need to go over each item with just as much care as you would any probably-irrelevant item pulled from a keyword search, because the LLM is incapable of evaluating it in any way other than correlation.

Also, you don't necessarily have a real dataset to begin with: prior art doesn't need to be patented, it just needs to be published/public/invented sufficiently before the patent. Searching the existing patent database is insufficient.



> Also, you don't necessarily have a real dataset to begin with: prior art doesn't need to be patented, it just needs to be published/public/invented sufficiently before the patent. Searching the existing patent database is insufficient.

I would caution against making assumptions with regards to dataset access and size. I agree effectiveness of the effort I mention would be a function of not only gen AI engineering, but also dataset size and scope.


Going over a better curated list is a significant upgrade and time saver.

Let’s not pretend that “correlation” isn’t very powerful




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: