Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If current LLMs hit a scaling wall and the game becomes about efficiency, I wonder if there's going to be space in the market for small models focussed on specific use cases.

I use Gemini to extract structured data from images and the flash model is great at this. I wonder how much effort it would be to create a smaller model that would run on something like a NUC with an AMD APU that is good enough for that one use case.

Or perhaps you end up with mini external GPU sticks that run use case specific models on them. Might not be much of a market for that, but could be pretty cool.



I was looking for one to use for named entity extraction and found this fine tune here: https://huggingface.co/dslim/bert-base-NER?utm_source=chatgp...

Its only 108 million params.


that's already the case, and it's called model distillation. You use LLMs to generate labels but then you use a dedicated smaller model (usually NN) to run at 1000x cheaper cost of inference.


I think beyond the technical aspect it's a product and packaging problem.

All the effort is in productizing foundational models and apps built on top of them, but as that plateaus distilled models and new approaches will probably get more time in the sun. I'm hopeful that if this is the case we will see more weird stuff come available.


> I wonder if there's going to be space in the market for small models focused on specific use cases.

just recent discussion on HN: "Small language models are the future of agentic AI"

https://news.ycombinator.com/item?id=44430311


throwback to that brief period where people would mine bitcoin (ineffectively) using ASICs in their USB ports


Yes, and people buying random GPUs for ether etc. I'm not a huge fan of what crypto has become but there was something exciting about hacking stuff together at home for it which is currently missing in AI IMO.

Maybe it's not really missing and the APIs for LLMs are just too good and cheap to make homebrew stuff exciting.


no, I think you're right—there's definitely something missing right now

but more likely it's going on and we're just not seeing it

in general, though, I think once a certain amount of money is involved, people just start to get rabid and everything becomes a lot less fun


Maybe more accessible tools i think?

It's possible to run models locally, fidget with temp etc

Being able to change other things on the fly like identify weights most used for a prompt and just changing those to see what happens is much harder.

I've tried both LLMS and image generators on my machine locally and while it's gotten in easier it's a long task just setting up. Especially if you run into driver issues.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: