Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I mean, yes? Managing giant model weight files is a big problem with getting people on-demand access to Docker-based micro-VMs. I don't think we missed that point so much as that we acknowledged it, and found some clarity in the idea that we weren't going to break up our existing DX just to fix it. If there were lots and lots and lots of people trying to self-host LLMs running into this problem, it would have been a harder call.


Did you consider other use cases in which people need custom models and inference other than just open source LLMs ?


Yes. Click through to the L40S post the article links to (the L40S's aren't going anywhere).

There are people doing GPU-enabled inference stuff on Fly.io. That particular slice of the market seems fine?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: