Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

that is correct, however I am already using all of my VRAM. it would mean I have to degrade my model quality. I instead decided that I would rather have one solid model, and have all my use cases tied to that one model. using RAM instead proved to be problematic for the reasons I mentioned above.

if I had any free VRAM at all, I would fit faster-whisper before I touch any other LLM lol



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: