Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am using the 32Gb distilled model on my local 3090 with Continue in VSCode. It beats everything out of the water.


How many tokens/s do you get on a 3090? With the extra tokens for the internal monologue, is it still performant enough for smooth VSCode integration?


Any idea how to use a cloud hosted version with cursor?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: