Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

llama3.3:70b-instruct-q4_K_M (~43 GB, 2x 3090/4090 or fast memory cpu inference on e.g. macs)

or

qwen2.5-coder:32b-instruct-q5_K_M (~23 GB)

or

gemma2:9b-instruct-q6_K (~7.5 GB)

and

https://github.com/bernardo-bruning/ollama-copilot

or alternatively:

https://github.com/ollama/ollama + https://github.com/olimorris/codecompanion.nvim



Or just get yourself a cerebras cluster and run full llama-3.1-405B.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: