llama3.3:70b-instruct-q4_K_M (~43 GB, 2x 3090/4090 or fast memory cpu inference ...

		mistercheph 8 months ago \| parent \| context \| favorite \| on: GitHub Copilot is now available for free llama3.3:70b-instruct-q4_K_M (~43 GB, 2x 3090/4090 or fast memory cpu inference on e.g. macs) or qwen2.5-coder:32b-instruct-q5_K_M (~23 GB) or gemma2:9b-instruct-q6_K (~7.5 GB) and https://github.com/bernardo-bruning/ollama-copilot or alternatively: https://github.com/ollama/ollama + https://github.com/olimorris/codecompanion.nvim

Or just get yourself a cerebras cluster and run full llama-3.1-405B.