Hacker News new | past | comments | ask | show | jobs | submit login

> 4-bit quantized Mixtral Instruct running locally, gets it right

This has been one of my favorite things to play around with when it comes to real life applications. Sometimes a smaller "worse" model will vastly outperform a larger model. This seems to happen when the larger model overthinks the problem. Trying to do something simple like "extract all the names of people in this block of text" Llama 7B will have significantly fewer false positives than LLama 70B or GPT4.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: