| | Train a Mnist VAE with C and CUDA (github.com/ggerganov) |
|
54 points by bssrdf 35 days ago | past | 2 comments
|
| | Llama.cpp Now Supports Qwen2-VL (Vision Language Model) (github.com/ggerganov) |
|
155 points by BUFU 42 days ago | past | 50 comments
|
| | Llama.vim: Plugin for Neovim (github.com/ggerganov) |
|
2 points by mariuz 3 months ago | past
|
| | Llama.vim: Plugin for Neovim (github.com/ggerganov) |
|
2 points by ibobev 3 months ago | past
|
| | Attention and final logit soft-capping, update scaling factor to Gemma2 (github.com/ggerganov) |
|
2 points by tosh 6 months ago | past
|
| | Distributed LLM Inference with Llama.cpp (github.com/ggerganov) |
|
3 points by tosh 8 months ago | past
|
| | New exponent functions that make SiLU and SoftMax 2x faster, at full accuracy (github.com/ggerganov) |
|
382 points by weinzierl 8 months ago | past | 72 comments
|
| | ggml: Add Flash Attention (github.com/ggerganov) |
|
2 points by tosh 8 months ago | past
|
| | Acoustic Keyboard Eavesdropping (github.com/ggerganov) |
|
1 point by behnamoh 8 months ago | past
|
| | llama.cpp bfloat16 support (github.com/ggerganov) |
|
2 points by indigodaddy 9 months ago | past
|
| | GGML Flash Attention support merged into llama.cpp (github.com/ggerganov) |
|
3 points by smcleod 9 months ago | past | 1 comment
|
| | Llama.cpp Working on Support for Llama3 (github.com/ggerganov) |
|
7 points by theolivenbaum 9 months ago | past
|
| | Llama.cpp: Improve CPU prompt eval speed (github.com/ggerganov) |
|
1 point by tosh 9 months ago | past
|
| | Llama.cpp: Mac Prebuilds (github.com/ggerganov) |
|
2 points by tosh 10 months ago | past
|
| | Grok-1 Support for Llama.cpp (github.com/ggerganov) |
|
11 points by schappim 10 months ago | past | 2 comments
|
| | Control Vectors have been added to llama.cpp (github.com/ggerganov) |
|
3 points by Der_Einzige 10 months ago | past
|
| | Gemma Is Added to Llama.cpp (github.com/ggerganov) |
|
17 points by behnamoh 11 months ago | past
|
| | Llama.cpp supports distributed inference across machines on a local network (github.com/ggerganov) |
|
3 points by behnamoh 12 months ago | past
|
| | Llama.cpp incoming backends: Vulkan, Kompute, SYCL (github.com/ggerganov) |
|
2 points by irusensei 12 months ago | past
|
| | Llama.cpp: Self-Extend Support (github.com/ggerganov) |
|
2 points by tosh on Jan 9, 2024 | past
|
| | Llama.cpp: SOTA 2-bit quants (github.com/ggerganov) |
|
5 points by tosh on Jan 7, 2024 | past
|
| | GGUF File Format (github.com/ggerganov) |
|
2 points by warkanlock on Dec 31, 2023 | past
|
| | K-Quants (github.com/ggerganov) |
|
2 points by tosh on Dec 29, 2023 | past
|
| | CUDA: Faster Mixtral Prompt Processing (github.com/ggerganov) |
|
3 points by tosh on Dec 21, 2023 | past
|
| | Performance of llama.cpp on Apple Silicon A-series (github.com/ggerganov) |
|
100 points by mobilio on Dec 19, 2023 | past | 41 comments
|
| | Llama.cpp: Support for Phi-2 (github.com/ggerganov) |
|
3 points by tosh on Dec 19, 2023 | past
|
| | Wchess (github.com/ggerganov) |
|
4 points by tosh on Dec 14, 2023 | past
|
| | QMoE Support for Mixtral (github.com/ggerganov) |
|
3 points by tosh on Dec 14, 2023 | past
|
| | Llama: Add Mixtral Support (github.com/ggerganov) |
|
2 points by tosh on Dec 11, 2023 | past
|
| | Performance of Llama.cpp on Apple Silicon (github.com/ggerganov) |
|
2 points by tosh on Nov 29, 2023 | past
|
|
|
More |