Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
smcleod
12 days ago
|
parent
|
context
|
favorite
| on:
Run DeepSeek R1 Dynamic 1.58-bit
Yep that's pretty much what I did, their calculation for the layers was slightly off though, I found I could offload an extra 1-2 layers to the GPUs
danielhanchen
11 days ago
[–]
Oh yes I reduced it by 4 for just in case :) I found sometimes the formula doesn't work, so in the worst case -4 was used - glad at least it ran!
reply
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: