Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
How attention sinks keep language models stable (hanlab.mit.edu)
219 points by pr337h4m 29 days ago | past | 36 comments
Radial Attention: O(nlogn) Attention for Long Video Generation with 2-4× Speedup (hanlab.mit.edu)
6 points by lmxyy 61 days ago | past | 1 comment
SVDQuant+NVFP4: 4× Smaller, 3× Faster FLUX with 16-bit Quality on Blackwell GPUs (hanlab.mit.edu)
52 points by lmxyy 6 months ago | past | 10 comments
RTX 5090 Workstation Configuration Journey (hanlab.mit.edu)
5 points by lmxyy 6 months ago | past | 1 comment
SVDQuant: 4-Bit Quantization Powers 12B Flux on a 16GB 4090 GPU with 3x Speedup (hanlab.mit.edu)
179 points by lmxyy 10 months ago | past | 65 comments
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer (hanlab.mit.edu)
2 points by lnyan 10 months ago | past
TinyChat: Large Language Model on the Edge (hanlab.mit.edu)
2 points by enduku on Dec 8, 2023 | past | 1 comment

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: