Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
monocasa
on April 14, 2022
|
parent
|
context
|
favorite
| on:
Apple's M1 Ultra comes with a 32MB TLB bottleneck
Looking into it more, AGX2 (like pretty much every fairly high perf modern GPU) is heavily SMT, allowing up to 1024 simultaneous threads per core depending on how many registers each shader invocation needs.
https://rosenzweig.io/blog/asahi-gpu-part-3.html
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
https://rosenzweig.io/blog/asahi-gpu-part-3.html