Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Vision-language-action models seem to be the broad category for the best approach, which basically combines a large vision-language model with robotic actions. For example, see https://www.physicalintelligence.company/blog/pi0


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: