Vision-language-action models seem to be the broad category for the best approac...

		Geee 3 months ago \| parent \| context \| favorite \| on: A brief, incomplete, and mostly wrong history of r... Vision-language-action models seem to be the broad category for the best approach, which basically combines a large vision-language model with robotic actions. For example, see https://www.physicalintelligence.company/blog/pi0