Asking an API to write three paragraphs of text still takes tens of seconds and ...

Asking an API to write three paragraphs of text still takes tens of seconds and requires working internet and an expensive data center.

Meanwhile we’re seeing the first of the new generation of on-device inference chips being shipped as commodity edge compute.

When the devices you use every day — cars, doorbells, TV remotes, points-of-sale, roombas — can interpret camera and speech input locally in the time it takes to draw a frame and with low enough power to still give you 10h between charges: then we’ll be due another round of innovation.

The article points to how few parts of the economy are leveraging the text-only API products currently available. That still feels very Web 1.0, for me.