so I still need to download and host models myself.
I found that to be incredibly impractical. I tried to do it for my project AIMD, but the cost and quality just made absolutely no sense even with the top models.
Well, the market for local inference is already quite large, to say the least. “It didn’t pencil out in my business favor” doesn’t seem like a fair criticism, especially for an app clearly focused on the hobbyist—>SMB market where compute costs are dwarfed by the costs of wages and increased mental load.
I definitely see your specific point tho, and have found the same for high-level usecases. Local models become really useful when you need smaller models for ensemble systems, to give one class of use case you might want to try out —- e.g. proofreading, simple summarization, tone detection, etc.
I found that to be incredibly impractical. I tried to do it for my project AIMD, but the cost and quality just made absolutely no sense even with the top models.