Sure if you have a specific need you can specialize some NN with the right archi...

mnahkies · 2025-05-24T17:51:49 1748109109

So one of the use cases we're serving in production is predicting energy consumption for a home. Whilst I've not tried, I'm very confident that providing an LLM the historical consumption and asking it to predict future consumption will under perform compared to our forecasting model. The compute required is also several orders of magnitude lower compared to an LLM

galangalalgol · 2025-05-24T17:05:52 1748106352

What zero shot would you suggest for that task on an rpi? A temporal fusion thing?

antirez · 2025-05-24T17:29:24 1748107764

The small gemma 3 and Qwen 3 models can do wonders for simple tasks as bag of algorithms.

galangalalgol · 2025-05-24T17:41:22 1748108482

Those would use more ram than most rpi have wouldn't they? Gemma uses 4GB right?

nolist_policy · 2025-05-24T18:48:05 1748112485

Gemma 3 4B QAT int4 quantized from bartowsky should barely fit in a 4GB Raspberry Pi, but without the vision encoder.

However the brand-new Gemma 3n E2B and E4B models might fit with vision.

antirez · 2025-05-24T19:10:35 1748113835

Yep, the Gemma 3 1B would be 815MB, with enough margin for a longer prompt. Probably more realistic.

antirez · 2025-05-24T18:47:26 1748112446

Nope, gemma3 and qwen3 exist of many sizes, including very small ones, that 4-bit quantized can run on very small systems. Qwen3-0.6B, 1.7B, ... imagine if you quantize those to 4 bit. But there is the space for the KV cache, if we don't want to limit the runs to very small prompts.