Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very small: can run on the edge to allow something like a Raspberry Pi to make basic decisions for your appliance even if disconnected from the internet. Example: those are some time series parameters and instructions, decide if watering the plants or not; vision models that can watch a camera and transcribe what it is seeing in a basic way, ...

Small: runs in an average laptop not optimized for inference of LLMs, like Gemma 3 4B.

Medium: runs in a very high spec computer that people can buy for less than 5k. 30B, 70B dense models or larger MoEs.

Large: Models that big LLM providers sell as "mini", "flash", ...

Extra Large / SOTA: Gemini 2.5 PRO, Claude 4 Opus, ChatGPT O3, ...



I'm not sure if you're implying that very small language models would be run in your raspberry pi example, but for use cases like the time series one, wouldn't something like an LSTM or TiDE architecture make more sense than a language model?

These are typically small and performant both in compute and accuracy/utility from what I've seen.

I think with all the hype at the moment sometimes AI/ML has become too synonymous with LLM


Sure if you have a specific need you can specialize some NN with the right architecture, collecting the data, doing the training several times, testing the performances, ... Or: you can download an already built LLM and write a prompt.


So one of the use cases we're serving in production is predicting energy consumption for a home. Whilst I've not tried, I'm very confident that providing an LLM the historical consumption and asking it to predict future consumption will under perform compared to our forecasting model. The compute required is also several orders of magnitude lower compared to an LLM


What zero shot would you suggest for that task on an rpi? A temporal fusion thing?


The small gemma 3 and Qwen 3 models can do wonders for simple tasks as bag of algorithms.


Those would use more ram than most rpi have wouldn't they? Gemma uses 4GB right?


Gemma 3 4B QAT int4 quantized from bartowsky should barely fit in a 4GB Raspberry Pi, but without the vision encoder.

However the brand-new Gemma 3n E2B and E4B models might fit with vision.


Yep, the Gemma 3 1B would be 815MB, with enough margin for a longer prompt. Probably more realistic.


Nope, gemma3 and qwen3 exist of many sizes, including very small ones, that 4-bit quantized can run on very small systems. Qwen3-0.6B, 1.7B, ... imagine if you quantize those to 4 bit. But there is the space for the KV cache, if we don't want to limit the runs to very small prompts.


He's talking about general purpose zero shot models.


Why in the world do you need such sophistication to know whether to water the plants or not?


When you have a golden hammer everything starts to look like a nail


this


There are places where: a) weather predictions are unreliable, b) there is scarcity of water. Just making the right decision on at what hour to water is a huge monthly saving of water.


None of which need AI hype crap. Some humidity sensors, photosensors, etc. will do the job.


Need is a very strong word. We don't need a lot of we have today.

But as a hobbyist I would prefer to program in an LLM than learn a bunch of algorithms, and sensor readings. It's also very similar to how I would think about it, making it easier to debug.


I think there’s two schools of thought. The models will get so big everyone everywhere will use them for everything and they will make lots of money on api calls. The models will get cheaper and cheaper computationally on inference that implementing them on the edge will cost nothing and so an LLM will be in everything. Then every computational device will have one as long as you pay a license fee to the people who trained them.


Or a farmer


In a greenhouse operation with high-valued crops. Automated control technologies in those applications have been around for decades, and AI is competing with today’s sophisticated control technology designed, operated and continually improved by agriculturists with detailed site-specific knowledge of water (quality, availability, etc.), cultivars, markets, disease pressures, etc.. The marginal improvements AI can make in a process of poor data quality and availability, an existing, finely tuned, functioning control system, and facing the vagaries of managing dynamic living systems are…tiny.

The solution for water-constrained operations in the Americas is move to a location with more water, not AI.

For field crops…in the Americas, land and water is too cheap and crop prices are too low to be optimized with AI at the present era. The Americas (10% of world pop) could meet 70% of world food demand if pressed with today’s technologies…40% without breaking a sweat. The Americas are blessed.

Talk to the Saudis, Israel, etc. but, even there, you will lose more production by interfering in the motivations, engagement levels and cultures of working farmers than can be gained by optimizing by any complex opaque technological scheme, AI or no. New cultivars, new chemicals, new machinery even…few problems (but see India for counter examples). Changing millennia of farming practice with expensive, not-locally-maintainable, opaque technology…just no. Great truth learned over the last 70 years of development.


Does it have to be computed at the edge by every person?


Just as the other comment "have to" is a very strong word. But there are benefits to it: a) adaptability to local weather patterns, b) no access to WiFi in large properties.


I see. I guess it all boils down to how low power you can make this.

Keep in mind that there are other wireless communication systems that are long range and low power that are specifically designed to handle this scenario


In this case, "sophistication" meaning throwing insane amounts of compute power and data at the problem? In older times we'd probably call that "brute forcing".


Today, I asked a colleague to pass me a pen. Was that a egregiously simple task for such a powerful intelligence?


For “very small”, I would add “can be passively cooled” as a criterion.


> Example: those are some time series parameters and instructions, decide if watering the plants or not

How is that a "language model"?


Is language model used to mean neural net, with transformers, attention that takes in a series of tokens and out outs a prediction as a value?

Working with time series data would work in that case.



> Example: those are some time series parameters and instructions, decide if watering the plants or not; vision models that can watch a camera and transcribe what it is seeing in a basic way, ...

This is the problem I have with the general discourse of "AI" even on Hacker News, of all places. Everything you listed is not an example of a *language model*.

All of those can either be implemented as a simple "if", decision tree, decision table, and finally actual ML in the example of cameras and time series predication.

Using an LLM is not just ridiculous here but totally the wrong fit and a waste of resources.


> Using an LLM is not just ridiculous here but totally the wrong fit and a waste of resources.

Time and labor are resources too. There's a whole host of problems where "good enough" is tremendously valuable.


How do we call the models beyond extra large which are so big they can't be served publicly because their inference cost is too high? Do such exist?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: