Asymptotic improvements are flattening the cost curves so fast that AI regulation might become practically meaningless by the end of the year. If you want unregulated output you'll have tons of offshore models to choose from.
The risk is that the good guys end up being the only ones hampered by it. Hopefully it won't be so large a burden that the bad guys and especially the so-so guys (those with a real chance, e.g. Alibaba) get a massive leg up.
> Asymptotic improvements are flattening the cost curves so fast that AI regulation might become practically meaningless by the end of the year.
Awesome. This will mean actually good open-source models, not just API endpoints by big tech which are unusable because of dataset censorship and bias alignment (SD3, Gemini).
In other words, big tech will actually need to make good stuff to be competitive, not trash protected by a granted monopoly.
Those improvements are definitely real, but we also have pretty solidly established and confirmed scaling laws by now. Until someone utterly breaks those, big players will always have an edge, simply because they can spend more compute on training and inference. The only way to change this is with a new architecture that benefits more from intelligent adjustments in a space than cannot be searched efficiently with raw compute. And even then we are not far from the point where these models could try out those adjustments themselves. So by the time you get to tune your own GAI in your home like you could do with a human, corporations might have millions of them improving themselves to something you could never achieve on your own.
We're still in phase 1, where human-directed improvement has the highest potential. Papers are still getting published and the cells interlinked. (I'm not sure the scaling picture is at all clear, given that papers like this can turn up casually with 15x savings, but let's put that aside for now.)
Phase 2 begins when patents break stealth, unsettling the picture. If some patent impairs research or operations in IP-solid countries, the lower-level stuff might move to local inference, and maybe some minor Pirate Bay-style outfits.
Phase 3 begins when the costly research goes dark (well, darker.) Everyone is Apple now. The research papers are replaced by white papers, then by PR communiqués.
Phase 4 begins when the AI AI researchers take over. The old AI researchers turn into their managers.
Some of the path is compute-bound. Some of it is IP-, luck-, and genius-bound.
You can't really stop it at this point anyway without completely locking down any code that resembles AI at a processor level to only signed and allowed models and making owning hardware before the lock illegal and destroying any thats ceased.
I almost commented the same thing. Framing things as "good/bad/so-so" is kind of moving the target. If we are focusing on who might use the model, rather than considering that when focusing on a model that accurately represents reality and altruistically aids humans... we will lose sight of the really valuable things in life. The reality is that I do not think that people are good/so-so/bad as humans are equipped with extremely complex and diverse adaptive systems with near-limitless capabilities. Sure, I am just re-framing, however from my perspective, we are not reducing humans to "good/bad/so-so."
Not when we're talking asymptotically. The linked paper for instance claims 14- to 118-fold cost reductions. 1-2 GPU generations from now you'll train this model for $0.12.
People are casually dropping thousands on cloud GPUs making random fine tunes over at r/localllama, the threshold will be met far sooner. Plus datacenters selling away their collection of A100s and eventually H100s when they become EoL for their standards.
This kind of research is great for reducing training costs as well as enabling more people to experiment with training large models. Hopefully in 5-10 years we'll be able to train a model on par with SD 1.5 with consumer gpus since that would be great for teaching model development.
I'm pretty sure we are looking at something like 12 months. Not 5-10 years.
Pixart and this paper are good data points. Another even just 50x reduction in cost will make it possible on consumer hardware easily. This paper already claims over 100x reduction
I hope that somewhere around that period of time we will have AI-based "game" engines working at 30-40fps at 4K (of course with upscaling). I mean it might not be game engines per se, but universal, interactive pipelines for audio-visual content creation/consumption. Because right now I do not see any hope of such engines due to the number of models involved and the latencies this implies.
i'm really hoping for this as well! i'm a big believer that neural rendering pipelines will overtake traditional push-tris-to-gpus that we've essentially been using since Descent
Getting parity with SD 1.5 should require a similarly comprehensive data set, which seems a lot harder to source than a computer GPU. Especially now that we've got the A I-equivalent of pre/post nuclear steel.
Given how little artistic data humans need, there are probably breakthroughs coming that will reduce the size of the data set needed. Or make it so that a lot of the data required is more generic (like how a human artist needs vast amounts of audio-visual data from walking around every day, but maybe as little as a few megabytes to go from nothing to copying a new style and subject - then we can have a curated open source "highlights of the first 20 years of life" data set that everyone uses for basic training).
> Getting parity with SD 1.5 should require a similarly comprehensive data set, which seems a lot harder to source
Wasn't SD1.5 trained on LAION? So we know what it was and you could recreate it.
Although I thought LAION was why SD1.5 is kinda ugly at base settings because LAION is just random images both good and bad content and quality not aesthetic and high quality images.
The LAION datasets don't contain actual images, but URLs pointing to images. Due to link rot and deliberate scraper-blocking it may be difficult to download LAION images to retrain a model to match SD 1.5.
Reminds me of PixArt-α which was also trained on the similarly tiny budget ($28,000). [0] How good is their result, though? Training a toy model is one thing, making something usable (let alone competitive) is another.
Edit: they do have comparisons in the paper, and PixArt-α seems to be... more coherent?
One thing I've wondered about is fine tuning a large model from multiple LoRAs. If the model doesn't fit in your vram you can train a LoRA, apply it to the model, train another LoRA from the same data, apply it, and so on. Iterative low rank parameter updates. Would that work?
The risk is that the good guys end up being the only ones hampered by it. Hopefully it won't be so large a burden that the bad guys and especially the so-so guys (those with a real chance, e.g. Alibaba) get a massive leg up.