Now THIS is the kind of shit I signed up for when AI started to become able to understand images properly: no shitty prompt-based generators that puke the most generalised version of every motif while draining the whole illustration industry from life.
It's just good-ass tooling for making cool-ass art. Hell yes! Finally, there is some useful AI tooling that empowers artistic creativity rather than drains it.
Pardon the French; I just think this is too awesome for normal words.
Yep, there's a similar refrain amongst 3D artists who are begging for AI tools which can effectively speed up the tedious parts of their current process like retopo and UV unwrapping, but all AI researchers keep giving them are tools which take a text prompt or image and try to automate their entire process from start to finish, with very little control and invariably low quality results.
There have been some really nice AI tools to generate bump and diffusion maps from photos. So you could photograph a wall and get a detailed meshing texture with good light scatter and depth.
That's the kind of awesome tech that got me into AI in the first place. But then prompt generators took over everything.
Denoising is another good practical application of AI in 3D, you can save a lot of time without giving up any control by rendering an almost noise-free image and then letting a neural network clean it up. Intel did some good work there with their open source OIDN library, but then genAI took over and now all the research focus is on trying to completely replace precise 3D rendering workflows with diffusion slot machines, rather than continuing to develop smarter AI denoisers.
Because the investors funding development of those AI tools don't want to try to empower artists and give them more freedom, they want to try to replace them.
The investors want to make money, and if they make a tool that is usable by more people than just experienced 3D artists who are tired of retopologizing their models, that both empowers many more people and potentially makes them more money.
Aside from that, it's impossible to tools replace artists. Did cameras replace painting? I'm sure they reduced the demand for paintings, but if you want to create art and paint is your chosen medium it has never been easier. If you want to create art and 3D models are your chosen medium, the existence of AI tools for 3D model generation from a prompt doesn't stop you. However, if you want to create a game and you need a 3D model of a rock or something, you're not trying to make "art" with that rock, you're trying to make a game and a 3D model is just something you need to do that.
There's a ton of room for using today's ML techniques to greatly simplify photo editing. The problem is, these are not billion dollar ideas. You're not gonna raise a lot of money at crazy valuations by proposing to build a tool for relighting scenes or removing unwanted to objects from a photo. Especially since there is a good chance that Google, Apple, or Adobe are going to just borrow your idea if it pans out.
On the other hand, you can raise a lot of money if you promise to render an entire industry or an entire class of human labor obsolete.
The end result is that far fewer people are working on ML-based dust or noise removal than on tools that are generating made-up images or videos from scratch.
I share your excitement for this tool that assists artists. However, I don't share the same disdain for prompt generators.
I find it enlightening to view it in the context of coding.
GitHub Copilot assists programmers, while ChatGPT replaces the entire process. There are pros and cons though:
GitHub Copilot is hard to use for non-programmers, but can be used to assist in the creation of complex programs.
ChatGPT is easy to use for non-programmers, but is usually restricted to making simple scripts.
However, this doesn't mean that ChatGPT is useless for professional programmers either, if you just need to make something simple.
I think a similar dynamic happens in art. Both types of tools are awesome, they're just for different demographics and have different limitations.
For example, using the coding analogy: MidJourney is like ChatGPT. Easy to use, but hard to control. Good for random people. InvokeAI, Generative Fill and this new tool is like Copilot. Hard to use for non-artists, but easier to control and customise. Good for artists.
However, I do find it frustrating how most of the funding in AI art tools goes towards the easy-to-use side, instead of the easy-to-control side (this doesn't seem to be shared by coding, where Copilot is more well-developed than ChatGPT coding). More funding and development to the easy-to-control type would be very welcome indeed!
(Note, ControlNet is probably a good example as easy-to-control. There's a very high skill ceiling in using Stable Diffusion right now.)
Good analogy. Yes, controllability is severely lacking, which is what makes diffusion models a very bad tool for artists. The current tools, even Photoshop's best attempt to implement them as a tool (smart infill), are situational at best. Artists need controllable specialized tools that simplify annoying operations, not prompt generators.
As a programmer, I find copilot a pretty decent tool, thanks to its good controllability. ChatGPT is less so, but it is decent for finding the right keywords or libraries i can look up later.
Except this is explicitly not AI, nor is it even tangentially related to AI. This is a normal graphics algorithm, the kind you get from really smart people working on render-pipeline maths.
It's not a deep neural network, but it's a machine learning model. In very simple terms, it minimizes a loss from refining an estimated mesh—about as much machine learning as old-school KNN or SVM.
AI means nothing as a word; it is basically as descriptive as "smart" or "complicated". But yes, it's a very clever algorithm invented by clever people that is finding some nice applications.
Whether you agree with what it means or not, the word AI most definitely has a meaning today, moreso than ever, and that meaning is not what we (myself included, I have a masters in AI from the before-times) used to use it for. Today, AI exclusively refers to (extremely) large neural networks.
If that is the definition, then I agree; calling this AI would downplay how clever this algorithm really is.
But most marketing firms disagree. AI has now absorbed the terms "big data" and "algorithm" in many places. The new Ryzen AI processor, Apple intelligence, NVIDIA AI upscaling, and HP AI printer all refer to much smaller models or algorithms.
Let's say you want to rotate a cat's head in an existing picture by 5 degrees, as in the most basic example suggested here. No prompt will reliably do that.
A mesh-transform tool and some brush touchups could. Or this tool could. Diffusion models are too uncontrollable, even in the most basic examples, to be meaningfully useful for artists.
It's just good-ass tooling for making cool-ass art. Hell yes! Finally, there is some useful AI tooling that empowers artistic creativity rather than drains it.
Pardon the French; I just think this is too awesome for normal words.