Hacker News new | past | comments | ask | show | jobs | submit login
AI and the Future of Pixel Art (pixelparmesan.com)
173 points by aleyan on Nov 9, 2022 | hide | past | favorite | 100 comments



As someone who draws, there are some obvious aspects where AI generation just works perfectly:

- generate details and texture, which photobashing was already used for; but that mostly solve the licensing problem for it

- generate random inspiration boards, for which image search was used (again, mostly solve the licensing problem)

- generate derivative stuff, e.g. typical game portraits or props

In practice, only the third point is generally discussed, because it lowers tremendously the entry barrier to generate images for people without any skills. It's like if you could pick up screenshots from other content, clean them up, and you're free to use it.

Whereas it essentially does not work for:

- cartoony generation. It relies too much on line consistency, visual clarity and abstraction

- concept design --not the flashy 10 minutes speedpaint type, but where you have to combine ideas in meaningful ways. In particular hard-surface design which required good 3D thinking and consistency of the whole.

These fundamental flaws are omnipresent, but can be hidden by certain styles where things are implied by color blobs, hidden by stylized brushstrokes, or simply an overflow of details (something Midjourney is very good at).

All in all, it feels like AI is a danger for people at the bottom of the profession hierarchy, but will elevate people at the top, whose work cannot be replaced. In other words, people who are more akin to be considered "artisans" rather than artists, who will take a prompt and simply clean it.

In particular, drawing has something like the 20/80 rule, where all the creative input is in the first 20% and the rest is 'rendering', a very mechanical task which you can mostly do with your brain turned off. As Yumenoley put it, "it was a mistake to let the AI do the interesting part".


Cartoony generation is probably a matter of training on the appropriate source material, for what it’s worth.

https://dreambooth.github.io shows a glimpse of the future. You’ll be able to upload a few drawings that you want to emulate (e.g. Mickey Mouse), and then you can give it a specific prompt (e.g. Mickey Mouse doing a handstand).

You’re probably right about the consistency of 3D concepts, though. On the other hand, I was going to say “If you need a specific table, AI might not be able to help” — but again, dreambooth shows that we might be able to upload a few photos of a certain table, and it’ll take care of the details.

Give it a few years. :)

I think you’re spot on that AI will be an incredible tool for artisans. I used it to make some video game music: https://soundcloud.com/theshawwn/sets/ai-generated-videogame... Even though I can’t play any instruments too well, I was able to craft each piece uniquely. (My favorite is “Crossing the Channel”, which has a strange rhythm because I’m pretty sure the AI made a mistake at the beginning, and then extrapolated the next “actually, this isn’t a mistake” song that it thought of, which turned out to sound cool. A bit like a guitarist doing improv.)


I think the main issue with cartoony generation is that professional animators, are not being tasked with "here is famous character Mickey Mouse that's omnipresent in your source data, now draw him on the moon wearing a hat", they're being tasked with something where the source material is a couple of concept images possibly at lower fidelity than the desired end output (and the task might well be much more specific, and the art directors considerably more pedantic about the quality and consistency of the lines than the average hacker typing in magic incarnations to get something resembling fantasy art for their blog). Of course, if you want to make memes of Mickey Mouse wearing a hat on the moon, or Mickey Mouse at the White House with a hammer and sickle, or an anime-style Mickey Mouse, AI cartoons may well be plenty good enough already.

There's a role for AI in filling in gaps, but illustrators remain essential in creating the basic style to extrapolate from and even more so in professional quality work (And to some extent, other procedural generation techniques were able to fill gaps up before cutting-edge NNs - the stampede of wildebeest in the 1994 Lion King was procedurally generated from a handful of models for example)


DreamBooth works with just 4 input images.


Agree that dreambooth is the future. I'm building an app that lets users with no technical background train their own dreambooth models for $2-$4: https://synapticpaint.com/dreambooth/info/ They can also share their trained models for others to use.

Here are some potential use cases:

- for fun (giving yourself a makeover, inserting yourself into famous movies)

- cheaper way to get studio photos (wedding photos, professional headshots for actors/models)

- easy way to create marketing assets. Like if you own an etsy store and don't want to engage a marketing studio, instead just create a dreambooth model of your necklace or whatever and create high quality product photos

There are probably a bunch of other use cases. I think making this easy (no figuring out how to do a git pull or rent a gpu) plus the community sharing aspects will make this technology a lot more accessible to artists and general users, and then the users will be doing all kinds of cool things with it organically.


> All in all, it feels like AI is a danger for people at the bottom of the profession hierarchy, but will elevate people at the top, whose work cannot be replaced.

The question is whether it not rather puts the people in the middle under pressure, when it helps the people at the bottom to produce better quality.

I can observe such a shift in translations. Deepl is not perfect, but it allows me to improve my own English texts considerably. Even if I were to give my text to a professional to polish, she or he would have much less to do than before, when I was not assisted by Deepl.


One thing I think needs to be looked at a little bit more deeply is that the AI is not stopping where it is today, it's going to get better and if it targets the lowest part of the market now, what's it going to do in 10 years?


Agreed. AI art will continue to improve.

This reminds of of the chess computers from 30 years ago. Experts were convinced computers would never beat GM humans, based on the state-of-the-art of the time. They never took into account all the future advancements in hardware and software.


The chess experts were wrong qualitatively. They were saying that best play was about intuition and feel, that computational power would never replicate regardless of its scale. (Data on Star Trek always lost chess matches implausibly.) Turned out that brute force calculation thirty moves deep does in fact outperform anything a human can foresee.

The same thing happened for both Go and Starcraft in the next decades, the experts said computers couldn't replicate enough spatial feel, and then they did. And now it's happening for AI art. Enough computational power and a sufficiently well-trained neural network can indeed exceed anything a human can do.

AI has been roughly doubling in performance every year or two, for quite some time. We just never noticed when it went from 0.0001% to 0.0002% of human capability. This is the year that it doubles from 10% to 20% and everybody notices. And there's not a lot of doublings left until it shoots past 100%.


I eagerly await a fully ML generated ballet choreography with a dozen participants. Maybe 15 years from now.

The super hard problem is driving a robotic body, vs rendering an animation of the above.


> All in all, it feels like AI is a danger for people at the bottom of the profession hierarchy, but will elevate people at the top, whose work cannot be replaced. In other words, people who are more akin to be considered "artisans" rather than artists, who will take a prompt and simply clean it.

I am an artist who is in a place where she gets paid decently to draw whatever the heck she feels like, with little regard for commercial viability.

The jobs you're dismissing as mere "artisans" are the ones where I was able to be in a place where I could spend most of my waking hours honing my drawing skills and still pay my rent. Every time you draw a thing, you get a little better at drawing that thing, and a little better at drawing in general. This is how you master your craft. If you're part of a studio then even better: there's older artists above you, and tons of opportunities for them to critique your work and open your eyes to the major flaws in it you can't see yet. If AI art fills all those niches for expanding on someone else's prompts, budding pros will find it a lot harder to get to the point where this virtuous circle of being paid to practice their craft gets started.

> In particular, drawing has something like the 20/80 rule, where all the creative input is in the first 20% and the rest is 'rendering', a very mechanical task which you can mostly do with your brain turned off.

This really depends on your process. A big part of becoming a pro, in my experience, was finding ways to make the boring parts happen a lot faster, and giving myself more opportunities to do the fun parts at every stage of the piece. That said there's also a pleasure to be found in putting on some good music, turning your brain off, and rendering the heck out of something! It's that much-coveted "flow state" people love to talk about.


I think all in all AI is a danger for everyone in the industry and there's no reason to sugar coat it.

Pure economics makes whatever hopeful expectation of advantage created by the AI generated medium irrelevant for anyone with a livelihood in the field. Being the creative genius may allow you to retain your employment, but rest assured, your income growth will decline and the next generation coming after you will be making less. The value of graphics will decline and so the money earned by the people making it will also decline. You're making truly creative work? Great, a company can just take your rendering and use it as the basis for an AI generated image to remove any licensing requirement. Good luck trying to prove original ownership. And who needs creativity anyway? How often are graphics used in commercial/retail/media enterprises really breaking the mold? Whatever shortcomings mentioned are already being mitigated by add-on AI systems (Tencent's ARC for instance) so there should be an expectation of rapid improvement in the coming years.

People employed in the field will likely adapt away from non-lucrative image generating work eventually, but the transition will be pretty painful and the overall effect will likely constrain incomes for creative work in general.


There are some fundamental pieces missing in AI art generation that when solved will completely change the game. Forever. I think most fundamentally is the capability for AI to have some kind of memory or maybe more technically a way for the AI to be capable of do style "character style transfer" more effectively. I think this is probably possible, it's akin to making deep fakes but for drawn/photography art.

With this tool suddenly the effort of making a series of compositions that are coherent will dramatically change the game, specially in the videogame industry. I think there are a lot of programmers out there capable of making great games but might be lacking the resources to fully complete their visions due to having to needing assets for their games.

It seems to me like the approach stable diffusion has taken has dramatically increased interest and utility of these tools. So I'm hoping they follow similar lines for other types of AI. Every week I'm reading of a new novel use for these generators that I hadn't really considered before.


It sounds like you're literally describing Dreambooth. You can easily and quickly train models for certain styles like Disney characters, certain illustrators, specific artists and even tailor the model to a specific person so you can produce countless pictures of them in different poses, locations and clothes. This is already available.


Indeed: https://dreambooth.github.io

Note that training is extremely expensive, and is beyond the capabilities of most end users. Here are the details of their training method:

> Given ~3-5 images of a subject we fine tune a text-to-image diffusion in two steps: (a) fine tuning the low-resolution text-to-image model with the input images paired with a text prompt containing a unique identifier and the name of the class the subject belongs to (e.g., "A photo of a [T] dog”), in parallel, we apply a class-specific prior preservation loss, which leverages the semantic prior that the model has on the class and encourages it to generate diverse instances belong to the subject's class by injecting the class name in the text prompt (e.g., "A photo of a dog”). (b) fine-tuning the super resolution components with pairs of low-resolution and high-resolution images taken from our input images set, which enables us to maintain high-fidelity to small details of the subject.

Each fine-tuned model is a copy of the original model. So if the model is 10GB, the fine tuned version will be a separate 10GB file. That might not sound like a lot, but it quickly adds up.

In this case, end users are artists. One could imagine a cloud-based art program which will fine tune on demand. That certainly seems like a good startup idea.


Dreambooth extension for automatic1111 just came out. Can be run on CPU even. Haven't tried that extension myself yet, but I have followed a youtube video last week and I trained it using a colab in a few minutes (of training - getting the hang of the whole process was maybe 20 or 30 mins including watching the vid?). I think dreambooth is already perfectly within reach of regular users, and is already being used for... ehem... questionable purposes that took 1000's of images and days and days of tweaking and training for 'traditional deepfakes' just 6 months ago. The pace of advancement is breathtaking, it's literally impossible to keep up even if you spend 100% of your time on this.


Thanks for the tip about automatic1111 with dreambooth!

Do you happen to have a link to that YouTube video you followed?


It was actually two now that I look in my bookmarks: https://www.youtube.com/watch?v=w6PTviOCYQY and https://www.youtube.com/watch?v=FaLTztGGueQ . That first one is about 1 month old and was already about running Dreambooth locally. So to be perfectly honest I'm not really sure any more which info I got from which video. Probably most from the second one. It's worth it for just the thumbnail - although I haven't been able to get results at that level yet, holy smokes is it impressive.


Not OP, but this is fastest tutorial for getting dreambooth up and running in automatic1111 I’ve found:

https://youtu.be/_GmGnMO8aGs


"Extremely expensive" - you can do dreambooth on a free colab with recent optimizations.

This area is moving really fast, one-click solutions are already being created.

People are also averaging weights of multiple models to create new models based on multiple other dreambooth models, and it works surprisingly well.

The stable diffusion reddit is a good place to see how all of this is developing.


Stable Diffusion is probably far from the state of the art, but good $DEITY did it open the floodgates! Just watching the open source scene evolving as a bystander is interesting.


What kind of GPU would you need to run this locally? (the training I mean)


Using offload to CPU (with dreamspeed) one can get away with a 8gb gpu. I haven't tried training myself yet, but there are reports of it working with 8gb vram (here for example https://www.reddit.com/r/StableDiffusion/comments/xwdj79/dre...

I have an rtx 2070 with 8gb and it has been working quite well for me. However there are always models that will not fit. For those running on the cpu with potential nvme offload is not that bad. For example a single inference on bloom 7b (30gb of ram required just for weights) on a 32gb ram machine takes about 30s (it has to offload few GB to nvme). This is on zen 3 ryzen and with no gpu use. I can't wait to try cpus that support avx512.


> Note that training is extremely expensive, and is beyond the capabilities of most end users.

You can't run it on consumer hardware, but you can just rent a GPU (or use a free collab book) for a few hours to generate the model. Then you download and reuse it locally at will. Yes, you need storage and if you train often it can get expensive, but it is by no mean out of the end user, at least professional end users, capabilities. And of course there are growing libraries of freely available pretrained models.

> In this case, end users are artists. One could imagine a cloud-based art program which will fine tune on demand. That certainly seems like a good startup idea.

Very much agree about this. At least for a while, I strongly believe that AI will just be another tool for artists willing to embrace it, far from replacing them.


I'm building an app that lets users with no technical background train their own dreambooth models for $2-$4: https://synapticpaint.com/dreambooth/info/ They can also share their trained models for others to use. I think making this easy (no figuring out how to do a git pull or rent a gpu) plus the community sharing aspects will make this technology a lot more accessible to artists and general users.


I wonder where it is going to. I was thinking of my favorite concept artists and what it means for them. Probably they can churn out 10x the work, and the role becomes more of curation, as in, scanning and judging the work put out by SD.

They probably still going to adapt the work to fit their needs, and so companies probably still want actual artists for entertainment and artistic purposes.

Small companies (and individuals) probably can use it to circumvent costly stock images all together, so the lower tier photographers/artists there gonna have a problem.


I think most actual work will be done via img2img, outpainting and inpainting, and an real artist can actually give an important contribution.


At the end of the day you still want something that sticks out, and artists are trained at that.


Aren't the fundamental pieces already there?

1 Generate a bunch of characters or objects based on a prompt.

2 Pick the one you like

3 Tell the AI to extract the character's traits from that one picture

4 Miracle happens (I don't know. What does a "character definition file" look like?)

5 Make a new prompt, but add the character definition from step 4, so you get the same traits, only in a different setting or position, etc


This style transfer is possible through both textual inversion and dreambooth models, though they take a while to train.

https://textual-inversion.github.io/

https://dreambooth.github.io/


Training can be done in under an hour[1], really not that long. And yes, what OP is saying is already possible, which seems to be par for the course for this "new" AI space, as it's moving so fast.

[1] https://colab.research.google.com/github/TheLastBen/fast-sta...


It is possible, but right now it still takes quite a bit of time and effort to get it right.

The main challenge is finding the right balance between "make something that looks exactly like this" and "put it in a completely different context". Better similarity equals less flexibility.

For now, the most effective combination will be artist + AI, although it does feel a bit like those that incorporate it in their workflow are helping to dig their own grave.


Do you happen to have any screenshots of what you mean? I’m really curious to see dreambooth’s capabilities in the field, and it sounds like you’ve had experience with some of its pitfalls.


Basically OP says that overfitting is a common pitfall, you don't want to overtrain the model, because then everything will look like your training data, and vice versa with not enough training steps. So it's a bit of a balance. If you search for "dreambooth" on the SD subreddit, you will see a lot of examples of dreambooth results and also some that show overfitted and underfitted results. https://www.reddit.com/r/StableDiffusion/search?q=dreambooth...


"AI" is only copy-pasting elements from a database of existing images. It's effectively just a fancy image search engine.

You can use it to make procedurally-generated art, but it's still obvious what the source material was.


It's pretty impressive that Stable Diffusion can compress over 200TB of already previously compressed images from LAION 5B down to a few gigabytes! And that it can search that database so quickly to copy and paste things from the right images! /s

There's a lot of good discussion to be had around the ethics of AI art, training on copyrighted materials, etc. But it is equivocally not just copy pasting from a database of images.


Yes, neural network compression is, indeed, very impressive.

> But it is equivocally not just copy pasting from a database of images.

Just because you put a lossy compression step in the middle doesn't mean it's not copy-paste anymore.


This isn’t true, but it’s 4am, and I regret that I can’t type out a full rebuttal on my phone.

One obvious counterexample is stylegan interpolations. If you interpolate between two images, the midpoint is usually unique — it’s often not obvious what the source material was. (E.g. Gwern’s anime interpolations; sure, they’re anime faces, but from where? “All of danbooru” might as well be “all styles of anime ever created.”)

Maybe someone else can argue the point further.


Go to your favorite image search engine and try to replicate anything AI has generated. It's impossible. The results you get from image search are nowhere near as specific as what the AI produces. It's not even close. The only time image search can compete is when you are highly specific, e.g. "painting of the Mono Lisa", both AI and image search will produce very similar results. But for a generic prompt, image search will come up with nothing that gets even close to the query, while AI can produce a highly specific image.

DreamBooth really should have destroyed all doubt about this point point, as the AI can generated highly specific images of subjects that aren't even in the original training set.


> as the AI can generated highly specific images of subjects that aren't even in the original training set

I haven't seen any concrete examples of this yet.


As a hobby game programmer stable diffusion looks exciting, the idea of quickly generating some assets for a game jam is very appealing. But as a hobby musician it strikes me hard. I want to write, compose and play my own music, so I would understand someone feeling the same with regards to hand drawing arts. This whole stable diffusion is exciting on one side, but on the other makes me wonder what we will be left with once this technology reaches higher capabilities?


Initially I had similar misgivings (I make visual art professionally and music as a hobby) but then I realised AI will not prevent me from creating my own stuff, it might just give me more options to use. A big part of my motivation is enjoying the artistic process itself, and I still get to choose my process. In terms of a productivity boost, it can give me the same boost as everyone else. Possibly even more as I have decades of experience. The biggest negative effect I anticipate is an endless stream of low effort content, bit TBH I think we were already there before the AI arrived. I feel this is more a matter of mental hygiene and culture and how we use technology, rather than the technology itself.


The silver lining is I guess that the low-effort content is of much higher technical quality than it was before, for better or for worse.


I can't talk about visual arts, but music market is actually appreciating a step back to a more organic/human sound. Although I must say it's actually a mixture of musicianship, recording techniques, better and fair use of electronics sounds and so on... So it actually might be like you said, that technology could empower artistic process. But, if we're already in an endless stream of low effort content(and I agree), how would you make yourself get noticed in an ocean of good enough crap once AI hits the public?


>appreciating a step back to a more organic/human sound. Although I must say it's actually a mixture of musicianship, recording techniques, better and fair use of electronics sounds and so on..

its funny, people have been saying this every few years since the 1980s


If it's "good enough", is it really crap?


I consider "good enough" stuff to be decently suitable for the general public not for the excellence


> In terms of a productivity boost, it can give me the same boost as everyone else.

Be careful with that, you may end up being more productive but also be expected to do more with fewer human resources or teammates. This will definitely work for some but I can't see this not becoming a race to the bottom for the creators. There will be tons of benefits too but just like with automation, it could be a double edge sword.


Yes.

I used to work in the animation industry. My friends who are still in it tell me that there is constant pressure from the studios to combine multiple roles into one, without any increase in pay. They've got a strong union so there's a lot of pressure to keep union shops as places with a lot of ways for you to bounce around and spend a year really delving into the fine details of one particular part of the craft, while still being able to afford to live in LA.


I have been thinking a lot about low effort content. Because low effort is quick, dirty and might be akin to junk food. Will people gobble it up or go meh after the initial honeymoon period?


There's gonna be a lot of 'meh', mostly because of lack of intentionality, but also through oversaturation. Basically people will gobble it up until they choke, and then will feel sick and blame it on all art, not just AI art.

I'm in an odd position on the subject myself: I've long wanted to get facile and slick in my art creation, able to do any genre or style, but it backfires. The only thing that keeps me going is, pursuing really eccentric pursuits. Now I think that's a blessing, because ability to be facile is seriously devalued now.

If you're a human artist very derivative of Greg Rutkowski right now, you're more screwed than I could possibly imagine. And yet, the reason someone would do that is because the style is popular on an extremely basic level: a far cry from trying to be a Basquiat from scratch.

I think a very real concern is, can AI adopt and popularize a trending style so fast that it obscures the initial artist from which the AI is trained? If novelty is what's needed, how small of a seed is enough to spawn 10000 AI derivatives, which may themselves be popularizations of the original concept?

Maybe the future of AI is to proliferate any new innovation so profusely that it inevitably chokes out the innovator and hybridizes with 'what's commonly popular' which is the guts of the neural network that makes AI what it is.

Maybe in some fields this has already begun to happen with a little assistance from AI-guiding humans. Take it up a level: what about AI prompt crafting? Can you define, not just what will be commonly accepted as popular, but what will be innovative and trendy, in a neural network?


Yeah. A bunch of interesting points. You also seem to describe a sort of future trendwatchers.

In Star Trek there still seem to be writers and creators of holo deck programs, even though you could probably have the computer serve a blend of different stories. Perhaps “a craft” will still be seen as admirable.


I forget where I saw it, but there was a musician who made a good point: he said, I didn't do all the hard work of learning to play my instrument in order to impress you with my talent, I did it because I have all these songs in me that I desperately wanted to be able to express.

I can understand that. In the visual domain, especially: how often haven't I thought about a story, my own or someone else's, and wished I could show people. I can also understand the dismay of some artists, who did all that hard work, and suddenly find that it's even harder to reach people with those songs, pictures and stories bubbling in them, because now so many others are able to express theirs too.


> I didn't do all the hard work of learning to play my instrument in order to impress you with my talent, I did it because I have all these songs in me that I desperately wanted to be able to express

Markets for many kinds of art have already been driven to near zero, but I wonder if AI will drive them negative: if you want a human audience to give you their much-contended attention, you'll have to start paying them.


Vanity publishing is already a thing, it might become bigger. But I don't think the price as such will become negative, because paying people to listen to you sort of defeats the point. You want to believe that the things you have to say are worthwhile, after all.


Vanity Publishing is basically the entire economic force behind Social Media, and even traditional media.

Inventor: I made a thing! I want people to know about it so they will want it, and pay me for my cool thing.

Public: We can't notice you over the million other things to pay attention to.

Inventor: Hey Mr. Zuckerberg, here's $100. MAKE THEM NOTICE ME.


It looks bad for career artists. Maybe it's an important step in terms of human expression and communication though, that visual and musical art won't require years of skill to convey your ideas or emotions effectively.

I don't have much of a solarpunk outlook on this, I don't think it's going to be that beneficial. It'd be nice to be wrong though.


I don't know. I've put out an album largely driven by modular synth, meaning that I've got many percussion elements driven literally by a machine, defeating the need for me as a human to play drums in perfect time (which I'm not good at)

But in doing that, I've spent years learning how the micro-timing of musical grooves work (some of it's very obvious, like how a heavy snare backbeat will lag slightly behind what the perfect time is, almost to the point of being a 'flam')

As a result, I was able to make an album that conveyed certain kinds of groove not immediately accessible to novice musicians (or drummers)… but given the same information, any schmoe could push a button and have a preset in his DAW produce the same effect.

I think to some extent if the person doesn't really understand the purpose or need for such an effect, their grasp of how to implement and craft it will be pretty loose. If you don't know why you're doing the effortless thing you're not going to guide it very well and your results will be kind of generic.

Rather than thinking of years of skill, maybe call it years of focus, or years of purpose? To some extent, we collectively respond to creations with a profound sense of purpose. If that purpose comes out of AI it will have to be something from the AI, and not simply a blind reflection of us and our crudest drives.


That's actually my point, without the requirement of years skill building, what's left for us to do? Also, imo, what makes art so human is also failure, it's part of the game. e.g. a successful musician releasing a bad record, or a not more relevant musician that after years of experimentation reinvent himself and publish a great record. I don't have a clear outlook on this too, only questions about the future


People play chess even if they don't have a chance at winning against an AI. And in Art there is no winning anyway.

People will continue to paint, like they did after the invention of the photograph and photoshop. Same for music, films, etc.


Yes, as at a hobby level this is true. The problem is that fewer will embark on journey if low effort and a click of a button would compete with their years of hard work. Yes, they can use AI too but that's not what they signed up for in arts, there was a benefit of being able to be in a completely different space, away from the computers and engineering. Anyway, the cat's out of the bag, we shall see how it goes and I hope I am wrong.


Maybe it also feels a bit like mid 90s, when the Internet and Linux came up. Before I had just QBasic and Dos/ Windows. But the with Linux, you got so many compilers and with the Internet (and with CDs on Magazines) you got sooo many programs. For almost everything you would like to do, somebody wrote already a program.


Do the extended creative process for yourself

Just like a hipster/enthusiast with a more manual photography and development process

some people might appreciate it, the market likely wont but thats already the same before AI


With the image generators coming out I’ve been scrambling to understanding their place, I’ve been looking to the chess world as a model of the future.

In chess, people would still rather see people play chess than a robot. The top chess players in the world started as a brute force obsession about the game. Those would go on to teach the next generation. They advent of computers allowed for historically statically advantaged moves. ML came along and disrupted even further.

Now many of the top chess players consult the ML chess oracle.

I see the same thing happening in a lot of areas: grammar, image generation, text replies.

I see a world where humans are celebrated for their humanness while machines assist.


This is vastly different. Chess is a battle of minds between people while you don't necessarily need context to enjoy art, and that's where AI art is likely to take over.

Yes, considering the story behind art pieces and the artist does significantly impact the way we interact with art, but you can still just like a painting or a piece of pixel art without knowing anything else about it. While Chess is about the players, art has products that exist on their own.


> That's where AI art is likely to take over

I guess that depends on what the definition of art is. If it's a digitally rendered illustration then maybe.

If it's a physical object crafted to imperfect perfections then no. AI can probably produce some alternative version of Guernica but that's just a fascimile of something that's already been made in a digital space.

I can see an artist using these tools as a way to produce ground-breaking work in the future but I would wager the artist that does that could make good art without any of these tools. You still need a craftsman to master the tools and without knowing the basics you're left with images of Elon Musk as a Disney princess on repeat.


But chess has a clear goal. It makes sense to consult a machine that is better at reaching that goal than a human. What is the goal of art and what does "better" mean for art?

(I admit that I intentionally misunderstood your post a bit. I suppose you are talking about images and text for general use, not only in an artistic context)


Sort of like the paralympics?


I honestly thought pixel art style games were made by creating 3D assets and then rendering them in low resolutions as sprite sheets these days. Or using the 3D models in the game with a 'pixelate' shader. The idea that people still hand draw sprites slightly blows my mind.


Dead Cells does the "2D sprites from 3D models" thing.

It requires some specific models and techniques that wouldn't look good in 3D, though, it's not just cell shading.

But it lets them have a wild variety of player skins and enemy animations without having to redraw everything by hand.


Why? It's much easier to draw a pixel sprite instead of creating a 3d model and then doing tricks to render it as 2d..

And hand-drawn images look better if the artist is skilled.


King of Fighter 12 sprites, which are regarded as being high quality, were made from 3d models. They first animated 3D models, selected specifics frames, then traced sprite over those.

https://kofaniv.snk-corp.co.jp/english/info/15th_anniv/2d_do...


> then traced sprite over those.

In the past this was called roto scoping and used to capture outlines and movement of real objects for animation. The end result is still a hand drawn and shaded object instead of just a screenshot of a posed 3D model.


ah yes. now that you say it, i recall they did that for Terminator 2.

https://www.youtube.com/watch?v=D0xp74uIZO4


Why? It's much easier to draw a pixel sprite instead of creating a 3d model and then doing tricks to render it as 2d..

It's never a sprite though. Even back on the Super Nintendo game sprites had 9 angles * lots of actions * lots of frames for every animation. Multiply that by different lighting in modern games, and using the power of a 3D engine starts looking like an obvious choice. I know lots of games do this for environments. I assumed they did it for everything.


During the 80s the pixelized sprites were drawn to be as realistic as possible given the inherent limitations, but when you make a retro-game now, you probably want to target the de-facto aesthetic itself, and then my guess is you don't want to go too far in the realism direction. Clever use of few animation frames and angles conveys this better than blindly just rendering and scaling down using some optimal resolution reduction algo..


3D doesn’t transfer well to stylized 2D. That’s one of the primary reasons why people are still doing it manually. Pixel art also has a much lower barrier to create.


It's a very high barrier. Because you can't interpolate animation, or move bones around, or slap a different texture if you don't like something.

Every motion in every direction has to be animated. It's akin to creating animation by drawing every change on a separate sheet of paper, photographing it, and then stitching together.

The barrier of entry is indeed low. It doesn't take much to start sketching some pixel images. Making a coherent animated whole out if it? Very labor intensive and hard.


Only if you keep it really low resolution and low framerate. If you want something higher resolution, multiple camera angles or fluidly animated doing it in 2D quickly becomes impractical.

Guilty Gears for example started out with 2D sprites and then went 3D with Guilty Gears Xrd[1][2], but with a lot of trickery to keep it looking 2D'ish, though that always was more anime-style than pixel art.

[1] https://www.youtube.com/watch?v=CTUvSsOtPKA

[2] https://www.youtube.com/watch?v=yhGjCzxJV3E


Factorio's graphics are 3D rendered via a pipeline to 2D sprites, multiple angles for the character & vehicle models, but only one for buildings since the camera angle is fixed. I mean that's not really pixel art anymore (fairly high resolution), but still a 3d-to-2d pipe.


I think (but I am not sure) the sprites in Fallout I (1997) and Fallout II (1998) were created as re-touched 3D renderings.

(I am instead sure they revealed that the characters in video animation were re-touched recordings of created puppets.)

Many techniques may be part of the creative process and of the production.


Despite what others have replied, you're right. Both of those techniques are used a lot. Still plenty of people doing it by hand, but you can achieve some really interesting results by "faking it".


If you can draw well(like, if you've spent a few hundred hours trying to carefully and accurately copy stuff), it's easy to outline the major proportions of a posed character or object quickly by looking at a reference and modifying it a bit. And if you have a palette set up as well, the rendering can go really fast once you know how you're going to group up the pixels(which is a major artistic decision, since clusters of same color pixels tend to read better as shapes and textures). So the payoff for 3D tends to manifest only after you have a lot of frames to render - each frame, on its own, tends to be fast to paint.

Basically, if you look at where CG is used in contemporary anime, that's where the line is drawn. Many shows will CG their vehicle shots and parts of their action scenes since they call for a lot of perspective drawing with swooping camera movements. They may use and trace over CG dolls to do establishing shots or get static posed characters in a scene, but once detailed acting is called for they tend to revert to drawing in keyframes. Again, generally relying on reference footage to get the acting down, but modifying it to fit the character designs.


There are some games that do as you say, "fake" some pixel art using a shader on 3D models. A Short Hike and Fez come to mind as examples.


A Short Hike isn't pixel art though, is it? It's too high-res.


And what blows my mind is that you actually think people are doing this. Sounds like the hardest and most convoluted way to do it. Scaling a high def image down to low resolution doesn't automatically create pixel art. It creates an ugly mess that would require as much work to look good as if you just did it in that resolution directly.


This guy uses the technique a lot with fantastic results: https://youtu.be/KPoeNZZ6H4s?t=861


While it looks great it feels very different from pixel art. More like a pixelated filter.

I think this is a better video from him showing how it looks in practice: https://www.youtube.com/watch?v=1FrIBkuq0ZI&t=420s


> 200x200 is relatively large for pixel art, but if a single pixel makes this much of a difference it should probably be larger.

Huh? Half the point of pixel art is that single pixels make a difference. That art is way above the threshold where one pixel can make too much of a difference.


For the last time, the issue with "AI" is not that it exists as a tool, all of the issue that anyone should have is in how the data used to fit is obtained. No one would have any issue if you took the time to draw thousands of images then trained on them and lived off of that, just don't steal data from others, hand-wave as "free use" and carry off into the sunset on data you didn't generate.

The tools behind AI are fine and have honestly existed for decades. If anyone is up against AI because it isn't authentic or something, that's a fools errand really because people, artists themselves and developers, will find them useful. The problem is and always will be how the data you fit on and how you obtained it.


Even the issue you mention, training without consent, won't stop this.

Chances are really very low that this will become illegal and enforceable. It would require some very draconian laws, whilst copyright legislation is low priority in government circles, even more so in these times.

Even if somehow this would be outlawed in the US, nobody cares internationally. Right now, on Amazon you can buy knockoffs of millions of products from China that violate IP/copyright. Nobody cares. Do you think they will care about something as worthless as a digital image? A digital image that can't even be reliably detected as being AI generated?

And there's yet another work-around. Scrape images that don't require consent or make consent part of terms and conditions. Google made Google Photos free for about a decade, and trained it for free on all your stuff.



Using these generators as tools to find inspiration seems like the best case scenario. I think people are assigning too little probability to the potential future scenario that current systems are about as good as we're going to get with current ML methods, though some refinements will marginally improve them. Without a major breakthrough, I don't see any AI systems replacing professional artists on video games, e.g., unless it's a very low effort, low quality game.


There are still major breakthrough and improvements every few weeks, so I really wouldn't worry about already having hit rock bottom.

But even ignoring that, we have barely even stared exploring what we can do with the technology as is. A lot of it is still just experiments living in a git repository or need more GPU than the average person has. Give it a few more months or years, and you'll have it integrated into every major photo and video editor software and optimized to run on normal consumer hardware. That simple improvement in accessibility will have very wide reaching consequences just by itself, even without improving the underlying AI drastically.

And no, you won't replace the professional artists anytime soon, after all somebody still need to have the final say into what goes into the game, but it will drastically transform how that artist will work and the amount of content they'll be able to produce.

> e.g., unless it's a very low effort, low quality game.

The output of Midjourney and Co. already looks spectacular, easily better than a lot of games out there. I could easily see that replacing or enhancing a lot of art in 2D RPGs, point&click adventures or visual novels.

Everything that needs animation or 3D meshes will take a while longer, but for 2D games it's already more than good enough. It's really more an issue with artists and game developers still needing to catch up on all the rapid new developments that happened over the last few months.


In 2014 we considered it almost impossible for computers to understand the contents of images. It took a few years for that to become possible. Now ML models are creating amazing images themselves. I don't see what your skepticism is based on. https://xkcd.com/1425/


Looking forward to the day when I can generate 4 direction sprite sheets, walk cycle, attacks and equipment.


I've already lost track of all the different apps that let you play with Stable Diffusion et al.

Do you have any recommendations for (web)apps that allow you to generate images from prompts in good resolutions? How about ones for img2img?



My wife is an artist and has been using some of these systems to inspire her own work with clients - she's said it's both given her creative new ideas, and improved her efficiency by about 30%.


Creative director is one of the many hats I wear these days. I am using components of AI in nearly all of my visual assets. The future is already here.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: