Also, you could verify by writing unit tests with OpenCV to look for similar sources. Since it's all headshots, it will find matches for sure, but it would also find with human faces.
The neural network is starting from noise, but that's not the only input, it was trained on [0] and I think it's arguable that the NN is "reproducing" the images from its training dataset in some sense.
I think it's a very interesting question, of how can we measure when a neural network is being creative? In fact, creativity is not obvious at all. It's sort of an ill-posed question if you think about it. How can you verify that a network is generating things that are not like what it was trained on, yet are... like what it was trained on?
Are neural networks* forever relegated to the role of copying and interpolation? Do the neural network weights form a kind of database?
* (I don't think this only applies to neural networks, but models in general)
There was one recent work trying to address this [1] but I'm not 100% convinced and I think a lot more work is warranted in this area. A difficulty is that it's not a purely technical problem, but also one of semantics and interpretation. It's one that the "automatic musical accompaniment" community and other digital arts communities have struggled with for decades, and it's not resolved.
How do you know when a machine is being creative? It's not far from the moving goalposts problem of general artificial intelligence. How do you know when a machine is being intelligent, if you can always explain it away by examining the black box?
The best one can hope for from a NN is that it discerns a model within the training data. There is a way to more-or-less onjectively measure how well it has done this, if at all: if the model requires less information than the data it explains. i.e. fewer bytes. So, "compression algorithms" are a rudimentary model of data; we'd like to do much better than that.
However, NN tend to not be very space-efficient, and also don't usually "explain" the data (in the sense of reproducing it). So this test is hard to apply to them.
BTW: human creativity has much to do with expectation: how obvious it was to you already. So, people with different levels of exposureto some art discipline have different opinions on creativity... and as new styles become known, those opinions change.
Human beings also draw on other fields and experiences, not available in training data. Especially striking, to humans, is inspiration from common experiences that are not recognised as common, as in art that reveals ourselves to us; observational humour. For a computer to use this information, it seems it would need to have human experiences, a body, social interaction etc. Of course, this is a very parochial concept... pure creativity need not be so anthropocentric.
And likewise, how do you know when a human is being creative? Isn't all art derivative of our training and influences? I believe something like that was an argument by one of the random paint splatter artists: that randomness was the only thing truly creative.
Yep, this is a current area of research for content generation.
I think most current approaches build some transform to a latent space and then compare generated images with their nearest neighbors in the training set. If they're identical then your network just learned to reproduce the dataset.
Yeah, I would have at least ran some kind of similarity search on the output. Without that check it's impossible to know if this is actually doing anything.
I think the sound is generated locally. Check out the page source, which contains the JS I think is responsible for generation (I haven't checked in detail).
Wait, it's actually generating new cat faces as in cats that don't exist? Some of those images looked like they had backgrounds in the corners was that also generated???
It's a combination of GPU RAM, slowdowns (remember it's squared in dimensions), and stability (larger is more unstable end-to-end). Arguably, state of the art in image synthesis is DeepMind's PixelCNN: "Parallel Multiscale Autoregressive Density Estimation" https://arxiv.org/abs/1703.03664 , Reed et al 2017: generating 512px photorealistic images & video with PixelCNNs rather than GANs. Also good is StackGAN which does ~200x200ish but there's no reason it couldn't go up to 500x500 (just pop in a third upscaling stage).
There's far more work on GANs than PixelCNNs (see the https://github.com/hindupuravinash/the-gan-zoo ) but at least thus far, I haven't seen any GANs which appear visually competitive with Reed et al 2017's PixelCNN samples. Downside - code has not been released by DeepMind[], and you can't do CycleGAN or other stupid GAN tricks with PixelCNN AFAIK. CycleGAN is absolutely hilarious, if you haven't seen all the uses of it yet, much more interesting than generating cat faces.
[] I asked way back when and Reed said he'd try but nothing yet.
That's not it. It's easy to scale a larger image down to 224x224 and feed it into a checkpoint. And a lot of these GANs don't use such content losses in the first place because it adds complexity and makes it harder to use (have to get one of those pretrained models in the first place).
I would assume that the name was chosen, because I already have an old project online called "cat generator" [0], which used the same underlying technique (GANs), dataset and landmark-based normalization. Reusing the name would have resulted in confusion.
I gave it a try on anime images with 64px/128px WGANs for about a month back in March. No, it's not really feasible yet. GANs need restricted datasets; anime girl or cat faces, yes, anime girls in general, no, it never learns effectively. They need either more supervision (I thought StackGAN could probably handle it if you could feed in Danbooru tags) or better algorithms (PixelCNN? see my other comment, but the Reed et al 2017 samples are great despite tremendous diversity of images). Plus more GPUs.
Ignoring the joke, this is actually interesting question to ask. I mean, yeah, there are some pretty scary, uncanny images of cats here, but some cats look almost… fine? So if these cat images are "creative" enough — this is almost a success.
But if you take a pencil and try to draw a cat yourself (assuming you are not a good artist) you have much higher chances to actually draw something "cute", than if you'd try to draw a woman. Human females look much more familiar, and there's something much trickier and more intimate to what you recognize as "cute" or "beautiful" in a human, than in a cat.
So, I'm pretty sure this NN would fail, but it's interesting what's actually required for it to not fail.
No, it's because the cost it inflicts on the commons exceeds the benefit of an adolescent sex joke. If we get lots of the the latter, it will turn off and drive away some of the people we want here. Then more good users may start to leave as more good users start to leave—a vicious circle which it's basically our #1 job to (try to) make sure HN doesn't get caught in. So we're a bit hypervigilant about this.
If that's what you're interested, a quick google search will show you plenty of startups/companies applying machine learning/deep learning to medicine.