Hacker News new | past | comments | ask | show | jobs | submit login
Flexible diffusion modeling of long videos (ubc.ca)
65 points by thorum on May 27, 2022 | hide | past | favorite | 16 comments



I was really hoping this was a "diffusion model" in the same sense that these guys built a reaction-diffusion based self-healing system:

https://distill.pub/2020/growing-ca


Yeah, the neural network "diffusion models" are not very well named. If you have background in natural sciences, you would understand diffusion to mean, well, diffusion. Whereas there generative neural networks are about (1) blurring data by Gaussian noise, (2) teaching a NN to denoise the noised data, and finally (3) with Gaussian noise as input, let the denoiser NN to generate new data. So it's not so much about diffusion as it's about reversing the diffusion. And it's not really (smooth) diffusion, but Gaussian noise.

"Denoising autoencoder" is already used for processes that reconstruct partially corrupted input. So what name to suggest for a process that reconstructs data from nothing but noise?


The reaction diffusion / NCA models have been applied to videos recently — check it out:

https://wandb.ai/johnowhitaker/nca/reports/Fun-with-Neural-C...


That was a fascinating article with a great mix of readability and interactive demos.


There has been so much fun made of the infinite monkey theorem [1] over the years, but look where we are now !

[1] https://en.wikipedia.org/wiki/Infinite_monkey_theorem


Now this just needs to be integrated with DALL-E/Imagen and GPT-3 (plus a text to speech engine), to create an offline version of YouTube.


It’s definitely fun contemplating a future where there is nothing special on video.


Finally a way to restrict the library of babel to (seemingly) meaningful books!


Great idea! It’ll need someone to train a censorship engine to complete the illusion.


You'll be relieved to hear that those models already have censorship engines built in to try to prevent them generating problematic content.


It still doesn’t work because even the latest AI tech is unable to understand the complex rules of what problematic content is.

How can you identify socially acceptable bias. For example, you’d expect it to be biased towards cars with 4 wheels vs rare 3 wheeled cars, but how does it know that bias is ok but being biased to male lawyers isn’t.

And then the billion other similar situations.


It doesn't have to discover all the rules with no help. We don't need ai to invent ethics, only enforce the one we have arbitrarily chosen.


It's an almost impassible task. ML is reflecting the world and the data in the world. If you ask for an anime style man, they will pretty much universally generate white men because the dataset of anime characters almost universally contains white characters. The model isn't wrong, its generating exactly what exists in the world already. And there are an infinite number of scenarios and biases that it reflects which you will never be able to manually flag.

It reminds me a lot of the early self driving car debate where there were endless surveys asking if the car should run over the 2 old ladies or the one child studying medicine. And in the end we decided it was an unreasonable burden and just accepted that ML doesn't need to make impossible moral judgements.


> the dataset of anime characters almost universally contains white characters

Japanese viewers and their creators see the majority of anime characters as Japanese, not white. That you see them as white says more about you.


Exactly! I’m being downvoted, but I’m sincerely suggesting that we train a full spectrum of ideologically biased censorship engines and then let people pick which ones they want to use.

It’s no different from parental filters.


hugged to death




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: