1) That depends entirely on the model in question (size, complexity) and the amount of training material.
Using a standard dataset like CelebA [1] and an "HQ" model (512x512) like StyleGAN2, training requires at least 1 GPU with 12GiB of VRAM and training of about a week with a single V100 GPU.
Depending on your provider of choice, this will cost anywhere from ~$514 (AWS), ~$420 (Google) to $210 (Lambda Labs, RTX 6000 - should be in the same ballpark).
If your training process is interruptible and can be resumed at any time (most training scripts support this), costs will drop dramatically for AWS and Google (think $50 to $200).
2) Yes. A used ~$200 Tesla K80 will do. Alternatively any NVIDIA card with at least 8 GiB of VRAM is capable of doing the job, but lower batch sizes and increased training time are to be expected. If you can use a dedicated machine with an RTX 3060 or a brand new A4000 (if you're willing to pay the premium), close to a week of training time can be achieved.
3) Yes*
*your work will be freely available to everyone and your training process is limited to 12h or so per day.
All in all I wouldn't recommend training a StyleGAN model from scratch anyway. Finetuning a pretrained model using your own dataset can be done much more quickly (think hours to a day or two) and on consumer-level hardware (I train my models on an old desktop with a GTX 1070).
> Finetuning a pretrained model using your own dataset can be done much more quickly (think hours to a day or two) and on consumer-level hardware (I train my models on an old desktop with a GTX 1070).
This is interesting! Do you have some links about doing that?
My desktop computer has a GTX 1060 with 6 GB of VRAM. But hopefully I can use it for something like this.
I've only used Google Colab in the past, and only tried stuff with prompting existing models.
Would love to experiment a bit with fine-tuning models on my own datasets to get some kind of unique stuff.
It is the same transfer learning you would do with any model. StyleGAN (and StyleGAN2-ADA) provide pretrained weights for you. Just start there and train on the new dataset. The ADA github even has tools to format your dataset correctly.
> If your training process is interruptible and can be resumed at any time, costs will drop dramatically for AWS and Google (think $50 to $200)
Noob question here: how does that work? Do you run the scripts in the hosting provider's downtime, with something like a nighttime rate? Or what magic is this?
It's basically special rates for instances that can be shutdown by the provider at any time.
Google charges about 1/3rd of the usual hourly rate for such instances and AWS has a "market place" where you can bid for such instances (you name your max. price beforehand) and whenever an instance with your selected specs becomes available at that rate, you get it.
Hyperscalers like Google and AWS basically have two types of machines/VMs for rent: instances reserved for long term commitment (think months to years) and on-demand. Naturally there's peak demand times in each region (usually during business hours) followed by periods of low demand so there's heavy fluctuation.
Instead of just having their machines sitting idle while still costing money, they offer such idle resources at a heavily discounted rate, with the catch that as soon as regular demand rises again, your VM is being shut down to be offered at the normal hourly rate (you get notified so your script has some time to save its state).
It's similar to what hotels do - discounted rates are available most of the time, but whenever there's a convention in town or on national holidays, you get kicked out and prices quadruple. Only with hyperscalers prices stay the same and you simply lose your cheap resource.
The actual time at which VMs become available is pretty random. If you are ok with using multiple regions there's pretty much always instances available, though usually not for hours on end.
Using a standard dataset like CelebA [1] and an "HQ" model (512x512) like StyleGAN2, training requires at least 1 GPU with 12GiB of VRAM and training of about a week with a single V100 GPU.
Depending on your provider of choice, this will cost anywhere from ~$514 (AWS), ~$420 (Google) to $210 (Lambda Labs, RTX 6000 - should be in the same ballpark).
If your training process is interruptible and can be resumed at any time (most training scripts support this), costs will drop dramatically for AWS and Google (think $50 to $200).
2) Yes. A used ~$200 Tesla K80 will do. Alternatively any NVIDIA card with at least 8 GiB of VRAM is capable of doing the job, but lower batch sizes and increased training time are to be expected. If you can use a dedicated machine with an RTX 3060 or a brand new A4000 (if you're willing to pay the premium), close to a week of training time can be achieved.
3) Yes*
*your work will be freely available to everyone and your training process is limited to 12h or so per day.
All in all I wouldn't recommend training a StyleGAN model from scratch anyway. Finetuning a pretrained model using your own dataset can be done much more quickly (think hours to a day or two) and on consumer-level hardware (I train my models on an old desktop with a GTX 1070).
[1] http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html