In 2017, global IP traffic was 1.5 ZB per year, or 1.5 billion terabytes. So assuming you were using 10TB drives you'd need 150 million hard drives per year, or about 37% of global production.
You can dump any traffic that originates from a bulk traffic provider like Netflix, Youtube, Prime, Xbox Live Download, etc - it would be sufficient to collect metadata if you were interested in this at all. This source suggests that content makes up about 33% of global IP traffic, with unspecified media providers (probably porn) making up another 15% or so, so on the whole you can probably round that up to between 40 and 50%.
From there the numbers get a little squishy depending on your estimates of various categories of traffic and how conservative you want to be about discarding content.
In theory you can discard anything that you can collect from another source - i.e. stuff like gmail, you can get from google directly, no need to capture that. If you are not interested in retaining un-encrypted content, you could dump a bunch more. Only about 50% of traffic is encrypted, although that is probably weighted towards non-bulk content being the encrypted stuff.
If you can dump, let's say 75% of all non-bulk content then you'd be looking at retaining about 12.5% of total IP traffic, 187.5 million terabytes per year, which would require 18.75 million drives per year, or about 4.69% of global production.
You could, of course, blow it all down to tape filed according to cipher, and then read it back in when you have broken it. No need to keep everything online forever when it's not broken yet. LTO-8 tapes are 12 TB each (encrypted data will not compress), so the numbers work out similarly to 10 TB drives. LTO tape production is a lot smaller though, about 20 million tapes per year, so there is not enough tape sold to do that, unless you are doing a private factory to produce your own tapes. But at that scale, it would probably be more affordable than drives.
Some people speculate that Amazon Glacier is actually a library of BDXL discs and that they are purchasing big batches from factories. That's about 125 GB per disc, but in bulk they are probably also cheaper than drives as well.
FWIW Snowden's docs suggested that they were only retaining data for a month (iirc) and then dumping it, but it's possible they could be selectively retaining encrypted data for longer. Presumably the Utah datacenter was built for a reason. I would assume that at this point they have "high-risk" selectors that automatically get pulled out but that they are probably not collecting everything everything and keeping it forever.
Also, a footnote here is that this would be a logistically significant operation, you would either need your own parallel data links with a significant fraction of the capacity of the primary backbone (far beyond what SIPRNET/NIPRNET likely can support), or you would need to be moving shipping containers of drives/tapes back to Utah like Amazon Snowmobile. You'd also need people regularly going into those tap rooms to change out the drives and so on. It would be high maintenance to attempt this.
I suspect that even if they don't retain content long term, they would probably retain metadata. What sites you visited and who you talked to is very revealing, especially on a timeline measured in decades.
Save it on what? Can someone do a back of the envelope calculation on what it takes to back "all of it" for an indeterminate amount of time?