That's not the same as piracy though. He wasn't downloading millions of scientif...

snickerdoodle12 · 2025-07-22T13:56:38 1753192598

The exact same charges could apply to the AI scrapers illegitimately accessing random websites.

dragonwriter · 2025-07-22T16:50:39 1753203039

No, they couldn't, since the then-novel and untested strained interpretation of the CFAA that the prosecutor was relying on has since been tested in the courts and soundly rejected.

kube-system · 2025-07-22T15:02:37 1753196557

I haven’t seen any accusations that they’ve done that, though. Usually people get pirated material from sources that intentionally share pirated material.

snickerdoodle12 · 2025-07-22T16:53:14 1753203194

They're not just training on pirated content, they've also scraped literally the entire internet and used that too.

kube-system · 2025-07-22T19:36:59 1753213019

Scraping the public internet is also not a CFAA violation

snickerdoodle12 · 2025-07-22T20:49:40 1753217380

CFAA bans accessing a protected computer without authorization. Hitting URLs denied by robots.txt has been argued to be just that.

dragonwriter · 2025-07-22T21:06:00 1753218360

> Hitting URLs denied by robots.txt has been argued to be just that.

"Has been argued" -- sure, but never successfully; in fact, in HiQ v. LinkedIn, the 9th Circuit ruled (twice, both before and on remand again after and applying the Supreme Court ruling in Van Buren v. US) against a cease and desist on top of robots.txt to stop accessing data on a public website constituting "without authorization" under the CFAA.

snickerdoodle12 · 2025-07-22T21:43:16 1753220596

Now do every other jurisdiction

gruez · 2025-07-22T22:49:56 1753224596

CFAA was mentioned specifically, which means only US jurisdiction is relevant here.

gruez · 2025-07-22T13:59:14 1753192754

Part of the accusation comes from the fact that Swartz accessed the downloads through a MIT network closet, which AI companies wasn't doing. The equivalent to that would be if openai broke into a wiring closet at Disneyland to download Disney movies.

snickerdoodle12 · 2025-07-22T14:01:15 1753192875

The CFAA is vague enough to punish unauthorized access to a computer system. I don't have an example case in mind, but people have gotten in trouble for scraping websites before while ignoring e.g. robots.txt

gruez · 2025-07-22T14:51:47 1753195907

The CFAA might be vague, but the case law on scraping pretty much has been resolved to "it's pretty much legal except in very limited circumstances". It's regrettable that less resourced defendants were harassed before large corporations were able to secure such rulings, but the rulings that allowed scraping occurred before AI companies' scraping was done, so it's unclear why AI companies in particular should be getting flak here.