FOSS infrastructure is under attack by AI companies

ericholscher · 2025-03-20T13:05:31 1742475931

Yep -- our story here: https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse... (quoted in the OP) -- everyone I know has a similar story who is running large internet infrastructure -- this post does a great job of rounding a bunch of them up in 1 place.

I called it when I wrote it, they are just burning their goodwill to the ground.

I will note that one of the main startups in the space worked with us directly, refunded our costs, and fixed the bug in their crawler. Facebook never replied to our emails, the link in their User Agent led to a 404 -- an engineer at the company saw our post and reached out, giving me the right email -- which I then emailed 3x and never got a reply.

pjc50 · 2025-03-20T13:14:54 1742476494

> just burning their goodwill to the ground

AI firms seem to be leading from a position that goodwill is irrelevant: a $100bn pile of capital, like an 800lb gorilla, does what it wants. AI will be incorporated into all products whether you like it or not; it will absorb all data whether you like it or not.

UncleMeat · 2025-03-20T13:21:55 1742476915

Yep. And it is much more far reaching than that. Look at the primary economic claim offered by AI companies: to end the need for a substantial portion of all jobs on the planet. The entire vision is to remake the entire world into one where the owners of these companies own everything and are completely unconstrained. All intellectual property belongs to them. All labor belongs to them. Why would they need good will when they own everything?

"Why should we care about open source maintainers" is just a microcosm of the much larger "why should we care about literally anybody" mindset.

rectang · 2025-03-20T14:28:39 1742480919

> Look at the primary economic claim offered by AI companies: to end the need for a substantial portion of all jobs on the planet.

And this is why AI training is not "fair use". The AI companies seek to train models in order to compete with the authors of the content used to train the models.

A possible eventual downfall of AI is that the risk of losing a copyright infringement lawsuit is not going away. If a court determines that the AI output you've used is close enough to be considered a derivative work, it's infringement.

WhyOhWhyQ · 2025-03-20T14:40:39 1742481639

I've pointed this out to a few people in this space. They tend to suggest that the value in AI is so great this means we should get rid of copyright law entirely.

dbingham · 2025-03-20T14:50:47 1742482247

That value is only great if it's shared equitably with the rest of the planet.

If it's owned by a few, as it is right now, it's an existential threat to the life, liberty, and pursuit of a happiness of everyone else on the planet.

We should be seriously considering what we're going to do in response to that threat if something doesn't change soon.

UncleMeat · 2025-03-20T15:09:14 1742483354

Yep. The "wouldn't it be great if we had robots do all the labor you are currently doing" argument only works if there is some plan to make sure that my rent gets paid other than me performing labor.

Spivak · 2025-03-20T15:28:44 1742484524

It depends if you're the only one out of a job. If it really is everyone then the answer will likely be some variant of metaphorically or literally killing your landlord in favor of a different resource allocation scheme. I put these kinds of things in a "in that world I would have bigger problems" bucket.

ricudis · 2025-03-20T16:06:25 1742486785

And that's the ultimate fail of capitalist ethics - the notion that we must all work just so we can survive. Look at how many shitty and utterly useless jobs exist just so people can be employed on them to survive.

This has to change somehow.

"Machines will do everything and we'll just reap the profits" is a vision that techno-millenialists are repeating since the beginnings of the Industrial Revolution, but we haven't seen that happening anywhere.

For some strange reason, technological progress seem to be always accompanied with an increase on human labor. We're already past the 8-hours 5-days norm and things are only getting worse.

robertlagrant · 2025-03-20T18:18:39 1742494719

> And that's the ultimate fail of capitalist ethics - the notion that we must all work just so we can survive. Look at how many shitty and utterly useless jobs exist just so people can be employed on them to survive.

This isn't a consequence of capitalism. The notion of having to work to survive - assuming you aren't a fan of slavery - is baked into things at a much more fundamental level. And lots of people don't work, and are paid by a welfare state funded by capitalism-generated taxes.

> "Machines will do everything and we'll just reap the profits" is a vision that techno-millenialists are repeating since the beginnings of the Industrial Revolution, but we haven't seen that happening anywhere.

They were wrong, but the work is still there to do. You haven't come up with the utopian plan you're comparing this to.

> For some strange reason, technological progress seem to be always accompanied with an increase on human labor.

No it doesn't. What happens is not enough people are needed to do a job any more, so they go find another job. No one's opening barista-staffed coffee shops on every corner in the time when 30% of the world was doing agricultural labour.

consteval · 2025-03-20T21:55:33 1742507733

> This isn't a consequence of capitalism.

Yes, it is. The fact we have welfare isn't a refutation of that, it's proof. The welfare is a bandaid over the fundamental flaws of capitalism. A purely capitalist system is so evil, it is unthinkable. Those people currently on welfare should, in a free labor market, die and rot in the street. We, collectively, decided that's not a good idea and went against that.

That's why the labor market, and truly all our markets, are not free. Free markets suck major ass. We all know it. Six year olds have no business being in coal mines, no matter how much the invisible hand demands it.

wdkrnls · 2025-03-22T15:36:56 1742657816

You have a very different definition of free than I do. Free to me means that people enter into agreements voluntarily. It's hard to claim a market is free when it's participants have no other choice...

integralof6y · 2025-03-20T15:17:22 1742483842

> That value is only great if it's shared equitably with the rest of the planet.

I think this should be an axiom which should be respected by any copyright rule.

joquarky · 2025-03-20T21:02:55 1742504575

You are correct, but the real problem is that copyright needs complete reform.

Let's not forget the basis:

> [The Congress shall have Power . . . ] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

Is our current implementation of copyright promoting the progress of science and useful arts?

Or will science and the useful arts be accelerated by culling back the current cruft of copyright laws?

For example, imagine if copyright were non-transferable and did not permit exclusive licensing agreements.

salawat · 2025-03-22T15:23:56 1742657036

The "publisher bootstrap kit + revenue sharing agreement" would become ubiquitous overnight.

Copyright isn't the problem. Over-financialization is the problem.

tadfisher · 2025-03-20T17:18:24 1742491104

AI is going to implode within 2 years. Once it starts ingesting its own output as training data it is going to be at best capped at its current capability and at worst even more hallucinatory and worthless.

bigfudge · 2025-03-20T19:20:39 1742498439

The mistake you make here is to forget that the training data of the original models was also _full_ or errors and biases — and yet they still produced coherent and useful output. LLM training seems to be incredibly resilient to noise in the training set.

joquarky · 2025-03-20T21:06:54 1742504814

Forget what it eats to continue improving.

Realize what it already has.

A foundational language model with no additional training is already quite powerful.

And that genie isn't going back into the bottle.

mafuy · 2025-03-20T17:29:31 1742491771

Nonsense. Some of the current best AI models were specifically trained on AI output.

neilv · 2025-03-20T14:51:53 1742482313

That's a talking point for bros looking to exploit it as their ticket.

"The upside of my gambit is so great for the world, that I should be able to consume everyone else's resources for free. I promise to be a benevolent ruler."

RodgerTheGreat · 2025-03-20T15:09:17 1742483357

"What's good for Milo Minderbinder is good for the world."

freeone3000 · 2025-03-20T14:52:50 1742482370

…meaning that whatever model results would have no protection, and would be free for anyone to use?

hn_acc1 · 2025-03-20T18:22:45 1742494965

That's not how conservatism works. AI oligarchs are part of the "in" group in the "there are laws that protect but do not bind the in group, and laws that bind but do not protect the out group" summary. Anyone with a net worth less than FOTUS is part of the "out" group.

rectang · 2025-03-20T14:55:48 1742482548

AI is worthless without training data. If all content becomes AI generated because AI outcompetes original content then there will be no data left to train on.

When Google first came out in 1998, it was amazing, spooky how good it was. Then people figured out how to game pagerank and Google's accuracy cratered.

AI is now in a similar bubble period. Throwing out all of copyright law just for the benefit of a few oligarchs would be utter foolishness. Given who is in power right now I'm sure that prospect will find a few friends, but I think the odds of it actually happening before the bubble bursts are pretty small.

olleromam91 · 2025-03-21T03:02:56 1742526176

Are we not past past critical mass though? The velocity at which these things can out compete human labor is astonishing, any future human creations or original content will already have lost the battle the moment it goes online and gets cloned by AI.

AtomBalm · 2025-03-20T20:59:28 1742504368

We should, but not for those reasons.

If software and ideas become commodities and the legal ecosystem around creating captive markets disappears, then we will all be much better off.

thayne · 2025-03-20T19:16:52 1742498212

I'm doubtful the AI companies would be happy with getting rid of laws protecting _their_ intellectual property.

DidYaWipe · 2025-03-20T19:07:57 1742497677

What an infantile worldview.

joquarky · 2025-03-20T21:09:15 1742504955

> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

https://news.ycombinator.com/newsguidelines.html

DidYaWipe · 2025-03-21T03:46:07 1742528767

OK. To be clear, that wasn't about the OP, but rather the alleged people promoting the abolition of copyright... which would significantly hurt open source.

The people agitating for such things are usually leeches who want everything free and do, in fact, hold an infantile worldview that doesn't consider how necessary remuneration is to whatever it is they want so badly (media pirates being another example).

Not that I haven't "pirated" media, but this is usually the result of it not being available for purchase or my already having purchased it.

pjc50 · 2025-03-20T14:58:00 1742482680

There's already been an interesting ruling that a pure AI output is not, in itself, copyrightable.

joquarky · 2025-03-20T21:12:15 1742505135

I'm curious what will happen when someone modifies a single byte (or a "sufficient" number of bytes) of AI output, thereby creating a derivative work, and then claiming copyright on that modified work.

eadmund · 2025-03-20T15:09:04 1742483344

> The AI companies seek to train models in order to compete with the authors of the content used to train the models.

When I read someone else’s essay I may intend to write essays like that author. When I read someone else’s code I may intend to write code like that author.

AI training is no different from any other training.

> If a court determines that the AI output you've used is close enough to be considered a derivative work, it's infringement.

Do you mean the output of the AI training process (the model), or the output of the AI model? If the former, yes, sure: if a model actually contains within it it copies of data, then sure: it’s a copy of that work.

But we should all be very wary of any argument that the ability to create a new work which is identical to a previous work is itself derivative. A painter may be able to copy Gogh, but neither the painter’s brain nor his non-copy paintings (even those in the style of Gogh) are copies of Gogh’s work.

rectang · 2025-03-20T15:18:05 1742483885

If you as an individual recognizably regurgitate the essay you read, then you have infringed. If an AI model recongnizably regurgitates the essay it trained on then it has infringed. The AI argument that passing original content through an algorithm insulates the output from claims of infringement because of "fair use" is pigwash.

eadmund · 2025-03-20T15:24:54 1742484294

> If an AI model recongnizably regurgitates the essay it trained on then it has infringed.

I completely agree — that’s why I explicitly wrote ‘non-copy paintings’ in my example.

> The AI argument that passing original content through an algorithm insulates the output from claims of infringement because of "fair use" is pigwash.

Sure, but the argument that training an AI on content is necessarily infringement is equally pigwash. So long as the resulting model does not contain copies, it is not infringement; and so long as it does not produce a copy, it is not infringement.

kod · 2025-03-20T16:08:20 1742486900

> So long as the resulting model does not contain copies, it is not infringement

That's not true.

The article specifically deals with training by scraping sites. That does necessarily involve producing a copy from the server to the machine(s) doing the scraping & training. If the TOS of the site incorporates robots.txt or otherwise denies a license for such activity, it is arguably infringement. Sourcehut's TOS for example specifically denies the use of automated tools to obtain information for profit.

joquarky · 2025-03-20T21:19:53 1742505593

I'm curious how this can be applied with the inevitable combinatorial exhaustion that will happen with musical aspects such as melody, chord progression, and rhythm.

Will it mean longer and longer clips are "fair use", or will we just stop making new content because it can't avoid copying patterns of the past?

mega_dean · 2025-03-20T23:48:14 1742514494

> I'm curious how this can be applied with the inevitable combinatorial exhaustion that will happen with musical aspects such as melody, chord progression, and rhythm.

https://www.vice.com/en/article/musicians-algorithmically-ge...

They did this in 2020. The article points out that "Whether this tactic actually works in court remains to be seen" and I haven't been following along with the story, so I don't know the current status.

rectang · 2025-03-20T22:43:53 1742510633

More germane is that there will be a smoking gun for every infringement case: whether or not the model was trained on the original. There will be no pretending that the model never heard the piece it copied.

consteval · 2025-03-20T22:00:19 1742508019

> AI training is no different from any other training.

Yes, it is. One is done by a computer program, and one is done by a human.

I believe in the rights and liberties of human beings. I have no reason to believe in rights for silicon. You, and every other AI apologist, are never able to produce anything to back up what is largely seen as an outrageous world view.

You cannot simply jump the gun and compare AI training to human training like it's a foregone conclusion. No, it doesn't work that way. Explain why AI should have rights. Explain if AI should be considered persons. Explain what I, personally, will gain from extending rights to AI. And explain what we, collectively, will gain from it.

UncleMeat · 2025-03-20T15:11:34 1742483494

Outcomes matter. Things that are fine at an individual level can become social harmful at scale.

joquarky · 2025-03-20T21:21:31 1742505691

What happens when a culture becomes overwhelmingly individualistic and becomes blind to the at-scale harms?

maaaaattttt · 2025-03-20T13:41:14 1742478074

I have this line of thought as well but then I wonder, if we are all out of jobs and out of substantial capital to spend, how do these owners make money ultimately? It's a genuine question and I'm probably missing something obvious. I can see a benevolant/post-scarcity spin to this but the non-benevolant one seems self defeating.

neutronicus · 2025-03-20T14:03:54 1742479434

"Making money" is only a relevant goal when you need money to persuade humans to do things for you.

Once you have an army of robot slaves ... you've rendered the whole concept of money irrelevant. Your skynet just barters rare earth metals with other skynets and your robot slaves furnish your desired lifestyle as best they can given the amount of rare earth metals your skynet can get its hands on. Or maybe a better skynet / slave army kills your skynet / slave army, but tough tits, sucks to be you and rules to be whoever's skynet killed yours.

sambull · 2025-03-20T14:14:20 1742480060

Good thing AI for now needs power, water and a place to exchange heat. Our version of womp rats if it goes to far I guess.

neutronicus · 2025-03-20T14:19:44 1742480384

That's part of the "rare earth metals" synecdoche - hydroelectric dams, thorium mines, great lakes heat sinks - they're all things for skynets to kill or barter for as expedient

sevensor · 2025-03-20T13:46:15 1742478375

I don’t think you’re missing anything, I think the plan really is to burn it all down and rule over the ashes. The old saw “if you’re so smart, why aren’t you rich?” works in reverse too. This is a foolish, shortsighted thing to do, and they’re doing it anyway. Not really thinking about where value actually comes from or what their grandchildren’s lives would be like in such a world.

lukevp · 2025-03-20T13:56:52 1742479012

Capitalism is an unthinking, unfeeling force. The writing is on the wall that AI is coming, and being altruistic about it doesn’t do jack to keep others from the land grab. Their thinking is, might as well join the rush and hope they’re one of the winners. Every one of us sitting on the sidelines will be impacted in some way or the other. So who’re the smart ones, the ones who grab shovels and start digging, or the ones who watch as the others dig their graves and do nothing?

bigfudge · 2025-03-20T19:22:58 1742498578

This is a bleak view. How about the ones who work hard to shape the way we adopt the technology to societal benefit ?

moooo99 · 2025-03-20T20:28:42 1742502522

> How about the ones who work hard to shape the way we adopt the technology to societal benefit ?

they are heavily outnumbered and "outfunded"

salawat · 2025-03-22T15:54:19 1742658859

Some technology is fundamentally incompatible with some societal architecture's implementation details; AI is one such technology.

Ubiquitous surveillance is another.

Miraste · 2025-03-20T20:26:49 1742502409

There are a few people like that, but not many. And certainly none in the AI space.

mistrial9 · 2025-03-20T14:25:45 1742480745

obviously China is going full forward and better at it, with no "Capitalism" involved

MyOutfitIsVague · 2025-03-20T14:39:32 1742481572

There's plenty of capitalism in Chinese business. It's not a purely communist country, it's a hybrid system with an active market economy.

Analemma_ · 2025-03-20T15:41:46 1742485306

China has been communist-in-name-only since Deng, you're accidentally proving the parent's point instead of refuting it.

greenchair · 2025-03-20T21:12:06 1742505126

ha, thanks for that!

billy99k · 2025-03-20T14:08:35 1742479715

I already started incorporating AI into my workflow. It's definitely helped with productivity.

At some point in the future, if you aren't using AI, you won't be able to compete in the job market.

sozforex · 2025-03-20T14:32:39 1742481159

At some point in the future, if you aren't AI, you won't be able to compete in the job market.

billy99k · 2025-03-20T14:34:20 1742481260

Sure, maybe in 50 years. At the moment, it's a productivity tool. Strangely, by the look of the down votes, the HN community doesn't quite understand this.

sozforex · 2025-03-20T14:58:50 1742482730

How confident are you that you will not be outcompeted by AI's in 3-7 years?

blibble · 2025-03-20T14:37:02 1742481422

what you don't understand is you are training your own replacement

the tools feed back to the mothership what you are accepting and what you aren't

this is a far better signal than anything they get from crawling the internet

DrillShopper · 2025-03-20T19:08:04 1742497684

Class traitors never understand this

Ray20 · 2025-03-20T16:43:54 1742489034

That's a laughable idea.

Job market is formed by the presence of needs and the presence of the ability to satisfy them. AI - does not reduce the ability to satisfy needs, so only possible situation where you won't be able to compete - is either the socialists will seize power and ban competition, or all the needs will be met in some other ways. In any other situation - there will be job market and the people will compete in it

moooo99 · 2025-03-20T20:37:04 1742503024

> there will be job market and the people will compete in it

maybe there will be. I'm sure there also is a market for Walkman somewhere, its just exceedingly small.

The proclaimed goal is to displace workers on a grand scale. This is basically the vision of any AI company and literally the only way you could even remotely justify their valuations given the heavy losses they incur right now.

> Job market is formed by the presence of needs and the presence of the ability to satisfy them

The needs of a job market are largely shaped by the overall economy. Many industrial nations are largely service based economies with a lot of white collar jobs in particular. These white collar jobs are generally easier to replace with AI than blue collar jobs because you don't have to deal with pesky things like the real, physical world. The problem is: if white collar workers are kicked out of their jobs en masse, it also negatively affects the "value" of the remaining people with employment (exhibit A: tech job marker right now).

> is either the socialists will seize power and ban competition,

I am really having a hard time understanding where this obsession with mythical socialism comes from. The reality we live in is largely capitalistic and a striving towards a monopoly - i.e. a lack of competition - is basically the entire purpose of a corporation, which is only kept in check by government regulations.

Ray20 · 2025-03-20T22:38:16 1742510296

>The proclaimed goal is to displace workers on a grand scale.

It doesn't matter. What you need to understand - is that in the source of the job market is needs, ability to meet those needs and ability to exchanges those ability on one another. And nothing of those are hindered by AI.

>Many industrial nations are largely service based economies with a lot of white collar jobs in particular.

Again: in the end of the day it doesn't change anything. In the end of the day you need a cooked dinner, a built house and everything else. So someone must build a house and exchange it for a cooked dinners. That's what happening (white collar workers and international trade balance included) and that's what job market is. AI doesn't changes the nature of those relationship. Maybe it replace white collar workers, maybe even almost all of them - that's only mean that they will go to satisfy another unsatisfied needs of other people in exchange for satisfying their own, job marker won't go anywhere, if anything - amount of satisfied needs will go up, not down.

>if white collar workers are kicked out of their jobs en masse, it also negatively affects the "value" of the remaining people with employment

No, it doesn't. I mean it does if they would be simply kicked out, but that's not the case - they would be replaced by AI. So the society get all the benefits that they were creating plus additional labor force to satisfy earlier unsatisfied needs.

>exhibit A: tech job marker right now

I don't have the stats at hand, but aren't blue collar workers doing better now than ever before?

>I am really having a hard time understanding where this obsession with mythical socialism comes from

From the history of the 20th century? I mean not obsession, but we we are discussing scenarios of the disappearance (or significant decrease) of the job market, and the socialists are the most (if not only) realistic reason for that at the moment.

>The reality we live in is largely capitalistic and a striving towards a monopoly

Yeas, and this monopoly, the monopoly, are called "socialism".

>corporation, which is only kept in check by government regulations.

Generally corporation kept in check by economic freedom of other economic agents, and this government regulations that protects monopolies from free market. I mean why would government regulate in other direction? Small amount of big corporations are way easier for government to control and get personal benefits from them.

glodime · 2025-03-21T10:28:39 1742552919

> In the end of the day you need a cooked dinner, a built house and everything else. So someone must build a house and exchange it for a cooked dinners.

You should read some history.This veiw is so naive and overconfident.

Ray20 · 2025-03-21T23:16:11 1742598971

My views on this issue are shaped by history. Starting with crop production and plowing and ending with book printing, conveyor belts and microelectronics - creating tools that increase productivity has always led to increased availability of goods, and the only reason that has lead to decreased availability - is things that has hindered ability to create and exchange goods.

forgetfreeman · 2025-03-20T14:39:56 1742481596

I started a borderline smug response here pointing out how bullshit white collar and service jobs* where in deep shit but folks who actually work for a living would be fine. I scrapped it halfway through when it occurred to me that if everyone's broke then by definition nobody's spending money on stuff like contractors, mechanics, and other hardcore blue collar trades. Toss in AI's force multiplication of power demands in the face of all of the current issues around global warming and it starts to feel like pursuing this tech is fractally stupid and the best evidence to date I've seen that a neo-luddite movement might actually be a thing the world could benefit from. That last part is a pretty wild thought coming from a retired developer who spent the bulk of his adult life in IT, but here we are.

* https://phys.org/news/2023-08-people-pointless-meaningless-j...

kmeisthax · 2025-03-20T16:39:09 1742488749

Neo-Luddism is less stupid when you remember that the Luddites weren't angry that looms existed. Smashing looms was their tactic, not their goal.

Parliament had made a law phasing in the introduction of automated looms; specifically so that existing weavers were first on the list to get one. Britain's oligarchy completely ignored this and bought or built looms anyway; and because Parliament is part of that oligarchy, the law effectively turned into "weavers get looms last". That's why they were smashing looms - to bring the oligarchy back to the negotiating table.

The oligarchy responded the way all violent thugs do: killing their detractors and lying about their motives.

Ray20 · 2025-03-20T16:57:35 1742489855

>if everyone's broke >nobody's spending money on stuff like contractors, mechanics, and other hardcore blue collar trades.

Why would this happen? Money is simply a medium of exchange of values that this contractors, mechanics and other hardcore blue collar trades are creating. How can they be broke, if Ai doesn't disturb their ability to create values and exchange it?

forgetfreeman · 2025-03-20T21:58:46 1742507926

Customers that have funds available to purchase the services you offer and who are willing to actually spend that money are a hard requirement to maintain any business. If white collar and service industries are significantly disrupted by AI this necessarily reduces the number of potential customers. Thing is you don't have to lay off that many people to bankrupt half of the contractors in the country, a decent 3-5 year recession is all it takes. Folks stop spending on renovations and maintenance work when they're worried about their next paycheck.

Ray20 · 2025-03-21T23:04:09 1742598249

>who are willing to actually spend that money

Money mean nothing. It is simply medium of exchange. The question is, is there anything to exchange? And the answer is yeas, and position of white collar workers doesn't affect availability of things for exchange. There's no reason for recession, there is nothing that can hinder ability of blue collar workers to create goods and services, all that things that when combined is called "wealth".

Don't think in the meaningless category of "what set of digits will be printed in the piece of paper called paycheck?". Think in the terms, that are implied: "What goods and services blue collar workers can't afford to themselves?". And it will become clear that the set of unaffordable goods and services to blue collar workers will decrease because of the replacement white collar workers with AI, because it is not hinder their ability to create those goods and services.

forgetfreeman · 2025-03-22T00:13:04 1742602384

> Money mean nothing.

You think so? Give me the contents of your checking, savings, and retirement accounts and then get back to me on that.

> position of white collar workers doesn't affect availability of things for exchange.

You appear to be confused about the concept of consumers, let me help. Consumers are the people who buy things. When there are fewer consumers in a market, demand for products and services declines. This means less sales. So no, you don't get to unemploy big chunks of the population and expect business to continue thriving.

Ray20 · 2025-03-22T17:40:47 1742665247

>When there are fewer consumers in a market, demand for products and services declines.

No, demand is unlimited and defined by the amount of production.

>You don't get to unemploy big chunks of the population and expect business to continue thriving.

I mean, generally replaced worker with the instruments - is the main way to business (and society) to thrive. In other words, what goods and services will became less affordable to the blue collar workers?

forgetfreeman · 2025-03-23T01:34:57 1742693697

> No, demand is unlimited and defined by the amount of production.

Enough of your trolling, go waste someone else's time.

sozforex · 2025-03-20T22:10:39 1742508639

When ~white collar [researchers, programmers, managers, salespeople, translators, illustrators, ...] lose their income/jobs to AI's -> lose their ability to buy products/services and at the same time try to shift in mass to doing some kind of manual work, do you think that would not affect incomes of those who are the current blue collar class?

Ray20 · 2025-03-20T23:13:50 1742512430

>do you think that would not affect incomes of those who are the current blue collar class?

Obviously it is affect. Supply of goods are increased and their relative market value are increased - how can this not increase their incomes?

forgetfreeman · 2025-03-21T00:52:42 1742518362

The law of supply and demand dictates that when the supply of a thing increases it's value decreases.

Ray20 · 2025-03-21T22:36:04 1742596564

> it's value decreases

I mean yeas, values of consumed goods will decrease, so blue color workers will be able to consume more. That's exactly what is called increase of income.

forgetfreeman · 2025-03-22T00:03:51 1742601831

My gut is telling me you're being intentionally obtuse but I'm going to give you the benefit of the doubt. To reiterate in detail:

AI is poised to disrupt large swaths of the workforce. If large swaths of the workforce are disrupted this necessarily means a bunch of people will see their income negatively impacted (job got replaced by AI). Broke people by definition don't have money to spend on things, and will prioritize tier one of Maslow's Hierarchy out of necessity. Since shit like pergolas and oil changes are not directly on tier 1 they will be deprioritized. This in turn cuts business to blue collar service providers. Net result: everyone who isn't running an AI company or controlling some currently undefined minimum amount of capital is fucked.

If you're trying to suggest that any notional increases in productivity created by AI will in any way benefit working class individuals either individually or as a group you are off the edge of the map economically speaking. Historical precedents and observed executive tier depravity both suggest any increase in productivity will be used as an excuse to cut labor costs.

Ray20 · 2025-03-22T17:54:58 1742666098

>This in turn cuts business to blue collar service providers.

No, it doesn't. Where's that is come from?

I mean, look at the situation from the perspective of blue collar service providers: what is exactly those goods and services, that they was be able to afford for themselves, but AI will make it unaffordable for them? Pretty obviously, that there's about none of those goods and services. So, in big picture, all that process that you described, doesn't lead to any disadvantage of blue collar workers.

forgetfreeman · 2025-03-23T01:37:33 1742693853

I literally described the mechanism to you twice and you're still acting confused. I'm not sure if we have a language barrier here or what but go check out a Khan Academy course on economics or maybe try running a lemonade stand for an afternoon if you still don't get it.

photonthug · 2025-03-20T14:27:11 1742480831

I think the obvious thing you are missing is just b2b. It doesn’t actually matter if people have any money.

Similar to how advertising and legal services are required for everything but have ambiguous ROI at best, AI is set to become a major “cost of doing business“ tax everywhere. Large corporations welcome this even if it’s useless, because it drags down smaller competitors and digs a deeper moat.

Executives large and small mostly have one thing in common though.. they have nothing but contempt for both their customers and their employees, and would much rather play the mergers and acquisitions type of games than do any real work in their industry (which is how we end up in a world where the doors are flying off airplanes mid flight). Either they consolidate power by getting bigger or they get a cushy exit, so.. who cares about any other kind of collateral damage?

greenavocado · 2025-03-20T13:50:13 1742478613

Money is a proxy for control. Eventually humans will become mostly redundant and slated for elimination except for the chosenites of the managerial classes and a small number of technicians. Either through biological agents, famines, carefully engineered (civil?) wars and conflicts designed to only exterminate the non-managerial classes, or engineered Calhounian behavioral sinks to tank fertility rates below replacement.

Beijinger · 2025-03-20T15:22:03 1742484123

https://senecaeffect.substack.com/p/exterminations-a-review

riehwvfbk · 2025-03-20T14:38:40 1742481520

Ssssh, you can't say that. Those types of brain damage are protected diversity.

kerkeslager · 2025-03-20T16:22:52 1742487772

Why should we care if they make money? Owning things isn't a contribution to society.

Building things IS a contribution to society, but the people who build things typically aren't the ultimate owners. And even in cases where the builders and owners are the same, entitling the builders and all of their future heirs to rent seek for the rest of eternity is an inordinate reward.

imtringued · 2025-03-20T13:47:52 1742478472

You don't. It's like Minecraft. You can do almost everything in Minecraft alone and everything exists in infinite quantity, so why trade in the first place?

This goes both ways. Let's say there is something you want but you're having trouble obtaining it. You'd need to give something in exchange.

But the seller of what you want doesn't need the things you can easily acquire, because they can get those things just as easily themselves.

The economy collapses back into self sufficiency. That's why most Minecraft economy servers start stagnating and die.

afavour · 2025-03-20T14:09:56 1742479796

Unfortunately I don’t think the logic extends beyond “if we don’t do it, someone else will”. Anything after that is secondary.

mistrial9 · 2025-03-20T14:27:08 1742480828

What people say is not the same as what people do.. in other words, what is spoken in public repeatedly is not representational of actual decision flows

articlepan · 2025-03-20T13:57:31 1742479051

Money is only a bookkeeping tool for complex societies. The aim of the owner class in a worker-less world would be accumulation of important resources to improve their lives and to trade with other owners (money would likely still be used for bookkeeping here). A wealthy resource-owner might strive to maintain a large zone of land, defended by AI weaponry, that contains various industrial/agricultural facilities producing goods and services via AI.

They would use some of the goods/services produced themselves, and also trade with other owners to live happy lives with everything they need, no workers involved.

Non-owners may let the jobless working class inhabit unwanted land, until they change their minds.

datadrivenangel · 2025-03-20T14:11:17 1742479877

Better for them to give us jobs so we owe them and are less likely to revolt!

rjbwork · 2025-03-20T14:44:24 1742481864

With what and against what? There will be spy satellites and drones and automated turrets that will turn you to pulp if you come within, say, 50KM of their compound borders.

blibble · 2025-03-20T14:42:53 1742481773

will they care if they have an army of cheap easily replaceable robots with guns?

I miss the star trek visions of the future

now the "good" outcome is a world sized north korea, with elon as ruler

and the bad outcome is the ruler using his army of robots to eliminate the possibility of the peasant revolt once and for all

quuxplusone · 2025-03-21T00:25:30 1742516730

One (satirical) answer to this question is given in Greg Egan's "The Discrete Charm of the Turing Machine" (2017). https://i.4pcdn.org/tg/1599529933107.pdf

kmeisthax · 2025-03-20T16:18:58 1742487538

The non-benevolent future is not self-defeating; we have historical examples of depressingly stable economies with highly concentrated ownership. The entirety of the European dark ages was the end result of (western[0]) Rome's elites tearing the planks out of the hull of the ship they were sailing. The consequence of such a system is economic stagnation, but that's not a consequence that the elites have to deal with. After all, they're going to be living in the lap of luxury, who cares if the economy stagnates?

This economic relationship can be collectively[1] described as "feudalism". This is a system in which:

- The vast majority of people are obligated to perform menial labor, i.e. peasant farmers.

- Class mobility is forbidden by law and ownership predominantly stays within families.

- The vast majority of wealth in the economy is in the form of rents paid to owners.

We often use the word "capitalist" to describe all businesses, but that's a modern simplification. Businesses can absolutely engage in feudalist economies just as well, or better, than they can engage in capitalist ones. The key difference is that, under capitalism, businesses have to provide goods or services that people are willing to pay for. Feudalism makes no such demand; your business is just renting out a thing you own.

Assuming AI does what it says on the tin (which isn't at all obvious), the endgame of AI automation is an economy of roughly fifty elite oligarchs who own the software to make the robots that do all work. They will be in a constant state of cold war, having to pay their competitors for access to the work they need done, with periodic wars (kinetic, cyber, legal, whatever) being fought whenever a company intrudes upon another's labor-enclave.

The question of "well, who pays for the robots" misunderstands what money is ultimately for. Money is a token that tracks tax payments for coercive states. It is minted specifically to fund wars of conquest; you pay your soldiers in tax tokens so the people they conquer will have to barter for money to pay the tax collector with[2]. But this logic assumes your soldiers are engaging in a voluntary exchange. If your 'soldiers' are killer robots that won't say no and only demand payment in energy and ammunition, then you don't need money. You just need to seize critical energy and mineral reserves that can be harvested to make more robots.

So far, AI companies have been talking of first-order effects like mass unemployment and hand-waving about UBI to fix it. On a surface level, UBI sounds a lot like the law necessary to make all this AI nonsense palatable. Sam Altman even paid to have a study done on UBI, and the results were... not great. Everyone who got money saw real declines in their net worth. Capital-c Conservative types will get a big stiffy from the finding that UBI did lead people to work less, but that's only part of the story. UBI as promoted by AI companies is bribing the peasants. In the world where the AI companies win, what is the economic or political restraining bolt stopping the AI companies from just dialing the UBI back and keeping more of the resources for themselves once traditional employment is scaled back? Like, at that point, they already own all the resources and the means of production. What makes them share?

[0] Depending on your definition of institutional continuity - i.e. whether or not Istanbul is still Constantinople - you could argue the Roman Empire survived until WWI.

[1] Insamuch as the complicated and ideosyncratic economic relationships of medieval Europe could even be summed up in one word.

[2] Ransomware vendors accidentally did this, establishing Bitcoin (and a few other cryptos) as money by demanding it as payment for a data ransom.

sozforex · 2025-03-20T22:20:25 1742509225

You may find "Technofeudalism: What Killed Capitalism" book to your liking.

pkdpic · 2025-03-20T13:46:04 1742478364

And how could they possibly base their actions on good when their technology is more important than fire? History is depending on them to do everything possible to increase their market cap.

kerkeslager · 2025-03-20T13:57:36 1742479056

Careful, I think you're being sarcastic, but you're in a space where a lot of people believe what you just said unironically.

anon373839 · 2025-03-21T02:22:29 1742523749

Ha! Comment of the week.

nancyminusone · 2025-03-20T14:19:28 1742480368

More important than fire? AI runs on fire.

DrillShopper · 2025-03-20T19:00:17 1742497217

> The entire vision is to remake the entire world into one where the owners of these companies own everything and are completely unconstrained.

I agree with you in the case of AI companies, but the desire to own everything an bee completely unconstrained is the dream of every large corporation.

chii · 2025-03-20T13:40:50 1742478050

> remake the entire world into one where the owners of these companies own everything and are completely unconstrained

how has this been any different from the past 10,000 years of human conquest and domination?

nemomarx · 2025-03-20T14:14:02 1742480042

in the past, you had to give some of your spoils to those who did the conquering for you, and laborers after that. if you can automate and replace all work, including maintening the robots that do that and training them, you no longer need to share anything.

lithocarpus · 2025-03-20T14:48:22 1742482102

In my view it's the same thing, same trajectory -- with more power in the hands of fewer people further along the trajectory.

It can be better or worse depending on what those with power choose to do. Probably worse. There has been conquest and domination for a long time, but ordinary people have also lived in relative peace gathering and growing food in large parts of the world in the past, some for entire generations. But now the world is rapidly becoming unable to support much of that as abundance and carrying capacity are deleted through human activity. And eventually the robot armies controlled by a few people will probably extract and hoard everything that's left. Hopefully in some corners some people and animals can survive, probably by being seen as useful to the owners.

neutronicus · 2025-03-20T17:14:28 1742490868

On the bright side, armies of robot slaves give us an off-ramp from the unsustainable pyramid scheme of population growth.

Be fruitful, and multiply, so that you may enjoy a comfortable middle age and senescence exploiting the shit out of numerous naive 25-year-olds! If it's robots, we can ramp down the population of both humans and robots until the planet can once again easily provide abundance.

lithocarpus · 2025-03-21T02:54:42 1742525682

Sure, the problem though is it won't be "we" deciding what the robots do, it will most likely be a few powerful people of dubious character and motivations since those are the sort of people who pursue power and end up powerful.

That's why even though technology could theoretically be used to save us from many of our problems, it isn't primarily used that way.

neutronicus · 2025-03-21T12:53:14 1742561594

True.

But presumably petty tyrants with armies of slave robots are less interested than consensus in a long-term vision for humanity that involves feeding and housing a population of 10 billion.

So after whatever horrific holocaust follows the AI wars the way is clear for a hundred thousand humans to live in the lap of luxury with minimal impact on the planet. Even if there are a few intervening millennia of like 200 humans living in the lap of luxury and 99,800 living in sex slavery.

outside1234 · 2025-03-20T14:47:26 1742482046

The thing is that this will be their destruction as well. If workers don't have any money (because they don't have jobs), nobody can afford what the owners have to sell?

BigParm · 2025-03-20T15:02:26 1742482946

The human population will be decimated just as the work horse population was.

yubblegum · 2025-03-20T14:13:59 1742480039

They are also gutting the profession of software engineering. It's a clever scam actually: to develop software a company will need to pay utility fees to A"I" companies and since their products are error prone voila use more A"I" tools to correct the errors of the other tools. Meanwhile software knowledge will atrophy and soon ala WALE we'll have software "developers" with 'soft bones' floating around on conveyed seats slurping 'sugar water' and getting fat and not knowing even how to tie their software shoelaces.

bbarnett · 2025-03-20T15:03:11 1742482991

Yes, like the Pixel camera app, which mangles photos with AI processing, and users complain that it won't let people take pics.

One issue was a pic with text in it, like a store sign. Users were complaining that it kept asking for better focus on the text in the background, before allowing a photo. Alpha quality junk.

Which is what AI is, really.

anthk · 2025-03-20T13:18:34 1742476714

AI tarpits && lim (human curated contant/mediocre AI answers -> 0) = AI's crumbling into dust by themselves.

davidmurdoch · 2025-03-20T13:23:59 1742477039

We, the people, might need to come up with a few proverbial tranquilizer guns here soon

Sharlin · 2025-03-20T13:55:23 1742478923

Maxim 1: "Pillage, then burn."

Coffeewine · 2025-03-20T14:05:55 1742479555

Another Schlock Mercenary fan? Or does this adage have many adherents?

datadrivenangel · 2025-03-20T14:12:07 1742479927

The adage predates the longest continuous webcomic, but not as a maxim.

Sharlin · 2025-03-20T18:14:55 1742494495

Yep, a fan I am.

ferguess_k · 2025-03-20T13:35:30 1742477730

That's pretty much what our future would look like -- you are irrelevant. Well I mean we are already pretty much irrelevant nowadays, but the more so in the "progressive" future of AI.

speerer · 2025-03-25T23:16:36 1742944596

https://arstechnica.com/ai/2025/03/devs-say-ai-crawlers-domi...

links to this comment.

asveikau · 2025-03-20T17:27:54 1742491674

Rules and laws are for other people. A lot of people reading this comment having mistaken "fake it til you make it" or "better to not ask permission" for good life advice are responsible for perpetrating these attitudes, which are fundamentally narcissistic.

slowmovintarget · 2025-03-20T14:50:19 1742482219

"... you have the lawyers clean it all up later." - Eric Schmidt

kordlessagain · 2025-03-20T14:27:05 1742480825

> AI will be incorporated into all products whether you like it or not

AI will be incorporated into the government, whether you like it or not.

FTFY!

huijzer · 2025-03-20T13:54:44 1742478884

I think the logic is more like “we have to do everything we can to win or we will disappear”. Capitalism is ruthless and the big techs finally have some serious competition, namely: each other as well as new entrants.

Like why else can we just spam these AI endpoints and pay $0.07 at the end of the month? There is some incredible competition going on. And so far everyone except big tech is the winner so that’s nice.

lgeek · 2025-03-20T14:32:20 1742481140

> One crawler downloaded 73 TB of zipped HTML files in May 2024 [...] This cost us over $5,000 in bandwidth charges

I had to do a double take here. I run (mostly using dedicated servers) infrastructure that handles a few hundred TB of traffic per month, and my traffic costs are on the order of $0.50 to $3 per TB (mostly depending on the geographical location). AWS egress costs are just nuts.

Ray20 · 2025-03-20T17:13:58 1742490838

I think uncontrolled price of cloud traffic - is a real fraud and way bigger problem then some AI companies that ignore robot.txt. One time we went over limit on Netlify or something, and they charged over thousand for a couple TB.

joepie91_ · 2025-03-21T18:14:01 1742580841

> I think uncontrolled price of cloud traffic - is a real fraud

Yes, it is.

> and way bigger problem then some AI companies that ignore robot.txt.

No, it absolutely is not. I think you underestimate just how hard these AI companies hammer services - it is bringing down systems that have weathered significant past traffic spikes with no issues, and the traffic volumes are at the level where literally any other kind of company would've been banned by their upstream for "carrying out DDoS attacks" months ago.

Ray20 · 2025-03-21T22:30:35 1742596235

>I think you underestimate just how hard these AI companies hammer services

Yeas, I completely don't understand this and don't understand comparing this with ddos attacks. There's no difference with what search engines are doing, and in some way it's worse? How? It's simply scraping data, what significant problems may it cause? Cache pollution? And thats'it? I mean even when we talking about ignoring robots.txt (which search engines are often doing too) and calling costly endpoints - what is the problem to add to those endpoints some captcha or rate limiters?

Suppafly · 2025-03-20T14:59:11 1742482751

>which I then emailed 3x and never got a reply.

Send a bill to their accounts payable team instead.

ldoughty · 2025-03-20T15:18:09 1742483889

Detect AI scraper and inject an in-page notice that by continuing they accept your terms of use.

Terms of use charges them per page load in some terminology of abuse.

Profit... By sending them invoices :-)

dabockster · 2025-03-28T20:21:21 1743193281

Honestly this is crazy enough to work. Bonus points if both you and the scraping company reside in the same state.

TuringNYC · 2025-03-20T14:28:08 1742480888

>> which I then emailed 3x and never got a reply.

At which point does the crawling cease to be a bug/oversight and constitute a DDOS?

ferguess_k · 2025-03-20T13:34:41 1742477681

Maybe just feed them dynamically generated garbage information? More fun than no information.

gnz11 · 2025-03-20T13:47:42 1742478462

OP’s linked blog post mentioned they got hit with a large spike in bandwidth charges. Sending them garbage information costs money.

ferguess_k · 2025-03-20T14:03:05 1742479385

Yeah you have a point, hmmm, wish there were a way to somehow generate those garbages with minimum bandwidth. Something like, I can send you a very compressed 256 bytes of data which expands to something like 1 mega bytes.

madeforhnyo · 2025-03-20T15:22:56 1742484176

Good ol' zip bomb

https://furry.engineer/@niko/113728467796605323

kevindamm · 2025-03-20T14:57:49 1742482669

there is -- but instead of garbage expanding data, add in several delays within the response so that the data takes extraordinarily long

Depending on the number of simultaneous requesting connections, you may be able to do this without a significant change to your infrastructure. There are ways to do it that don't exhaust your number of (IP, port) available too, if that is an issue.

Then the hard part is deciding which connections to slow, but you can start with a proportional delay based on the number of bytes per source IP block or do it based on certain user agents. Might turn into a small arms race but it's a start.

Steltek · 2025-03-20T18:56:02 1742496962

Tarpit instead? Trickle out a dead end response (no links) at bytes-per-second speeds until the bot times out.

https://en.wikipedia.org/wiki/Tarpit_(networking)

InfamousRece · 2025-03-20T13:46:58 1742478418

It does not even have to be dynamically generated. Just pre-generate a few thousand static pages of AI slop and serve that. Probably cheaper than dynamic generation.

m463 · 2025-03-20T21:38:39 1742506719

I kind of suspect some of these companies probably have more horsepower and bandwidth in one crawler than a lot of these projects have in their entire infrastructure.

spenczar5 · 2025-03-20T14:13:51 1742480031

Thanks for writing about this. Is it clear that this is from crawlers, as opposed to dynamic requests triggered by LLM tools, like Claude Code fetching docs on the fly?

Freebytes · 2025-03-20T17:45:50 1742492750

Along with having block lists, perhaps you could add poison to your results that generates random bad code that will not work, and that is only seen by bots (display: none when rendered), and the bots will use it, but a human never would.

ATechGuy · 2025-03-20T19:04:49 1742497489

Wondering if used tried stopping such bots with Captcha?

aspir · 2025-03-20T13:22:57 1742476977

Just a callout that Fastly provides free bot detection, CDN, and other security services for FOSS projects, and has been for 10+ years https://www.fastly.com/fast-forward (disclaimer, I work for Fastly and help with this program)

Without going into too much detail, this tracks with the trends in inquiries we're getting from new programs and existing members. A few years ago, the requests were almost exclusively related to performance, uptime, implementing OWASP rules in a WAF, or more generic volumetric impact. Now, AI scraping is increasingly something that FOSS orgs come to us for help with.

Aachen · 2025-03-20T14:21:55 1742480515

I've been running into bot detection on at least five different websites in the past two months (not even including captcha walls)

Not sure what to tell you but I surely feel quite human

Three of the pages told me to contact customer support and the other two were a hard and useless block wall. Only from Codeberg did I get a useful response, the other two customer supports were the typical "have you tried clearing your cookies" and restart the router advice — which is counterproductive because cookie tracking is often what lets one pass. Support is not prepared to deal with this, which means I can't shop at the stores that have blocking algorithms erroneously going off. I also don't think any normal person would ever contact support, I only do it to help them realise there's a problem and they're blocking legitimate people from using the internet normally

Beware if you employ this...

RVuRnvbM2e · 2025-03-20T14:27:31 1742480851

Were the walls you hit caused by Fastly's bot detection? I've found it to be quite accurate.

On the other hand CloudFlare and Akamai mistakenly block me all the damn time.

Aachen · 2025-03-20T14:40:12 1742481612

It's not like they say, but it's at least three different implementations and I don't think any were cloudflare because I've been running into those pages for years and they've got captchas (functional or not). One of them was Akamai I think indeed

aspir · 2025-03-20T15:03:49 1742483029

Yeah, I definitely don't want to pivot this thread into a product pitch, as the important thing is helping the open-source projects, but we can work with the maintainers to tune the systems to be as strict/lax as preferred. I'm sure the other services can too, to be fair.

structural · 2025-03-20T15:52:25 1742485945

The underlying issue is that many sites aren't going to get feedback from the real people they've blocked, so their operators won't actually know that tuning is required (also, the more strict the system, the higher percentage of requests will be marked as bots, which might lead an operator to want things to be even more strict...)

aspir · 2025-03-20T17:00:21 1742490021

I will say -- a higher-end bot detection service should provide paper trails on the block actions they take (this may not be available for freemium tiers, depending on the vendor).

But to your point, the real kicker is the "many sites aren't going to get feedback from the real people they've blocked" since those tools inherently decided that the traffic was not human. You start getting into Westworld "doesn't look like anything to me" territory.

Aachen · 2025-03-20T19:31:13 1742499073

I'm not into westworld so can't speak to the latter paragraph, but as for "high-end" vendors' paper trail: how do log files help uncover false blocks? Any vendor will be able to look up these request IDs printed on the blocking page, but how does it help?

You don't know if each entry in the log is a real customer until they buy products proportional to some fraction of their page load rate, or real people until they submit useful content or whatever your site is about. Many people just read information without contributing to the site itself and that's okay, too. A list of blocked systems won't help; I run a server myself, I see the legit-looking user agent strings doing hundreds of thousands of requests, crawling past every page in sequence, but if there wasn't this inhuman request pattern and I just saw this user agent and IP address and other metadata among a list of blocked access attempts, I'd have no clue if the ban is legit or not

With these protection services, you can't know how much frustration is hiding in that paper trail, so I'm not blocking anyone from my sites; I'm making the system stand up to crawling. You have to do that regardless for search engines and traffic spikes like from HN

999900000999 · 2025-03-20T15:26:59 1742484419

To be fair,

>I'm Not a Robot (film) https://en.m.wikipedia.org/wiki/I%27m_Not_a_Robot_(film)

Aachen · 2025-03-20T19:38:44 1742499524

Oh my, a Dutch film that actually sounds good?! I get to watch a movie that's originally in my native language for perhaps the second time in my life, thanks for linking this :D

Edit: and it's on YouTube in full! Was wondering which streaming service I'd have to buy for this niche genre of Dutch sci-fi but that makes life easy: https://www.youtube.com/watch?v=4VrLQXR7mKU

Final update: well, that was certainly special. Favorite moment was 10:26–10:36 ^^. Don't think that comes fully across in the baked-in subtitles in English though. Overall it could have been an episode of Dark Mirror, just shorter. Thanks again for the tip :)

999900000999 · 2025-03-20T20:49:54 1742503794

Glad to help.

I have to assume the Dutch movie industry just isn't too big.

I guess it's a side effect of America's media, but when I went to Europe including the Netherlands almost everyone spoke English at an almost native level.

It almost felt like playing a video game where there is an immersive mode you can just turn off if it gets too difficult ( subtitles in English at all public facilities).

xena · 2025-03-20T13:22:30 1742476950

It's really surreal to see my project in the preview image like this. That's wild! If you want to try it: https://github.com/TecharoHQ/anubis. So far I've noticed that it seems to actually work. I just deployed it to xeiaso.net as a way to see how it fails in prod for my blog.

diggan · 2025-03-20T14:01:30 1742479290

Nice work :)

One piece of feedback: Could you add some explanation (for humans) what we're supposed to do and what is happening when met by that page?

I know there is a loading animation widget thingy, but the first time I saw that page (some weeks ago at the Gnome issue tracker), it was proof-of-work'ing for like 20 seconds, and I wasn't sure what was going on, I initially thought I got blocked or that the captcha failed to load.

Of course, now I understand what it is, but I'm not sure it's 100% clear when you just see the "checking if you're a bot" page in isolation.

xena · 2025-03-20T14:05:24 1742479524

> One piece of feedback: Could you add some explanation (for humans) what we're supposed to do and what is happening when met by that page?

Will do! https://github.com/TecharoHQ/anubis/issues/25

ranger_danger · 2025-03-20T19:56:43 1742500603

also if you're using JShelter, which blocks Worker by default, there is no indication that it's never going to work, and the spinner just goes on forever doing nothing

xena · 2025-03-20T20:06:42 1742501202

Noted! I filed a bug: https://github.com/TecharoHQ/anubis/issues/38

All of this is placeholder wording, layouts, CSS, and more. It'll be fixed in time. This is teething pain that I will get through.

hartator · 2025-03-20T14:18:03 1742480283

Maybe a progress bar?

xena · 2025-03-20T14:21:40 1742480500

There's no way to really make a progress bar make sense, it's a luck-based mechanic.

diggan · 2025-03-20T14:24:12 1742480652

Maybe one of those (slightly misleading) progressbars that have a dynamic speed that gets slower and slower the closer to the finish it gets? Just to indicate that it's working towards something

blibble · 2025-03-20T14:44:52 1742481892

more, easier proof of works

and the law of large numbers will do the rest

yifanl · 2025-03-20T21:08:08 1742504888

That's multiplying the work the server has to do by a large number so it can show a nicer progress bar.

Seems very counter to the purpose.

wink · 2025-03-21T12:49:51 1742561391

So just like the windows copy dialog. Progress bar it is.

isoprophlex · 2025-03-20T14:25:03 1742480703

It'll be somewhat involved, but based on the difficulty vs the clients hashing speed you could say something probabilistic like "90% of the time, this window will be gone in xyz seconds from now"?

xena · 2025-03-20T14:29:05 1742480945

Yeah, I have to get the data for that though! I'm gonna add that to the list.

clvx · 2025-03-20T14:17:07 1742480227

I really like this. I don't mind Internet acting like the Wild Wild West but I do mind there's no accountability. This is a nice way to pass the economic burden to the crawlers for sites who still want to stay freely available. You want the data, spend money on your side to get it. Even though the downside is your site could be delisted from search engines, there's no reason why you cannot register your service in a global or p2p indexer.

lukan · 2025-03-20T15:15:34 1742483734

"why you cannot register your service in a global or p2p indexer"

Network effects anyone? So yes, we should work on a different way of indexing the web again, than via google, but easier said than done I think ..

isoprophlex · 2025-03-20T14:23:19 1742480599

Loving it, great work as always.

Also

> https://news.ycombinator.com/item?id=43422781

Integrate a way to calculate micro-amounts of the shitcoin of your choice and we might have the another actually legitimately useful application of cryptocurrencies on our hands..!

vhcr · 2025-03-20T14:52:24 1742482344

Anubis is only going to work as long as it doesn't gets famous, if that happens crawlers will start using GPUs / ASICs for the proof of work and it's game over.

bashfulpup · 2025-03-20T15:12:58 1742483578

The entire reason bots are so agressive is because they are cheap to run.

If a GPU was required per scrape then >90% simply couldn't afford it at scale.

xena · 2025-03-20T15:01:35 1742482895

Author of Anubis here. If that happens, I win.

eb0la · 2025-03-20T15:10:54 1742483454

If that happens, count with me to use Anubis to factor large primes or whatever science needs as a background task.

enrico204 · 2025-03-20T16:46:53 1742489213

Actually, that is not a bad idea. @xena maybe Anubis v2 could make the client participate in some sort of SETI@HOME project, creating the biggest distributed cluster ever created :-D

programd · 2025-03-20T18:10:23 1742494223

Oh come now, clearly Anubis should make the clients mine bitcoin as proof of work, with a split for the website and the author.

Oh dear, somebody is going to implement this in about an hour, aren't they....

grotorea · 2025-03-20T22:15:07 1742508907

Just in case you didn't know, cryptominers in Javascript are already thing. Firefox even blocks them.

clvx · 2025-03-20T21:59:42 1742507982

a service that allows you expose and host your data in a private manner getting a cut from whatever token your endpoints have generated.

kotenok2000 · 2025-03-21T15:13:52 1742570032

Shouldn't you factor composite numbers? Factoring prime numbers is pointless.

knowaveragejoe · 2025-03-20T15:30:54 1742484654

I love that I seem to stumble upon something by you randomly every so often. I'd just like to say that I enjoy your approach to explanations in blog form and will look further into Anubis!

reginald78 · 2025-03-20T15:29:45 1742484585

Maybe I'm missing something, but doesn't this mean the work has to be done by the client AND the server every time a challenge is issued? I think ideally you'd want work that was easy for the server and difficult for the server. And what is to stop being DDoS'd by clients that are challenged but neglect to perform the challenge?
Regardless, I think something like this is the way forward if one doesn't want to throw privacy entirely out the window.

client

xena · 2025-03-20T15:46:45 1742485605

The magic of proof of work is that it's something that's really hard to do but easy to validate. Anubis' proof of work works like this:

A sha256 hash is a bunch of bytes like this:

  394d1cc82924c2368d4e34fa450c6b30d5d02f8ae4bb6310e2296593008ff89f

We usually write it out in hex form, but that's literally what the bytes in ram look like. In a proof of work validation system, you take some base value (the "challenge") and a rapidly incrementing number (the "nonce"), so the thing you end up hashing is this:

  await sha256(`${challenge}${nonce}`);

The "difficulty" is how many leading zeroes the generated hash needs to have. When a client requests to pass the challenge, they include the nonce they used. The server then only has to do one sha256 operation: the one that confirms that the challenge (generated from request metadata) and the nonce (provided by the client) match the difficulty number of leading zeroes.

The other trick is that presenting the challenge page is super cheap. I wrote that page with templ (https://templ.guide) so it compiles to native Go. This makes it as optimized as Go is modulo things like variable replacement. If this becomes a problem I plan to prerender things as much as possible. Rendering the challenge page from binary code or ram is always always always going to be so much cheaper than your webapp ever will be.

I'm planning on adding things like changing out the hash in use, but right now sha256 is the best option because most CPUs in active deployment have instructions to accelerate sha256 hashing. This combined with webcrypto jumping to heavily optimized C++ and the JIT in JS being shockingly good means that this super naïve approach is probably the most efficient way to do things right now.

I'm shocked that this all works so well and I'm so glad to see it take off like it has.

k1tanaka · 2025-03-21T02:59:19 1742525959

I am sorry if this question is dumb, but how does proof of work deter bots/scrappers from accessing a website?

I imagine it costs more resource to access the protected website but would this stop the bots? Wouldn't they be able to pass the challenge and scrap the data after? Or normal scrapbots usually timeout after a small amount of time/ resources is used?

joepie91_ · 2025-03-21T18:18:17 1742581097

There are a few ways in which bots can fail to get past such challenges, but the most durable one (ie. the one that you cannot work around by changing the scraper code) is that it simply makes it much more expensive to make a request.

Like spam, this kind of mass-scraping only works because the cost of sending/requesting is virtually zero. Any cost is going to be a massive increase compared to 'virtually zero', at the kind of scale they operate at, even if it would be small to a normal user.

dbmnt · 2025-03-21T03:38:36 1742528316

Put simply, most bots just aren't designed to solve such challenges.

diggan · 2025-03-20T15:43:02 1742485382

> I think ideally you'd want work that was easy for the server and difficult for the server.

That's exactly how it works (easy for server, hard for client). Once the client completed the Proof-of-Work challenge, the server doesn't need to complete the same challenge, it only needs to validate that the results checks out.

Similar to how in Proof-of-Work blockchains where coming up with the block hashes is difficult, but validating them isn't nearly as compute-intensive.

This asymmetric computation requirement is probably the most fundamental property of Proof-of-Work, Wikipedia has more details if you're curious: https://en.wikipedia.org/wiki/Proof_of_work

Fun fact: it seems Proof-of-Work was used as a DoS preventing technique before it was used in Bitcoin/blockchains, so seems we've gone full circle :)

namaria · 2025-03-20T17:06:13 1742490373

I think going full circle would be something like bitcoin being created on top of DoS prevention software and then eventually DoS prevention starting to use bitcoin. A tool being used for something than something else than the first something again is just... nothing? Happens all the time?

GaggiX · 2025-03-20T15:43:36 1742485416

The AI anime girl has 6 fingers btw, combating AI bot with AI girls.

Edit: I will probably send a pull request to fix it.

xena · 2025-03-20T19:06:37 1742497597

I'm commissioning an artist to make better assets. These are the placeholders that I used with the original rageware implementation. I never thought it would take off like this!

pabs3 · 2025-03-21T03:49:36 1742528976

Could you add an option for non-JS users? Maybe a Linux command-line we can paste the output of into a form.

brushfoot · 2025-03-20T13:24:37 1742477077

At this rate, it's more than FOSS infrastructure -- although that's a canary in the coalmine I especially sympathize with -- it's anonymous Internet access altogether.

Because you can put your site behind an auth wall, but these new bots can solve the captchas and imitate real users like never before. Particularly if they're hitting you from residential IPs and with fake user agents like the ones in the article -- or even real user agents because they're wired up to something like Playwright.

What's left except for sites to start requiring credit cards, Worldcoin, or some equally depressing equivalent.

Freak_NL · 2025-03-20T13:52:59 1742478779

We're half way there already. It always hits me whenever I am doing some mapping for OpenStreetMap and I'm looking up local businesses without their own internet presence. They use Facebook, Instagram, X, etc. for their digital calling card. I normally don't use Facebook (or Instagram, and gave up on X) and have no account there, and every time I follow one of those links, you get some info, and then you get a dialogue screen telling you to make an account or get lost, or you just get some obscure error.

I don't mind registering an account for private communities, but for stuff which people put up thinking it is just going to be publicly visible it's really annoying.

yurishimo · 2025-03-20T14:44:49 1742481889

> ... but for stuff which people put up thinking it is just going to be publicly visible ...

I don't think these business owners really understand. Most normies just think everyone has a Facebook/Instagram account and can't even imagine a world where that is not the case.

I agree with you that it is extremely frustrating.

Suppafly · 2025-03-20T15:07:20 1742483240

>Most normies just think everyone has a Facebook/Instagram account and can't even imagine a world where that is not the case.

The people without a basic internet presence aren't likely to be customers anyway so it's not a huge loss. It's trivial to setup a basic account for any site that doesn't contain any personal data you want to keep hidden, if you aren't willing to do that, you're in a tiny minority.

dreamcompiler · 2025-03-20T16:12:03 1742487123

> It's trivial to setup a basic account for any site that doesn't contain any personal data you want to keep hidden

It's equally trivial for a restaurant to set up a custom domain with their own 2-page website (overview and menu) on any of a hundred platforms that provide this service.

Most of these services are not free like FB, but any business that can afford a landline phone can afford a real website.

sofixa · 2025-03-20T16:29:18 1742488158

> Most of these services are not free like FB

There are free ones as well, just as a subdomain (something.wordpress.com or something.wix.com), not a full top level custom domain.

Suppafly · 2025-03-20T17:26:25 1742491585

>It's equally trivial for a restaurant to set up a custom domain with their own 2-page website (overview and menu) on any of a hundred platforms that provide this service.

Sure but they don't want to. If you want to see the menu they have online you need to follow their rules, not your own.

elaus · 2025-03-20T17:44:07 1742492647

And if they want me as a customer, they have to follow _my_ rules.

Obviously the restaurant has enough other customers and I have enough other restaurants to go to, so we both will be fine.

Suppafly · 2025-03-20T18:25:13 1742495113

>And if they want me as a customer, they have to follow _my_ rules.

Sure, but putting their menu behind a trivial to access account shows they don't want you as a customer. You're the one complaining, not them.

madeofpalk · 2025-03-20T15:40:02 1742485202

I don't have a Facebook or Instagram account, but I definitely eat tacos and I was put off when I couldn't see a new taco place's opening hours without an instagram accoutn.

I'm not sure why you think why people who don't have a Facebook account wouldn't eat at restaurants