Hacker News new | past | comments | ask | show | jobs | submit login
Amazon's AI crawler is making my Git server unstable (xeiaso.net)
33 points by bitbasher 3 months ago | hide | past | favorite | 10 comments



Have you verified that they are actually Amazon crawlers as outlined here:

https://developer.amazon.com/amazonbot


Yes.


Can you share the list of offending ip’s, here or your website so we can use them in block lists?

Also, there is an email to contact:

amazonbot@amazon.com


The website that you were pummeling is in the article: git.xeserv.us. I sent an email earlier today and have gotten no response.

Right now your crawler bots are getting the bee movie script, so you may want to delete all the data that's being scraped from that domain. Unless you like jazz that is.

It'd be a gesture of good faith to remunerate me for the egress fees your bot incurred, but I'm not gonna die on that hill.


Apologies, I’m not affiliated with Amazon in any way.

I meant the Amazon ip addresses that are causing you trouble so I can preemptively block them.


How much of a problem is it?


3Ti of egress and climbing, I'm in the hole financially and it's making my personal infra that relies on it unstable.


Damn that's rude. At least you appear to be using Vultr, imagine if it was running on one of those newfangled cloud providers which mark bandwidth up by a few orders of magnitude...


It's actually slightly worse. That vultr node is a reverse proxy over wireguard to my homelab.


Remove the gittea instance for now until it's sorted out? Respond to all git.* traffic with a 420 until it's sorted out.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: