Hacker News new | past | comments | ask | show | jobs | submit login

1. Create fake url endpoint. And go to that endpoint in the adversary's website, when your server gets request, flag the ip. Do this nonstop with a script.

2. Create fake html elements and put unique strings inside. And you can search that string in search engines for finding similar fake sites on different domains.

3. Create fake html element and put all request details in encrypted format. Visit adversary's website and look for that element and flag that ip OR flag the headers.

4. Buy proxy databases, and when any user requests your webpage, check if its a proxy.

5. Instead of banning them, return fake content (fake titles and fake images etc) if proxy is detected OR the ip is flagged.

6. Don't ban the flagged ip's. She/He's gonna find another one. Make them angry and their user's angry so they give up on you.

7. Maybe write some bad words to the user on random places in the HTML when you detect flagged ip's :D So the user's will leave the site and this will reduce the SEO point of the adversary. Will be downranked.

8. Enable image hotlinking protection. Increase the cost of proxying for them.

9. Use @document CSS to hide the stuff when the URL is different.

10. Send abuse mail request to the hosting site.

11. Send abuse mail request to the domain provider.

12. Look for the flagged IPs and try to find the proxy provider. If you find, send mail to them too.

Edit: More ideas sparkled in my mind when I was in toilet:

1. Create fake big css files (10MB etc). And repeatedly download that from the adversary's website. This should cost them too much money on proxies.

2. When you detect proxy, return too big fake HTML files (10GB) etc. That could crash their server if they load the HTML into the memory when parsing.




I like how you think. These are all great ideas!

Reminds me of a time some real estate website hotlinked a ton of images from my website. After I asked them to stop and they ignored me I added an nginx rewrite rule to send them a bunch of pictures of houses that were on fire.

For some reason they stopped using my website as their image host after that.


Is the primary motivator to do this?

I'm curious if they are stealing anything else, e.g. are they selling ads/tracking, do they replace order forms with their own...


because I asked them to stop doing it, and they didn't. Technically they were stealing my bandwidth.

Also to teach them an important lesson about the internet.


haha, they're just lucky you didn't introduce them to Goatse


well actually...

there was another time a site hotlinked to a js file. After asking them to stop, i found that they had a contact form with a homebrew captcha which created the letters image like http://evilsite.com/cgi-bin/captcha.jpg?q=ansr

A little while later, their captcha form had a hidden input appended with the correct answer value, and the word to solve was changed to a new 4 letter word from a dictionary of interesting 4 letter words. The form still worked because of the hidden input. I might have changed the name on the "real" input also.


Signal boosting suggestion #1 here. Great idea.

Additionally if they decide to blackhole the fake/honeypot url, since you mentioned they pass along the user agent, you could mixin some token in a randomized user agent string that your scraper uses so that you could duck-type the request on your end to signal when to capture the egress ip.


#5 and #6 are key. Don't try to block them directly, just get them delisted. When you've worked out a way to identify which requests belong to the scammer, feed them content that the search engines and their ad partners will penalize them for.


Bummed that I can upvote this only once. Excellent work.


LOL! Thank you for the laugh. This is great.


What a sure-fire way to toast them! Kudos!


In my search for this I found @document isn't super supported [0] I suggested something like:

    a[href*= "sukuns.us.to"] {
     display:none; 
    }
Then use SRI to enforce that CSS.

[0]: https://caniuse.com/mdn-css_at-rules_document


How about something like...

    body[href*= "<OFFENDING URL>"] {
        background-image: url("http://goatse..."); 
    }
Ala: http://ascii.textfiles.com/archives/1011


Or just make the whole page rotate

    body[href*= "<OFFENDING URL>"] {
      animation: rotation 20s infinite linear;
    }

    @keyframes rotation {
      from {
        transform: rotate(0deg);
      }
      to {
        transform: rotate(359deg);
      }
    }


We're trying to punish the people running the proxy mirror, not the users who stumble upon them just trying to use the site


You could look at it as trying to get them blocked by search engines. Can you detect when they're proxying a search bot as opposed to a user? As for punish, you don't have to make it eye-bleach, just enough to make it firmly NSFW so nobody can get any business value from it, or even use it safely at work.

A little soft NSFW would also greatly accelerate them being added to a block list, especially if you were to submit their site to the blocklists as soon as you started including it. You can include literally anything that won't get you arrested. Terrorist manifestos, the anarchists cookbook, insane hentai porn... Use all those block categories - gore/extreme, terrorist, adult, etc.


In that case, write some JS, that wanders around the Hubble site, randomly downloading full-res TIFF images for the background, or that randomly displays Disney images.


Seems like it would be fairly easy to use this pseudo selector, and apply it to every element on the page. Making them show up as empty to the user


You could add a data attribute to the html tag of the document with the current URL, I.E.

  <html data-path="https://www.saashub.com/about">
then hide the full page with:

  html {display: none;}
  html[data-path*="saashub.com"] {display:block;}


This seems quite elegant and easy. Obviously in addition to other measures, but I like it.


Honestly this is my favorite HN post in a while I've had a lot of fun thinking over this challenge.


I'm with you, too!


I know this is just a game that never ends, but if they're already rewriting the HTTP requests what's stopping them from rewriting the page contents in the response?

SRI is for the situation where a CDN has been poisoned, not this.


It might not explicitly be what SRI is meant for but it'll narrow the proxy's options to:

A. Blank page

B. Let the find and replace update the CSS. Generate new hashes in the HTML.

C. Find someone new to pick on.

B is time and potentially computationally expensive, so it makes C a better option.


A doesn't work because B doesn't prevent the attacker from regexing out the hash altogether and changing the domain name in the tags to their own.


If they're rewriting html, I guess sanitizing css won't be beyond them.


Shadow nefarious techniques are the best. Don't give them clear indications that there is a problem.

For example, I had an app developer start stealing API content, so once I determined points to key from them, instead of blocking them I simply randomized the API content details returned to their user's apps.

Hey, API calls look good, the app looks like it is working, no problem right? Well, the users of the app were pissed and the negative reviews rolled in. It was glorious.


Serious question — is there a way to defend from this "stealing the API" thing? E.g. building an authentication of some sort and then including a key with your app?


Of course HN doesn’t like anything that’s reminiscent of DRM, but Apple’s App Attest and Google’s Play integrity API can help dispense online services to valid clients only.


These are the best ideas, especially SEO poisoning and alternate images. If their point is to steal content and rankings then poisoning the well should discourage this in the future. I suspect their actual goal is to have a low-effort high SEO site to abuse as a watering hole for phishing attacks.

As a side note, their domain is linked in this thread so they are seeing HN in their access logs and probably reading this. It should make for an interesting arms race. Or red/blue team event.


They said the attacker was passing through the client's user agent. If they get a user agent that is GoogleBot, they could check if the requesting IP is actually a valid Google data centre (there is a published list of IPs). If the IP is not Google directly, they could return a blank page therefore causing Google to index nothing through the mirrored site.


This is a good idea, though it may be short lived since the attackers are likely reading this due to the referrers in the logs. They may add an ACL to counter this but it might be interesting to see how long that works.


Seems like a good use case for a zip bomb. Return some tiny gzipped content that expands to 1gb.


Yeah. Their proxy is parsing the HTML and stripping it / modifying it, so they're obviously unzipping the responses on their servers. Create the honeypot endpoint, and if you get a request from that endpoint, reply with a zip bomb.

Then, write a little script that repeatedly hits that honeypot URL. I quite like this idea.


Awesome, do post a follow-up on HN, I want to hear how this war with the proxy asshats plays out.


> 5. Instead of banning them, return fake content (fake titles and fake images etc) if proxy is detected OR the ip is flagged.

> 6. Don't ban the flagged ip's. She/He's gonna find another one. Make them angry and their user's angry so they give up on you.

There's a popular blog that no longer gets linked on HN.

The author didn't like the discussions HN had around his writing, so any visitors with HN as the referer are shown goatse, a notorious upsetting image, instead of the blog content.


Goatse? I assume you're referring to jwz - that blog shows a testicle in an egg cup if it sees a HN referrer.


Yeah, jwz. Looks like I got mixed up - goatse has been a popular choice for this kind of thing, but jwz went with a different image.

Fortunately, there are many upsetting images for the OP to choose from!


Out of curiosity, which blog are you talking about?



Does anyone not have their referer header supressed or faked?


I strip the referrer generally via https://wiki.mozilla.org/Security/Referrer, unfortunately it breaks a small number of sites very badly, such as web.archive.org and a few others. some of them claiming it was done to combat scraping.


Breaking is only part of the problem. The pages that rely on the referer header take it for granted and do not implement any meaningful error handling. They just die a horrible death, instead of responding with an error message stating that they need a referer.

One bad example is relying on the referer only for log-out, everything else works. That site also runs massive js on log-out, as if it really needs to rely on explicit log-out, and not just the user disappearing.


I have never considered faking or suppressing my referer header. I don't know why I would care. I suspect I'm in the company of well over 99% of all internet users.


Why return big files when you can return small files at excruciatingly slow speeds? modems are hot again!


that's probably the best advice. Instead of denying the proxy, just make it shitty to use for the end-user.


> Maybe write some bad words to the user on random places in the HTML

> Create fake big css files (10MB etc). And repeatedly download that from the adversary's website. This should cost them too much money on proxies.

Be careful when doing things like this, including the shock image option mentioned in other comments, as then it could become an arsehole race with them trying to DoS your site in retribution. Then again, going through more official channels could also get the same reaction, so…

> When you detect proxy, return too big fake HTML files (10GB) etc. That could crash their server if they load the HTML into the memory when parsing.

Make sure you are setup to always compress outgoing content, so that you can send GBs of mostly single-token content with MBs of bandwidth.


> Create fake big css files (10MB etc). And repeatedly download that from the adversary's website. This should cost them too much money on proxies.

Doesn't that also cost you an equal amount? You'll be serving them an equal amount that they proxy to the end user.

It's not even necessarily a cost for them; you're assuming that the host is owned and paid for by the abuser. If it's simply been hijacked (quite possible), you're just racking up costs for another victim.


I remember years ago there was a way to DDoS a server by opening the connection and sending data REALLY slow, like 1 byte a second. I wonder if there is a way to do the opposite of that, where ever request is handed off to a worker which slow enough to keep the connection alive. I doubt this can scale well, but just a thought.



The “opposite” thing you’re describing sounds like a tarpit: https://en.m.wikipedia.org/wiki/Tarpit_(networking)


you can have some fun with nginx if you can identify on your backend whether the request is coming from a malicious source, e.g. with X-Accel-Limit-Rate


I read once a suggestion to serve gzipped requests which, gzipped, are tiny, but un-gzipped are enormous. Like GBs of 0s.

Not sure how you actually do it and if it serves your purpose but sounded neat.


It's called a "zip bomb" (popularized by Silicon Valley [1]), and there is a good guide (and pre-generated 42kB .zip file to blow up most web clients) at https://www.bamsoftware.com/hacks/zipbomb/

[1] https://www.youtube.com/watch?v=jnDk8BcqoR0


Any recommendations on proxy database providers?


http://iplists.firehol.org/ looks free and very comprehensive. It has whole bunch of sub-lists of IPs that are likely to be sources of abuse, including datacenters and VPNs, and it gets updated frequently. Github: https://github.com/firehol/firehol


> 1. Create fake big css files (10MB etc). And repeatedly download that from the adversary's website. This should cost them too much money on proxies.

Nope, since anybody doing this and it has at least minimum intelligence are using residential botnets as proxies.


Going defcon3 on proxies

You can also write some obfuscated inline JavaScript that checks the current hostname and compares to the expected one and redirects when not aligned.


They are stripping all JS.


Passive Aggressive FTW. These are all fantastic ideas.


I really like #9, this seems like a simple way to make your site unusable except via the methods you desire.


Oh, I love these. I will use some of them. Many thanks!


Fake 10GB html can be a zip bomb?


point no.1 will do. that's the solution.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: