That's what it was. The NSA was reverse proxying Google.
The legit explanation (given the domain name) is probably they wanted to use reCAPTCHA, but block all non-NSA hosts with a firewall or something.
This is not great, because the NSA expanded its attack surface to all of google.com.
The more conspiracy explanation is that this is actually a phishing page set up, and due to a misconfiguration it's exposed under captcha.nsa.gov, but Occam's Razor should apply here.
I'm guessing that the NSA website uses recaptcha, which is served by Google. Perhaps in order to comply with strict origin policy, they want everything on nsa.gov to be served from their domain. They seem to have a reverse proxy that proxies requests to google.com.
That's one plausible explanation, but in any case, even if my explanation is wrong, I doubt the explanation is interesting.
If that's the case, they are being sloppy, considering that everything under www.google.com is proxied through their servers, not just specific reCAPTCHA assets.
They're inheriting a considerable part of Google's attack surface. For example, Google's open redirects could be used to bypass origin checks as part of an attack on nsa.gov, or to phish NSA employees.
They appear to have change something in the past few minutes. When I first opened this HN thread it showed me Google's homepage. Now I'm also seeing that redirect.
Can someone explain what's going on? Is this a domain hack to get Google's captcha working under an nsa.gov hostname, presumably so that it's usable on whitelist firewalls? I'm surprised Google serves a homepage to the domain, and that it doesn't only respond to requests to google.com (etc.)
My guess: a custom version of Google that allows NSA analysts to do "Google dorking" - searching for vulnerable hosts with Google - without triggering a captcha. Somebody on twitter mentioned they could not get a captcha with strings that usually reliably cause one.
Maybe this is just a fake front page that calls to the Google search API and pretends to be Google proper. Either it is for agents in the field to inconspicuously use google or they misconfigured it to be public?
You can do that? I would expect Google to flag connections to the search page that don't terminate on a residential/commercial IP as suspicious and show you the near "unsolvable" captcha.
At least that is my experience with proxying google services (e.g. silly setup for accessing them from China). Datacenter IPs or SSL "MitM" connections reliably trigger it.
Anecdotal, and I'm guessing it's because I was logged in (to my long standing personal Google account) - but I didn't have any issues when I was VPN'd through a Vultr vps of mine when I was in my dorm.
Again I'm guessing it's because I was logged in, from google chrome.
Depends very much on which datacenter you're using. I'd imagine google doesn't get much (any) bot traffic from Akamai, so I'm not surprised that their ranges aren't flagged yet.
But all it takes is a few dozen queries in fast succession and google will start showing a captcha. At least, that is how it seemed to be a few years ago.
I've seen this on Twitter all day. My guess is that they wanted recaptcha, but serving the resources themselves. The easiest route was probably to reverse proxy google.com, which is what recaptcha is hosted on:
How has no one used this for ads yet? You could make any third party site appear as a first party site. As blockers usually aren’t set up to block first party ads.
Can anyone just do that to any domain? My website is hosted at GitHub Pages and requires a CNAME file in the repo root as well as the DNS entry at Cloudflare.
Agreed. The copyright holder / trademark owner must be the party that wants to limit distribution, not the government or some unrelated third party.
i.e. if I see you producing fake Coca Cola drinks, I can't sue you for infringing on The Coca Cola Company's trademark. They would have to sue you. Same applies for the government.
And of course, if NSA does have an agreement with Google to reverse proxy https://google.com/, them doing exactly that would be perfectly legal. I presume they have SOME sort of agreement, and aren't just doing this behind Google's back, as the website is on HN's first page in the first 5 places for an hour already, and Google hasn't banned access.
Try getting even 50 Google queries with a reverse proxy, and you will see what I mean -- they will show you a progressively more difficult ReCAPTCHA until a certain treshold, after which the CAPTCHA is unsolvable and is there only to waste your time. This hasn't happened to HN readers [yet].
Meanwhile I presume they misconfigured a service meant for doing captcha checks using Google. What's more likely? Why are you so aggressively.. eh.. okay, not going to write that.
I don’t think it’s unreasonable to point out that lots of the speculation here about NSA hosting phishing pages or secret captcha-free google for analysts under nsa.gov falls firmly into the chemtrail category of crazy conspiracy theories.
Just like with “chemtrails” there exists a very reasonable explanation for what happened here, but people are choosing to ignore that in order to push weird conspiracy theories.
you can do it to any domain that isn't checking the hostname header. Most sites check that the hostname header matches the sites actual domain (like is specified in the CNAME file on github pages)
that's definitely not what's happening here though, most obviously because it has an SSL certificate. If it were just being CNAMEd over to google, the SSL would be invalid. NSA has to be catching the request to terminate the SSL, and then proxying it back to google.
So you can't search for `traceroute` or `tracert` directly but you can search for misspelling like `tracerout` and the results page just ends up showing the search results for `traceroute` so it's not exactly a very sophisticated filter.
Well the purpose of the filter is almost certainly to prevent running the command on the server in case of an attack, not to prevent it from being searched on Google. You'd have to spell it correctly to get the server to execute it.
>If, on or after the date that is 180 days after the date of the enactment of this section, an agency creates a website that is intended for use by the public or conducts a redesign of an existing legacy website that is intended for use by the public, the agency shall ensure to the greatest extent practicable that the website is mobile friendly.
That's actually a really viable theory, especially given the "can't search for traceroute" thing - that spits out what seems to be a time-based error string.
It’s not, that’s just standard akamai WAF behaviour.
E: sorry, HN is throttling me and I can’t reply below. This is just a silly web application firewall that blocks a list of “suspicious strings”. There’s not much else to be said about it.
$ host captcha.nsa.gov
captcha.nsa.gov is an alias for www.nsa.gov.edgekey.net.
www.nsa.gov.edgekey.net is an alias for e6655.dscna.akamaiedge.net.
e6655.dscna.akamaiedge.net has address 104.75.125.118
e6655.dscna.akamaiedge.net has IPv6 address 2600:1406:5800:7b5::19ff
e6655.dscna.akamaiedge.net has IPv6 address 2600:1406:5800:792::19ff
edgekey.net is an akamai thingy, all of nsa.gov seems to go through it
$ host www.nsa.gov
www.nsa.gov is an alias for nsa.gov.edgekey.net.
nsa.gov.edgekey.net is an alias for e16248.dscb.akamaiedge.net.
The creapiest thing to me is that this post is 7 hours old, and the comment states it's disabled. It was fixed within 2 hours. Ergo, the NSA is actively monitoring HackerNews and taking quick actions when needed.
I wonder what other sites the nsa has active alerting on?
Why assume that was served on the link, and how it was served, is working as intended?
It could have been part of a phishing setup that got accidentally pushed out with obfuscation components still missing.
It's not like everybody working at NSA is a flawless human being, mistakes happen everywhere, sometimes even rather big ones.
Also kinda weird how everybody seems to be giving the NSA the benefit of the doubt of this having some kind of supposedly totally benign purpose, completely ignoring the NSA's history and purpose.
What's odd is that it came up in English at first, but now it's Portuguese for me. Another comment here mentioned it's the Brazilian version of Google's search page.
depends on where the traffic exits the Akamai network... they are likely using it to proxy Recaptcha, so they likely said "we don't care where it exits" and Akamai picks whatever is most convenient for them... in that case, Brazil.
Among other things, it's weird that it shows up with a different GeoIP triangulation for different users. Someone commented here about seeing this in Portuguese. I'm seeing this in Japanese. Does anyone what's going on?
Edit: This seems to have been online since 2018, see https://web.archive.org/web/20181206224407/http://captcha.ns....