How long until the NSA/GCHQ backdoors this site, if they haven't already? Would the thought police jump to conclusions if one uses this to search for classified documents? FBI files are probably of no use to me, so I'm not interested in those. But I hate it if people (in particular journalists) are considered to be some kind of al-shabab just for exercising democracy (i.e. holding their governments accountable under the exact letter as well as spirit of applicable legislation).
the reason I decided to implement this as a hidden service and not expose it to the regular internet (except through things like onion.to) is that my server sees every request as coming from localhost; all requests are processed through Tor so no matter what you search, as long as you aren't coming in through a Tor mirror, nobody can tell who you are
He ways saying if they breached his box, they still wouldn't be able to identify persons. But they can if they have his box (the exit) and the entry node.
And they're slow! Google has stated that page speed is a ranking factor which means that hidden services will be hit hard (if they were actually crawled) and hidden service proxies would get that and then some.
If they just pick up the .onion domain, they will try to crawl it and determine that the URL is incorrect, as .onion domains do not exist in the standard DNS stack. The only we for them to crawl Tor is if they went out of their way to crawl it (in which case they may or may use domains found on the standard web), or if they crawl domains like .onion.to, which behave like normal sites.
.onion crawling was a 20% time project in 2009/2010 (IIRC). I'm sure you could find the announcement if you searched for it. Google does crawl and index.
It doesn't have a lot of .onion websites indexed. The two I tried to find are not in their database (for one of them I got a result though, because it is mentioned in the hidden wiki).
I don't see any way of submit a site for crawling.
The problem I see is that you can't really trust a search engine providing links to hidden services. Since hidden services doesn't really use "understandable" domain-names it's very easy to duplicate a website.
What is to say the owner(s) of the search engine isn't targeting journalist and the link to the newspaper I found is a dupe of the 'real' site?
This is by the way a problem in general with hidden services.
The larger internet has the same problem - duplicate content farms, making the "original" of a piece of content hard to discern. It's certainly not a solved problem but by tracking which copy is seen first, using links to imply trust, etc, a reasonable search can still be provided. You'll never have 100% accurate attribution, but that's kinda the point. And when it comes to trusting the search provider, well, how do you choose now beyond seeing consistent result quality?
Agreed trust is a problem with hidden services in general, but I think it's one we'll solve by reframing what it means to 'trust' a site in the first place.
Could you fingerprint the browser to determine how many users are using the Tor Browser? Obviously this will miss the people who just use a normal browser through Tor. Or ask onion.to to log how many requests they are forwarding to you. Although, this will miss any other proxying service that happens to be running.
To clarify, TorSearch isn't accessible from clearnet (from my understanding) so he can't monitor->correlate individuals activity. Such as a nefarious search.
This is exactly true; however, if a user is coming in through the onion.to forwarder as provided in the article, remember that they can track anything that you do!
I haven't looked into it but I don't think so. Additionally, you should be accessing it via Tor which wipes its memory of what you type in any fields anyway so no autocomplete! You weren't searching for things you shouldn't be, were you? ;-)
There are already multiple search engines for Tor, including DuckDuckGo.
The problem with Hidden services/ TOR isn't search. It's the fact there aren't enough legitimate/trusted websites that appeal to users outside the hardcore privacy/security crowd.
DuckDuckGo isn't a search engine for Tor, it just provides a hidden service interface to its regular search engine. If you want to make a hidden service to help improve the overall quality, let me know and I can help you get it all setup.
I thought you could limit searches to .onion domains. Oh, well. Thanks for the offer to help. There are a few ideas for site that I would want to see on the Tor network, but I don't think they would have the draw enough traffic to get people running TOR on the regular.
Though writing this gave me an idea: is there a .onion Torrent index?
You can download the entire PirateBay database of magnet files, in under 100 MB. If you want to provide a robust place to grab torrents, a hidden service is a good idea.
Beyond that, I think that dedicated TOR users will shy away from using BitTorrent since it's a huge privacy risk. They're more likely to be using FreeNet or I2P for anonymous P2P filesharing.
Not sure if they promptly rebranded, but the search is called Torch actually. Here's a screen of an example search [1]. Note the query time - it's quite slow.
Torch is an entirely different search engine at a different domain; I run TorSearch and built it from scratch partially because of how terrible my experience using Torch was.