Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
TorSearch launches to be the Google of the hidden internet (venturebeat.com)
94 points by IceyEC on Oct 11, 2013 | hide | past | favorite | 49 comments


Easier to enumerate Tor hidden services than to spider.. http://freehaven.net/anonbib/cache/oakland2013-trawling.pdf


How long until the NSA/GCHQ backdoors this site, if they haven't already? Would the thought police jump to conclusions if one uses this to search for classified documents? FBI files are probably of no use to me, so I'm not interested in those. But I hate it if people (in particular journalists) are considered to be some kind of al-shabab just for exercising democracy (i.e. holding their governments accountable under the exact letter as well as spirit of applicable legislation).


the reason I decided to implement this as a hidden service and not expose it to the regular internet (except through things like onion.to) is that my server sees every request as coming from localhost; all requests are processed through Tor so no matter what you search, as long as you aren't coming in through a Tor mirror, nobody can tell who you are


Thats not true, its specifically the attacks where they have many nodes, and if they can get on the entry and exit, the've identified that person.


There is no exit to a hidden service...


He ways saying if they breached his box, they still wouldn't be able to identify persons. But they can if they have his box (the exit) and the entry node.


You know, I'm somewhat suprised Google isn't actually crawling tor. Or maybe they are, who knows?


There are a few cases I've seen of Google actually picking up onion.to links but they tend to rank terribly, maybe for their name


Yeah, there are lots of onion.to sites indexed. Maybe the ranking is to do with the fact they experience so much downtime?


And they're slow! Google has stated that page speed is a ranking factor which means that hidden services will be hit hard (if they were actually crawled) and hidden service proxies would get that and then some.


I recall the engine once actually picked up a genuine .onion domain (it was the Hidden Wiki, IIRC).

It's safe to assume they do crawl the Tor space.


If they just pick up the .onion domain, they will try to crawl it and determine that the URL is incorrect, as .onion domains do not exist in the standard DNS stack. The only we for them to crawl Tor is if they went out of their way to crawl it (in which case they may or may use domains found on the standard web), or if they crawl domains like .onion.to, which behave like normal sites.


onion.to is not "crawling tor sites". It's crawling someone's public web proxy's copies.


.onion crawling was a 20% time project in 2009/2010 (IIRC). I'm sure you could find the announcement if you searched for it. Google does crawl and index.


https://encrypted.google.com/search?hl=en&q=site%3A.onion

No results here. If they are crawling, the results are not shown.


I notice the Telecomix logos in the image. Did they have anything to do with it? No mention of them on the page.


They didn't, I'm surprised the reporter used their logo for it


It looks like he used this image (http://www.flickr.com/photos/xp0s3/7851153390/in/photostream...), potentially just searching on Flickr for something eye catching and due to the "We're watching you" concept?


It doesn't have a lot of .onion websites indexed. The two I tried to find are not in their database (for one of them I got a result though, because it is mentioned in the hidden wiki).

I don't see any way of submit a site for crawling.



Thanks.

By the way, I can't get the https version to work when not using .onion.to (it's not important then, but still I prefer to report this to you).


When not using the onion.to (going over Tor), Tor handles point to point encryption between the user's Tor client and the server's Tor client


I know, that why I said "it's not important then", but just in case it was supposed to, I preferred to tell you :-).


The problem I see is that you can't really trust a search engine providing links to hidden services. Since hidden services doesn't really use "understandable" domain-names it's very easy to duplicate a website.

What is to say the owner(s) of the search engine isn't targeting journalist and the link to the newspaper I found is a dupe of the 'real' site?

This is by the way a problem in general with hidden services.


The larger internet has the same problem - duplicate content farms, making the "original" of a piece of content hard to discern. It's certainly not a solved problem but by tracking which copy is seen first, using links to imply trust, etc, a reasonable search can still be provided. You'll never have 100% accurate attribution, but that's kinda the point. And when it comes to trusting the search provider, well, how do you choose now beyond seeing consistent result quality?

Agreed trust is a problem with hidden services in general, but I think it's one we'll solve by reframing what it means to 'trust' a site in the first place.


Interesting. They should monitor non-Tor access to the site for a bit for kicks. I have a feeling that would provide some interesting results :)


Unfortunately, traffic coming from Tor and traffic coming from the onion.to forwarder all looks the same to me :(


Could you fingerprint the browser to determine how many users are using the Tor Browser? Obviously this will miss the people who just use a normal browser through Tor. Or ask onion.to to log how many requests they are forwarding to you. Although, this will miss any other proxying service that happens to be running.


the easiest way is for me to just watch the referrer, it's all coming from VentureBeat :)

edit: Well, most of it, some is from feed readers


To clarify, TorSearch isn't accessible from clearnet (from my understanding) so he can't monitor->correlate individuals activity. Such as a nefarious search.


This is exactly true; however, if a user is coming in through the onion.to forwarder as provided in the article, remember that they can track anything that you do!


If you are interested in tracking from within Tor vs in via tor2web, look for the header "X-tor2web: encrypted" on HTTP requests.


[deleted]


What does this even mean? Which freedom, compromised how, and what makes it unretrievable?


Awesome! The first thing I saw in autocomplete was my book. I assume people are searching for discussion groups, not bootleg copies.


It doesn't have autocomplete... I bet your browser autocompleted on the field named q


nice potential information leak, is such autofilled text available to the website via javascript?


I haven't looked into it but I don't think so. Additionally, you should be accessing it via Tor which wipes its memory of what you type in any fields anyway so no autocomplete! You weren't searching for things you shouldn't be, were you? ;-)


Your friendly search solution to find the stuff you momma (and uncle Sam) don't want you to see.


There are already multiple search engines for Tor, including DuckDuckGo.

The problem with Hidden services/ TOR isn't search. It's the fact there aren't enough legitimate/trusted websites that appeal to users outside the hardcore privacy/security crowd.

edit: fixed Tor capitalization.


DuckDuckGo isn't a search engine for Tor, it just provides a hidden service interface to its regular search engine. If you want to make a hidden service to help improve the overall quality, let me know and I can help you get it all setup.


I thought you could limit searches to .onion domains. Oh, well. Thanks for the offer to help. There are a few ideas for site that I would want to see on the Tor network, but I don't think they would have the draw enough traffic to get people running TOR on the regular.

Though writing this gave me an idea: is there a .onion Torrent index?

edit: fixed Tor capitalization


You can download the entire PirateBay database of magnet files, in under 100 MB. If you want to provide a robust place to grab torrents, a hidden service is a good idea.

Beyond that, I think that dedicated TOR users will shy away from using BitTorrent since it's a huge privacy risk. They're more likely to be using FreeNet or I2P for anonymous P2P filesharing.


Do you mean like a tracker that is a hidden service or something else?


I meant an Torrent Index, such as Piratebay


http://jntlesnev5o7zysa.onion/

It is The Pirate Bay


DDG doesn't search Tor, it's just a .onion version of their regular site.

Also, it's Tor not TOR. https://www.torproject.org/docs/faq.html.en#WhyCalledTor


That will be useful for the NSA and FBI.


Not sure if they promptly rebranded, but the search is called Torch actually. Here's a screen of an example search [1]. Note the query time - it's quite slow.

[1] http://i.imgur.com/4tx4nzt.jpg

Quick guide to access Torch on the Tor network:

1. Install Tor Browser (customized Firefox) https://www.torproject.org/projects/torbrowser.html.en

2. Extract and run "Start Tor Browser"

3. Go to Torch at http://xmh57jrzrnw6insl.onion


Torch is an entirely different search engine at a different domain; I run TorSearch and built it from scratch partially because of how terrible my experience using Torch was.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: