Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Towards the flagged sibling comment about the tor network providing a hiding place for illegal activities and terrorism:

While it's non-trivial to inspect many aspects of tor traffic, an often used study metric has been the (determinable) percentage of connections to hidden services, which are usually assumed to be disproportionately malicious.

This ranges around ~5% across most studies, the most recent one I can find shows similar results[0].

Results are limited by the inability to account for lawful use of hidden services, but also the percentage of malicious use outside of them.

[0] https://www.pnas.org/doi/full/10.1073/pnas.2011893117



It's easier to account for the quantity of malicious use of Tor towards non-hidden internet websites

> Based on data across the CloudFlare network, 94% of requests that we see across the Tor network are per se malicious.

https://blog.cloudflare.com/the-trouble-with-tor/


> Based on data across the CloudFlare network, 94% of requests that we see across the Tor network are per se malicious.

I'm skeptical about their metrics. If I try to access a cloudfare protected site and never manage to finish the infinite captcha look or just change my mind after seeing the captcha, do they consider it a successful block? Even if we think they have some magic way of actually knowing whether all requests are malicious or not:

> That doesn’t mean they are visiting controversial content, but instead that they are automated requests designed to harm our customers. A large percentage of the comment spam, vulnerability scanning, ad click fraud, content scraping, and login scanning comes via the Tor network.

Personally I don't think content scraping and vulnerability scanning are "malicious per se", by that metric even Google (and every other search engine) would be malicious.


For comparison, I'm curious what the percentage is excluding Tor.


Criticism of this work[0] from which may be relevant.

"I could not find any reference to the toolset used to identify the websites being visited by the scanned users. I think this is important for external verification of the validity of the data. If the software is public and known, could the authors reference it? If the code is new, could the authors deposit it along the rest of the code in the OSF repository? This should not be a problem, since the authors do not have any conflict of interest."

[0] https://pubpeer.com/publications/3CE766FE19680525B332FA0004A...


> which are usually assumed to be disproportionately malicious.

Why is this assumed?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: