For heavy users (like me) I suggest using one of many "implementations" of a auto-scihub bookmark here: https://github.com/nfahlgren/scihub_bookmark
It is a great addition to an otherwise clean tor-browser...
This is not mine - but looking at the code you should notice fairly quickly that it is somehow trivial. Note also that it may be necessary to change the URL if necessary.
Lastly it should be said that sci-hub does indeed need donations - it is essentially a one-woman project and she can always use some lawyer-funds...
I think the donations are mostly for running the infrastructure - I don't think she's doing much in terms of hiring lawyers. Rather, she's just ignoring the outcomes of US court cases (being based in Kazakhstan or Russia).
I prefer the custom search approach over bookmarks: I have scihub as custom search with 'sh' as key word, so I go to the address bar (Ctrl+L or similar), and type {left}sh{space}{enter}
My usual workflow is to check citations for a paper via google scholar or getting a link from a peer which leads me directly to the paywall. From there it is a bookmark click away. In other words I always need sci-hub when I already found what I am looking for (but can't reach it). Does using sci-hub as a search engine work well?
This reminds me of Aaron Swartz's fight. People have made great sacrifices for this. Information that can lead to or enhance knowledge and learning shouldn't be behind closed doors.
Aaron Swartz's death brings tears to my eyes to this day. Even if I have never met him, or talked to him. Or actually ever really heard of him until he had to take his life.
A terrible and tragic loss. And not the first or last victim of zealous and overwrought prosecution. This kind of overkill-prosecution is a familiar and common tactic used to oppress and silence activists.
Though not an activist, he was another hacker prosecuted by the same team as Swartz, Jonathan James. He also committed suicide, at the age of 24. When he was 17 he spent six months in a federal corrections facility for hacking. Later, in 2007, he was connected to the TJX hack. He denied involvement but was friendly with some of the hackers who were involved and his past gave him little credibility. From his suicide note:
"The feds of course would see me as much more appealing target than Chris - if they could tie me to this case I'd be like Mitnick times 10 to them. I honestly, honestly had nothing to do with TJX. Unfortunately I don't picture the feds caring all too much. Read Agent Steal's guide to getting busted. The feds play dirty. Chris called me the other day. He was in jail and they let him out. That can only mean that he too is trying to pin this on me. So despite the fact that he and Albert are the most destructive, dangerous hackers the feds ever caught, they'll let them off easy because I'm a juicier target that would please the public more than two random fucks. C'est la vie.
I have no faith in the ‘justice’ system. Perhaps my actions today, and this letter, will send a stronger message to the public. Either way, I have lost control over this situation, and this is my only way to regain control. Remember, it's not whether you win or lose, it's whether I win or lose, and sitting in jail for 20, 10, or even 5 years for a crime I didn't commit is not me winning. I die free."
I agree. It was a great guy. However, I think its worth pointing out that JSTOR didn't want him prosecuted and they came to a viable agreement. SciHub on the otherhand is plain theft and its not being backed by donations, it has to be more well funded than that based on its uptime and storage.
The general estimates in the community is around $1.6 million a year to run. That's not cheap.
Sci-Hub has a founder. A founder that may be sometimes be a benevolent dictator, and sometimes not. For example, very recently, this very founder got offended by Russian social media, in which some academics said something unflattering about her, and a biologist apparently named a newly discovered species of parasitic wasp after her. So she blocked Sci-Hub for all Russian IPs. For about a week. And lifted that ban after a number of humble petitions in Russian social media asking her to change her mind.
To represent everything that is good about the internet, Sci-Hub should have been impersonal. Like archive.org perhaps, or pubmed, or thepiratebay. Unfortunately, it is not.
But it's definitely a good thing we have it, anyway.
All websites have founders and personality. Since you mention the PirateBay, they were a team with a rather snarky sense of humour. Archive.org is run by a foundation. There are always people and there's always legal responsibility.
In any case, Sci-Hub is really "everything that is good about the internet" because it's a way to make scientific knowledge accessible anywhere to anyone, in a way that was never possible before. The benefit to scholars and students in poor countries is enormous.
And considering just how disgustingly parasitic the scientific publishers have become, Elbakyan is probably the closest to a XXIst century Robin Hood, much more so than the Piratebay founders, actually.
There is something wrong with a service when access to it is conditional on the opinion about its founder voiced by some part of the population of a country. I can’t but feel that it goes against the spirit of freedom and empowerment that _is_ everything that’s good about the internet.
(I do not dispute the usefulness of SciHub, or the good that it’s doing for the netizens, or maybe for society in general. I’m just saying that it has this one thing against it, and that, for being a representative of everything that’s good, it’s pretty important.)
> To represent everything that is good about the internet, Sci-Hub should have been impersonal.
despite the fact that blocking people is "wrong", the good thing about the internet is that it's _allowed_! The internet isn't built on some moral/ethics - it shouldn't have one. It should remain something that anyone can get on, anyone can do anything on, and noone has the power to dictate behaviour.
As far as I can tell the storage is quite centralized. At least I remember them allegedly moving all their porcelain to Canada when that elephant entered the room...
Most items on Archive.org have torrent files, so there's some degree of (potential) decentralization. They run their own tracker, but in principle the torrents could live on via DHT. Most are poorly seeded, however.
They also have multiple backups in other locations.
ELI5: I'm not from academia so please help me understand. Are researchers who publish these [firewalled/copyrighted] papers likely to lose out on their earnings if this is 'open-sourced'? I'm honestly asking as I try to think through the whole 'open-source' paradigm and how it fits in a (mostly) capitalist, libertarian world. Thanks!
2. You submit novel research to the best journal you think will publish it.
3. The editor (the only person involved actually paid by the journal) invites other researchers to volunteer their time to peer review your research.
4. Assuming your research bears up under scrutiny and the peer reviewers are not actually competitors who decide to filibuster (this happens), your paper passes peer review.
5. Congratulations! The journal agrees to publish your novel research. For this, you must pay the journal. They do not pay you. You pay them. Color figures in the paper version nobody reads cost thousands extra.
6. Having been paid by you to publish your work, the journal sells your paper at exorbitant prices to university libraries (unless you paid them lots extra to just make it available for free).
Taxes pay researchers to pay journals and taxes pay for university libraries to pay journals. Why on Earth do intelligent researchers (or taxpayers) put up with this crap? Being published in a good journal bumps up your impact factor and helps you win more grant money. If high impact factor journals go out of business because of piracy, others will just take their place.
In short, pirates screwing over journals doesn't hurt researchers in the least. Shaking up the parasitic journal biz is actually long overdue. Journals put in only a tiny percentage of the labor involved in putting a paper through peer-review, but they soak everyone involved for massive amounts of time and money. It's time they died.
Not all fields suffer from this to the same degree. For example, I am a fledgling computing scientist. I mostly publish in ACM-related conferences and workshops, where ACM is an organisation run by and for academics and professionals in the computing field, not a for-profit publisher. My submission experience is:
1. You produce novel research.
2. You submit novel research to the best conference or workshop you think will publish it.
3. The editor (a volunteer) invites other researchers to volunteer their time to peer review your research.
4. Assuming your research bears up under scrutiny and the peer reviewers are not actually competitors who decide to reject (this supposedly happens, but is rare), your paper passes peer review.
5. Congratulations! The conference or workshop agrees to publish your novel research. For this, you generally must show up to the conference or workshop to present your work in person. This costs money (conference registration), and also travel cost and such. However, if you were going to the conference anyway, there is no added expenditure.
6. Having been paid by you to publish your work, ACM sells your paper at somewhat exorbitant prices to university libraries (unless you paid them lots extra to just make it available for free). However, at no cost to you, you are also explicitly allowed to post a "pre-print", identical to the published paper, on your personal/university website, where it will be promptly found by Google. However, the hosting is your own concern. (You can also upload the preprint to arXiv.)
It's a decent procedure. ACM also publishes journals, which do not require you to present anything in person. I do not know whether any author-borne costs are involved if you go that route.
To be clear to everyone else, the parent and the grandparent have just described respectively conference papers and journal papers. Journal papers are usually a bit longer and represent more work, conference papers are usually shorter and take a bit less time. This of course is a generalization.
For example, a PhD thesis might be the full written-up version of 3-4 conference papers you'd have published in the course of your PhD, and then you might condense the important part of your thesis into a single journal paper afterwards, but this is only a vague example and there are many ways of skinning this cat. There are many examples of extremely short conference papers that have had a major impact in their field, and there are _millions_ of completely forgettable journal papers.
The above broadly holds for any of the scientific research disciplines with which I'm familiar.
Yes! As an author of many CS conference papers, a few CS journal papers, and a few papers in other related fields, piracy affects me. It has been entirely beneficial, and I completely support sci-hub.
Collaborators, other authors, and other researchers have been able to read and find my work more cheaply and easily than they otherwise would have. Some would not have paid to buy my papers sight unseen. Some can't afford to buy them. A larger audience is always good for me.
I have also never directly earned even a single cent from any published work of mine. No royalties, no copyright payments, no publisher payments, etc. And I've paid thousands to publishers over the years to publish my own work.
I'm in a very similar position as the OP (Computer Science PhD student). Piracy doesn't affect me at all. I'm very happy if someone wants to read my papers, however, you don't have to use sci-hub etc to access them - I've published preprints of all of them on arxiv.org. Usually I do this right after I submit to a conference and update it later to reflect any changes made before publication.
This also allows me to timestamp my results - say my paper gets rejected because the reviewers think it's a bad fit for the conference, or they didn't understand my point because my writing was bad, etc. Now I have to improve it and submit it to another conference. If someone else publishes similar results in the meantime, then I can always point to the preprint and say that I did it first ;)
As a scientist, I am unambiguously in favor of Sci-Hub and similar sites.
We want and need to disseminate our results as much as possible and the paywalls are getting more and more in our way. You won't find many scientists who are against Sci-Hub - maybe a few at some unusually rich universities of some very rich countries that pay $500 for a color graphics and the $2500-$3000 fee per article for them that makes the publication "open access".
Poorer countries cannot afford the journal subscriptions either, because these can be exorbitantly high - so high, that even top universities have to limit access. I stumble almost daily over a journal to which my institute doesn't have access. Before the advent of Sci-Hub, we often had to ask colleagues abroad on mailing lists for (technically illegal!) copies of articles to be able to conduct our research.
I might add the actual legal way at our university in case anyone wonders. The official way to get paywalled papers we do not have bulk-orders for is this:
1. Go to the university library in person.
2. Submit a order-and-copy request for that paper.
3. Return 2 days later unless there is a weekend in-between.
4. Print a non-searchable, non-zoomable (extremely annoying in CG where I work) copy because getting the PDF is forbidden due to legal issues.
Now combine this with the usual way to do related work research for your own paper:
1. Search relevant papers for your topic.
2. Check who else cited these paper and read those papers.
3. Complete this breadth-first-search until time runs out or the papers become too off-topic.
This (in my field) yields a stack of around 120 papers I would have to order that 2-day-print-copy for of which in turn 100 are most probably irrelevant and hence will not be part of my papers related work section. Note also that I would have to order each "layer" of the breadth-first search separately - each time taking 2 days. Compare that to a day of related work research with sci-hub.
Quite obviously I have never met a researcher who took the legal/official road ;)
AAA (American Anthropological Association) sent out a reminder to its members recently. When someone publishes their own paper on ResearchGate or academia.edu, they are in violation of copyright (which they've signed over to the lovely scientific publisher). It's a mess.
You are incorrect, at least for my branch (biology), and the traditional publishers like Elsevier, and well-known journals like Science, Nature, or PNAS:
Publishing in such a journal does not cost money. It also doesn't pay the authors.
Only the readers pay for the journal subscription.
Then about 15 years ago, a movement started to invert this process. This is called Open Access. Under this model, the author pays to have their paper reviewed, and it is published with a copyleft license, free to access by the public.
In the last years, traditional journals have started similar offerings, sometimes graduated (i. e. you pay only a little, and the article is embargoed for a year).
Sci-Hub is attacking this problem with technology. That is the second-best solution only. It would be much better if the scientific community could throw their weight behind legal Open Access publishing.
That's because there simply are costs involved in the review process, even if the reviewers as well as the authors are not paid. Copyediting, infrastructure, coordination, public relations, data retention, and other work that goes into publishing is done by full-time employees. As one point of reference: PLOS One, the largest Open Access publisher, charges around $600 for a publishing a manuscript. That's a non-profit, and it represents the best current effort to manage this process.
> Under this model, the author pays to have their paper reviewed
Note: it's not necessarily the case that the author pays. For example, the Open Library of Humanities [1] is funded by contributions my academic libraries, and is thus both free to read and to publish in. Something that's also referred to as Open Access (the "green" variant - I wrote about the different flavours at [2]) is when authors post a version of their article elsewhere, e.g. at their institution or on ArXiV.
The confusion is common though: there are lot of terms, and publishers have been exacerbating it by co-opting the terms (which I wrote about at [3]).
Thanks, that makes it clearer for me (non-academic) to understand. It does astound me how much control the 'journals' have over the academic research world!
They basically work as a "neutral" refree and communication medium between competing groups and meta groups (universitys) and create a neutral evaluation of the goods these rivaling organisations produce.
The reason they are not replaced is for the same reason, stocktrading places can not simply be "replaced"- aka you cant open up a competing wall-street.
>The reason they are not replaced is for the same reason, stocktrading places can not simply be "replaced"- aka you cant open up a competing wall-street.
> create a neutral evaluation of the goods these rivaling organisations produce
Well... They mostly facilitate the evaluation of the goods. The evaluation itself is still done by scientists and scholars that are not employed by the publishers.
If those would routinely abuse the power bestowed upon them, they would threaten the magazines. Its in there own best interest to not abuse there review-powers, or there reputation could be ruined by the very same marketplace.
You cant just open another marketplace at the same location, because that has you at a automatic disadvantage. You could however ruin the marketplaces reputation, trying to shift customers to yours. You can also, wrap the market-place as in- provide a vital layer they need to function, and thus turn them into a organella of a bigger market-place.
If academics cannot get together to fix a model so absurdly broken their problem solving capabilities come into question.
They had access to the Internet before the general public, they would have realized the ridiculousness of the journal model decades ago. You can't always be a victim.
Even though I'm not a researcher (yet; still doing my undergrad), this infuriates me to no end. I'd be glad to help any movement that attempts to break this by volunteering code and time.
> When you cite a source, you are not actually claiming that you have read it. What you are actually doing is staking your professional reputation on that source containing the information that you claim that it contains.
I don't recall a paper ever being retracted because it cited a background paper which doesn't exist.
I can't tell you how many times I've spent hours tracking down the source of a quote only to find that either the "quote" is really just a paraphrase of someone's work that someone else mistook for a quote or that the "quote" from source A is really in source B but is actually source B quoting source C
A friend of mine just published a book in a technology press editorial. The royalty for the book is about $1 US Dollar. And the book is listed on Amazon at $50 USD paper-back.
Heinlein's "Have Spacesuit, Will Travel" (1958), p14 in my copy:
"I remember once, when the money baskey was empty, Dad told Mother that "a royalty was due." I hung around that day because I had never seen a king (I was eight) and when a visitor showed up I was disappointed because he didn't wear a crown. There was money in the basket the next day so I decided that he had been incognito (I was reading The Little Lame Prince) and had tossed Dad a purse of gold-it was at least a year before I found out that a "royalty" could be money from a patent or a book or business stock, and some of the glamour went out of life."
The researchers will not lose anything, only the publishers. Researchers are paid by (usually) universities, and often post preprints i.e. the paper without the publishers formatting, since wider dissemination is usually better.
That is right, researchers are paid by the universities. BUT, to be hired researchers need a strong list of publications in peer-reviewed journals. This is usually the most influential factor for ranking. For this reasons some journals charge fees for publishing your paper, not saying about being paid.
Researchers get paid salaries by their institution. They are not paid for their publications directly, nor for reviewing or editing. All the money from journal subscriptions, book sales and per-article download fees go to the publishers of the journals.
No need to be paid directly when the indirect dollar-value of publication in the right journal can be six figures or more. With such incentive, it's no wonder many researchers skew their work towards whatever increases their chance of publication; dreaming of a Nobel Prize (apparently many in academia spend a lot of time fantasizing about it).
I don't agree. I think it's just tradition from back when academic journals had a much smaller circulation - and were physical things that needed to be printed and shipped - which justified this state of affairs.
The vast majority of scientists never receive any of these prizes. A lot work in fields where no such prizes exists.
There can be some minuscule royalties, but by and large academic publishers are reviled for predatory practices. They get close to free content. They get free services from peer reviewers (the essential basis of what they're selling). They charge vast sums to educational libraries for access. They charge ludicrous per-article fees for any member of the public who has the temerity to wish to read about any of the research their taxes fund.
Its in just about everyone's interests for for-profit academic publishers to collapse. If Sci-Hub can help bring this about, it will be a historic victory for the open internet.
Hah - keeping some parasite's wine cellar very well-stocked no doubt.
Not only does this rip off universities (which in most countries also means the taxpayer), it seriously limits access to information for citizens. Many more ordinary people access academic journal articles than I suspect research institutions fully realise. This is particularly true for legal and medical journals, access to which does have real (sometimes life-saving) effects.
How, and why, is such a business model still working? The only scenario I can think of is when an independent researcher wants to publish his/her research to disseminate to a wider audience and invite critiques. Or, is that also possible without these publishers?
1. Reputation. Journals are a provider, mainly, of a note saying "this paper is worth reading". This reputation is calculated largely by how much the papers they publish get cited later on. Due to the speed of iterations of science (how long it takes for a published paper to be used in more work, and then that new work to be published) it takes a long time to get past this.
2. Organised peer review. Despite the reviewers working for free, it's still not a free process. There's a great twitter thread about this from someone at peerJ (founder?) but I can't find a link now.
Getting more and more common is to pay the publisher and have the work made openly available, or at least to deposit the work in an institutional repository.
[disclaimer, I work in the science & publishing area for Digital Science, but on analysis of data not the publishing itself]
Each individual researcher wants their work published in as high-impact paper as possible. Individually fighting against the very publisher you want to publish your work with is harmful to your career and will accomplish exactly nothing.
It takes a critical mass of researchers, acting in coordinated way, for the attempt to change things not be suicidal.
This is really the point. It's the tragedy of the commons. It needs coordination - sometimes it happens. Don Knuth managed to wrangle the Journal of Algorithms from Elsevier, but not all fields have such an engaged Nestor that puts in the work to pull this off.
I'm not really easily angered, but Elsevier (and, to a lesser extent, the other scientific publishers) are really despicable, earning above-market returns on equity off the labour of (publicly funded) scientists, then spending some of that money to make it harder to access the (publicly funded) research.
Ceterum censeo Elsevier(um) esse delendum.
Read the longest section of the Wikipedia entry on Elsevier ("Criticism"), and weep.
But researchers always want to disseminate their work to a wide audience (of other scientists that will cite their work and thus help them get a long-term job).
ELI5: Simplistically, academics don't get paid to write papers. The most important thing they get from writing papers is peer prestige. Which they need to be employed in an academic role. The academics are paid by the institutions which hire them, from research grants, and so on. The only link with financial renumeration and the papers is that the better reputation an academic has, the the better their financial situation is.
Publishers require expensive subscriptions from academic universities to be able to read their published papers.
So actually it goes the other way around. Academics have to pay to read papers, and to write papers.
Scihub does not steal a dime from any original contributor. They actually save money from them.
It is extremely unlikely that authors lose anything from open publication of their research (in fact some provide final versions themselves as preprints on their websites).
Authors in general get no money for publications. In a weird setup of today authors get benefits from publications of prestige, career, hiring help and other scientists get curation / peer review benefits.
If current publication system went away completely and quickly it would be a minor shock to the system, but IMO easy to survive as a replacement curation system would emerge (from conferences, arxiv style, etc.)
Almost all top journals are in the hands of just a few global players such as DeGruyter, Elsevier, Springer, Sage, and Oxford Journals. These companies have lured independent journals into their traps by offering free Editorial Management Systems and other help to them or buying them. They have been doing that for many decades, with the result that there are only very few independent journals run by universities themselves left.
We are trying to fight this by making our own open source journals, but it will take decades to replace to old, well-established journals if it happens at all. Until then, if you don't publish in a sizable number of firewalled/copyrighted top journals, your career will end at the postdoc level.
Academic journals tend to require exclusivity. That means, if you want to publish in, say Nature, you will be asked to not publish the same article anywhere else.
However, the researchers (at least in STEM) often publish these "in-progress" drafts that are very close to the final articles on their website or wherever.
In computer science we have more conference paper than journals. This is a good thing since you do not have to pay a shit load of money to get your article in. Also I have not experienced a license yet, where I was forbid to publish it on my own.
Arguably open source has a strong tradition of support withing procapitalist libertarians (and an equally strong tradition of opposition), starting from the 19th century debates between Benjamin Tucker and Lysander spooner, through CATB author Eric Raymond, and beyond.
We researchers don't get paid for our papers. We give the publishers our papers for free so that they can make money by putting them behind a paywall. We sometimes even need to pay the publishers to publish our papers. The paywall can sometimes be removed by paying the publisher even more to compensate for their loss of income.
For those who use Telegram, there is a Telegram bot, where you send it the DOI or the URL and it sends you the PDF -- so much easier than legally accessing the file.
The site returns direct pdf links only for paywalled journals. If you put in a DOI for an article that's hosted on a free-access journal, the site redirects you to the article's page on the journal, so you can download the pdf directly from them. This behavior might be the cause?
It's not so rare for older but well-written articles to show up multiple times on the HN frontpage.
It's somewhat rarer for a 'product' such as Sci-hub to show up multiple times, but considering that the HN crowd is pretty proud about Sci-hub's existence and will readily strike up a conversation about the Evil Publisher Empire, it's not so strange to see it on top.
The only data we have are the 2015-16 download logs posted by Elbakyan and a reporter for 'Science.' At the time they had 28 million+ downloads over 6 months. I truly hope no more logs are released (at least not with that level of detail).
A piece I'd written a year or so back on why Sci-Hub is such a compelling option for academic and independent researchers. It's been picked up by a number of OA sites in the past few months.
As for Internet research gems, I'd also like to note a self-created resource that could use love, the Online Etymology Dictionary, produced by Douglas Harper. Unlike Sci-Hub, this is original work largely created to support Harper's own etymological explorations. I've found it tremendously useful, and very much in the spirit of the original web (of which it is very much a part).
Over at Scholastica (https://scholasticahq.com) we've been taking on this problem for the last few years.
We allow journal editors to create, manage peer-review, and publish OA journals all in one place.
Sir Tim Gowers, the Field's Medal winner, uses our platform for his journal Discrete Analysis (http://discreteanalysisjournal.com/)
The journal Internet Mathematics recently came over to the platform after being on Taylor & Francis for years (https://blog.scholasticahq.com/post/internet-mathematics-pub...).
We think journals make a lot of sense and that the problem is that journals don't control the toolchain.
I notice there's a lot of interest on HN around this subject from time to time. If you work with a journal and want to get in touch or have questions feel free to write me at rwalsh [at] scholasticahq.com
In my understanding, some people have donated their academic login credentials, effectively giving Sci-hub access to their institution's subscriptions.
When somebody requests an article, Sci-hub will try and see if it already has it. If not, it will attempt to use some of the donated credentials to gain access, download it and store it for further download requests.
I don't think you'll have to. Both hidden and conspicuous watermarks are common in trade book PDF downloads. RIAA members poisoned the Bit Torrent well.
I'm neutral on the matter, but there are some obvious paths of recourse for publishers who don't want people sharing their credentials.
(Disclaimer: I work for Crossref, which is an organisation in the scholarly publishing space.)
In any case, some researchers have claimed that Sci-Hub obtained their credentials through phishing. I don't know if it's true but in any case, such claims provide plausible deniability against that kind of watermarking proof, don't they ?
I'm asking because a friend of mine who's enrolled in 2 colleges, just downloaded the same scholarly PDF (one that's also available through sci-hub) with 2 different college credentials and they had the same sha256sum.
Then when he tried it through sci-hub, he also got the same sha256sum.
Crossref doesn't do anything at all in relation to SciHub or watermarking PDFs. We don't host content. (I'm happy to talk about what we do do, but it's off-topic)
Sorry if my disclaimer wasn't clear. I was just pointing out affiliation as standard on HN.
Nah, that doesn't work for academic papers. You usually access them anonymously anyway through your institution (which has a subscription). So even if they watermark (and I don't think they do), they can only watermark the institution.
I'm still curious as to how this system works. Don't institutions have logs that would show someone researching topics wildly outside of their field of study?
We do keep logs for a period of time, but the library administration's policy is to only investigate them when a publisher contacts us with a claim of abuse (we do not proactively monitor for unusual activity, although we do take steps like limiting the number of concurrent sessions per user and blacklisting IP addresses/ranges with a history of suspicious activity).
The publisher generally supplies examples of timestamps and URLs that were part of the alleged abuse. We use that information to identify the "abusing" user in the log.
Usually there is pretty clear evidence that the user is not conducting legitimate personal research (e.g. the user is a freshman early childhood education major at the local rural branch campus, but they're downloading thousands of chemistry papers from an IP address in China or Russia). Typically the user does not seem like an information freedom warrior, or even to have a clue what is going on, so it seems most likely the credentials were phished.
These cases may or may not be phishing. When corporations are hacked for their user credentials, those databases sometimes end up in dark web markets. It would be easy to extract email addresses with .edu domains ... so if a student used their university address for some service and reused the password, there's your login.
Moral of the story: Encourage students to use a password manager and 2FA.
Academic librarians, who negotiate terms with publishers, are obsessed with privacy and academic freedom. Bless 'em. In theory, publishers don't know who's downloading what. The librarians I've talked to say they delete their logs daily, if not several times a day.
As for geographic location, academics tend to travel a lot. Last I heard (from reading the court docs in Elsevier's lawsuit against Sci-Hub), they stopped using proxy connections a few years ago. They just log in using stored credentials, grab an auth token that lasts X minutes/hours, and download articles from whatever IP is convenient.
Usually, institutions don't subscribe to everything a publisher has on offer, but rather a subset that is interesting for them. Therefore, if a download works with a set of credentials, it should not look suspicious.
Storing the documents on Sci-hub controlled machines also makes sure that repeated requests don't actually hit the publishers, which significantly lowers what would otherwise be suspiciously high traffic.
Sci-Hub's an important player in the transition to Open Access. Note, however, that many papers are already available elsewhere (e.g. ArXiv), legally. You can use the OAButton [1] or Unpaywall [2] extensions if you hit a paywall to find a free version. They're not perfect, and solve a problem that shouldn't exist, but it's nice that they're there.
Yes, many papers are available ... somewhere ... if you know precisely where to look for them and find them.
The great thing about Sci-Hub, and a not-inconsequential element of its success, is that it is very nearly universal. Content is simply available in the archive in a tremendous number of cases. And is directly referenceable by DOI (or, when it's working, direct search).
The size of an archive matters as it reduces search costs across archives (this is a reason why archives tend to "a single max-size dump" dynamics). If you look at lists of the world's largest libraries, for example, it's pretty much the U.S. Library of Congress, and ... everything else. Even at the university library level, the largest collections tend to be fairly uniform at about 15-20 million volumes (Harvard, University of California, etc.).
The fact that a scholar can go to one such institution and have access to their entire archive is a compelling advantage. Shoe-leather adds up when you're crossing provincial or national borders.
Similar dynamics drove the adoption of single scholarly languages -- Greek, Latin, Arabic, Latin (again), French, German, and (in a battle with German and Russian) English following the 2nd world war, given that works had to be translated only to one language (English) rather than mulitiple.
Similar arguments, compounded by the insane costs of scribe and codex formation, applied in the pre-print era.
Yes, and it's great that people can use Sci-Hub for that. That said, for papers that are legally available, OAButton and Unpaywall exactly aim to solve the problem of having to know where to look for them and find them.
They're still only limited to openly available articles ("green" open access), but might be a solution for those not willing to use a solution of dubious legality, or for whom perhaps their institution blocks it or something.
But yeah, I'm not trying to trivialise what a great resource Sci-Hub can be for those who need it.
I wish we as scientific community could make some secure, anonymous, shared hosting work for this. With sci-hub as authority and everyone else being able to allocate as much of their local disk space for this.
edit: Before someone suggests IPFS: IPFS is not suitable. a) it is not anonymous and b) it duplicates everything for its block storage so you have to have twice the space...
After I graduated I gave off the credentials to my university's online library to my sister, who studied at another university, which didn't have the proper literature even though its profile(agricultural engineering) was much closer to her field than mine(computer science/electrical engineering).
She should have had access to that literature without all this.
That worked great for your sister. What about those researchers living let's say in Brazil, where Universities cannot afford huge subscription prices? The only alternative is SciHub...
The user experience really is just so good. Put in a paywalled link to a paper and out pops the PDF. I think the ease of use really contributes to its popularity a lot.
The point of sci-hub is that it fetches specific articles for you, not to be a search engine. Find an article you want - for example, the PolderCast paper[0] - then enter a link or the DOI for that article, and it'll find the paper most of the time.
Unfortunately not, I tend to find something close to what I want via Google and then find papers that it references, and papers that reference it. Review articles are useful, as are benchmark comparisons that you find in some papers.
In addition to other suggestions here, you can also simply add ".sci-hub.cc" to the end of the domain part of the URL when confronted with a paywalled paper.
I find this the easiest interface. It allows me to search using standard search engines (PubMed, Google Scholar etc) the duck into sci-hub as required.
Yes, that's the one! Those numbers refer to the number of papers in each torrent, so each one contains 100,000 papers giving a current total of 66+ million.
The torrents of 100,000 are broken into 1000-paper zip archives that can be downloaded individually, so it's pretty manageable if you want to just check out a random sampling of the papers.
I would love to see somebody do some kind of massive scale analysis of the papers, but just extracting plain text from all those PDFs is a pretty herculean task considering that many would need to be OCRed, and others end up pretty garbled / misformated with pdftotext and the like.
I thought about mirroring it, the repository db is 200MB and simple in structure, but then you have to have quite a lot of hdd on your side (20, 200TB maybe more, can't recall)
Well, you might want to worry about citing sources that your readers might not be able to read. Otherwise, it's mostly the same risks you get when using e.g. the Pirate Bay. That's if you use the site, by the way, regardless of if you use them for your thesis.
No, you don't have to worry about it, as long as you confirm the authenticity of the documents. Cross-check with the preview of the original publisher to confirm that it's not a preprint or otherwise non-citable.
It scares me actually, when trying a search I got this:
поиск временно недоступен. пожалуйста, используйте DOI или прямые ссылки
search temporarily unavailable, please use DOI or direct links
If you're using Google chrome, you can install Sci-Hub extension to use search.
To do this:
Download the extension and unpack it. You get the "Sci-Hub" folder with code.
Open Chrome and navigate to chrome://extensions, or just open the menu -> settings -> extensions.
Check the developer mode in upper right.[...]
It's the message that shows up when you try searching. It says that "search temporarily unavailable, please use DOI or direct links" and then under that "If you're using Google chrome, you can install Sci-Hub extension to use search", with instructions. I've installed it in the past and it does work, but I was also wary about enabling developer mode and Chrome would nag me on startup each time that it was unsafe so I ended up disabling it. Here's the full text: https://pastebin.com/6RnRJYUa.
You probably just paste DOI codes there like most of us. Originally, sci-hub supported searching for codewords/topics (by using Google Scholar), but then their IPs got banned from Google Scholar and the workaround is to install this extension. It modifies Google Scholar pages so that all links to paywalled articles are replaced with links to sci-hub pdfs of same articles. Google would never approve this extension, so you need to install it in dev-mode.
But again, for most people's usage, it's completely unnecessary. Paste DOI -> get pdf works without any 'extensions' necessary.
I regard SciHub as a useful tool for accessing research papers when I need them. The founder's political stance is irrelevant. I am a bit disappointed by her snap decision to block access to Sci-Hub from Russian IPs for the brief while that she felt offended by someone from the Russian academia, but that disappointment is only about that she allowed her personal grievances to influence the service. That hurts reliability of this instrument somewhat.
Please don't respond to a bad comment with an even worse one. That's exactly the wrong direction to take.
Charges of shillage against other users are not allowed here unless (in some extremely rare twist of fate) you have actual evidence thereof. Treating such accusations as a routine internet tactic is poison, so please don't do it again.
It wasn't a real accusation and I definitely don't use it routinely, was just pointing out ridiculousness of politically charging a conversation about a very useful scientific tool which I am a great fan of. One can always find some angle to accuse someone of being someone else's agent.
This is not mine - but looking at the code you should notice fairly quickly that it is somehow trivial. Note also that it may be necessary to change the URL if necessary.
Lastly it should be said that sci-hub does indeed need donations - it is essentially a one-woman project and she can always use some lawyer-funds...