An Indian research university has assembled 73 million journal articles (without permission) and is offering the archive for unfettered scientific text-mining

Originally published at:


Scholarly publishers face the same issue that some players in the entertainment industry face, which is the value they added, and their ability to get paid for that value, was tied up in the fact that distribution was hard. They added value by helping to solve that problem and controlling distribution gave them the opportunity to generate revenue.

Now that distribution is easy, it’s not clear what (if any) value they add and their ability to get paid has been destroyed and they’re left flailing around yelling about copyright.


Also good.


(without permission)

I give them my permission. For what it’s worth…




It’s possible that grad students are even more naive about owning the rights to the result of their work than are musicians.


I like your style.




Thanks for your permission.



Except the entertainment industry scouts talent and occasionally fosters, markets and promote bands. Or at least it did historically. Research entities/labs have to pay to publish. Yes, there’s a cache that comes with getting into a sexy journal, but it seems like journals are generally less useful to “the talent” than the entertainment industry.


I haven’t yet heard a valid reason for continuing the current expensive private journal publication scheme. What would be lost if everyone published their research through free journals supported by the government or by donations?

The last few hundred years have proven repeatedly, daily, that when information is freely shared, science (and society) benefits.


Theoretically these journals provide peer review support to justify their fees. They also provide a first pass filter to reduce the amount of garbage you need to sift through to find the gold. The existence of Arxiv does call into question the amount of value they are adding for their fees.


Tenure approvals, mostly.


I wonder how Aaron Swartz would feel about this.

Peer review (and editing for that matter) is done by outside scientists who do not get paid for their work. So even theoretically this isn’t justifiable.


Here’s a recent pro- and con-sci-hub debate:

1 Like

Luckily, we’re doing away with that, in favor of adjuncts across the board… /s


There are some misconceptions and questions here that may be worth clearing up. Credential-wise, I am editor at 3 journals at a range of publishers including a society journal and big Evil Elsevier. The editors at my Elsevier journal, including me, have recently pushed them to make our journal open access, though not without some reservations, which I’ll get to in some of the items below.

  1. Editors usually do get paid. It can range from a nominal amount to quite substantial if you are the Editor in Chief at a high profile journal, especially if it’s run by a scientific society. ($60k /year plus). (Mine is more on the nominal level, so that’s not a motivation for my post, btw.)

  2. There are actually fairly substantial costs to running a good quality journal. Our editor group discussed this with the editors at several of our peer/competitor journals when we were making the decision about whether to make the jump. There’s keeping the servers running, both to host the articles and for the submission and review management system for online. There’s the IT team to keep that running and secure from the umpteen attacks aimed at it every day. If you also do a print version, there are substantial costs for that especially if there are color figures. “Editors” might better be referred to as “referees” or “umpires”; there is typically a professional staff or subcontractor that does the copyediting, checks that references are correct and up to date, etc. These are nontrivial tasks. I am incredibly persnickety (just ask my students, who have been sweating over .01 millimeter differences in their figure alignments today…) but they still usually manage to find 5-6 things in my papers; most of the papers I see are much, much worse, especially if they are from non-native speakers. Somebody has to pay for the license for the plagiarism detection software, and somebody has to pay for the (often international) legal team to deal with it when an author sues you because you reported them to their department and funding agency for said plagiarism. Even for small journals, there is typically at least one full-time administrator assigned to each journal, a publisher’s rep (executive level salary) to help with publicity, getting the journal registered for its impact factor, making sure that it is hitting its target numbers in terms of distribution and readership/impact, getting reports into the various agencies and groups that rank the journals, want to keep track of funding and other compliance issues, including making sure that the papers published have met ethical requirements, and so on. Elsevier is no angel, but to their credit they do also host workshops at conferences for beginning authors and reviewers to help them advance in the field, fund travel for junior scientists through various awards, have a free online "academy’ for people to learn professional skills, and soon.

  3. As others have pointed out there are the various preprint services such as ArXiv. And here’s the thing: You can send your paper through the peer review process at the journal, and once it has received the benefit of peer review, etc. you still completely own your original copy of the paper (i.e., the Word or LaTex or whatever document). The publisher’s copyright is only for their typeset, proofread/copyedited, etc. version. Further, at least in my field, even Elsevier (probably the publisher with the worst reputation) either sends authors a link they can give to people for limited-time access to the ‘pretty’ version or for newer journals, makes the entire issue open-access for several months. Per mjionsda’s comment, at this point the “added value” the publishers are providing is really convenience and ‘niceness’…you get the papers collected together, easily searchable, copyedited, and looking nice. Are many of the publishers making this ‘convenience fee’ too high? Arguably, but that’s about amount not if there’s any reason.

  4. All that stuff in #2 isn’t free. Somebody’s going to pay for it. If it’s not the receiving institution for the ‘conveneince fee’, it’s going to be the authors and their funding agencies. We talked to one of the other top journals in our field that is all open-access, and their real costs for publishing an article are around $6k, because they also have to factor in the costs around all the articles that get submitted and have to be processed but don’t get accepted in the end. (Typically author fees are ony after the paper has been accepted; you don’t pay if you submit but are rejected.) That’s the difference between sending people in your lab to a conference where they could talk with others about their work and make connections that could lead to collaborations or jobs, buying that piece of equipment, etc. Or if you are not funded sufficiently, whether you publish at all or end up paying it out of your own bank account. From the citizen’s perspective, your tax dollars are now going to the publisher, not the science.

There are hardship allowances for certain circumstances, developing countries, etc. but still it means that some people are not going to be able to publish their work, or at least not in the journal they think would be the best fit. This will disproportionately affect junior investigators and those at smaller schools (less likely to get grants), or countries/regions. At the journal about to make the switch, we are fighting to keep our costs relatively low and get subsidies, but are still anticipating a 50% drop in submissions. Is it really better for papers not to get published at all, or at least not in the places where they are likely to reach their best-fitting audiences, than to have their copyedited version behind a paywall? Currently at least the authors could say 'here’s the author copy of my paper recently accepted in <>".

Quite honestly, if you take a look, it’s mostly the big, already well-funded and prominent labs that are advocating for this. It’s going to be a ‘rich get richer’ phenomenon, unless the real curation and distribution (i.e., getting it to the right audiences, not just getting it out there) problems get solved by the preprint servers better than is currently the case.

None of the above is meant to say that the publishers don’t charge too much for what they provide, or that they haven’t done some really dumb or and potentially unethical things in trying to play hardball with the different countries and institutions giving them pushback. However, especially since people actually DO continue to own the rights to their original version, just not the typeset one, I’m not sure that many of the people advocating for all open-access (which will almost certainly mean author pay unless governments want to pitch in - and how do yo do that for international journals?) aren’t going to regret it

(And before you pick on me, yes, there are a couple typos in here. However, this is a post in a comments section on a website, not a journal submission!)


Long live Rameshwari Photocopy Service :sweat_smile: