Fake scientific papers are alarmingly common and becoming more so

When neuropsychologist Bernhard Sabel put his new fake-paper detector to work, he was “shocked” by what it found. After screening some 5000 papers, he estimates up to 34% of neuroscience papers published in 2020 were likely made up or plagiarized; in medicine, the figure was 24%. Both numbers, which he and colleagues report in a medRxiv preprint posted on 8 May, are well above levels they calculated for 2010—and far larger than the 2% baseline estimated in a 2022 publishers’ group report.


Journals are awash in a rising tide of scientific manuscripts from paper mills—secretive businesses that allow researchers to pad their publication records by paying for fake papers or undeserved authorship. “Paper mills have made a fortune by basically attacking a system that has had no idea how to cope with this stuff,” says Dorothy Bishop, a University of Oxford psychologist who studies fraudulent publishing practices. A 2 May announcement from the publisher Hindawi underlined the threat: It shut down four of its journals it found were “heavily compromised” by articles from paper mills.

Sabel’s tool relies on just two indicators—authors who use private, noninstitutional email addresses, and those who list an affiliation with a hospital. It isn’t a perfect solution, because of a high false-positive rate. Other developers of fake-paper detectors, who often reveal little about how their tools work, contend with similar issues.


To fight back, the International Association of Scientific, Technical, and Medical Publishers (STM), representing 120 publishers, is leading an effort called the Integrity Hub to develop new tools. STM is not revealing much about the detection methods, to avoid tipping off paper mills. “There is a bit of an arms race,” says Joris van Rossum, the Integrity Hub’s product director. He did say one reliable sign of a fake is referencing many retracted papers; another involves manuscripts and reviews emailed from internet addresses crafted to look like those of legitimate institutions.

Twenty publishers—including the largest, such as Elsevier, Springer Nature, and Wiley—are helping develop the Integrity Hub tools, and 10 of the publishers are expected to use a paper mill detector the group unveiled in April. STM also expects to pilot a separate tool this year that detects manuscripts simultaneously sent to more than one journal, a practice considered unethical and a sign they may have come from paper mills.


STM hasn’t yet generated figures on accuracy or false-positive rates because the project is too new. But catching as many fakes as possible typically produces more false positives. Sabel’s tool correctly flagged nearly 90% of fraudulent or retracted papers in a test sample. However, it marked up to 44% of genuine papers as fake, so results still need to be confirmed by skilled reviewers.


Publishers embracing gold open access—under which journals collect a fee from authors to make their papers immediately free to read when published—have a financial incentive to publish more, not fewer, papers. They have “a huge conflict of interest” regarding paper mills, says Jennifer Byrne of the University of Sydney, who has studied how paper mills have doctored cancer genetics data.

The “publish or perish” pressure that institutions put on scientists is also an obstacle. “We want to think about engaging with institutions on how to take away perhaps some of the [professional] incentives which can have these detrimental effects,” van Rossum says. Such pressures can push clinicians without research experience to turn to paper mills, Sabel adds, which is why hospital affiliations can be a red flag.


Source: Fake scientific papers are alarmingly common | Science | AAAS

A closed approach to building a detection tool is an incredibly bad idea – no-one can really know what it is doing and certain types of research will be flagged every time, for example. This type of tool especially needs to be accountable and changeable to the peers who have to review the papers this tool spits out as suspect. Only by having this type of tool open, can it be improved by third parties who also have a vested interest in improving the fake detection rates (eg universities, who you would think have quite some smart people there). Having it closed also lends a false sense of security – especially if the detection methods already have been leaked and papers mills from certain sources are circumventing them already. Security by obscurity is never ever a good idea.

