Wednesday, January 5, 2022

How Academic Paper Overload Makes Innovative (And Substantial) Research Harder To Recognize

 Physics Today editor Charles Day makes valid points in his recent editorial, 'It's All Too Much' (Vol. 74, No. 12, December, p. 8) when he cites the research of Johan Chu of Northwestern University and James Evans of the University of Chicago in October.  Their published analysis of bibliometric data drawn from 10 broad fields in science, medicine, engineering, and mathematics, e.g.

J. S. G. Chu, J. A. Evans, Proc. Natl. Acad. Sci. USA 118, e2021636118 (2021). https://doi.org/10.1073/pnas.2021636118

Featured a main conclusion that is sobering: As the number of papers in a field increases, researchers find it harder to recognize innovative work. Progress seems to be slowing.

I don't know why this should be surprising, given I had earlier cited one 2007 study, showing that half of academic papers are read only by their authors and journal editors.  Even more sobering, the authors found that 90 percent of papers published are never even cited by other researchers. Again, this isn't surprising. and let's also bear in mind that papers are of unequal quality - also many of lesser quality due to the "publish or perish" pressure. 
This mandate means professors will churn out papers merely to meet the quota mandates of their institutions, and may not have anything genuinely novel to report. Hence, the low quality, also the evident lack of recognition of innovative work.  There are only so many "new" things science - even physics - can excavate and process in a given year.  So when some 1.9 million (estimated) research papers are published a year  one must consider the reality of redundancy.  
This over production has led Chu and Evans to present a plausible hypothesis. Rather than evaluate new papers on their individual merits, researchers increasingly resort to comparing them with existing paradigms. Further,  they note that when new papers are published at a high rate, truly novel ideas struggle to prevail—let alone be noticed—over the flood of competitors. It is the old problem of noise overwhelming the signal, and let's be frank - of the estimated 1.9 million yearly published research papers (in 25,000 journals)  probably 75% are noise, not signal.  By which I mean a significant novel discovery, or finding. 
One strength of Chu and Evans’s paper - as noted by Day -  is that they used their hypothesis to make six predictions, all of which are statistically testable, i.e. by looking for correlations. Their predictions are, to quote the paper,
1) New citations will be more likely to cite the most-cited papers rather than less-cited papers; 
2) The list of most-cited papers will change little year to year—the canon ossifies; 
3) The probability a new paper eventually becomes canon will drop; 
4) New papers that do rise into the ranks of those most cited will not do so through gradual, cumulative processes of diffusion; 
5) The proportion of newly published papers developing existing scientific ideas will increase and the proportion disrupting existing ideas will decrease; and 
6) The probability of a new paper becoming highly disruptive will decline.
As Prof. Day points out citations are straightforward to count, so it makes sense to use them in the testing. . To characterize a paper’s "canonicity, disruptiveness, and diffusibility", Chu and Evans also developed statistical measures. In all, they examined 1.8 billion citations of 90 million papers published from 1960 to 2014. Each of their six predictions was affirmed by a significant correlation.  So we have excess publication, way too much, as the editorial header makes clear. 
What to do about it?
Day writes: "Telling physicists to publish less, publishers to stop launching new journals, and editors to reject more papers are all illiberal restrictions on freedom of expression."
But I disagree. If a tidal wave of research papers is swamping the ability of the given scientific community to take time to process the work, then  where is the benefit?  If most papers are read only by the immediate authors and editors, what's the point?  It appears more an exercise in academic vanity and meeting a mandate to publish in order to remain viable at a given institution. If 25 authors need to contribute to one paper (as is happening more and more) to get credit for research at an institution, do they all get equal credit for a citation?
Don't take my word for insisting there is excess, even in premier journals. Check out the 10 -day cycles of contributions to The Astrophysical Journal issue in the link below.  Then ask yourself just how many will read even one of the papers before the next batch becomes available (in the ensuing issue) for the next ten days (go to the top and click on 'next issue'):
If anyone tells me the particular astrophysicist - even in that specialty area -  is reading all of the relevant papers, I'd day they are inhabiting fantasy land.  Even Day admits most papers are not being judged on individual merit, and that's just in physics.  What about mathematics, biochemistry, anthropology, geology, psychology etc. ?
Even if you are unable to grasp the content in the specific papers from the link above, it is instructive to peruse the span of topics just from this single issue and I encourage readers to continue looking at the subsequent issues - asking how many of those papers might actually be read and understood. Say to warrant citation.
 I do agree with Day that scientists need a better way to evaluate a paper's  novelty. As he writes:  "Whoever develops one would earn scientists’ gratitude. They might even become rich."  
Maybe, but I doubt it.

See Also:

No comments: