The Science Publishing Complex – 1% publish 41% of all papers

Erik Stokstad in Science (AAAS) writes: Publishing is one of the most ballyhooed metrics of scientific careers, and every researcher hates to have a gap in that part of his or her CV. Here’s some consolation: A new study finds that very few scientists—fewer than 1%—manage to publish a paper every year.

But these 150,608 scientists dominate the research journals, having their names on 41% of all papers. Among the most highly cited work, this elite group can be found among the co-authors of 87% of papers.

The new research, published on 9 July in PLOS ONE, was led by epidemiologist John Ioannidis of Stanford University in Palo Alto, California, with analysis of Elsevier’s Scopus database by colleagues Kevin Boyack and Richard Klavans at SciTech Strategies. They looked at papers published between 1996 and 2011 by 15 million scientists worldwide in many disciplines.

“I decided to study this question because I had seen in my life a large number of talented people who just did not survive in the current system and with the current limited resources,” Ioannidis wrote to ScienceInsider in an e-mail. He suspected that only a few scientists are able to publish papers year in, year out. But the finding that less than 1% do so surprised him, he says.

h/t to Dennis Wingo

===================================================

It seems to me that a similar rule holds true in climate science, with “big names” like Mann, Jones, Trenberth, Hansen, and others being on more papers than is the mean, but only an analysis will answer that question for certain.

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0101698

The paper:

Estimates of the Continuously Publishing Core in the Scientific Workforce

John P. A. Ioannidis, Kevin W. Boyack, Richard Klavans
Published: July 09, 2014 DOI: 10.1371/journal.pone.0101698

Abstract

Background

The ability of a scientist to maintain a continuous stream of publication may be important, because research requires continuity of effort. However, there is no data on what proportion of scientists manages to publish each and every year over long periods of time.

Methodology/Principal Findings

Using the entire Scopus database, we estimated that there are 15,153,100 publishing scientists (distinct author identifiers) in the period 1996–2011. However, only 150,608 (<1%) of them have published something in each and every year in this 16-year period (uninterrupted, continuous presence [UCP] in the literature). This small core of scientists with UCP are far more cited than others, and they account for 41.7% of all papers in the same period and 87.1% of all papers with >1000 citations in the same period. Skipping even a single year substantially affected the average citation impact. We also studied the birth and death dynamics of membership in this influential UCP core, by imputing and estimating UCP-births and UCP-deaths. We estimated that 16,877 scientists would qualify for UCP-birth in 1997 (no publication in 1996, UCP in 1997–2012) and 9,673 scientists had their UCP-death in 2010. The relative representation of authors with UCP was enriched in Medical Research, in the academic sector and in Europe/North America, while the relative representation of authors without UCP was enriched in the Social Sciences and Humanities, in industry, and in other continents.

Conclusions

The proportion of the scientific workforce that maintains a continuous uninterrupted stream of publications each and every year over many years is very limited, but it accounts for the lion’s share of researchers with high citation impact. This finding may have implications for the structure, stability and vulnerability of the scientific workforce.

 

About these ads

36 thoughts on “The Science Publishing Complex – 1% publish 41% of all papers

  1. Boasting on my wife’s behalf: Joan joined academia in middle age. In 23 years as a biostatistician at a medical school, she authored or co-authored well over 100 papers. Aside from working very hard, she had the opportunity because medical journals generally require that a paper involving statistical analysis must have a qualified statistician as one of the co-authors.

    I wish climate journals would adopt the same rule.

  2. And of those published papers, most are never replicated or tested. Until their results are replicated, its not science but merely one person’s opinion.

  3. I think this is true across most academic disciplines, and indeed most walks of life. In my own field (social sciences) I would expect similar statistics. Publishing in double blind peer reviewed journals is tough. You have to be very skilled, very driven, and a little lucky to continue to publish in top tier outlets. It’s also true that the more you publish, the better you get at dealing with reviewers, and the more successful you become. Also true that the more you publish, the better your access to resources (e.g. research grants), and the better your chances get.

  4. Jim Steele is right in that replication is the hallmark of good science. Put differently, hypotheses can only be considered well verified when they survive repeated attempts at statistical falsification. So a one off result should be view with extreme skepticism. A result that is verified again and again in multiple studies is obviously far more robust.

  5. Interesting point, David, regarding “because medical journals generally require that a paper involving statistical analysis must have a qualified statistician as one of the co-authors.” Perhaps if certain skeptical climate bloggers were qualified climate scientists (meaning they hold a degree in a relevant field, and are actively out doing research), their research would be published too.

  6. Articles have to be about ‘original’ work. Now, how many times a year do you have a Nobel price winning idea, or access to original raw data? Most of what is published is rehash from rehash.

  7. What you find if you actually look at the papers published by MANY of that core 1% is that they publish multiple papers on the SAME research with slightly different twists, and they put them in different journals, and add different co-authors so they can all get as many publications to their collective credits as possible. I do a lot of literature research as part of my professional (non-academic) life and continually come up against virtually the same paper by different authors – except, each of the many authors are listed in different orders so the citation in the SAME paper you are reviewing appears to be MANY people corroborating the finding, but in fact is the SAME group who just shuffles the findings and author name orders a little bit.

    What a game.

  8. I remember when I started reading ‘climate’ related papers, how the same group of people all seemed to collaborate on the papers, just trading off the lead name, and using each others’ papers as references. Seemed like a closed loop.

  9. 1. Time available to do real work decreases as time spent reporting on same increases.
    2. Equilibrium is reached when 100% of time available is spent reporting on the 0% real work being done.
    3. Academia is the only place where equilibrium is routinely attained.

    plagarized form somewhere….

  10. I suspect that most papers involving tenured people represent work done mainly by graduate students and postdocs, so there is a natural trend to a relatively small fraction of names on many papers. Early success leads to tenure and money for students and postdocs –> more success –> more papers, etc.

    Also, many academics get caught up in teaching and admin (and of course advocacy in climate science), so effectively cease publishing when they get tenure.

  11. kenw says: July 15, 2014 at 9:02 am
    1. Time available to do real work decreases as time spent reporting on same increases.

    A lot of researchers do the daily grind work, and simply don’t have time to publish.

  12. The first paper to be published is the most difficult. As you publish more and more papers, you get better and more efficient at it. I’ve always thought that it would be a big help if there was some sort of template (maybe there is), to give you a format and structure, into which you may type your work.

  13. In most fields that 1% forms a clique based upon competence and creativity, whereas in climate “ science” it’s based on being the best scammers. The star system works fine when reports of fraud are rapidly punished instead of rewarded, as is indeed the case in nearly all other fields. Most Ivy League scientists attract both massive funding and the very best students so they end up with a group of twenty or even forty full time workaholics at the top of their creative game, so several papers a year is really the output of dozens of scientists in training. I know also that hard physical scientists and engineers tend to not even care about the climate scam since they don’t see existing climatology as anything more than yet another soft science like sociology or highly political anthropology. The one thing top schools do get away with in science is hyping results up and publishing a lot of breakthrough proof of concept inventions that don’t always actually work well in terms of yield and reliability and there’s pressure to leave optimization to lesser schools and industry. That’s not fraud but it’s a bit silly since it then takes a long time for invented tools to become robustly useful. The funny thing is how climatology now attracts the worst students of all, the activists who didn’t have much passion for the hard sciences where the big mysteries and big inventions come from.

    -=NikFromNYC=-, Ph.D. in chemistry (Columbia/Harvard)

  14. The new research, published on 9 July in PLOS ONE, was led by epidemiologist John Ioannidis

    epidemiologist [-dē′mē·ol′əjist] – a physician or medical scientist who studies the incidence, prevalence, spread, prevention, and control of disease in a community or a specific group of individuals. In a hospital a physician may be assigned as a staff epidemiologist with responsibility for directing infection control programs within the facility.

    http://medical-dictionary.thefreedictionary.com/epidemiologist

    So it took someone who looks at a spreading disease in a specific group to check find what the results looked like.

  15. Johan says:
    July 15, 2014 at 9:20 am
    ……
    exactly. they’re doing real work. Not talking about it….

  16. In 1993, the United States Supreme Court in Daubert v. Merrell Dow (509 US 579) redetermined what constituted scientific knowledge for federal courts. It was well-supported by many so-called experts in the field. It adopted five guidelines, four “shoulds” and one “should not”. They are, in case order:

    1. The knowledge should be testable and preferably tested , or falsifiable. (Popper (1963) , p. 7.)

    2. The knowledge should have been subjected to peer review and publication. (An individual cannot be objective, but must be challenged. Popper (1945) I-213, I-225-6; Popper (1966) pp. 1-2 of 31.)

    3. A court should consider “the known or potential rate of error” of any scientific technique. (Truth can never be established, only falsity. Popper (1945), p. II-343 fn. 7(2); Popper (1934/1959) , p. 275.)

    4. Results should have been accepted by a consensus in the scientific community. (Reason is a social phenomenon. Popper (1945) p. II-205.)

    5. Conclusions should not depend on the application, but instead be based solely on scientific principles and methodology. (This “Should-not” is the negation of Popper’s urging that science is a moral endeavor, in which a model is not accepted because its predictions are valid, but rather on its social consequences. Contradicting Popper (1945), p. 220.)

    The Court didn’t discover the attributions to Popper.

    #1 is based on Popper’s misunderstanding that scientific propositions were Universal Generalizations (famously “All crows are black”). He confused scientific propositions, which are modus ponens statements, with definitions. Popper famously said, “definitions do not matter”. He was against definitions.

    #2 is the darling of the academics, translated as “publish or perish”.

    #3 is a misunderstanding of decision theory, which involves Errors of the First Kind (Type I) and Errors of the Second Kind (Type II). Popper’s falsification miscalculation (#1) required zero Type II errors (perfect modeling) and he gave no credit to Type I errors.

    #4 is the notion that science is about voting instead of models with predictive power. It is why we have professional publications of dogma. Popper’s reduction included intentional removal of all traces of Cause & Effect, Bacon’s addition in 1620 that created Modern Science.

    #5 is Popper’s intersubjective testing, which is relativism blended with political correctness. Popper’s scientific conclusions had to reflect the consequences, social or other, of scientific propositions. The Supremes knew THAT much was wrong: it preempted the role of the triers of fact. The Court could tell when ITS ox was being gored.

    The entire construct is Post Modern Science, conveniently documented in Daubert. The intersection of PMS and MS, which is the only science practiced in industry, is the empty set. PMS (+ $Gov) is why AGW climatologists exist.

  17. “It seems to me that a similar rule holds true in climate science, with “big names” like Mann, Jones, Trenberth, Hansen, and others being on more papers than is the mean, but only an analysis will answer that question for certain.”

    The Wegman Report comes to mind.

  18. 15 million (published) scientists! How come we don’t already know everything there is to know about everything, forever, already?

    How ironic, that in that softest of sciences, literature, the best publish the least (think Tolstoy)

  19. Looks like a special case of the “Pareto Distribution”

    “Pareto stated in his book [1] that there is a simple law which governs the distribution
    of income in all countries and at all times. Briefly, if N represents the number
    of people with wealth larger than a certain income limit x, and A and are constants,
    then N = A=x , therefore,
    log(N) = log(A) -a log(x) .”

    With different a factors, the formula can be used to model letter distribution in an alphabet, number of cities in a country with populations greater than N, and number, or tonnage of ships sunk by U-Boat commanders during WWII.

  20. Ioannidis is very famous in the medical world, as a sceptic attacking statistical sleight of hand in medical papers. Probably the equivalent of Steve McIntyre for climate…

  21. steveta_uk says:
    July 15, 2014 at 8:59 am
    How much is there of “if you add me to your paper, I’ll add you to mine”?

    I don ‘t know how often that (hit swapping) goes on in the natural sciences. But in economics and finance it is quite common. There are many people who form groups of three or four and automatically put each others’ names on their papers, even if their friends had little or no input into them. That way, instead of 2 publications per year, they get 6 or 8 and they look like geniuses.

  22. I’m not that sure that ‘publishing’ is a guide to a scientist’s quality or usefulness.

    I base this partly on experience, when I was at university, the ones who were keenest on publishing tended to be the ones that were the most out of touch with both students and society, the ones that treated students the worst and as just commodities for their dirty work, the ones that were the most inflexible, the ones that gamed the system, the ones that were the most dishonest, the ones that were the most socially unaware, the ones that never did any field work, whilst the others who didn’t publish just got on with the job teaching and instructing. These interacted with the students and treated them like human beings, whilst the publishing scientists tended to see students as simply as a resource to be exploited, students were only useful so far as they fit into their rigid research agendas. Personalities and lifestyles, regardless of the legality, had to make way for research requirements, rather than the other way around. Researchers could pick and chose which students they liked and which students fit their agenda, rather than allowing students to express themselves and determine their own futures. Some of the most prolific publishers didn’t even teach any courses, and used their higher status to get away with it.

    There was most definitely a corrupting influence in the ‘publish or perish’ culture. This wasn’t the case amongst many who didn’t ‘publish’ at all.

  23. Further to my comments above with has relevance to the above article on research statistics, university administrators have stated to me that part of the problem is that if you are going to have a publishing culture, there needs to also be adequate supervision of what researchers are up to, which isn’t always available. They themselves have sometimes complained about lack of resources to properly supervise what researchers were doing.

    So you have the paradox here of scientist’s needs for funding and resources, and the reality that there needs to be adequate resources to ensure that researchers, and university research departments, need to also be adequately supervised, to ensure they conduct research in a proper manner. In other words someone needs to make sure that researchers follow the rules.

  24. Wunderbar!, I’m in the top 1% finally, but where is my 1% income? Well, I don’t work in medicine or climate science, and I’m definitely not a rocket scientist, so bugger me.

    Still, I do not find this surprising. I remember a study from long ago that the average PhD published less than 1 paper per lifetime. Some people are driven to do research and publish it and some aren’t. Some of the latter are happy enough to have convinced themselves of the validity of a particular hypothesis – one of my mentors was like that, he knew an incredible amount about his field, but only published when he was sure no one could rubbish him. I’m not a perfectionist and I do feel a debt to the tax payers who have funded my research, so I send out whatever I think will float in the hopes that someone else will find it useful. Besides, it’s too easy to convince yourself you are right – you need those jealous colleagues to keep you honest. Worked for me in terms of reputation, and there has been no falsification or regression to the mean (yet), but it has meant not having much of a real life outside or research and teaching. Well, better than a lifetime spent playing computer games (including models).

  25. In my career I have found papers tend to be like buses. You can easily have a year with none then in the next year three come all at one. This is particularly the case with complicated research projects which take time to set up and do.

    I also wonder how many of the 1% are the sort of academic leader that insists on their name being added to everything their team produces.

  26. Agree with some commenters lie Tom G(oligist) above. Back when I was struggling with biochemistry and physiology, I remember doing a literature search on a particular metabolic pathway found in some marine organism. A husband and wife research team would publish five or six papers a year, by dint of publishing every time they had completed a fresh experiment – the results of prior experiments were regurgitated in each paper, with each experiment acting as a new variation, adding a tiny, or in some instances, nonexistent, increment of knowledge (“the results confirm our conclusions from earlier experiments etc.”) It was not bad science, but certainly was not earthshaking.

    Historically, in earlier fisheries science, some individual scientists were remarkably prolific in terms of both quantity and variety of research. More recently, the big producers have a much narrower focus – but this must be said for those in environmental science.

    It must be acknowledged that a lot of important work by scientists in environmental fields (eg fisheries, entyomology etc) ends up in grey literature (not published in science journals, but printed and circulated in-house) – serving as a database for other research. The material is important, but does not offer new theoretical insights and hence is not suitable for science publication.

  27. mrjohnmcnab says:
    July 16, 2014 at 3:05 am

    In my career I have found papers tend to be like buses. You can easily have a year with none then in the next year three come all at one.
    *****************************
    I find elevators just as accommodating. About the same time frame too.

  28. Sometimes this is simply due to a negotiation over ‘usage rights’.

    There are some biologists who made a set of monoclonal antibodies and every paper using them coming out of that lab for the next 15 years has their name on it, even though they left the lab 10 years previously. I am strongly opposed to this, because it pumps up certain people’s ‘productivity’ when they didn’t actually do any of the work. Fine for 2 or 3 years after they leave the lab, but no more. At that point, the reagent is in general use and nothing new has been added.

    You find similar situations when a lab head shares reagents with other labs – of course, the first few papers should acknowledge the work of creating the reagents, but how long for??

    More discerning people have two lists: first authorship + senior authorship (i.e. last); and all the rest. If you are first, it probably means you did a majority of the work. If you are last, you are probably the lab head who raised the funds, supervised the work etc. In between, who knows what you did…..

    A lab head who can’t publish every year either doesn’t have a very big lab or they do very, very speculative research. If you have 3 PhD students and two postdocs, you’d expect to have at least 2 – 3 papers a year of original research and you might well be authoring a review article, a technical monograph not to mention editing a book.

    For a PhD student or a postdoc, the key is to ‘fit into an experimental system that is already up and running’. If the lab already has this slick, you can easily get enough data in 6 months to publish. If you have to develop something from scratch, it might take you 18 months before results start coming. Rarely is that spade work acknowledged by employers.

    Let’s ask this: if you saw Fred Sanger’s publication record over 40 years, he published under 10 papers. Pretty pathetic eh??

    Well, not really. He invented the method of sequencing the amino acid composition for proteins, which won him a Nobel Prize. He invented the method of using dideoxy sequencing of DNA, which underpins the entire genetics revolution. That won him a second Nobel. He always invented a method for sequencing RNA, which didn’t win him a third Nobel, probably because there’s not really a major application for sequencing RNA. He was also at the forefront of sequencing entire genomes for small bacteriophages, which was, in its day, as big an achievement as the automated sequencing of a plant genome today.

    It takes insight to judge what publication records really mean. If you want ‘dull but worthy’, look for lots of papers which didn’t change fields, but added knowledge. If you want ‘genius’, you may need to back them for 6 or 7 years without publishing anything, then you might make £100m as an institution on the back of a groundbreaking set of patents, which then get published academically once protection is secure. Of course, you might end up with nothing too. Don’t risk it if you can’t stand the possibility of failure…….

  29. DD More says:
    July 15, 2014 at 10:17 am

    The new research, published on 9 July in PLOS ONE, was led by epidemiologist John Ioannidis
    ,,,
    So it took someone who looks at a spreading disease in a specific group to check find what the results looked like

    He has a lot more going on and for him than that. He has spearheaded a challenge to medical publishing that is literally world-shaking. Try this 2010 article, first: http://www.theatlantic.com/magazine/archive/2010/11/lies-damned-lies-and-medical-science/308269/

Comments are closed.