Peer review Science

The code of Nature: making authors part with their programs

15 years ago

Anthony Watts

Guest post by Shub Niggurath

However, as it is now often practiced, one can make a good case

that computing is the last refuge of the scientific scoundrel.

—Randall LeVeque

Nature - show me your code if you want to

Some backstory first

A very interesting editorial has appeared recently in Nature magazine. What is striking is that the editorial picks up the same strands of argument that were considered in this blog – of data availability in climate science and genomics.

Arising from this post at Bishop Hill, cryptic climate blogger Eli Rabett and encyclopedia activist WM Connolley claimed that the Nature magazine of yore (c1990), required only crystallography and nucleic acid sequence data to be submitted as a condition for publication, (which implied, that all other kinds of data was exempt).

We showed this to be wrong (here and here). Nature, in those days placed no conditions on publication, but instead expected scientists to adhere to a gentleman’s code of scientific conduct. Post-1996, it decided, like most other scientific journals, to make full data availability a formal requirement for publication.

The present data policy at Nature reads:

… a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols promptly available to readers without undue qualifications in material transfer agreements.

Did the above mean that everything was to be painfully worked out just to be gifted away, to be audited and dissected? Eli Rabett pursued his own inquiries at Nature. Writing to editor Philip Campbell, the blogger wondered: when Nature says ‘make protocols promptly available’, does it mean ‘hand over everything’, as with the case of software code used?

I am also interested in whether Nature considers algorithmic descriptions of protocols sufficient, or, as in the case of software, a complete delivery.

Interestingly, Campbell’s answer addressed something else:

As for software, the principle we adopt is that the code need only be supplied when a new program is the kernel of the paper’s advance, and otherwise we require the algorithm to be made available.

This caused Eli Rabett to be distracted and forget his original question altogether. “A-ha!”. “See, you don’t have to give code” (something he’d assumed, was to be given).

At least something doesn’t have to be given.

A question of code

The Nature editorial carried the same idea about authors of scientific papers making their code available:

Nature does not require authors to make code available, but we do expect a description detailed enough to allow others to write their own code to do a similar analysis.

The above example serves to illustrate how partisan advocacy positions can cause long-term damage in science. In some quarters, work proceeds tirelessly to obscure and befuddle simple issues. The editorial raises a number of unsettling questions that are sought to be buried by such efforts. Journals try to frame policy to accommodate requirements and developments in science, but apologists and obscurantists seek to hide behind journal policy for not providing data.

So, is the Nature position sustainable as its journal policy?

alchemy-symbols-6 — Alchemy - making things without knowing

A popularly held notion mistakes publication for science; in other words – it is science in alchemy mode. ‘I am a scientist and I synthesized A from B. I don’t need to describe how, in detail. If you can see that A could have synthesized B without needing explanations, that would prove you are a scientist. If you are not a scientist, why would you need to see my methods anyway?’

It is easy to see why such parochialism and close-mindedness was jettisoned. Good science does not waste time describing every known step or in pedantry. Poor science tries to hide its flaws in stunted description, masquerading as the terseness of scholarly parlance. Curiously, it is often the more spectacular results that are accompanied by this technique. As a result, rationalizations to not provide data or method take on the same form – ‘my descriptions may be sketchy but you cannot replicate my experiment, because, you are just not good enough to understand the science, or follow the same trail’.

If we revisit the case of Duke University genomics researcher Anil Potti, this effect was clearly visible (a brief introduction is here). Biostatisticians Baggerly and Coombes could not replicate Potti et al’s findings from microarray experiments reported in their Nature Medicine paper. Potti et al’s response, predictably, contained the defense: ‘You did not do what we did’.

Unfortunately, they have not followed our methods in several crucial contexts and have made unjustified conclusions in others, and as a result their interpretation of our process is flawed.

…

Because Coombes et al did not follow these methods precisely and excluded cell lines and experiments with truncated -log concentrations, they have made assumptions inconsistent with our procedures.

Behind the scenes, web pages changed, data files changed versions and errors were acknowledged. Eventually, the Nature Medicine paper was retracted.

The same thing repeated itself in greater vehemence with another paper. Dressman et al published results on microarray research on cancer, in the Journal of Clinical Oncology. Anil Potti and Joseph Nevins were co-authors. The paper claimed to have developed a method of finding out which patients with cancer would not respond to certain drugs. Baggerly et al reported that Dressman et al’s results arose from ‘run batch effects’ – i.e., results that varied solely due to parts of the experiment being done on different occasions.

This time the response was severe. Dressman, with Potti and Nevins wrote in their reply in the Journal of Clinical Oncology:

To “reproduce” means to repeat, following the methods outlined in an original report. In their correspondence, Baggerly et al conclude that they are unable to reproduce the results reported in our study […]. This is an erroneous claim since in fact they did not repeat our methods.

…

Beyond the specific issues addressed above, we believe it is incumbent on those who question the accuracy and reproducibility of scientific studies, and thus the value of these studies, to base their accusations with the same level of rigor that they claim to address.

…

To reproduce means to repeat, using the same methods of analysis as reported. It does not mean to attempt to achieve the same goal of the study but with different methods. …

Despite the source code for our method of analysis being made publicly available, Baggerly et al did not repeat our methods and thus cannot comment on the reproducibility of our work.

Is this a correct understanding of scientific experiment? If a method claims to have uncovered a fundamental facet of reality, should it not be robust enough to be revealed by other methods as well, which follow the principle but differ slightly? Obviously, Potti and colleagues are wandering off into the deep end here. The points raised here are unprecedented and go well beyond the specifics of their particular case – not only do the authors say: ‘you did not do what we did and therefore you are wrong’, they go on to say: ‘you have to do exactly what we did, to be right’. In addition they attempt to shift the burden of proof from a paper’s authors to those who critique it.

Victoria_Stodden(1)_rdax_150x160 — Victoria Stodden

The Dressman et al authors face round criticism by statisticians Vincent Carey and Victoria Stodden for their approach. They note that a significant portion of Dressman et al results were nonreconstructible – i.e., could not be replicated even with the original data and methods, because of flaws in the data. This was only exposed when attempts were made to repeat their experiments. This defeats the authors’ comments about the rigor of their critics’ accusations. Carey and Stodden take issue with the claim that only the precise original methods can produce true results:

The rhetoric – that an investigation of reproducibility just employ “the precise methods used in the study being criticized” – is strong and introduces important obligations for primary authors. Specifically, if checks on reproducibility are to be scientifically feasible, authors must make it possible for independent scientists to somehow execute “the precise methods used” to generate the primary conclusions.

Arising from their own analysis, they agree firmly with Baggerly et al’s observations of ‘batch effects’ confounding the results. They conclude, making crucial distinctions between experiment reconstruction and reproduction:

The distinction between nonreconstructible and nonreproducible findings is worth making. Reconstructibility of an analysis is a condition that can be checked computationally, concerning data resources and availability of algorithms, tuning parameter settings, random number generator states, and suitable computing environments. Reproducibility of an analysis is a more complex and scientifically more compelling condition that is only met when scientific assertions derived from the analysis are found to be at least approximately correct when checked under independently established conditions.

Seen in this light, it is clear that an issue of ‘we cannot do what you say you did’ will morph rapidly to a ‘does your own methods do what you say they do?’ Intractable disputes arise even with both author and critic being expert, and with much of the data openly available. Full availability of data, algorithm and computer code is perhaps the only way to address both questions.

Therefore Nature magazine’s approach to not ask for software code as a matter of routine, but to obtain everything else, becomes difficult to reconcile.

Software dependence

Results of experiments can hinge just on software, just as it can on the other components of scientific research. The editorial recounts an interesting example of one more instance of bioinformatics findings which were dependent on the version number of commercially available software employed by the authors.

The most bizarre example of software-dependence of results however comes from Hothorn and Leisch’s recent paper ‘Case studies in reproducibility‘ in the journal Breifings in Bioinformatics. The authors recount the example of Pollet and Nettle (2009) reaching the mind-boggling conclusion that wealthy men give women more orgasms. Their results remained fully reproducible – in the usual sense:

Pollet and Nettle very carefully describe the data and the methods applied and their analysis meets the state-of-the-art for statistical analyzes of such a survey. Since the data are publicly[sic] available, it should be easy to fit the model and derive the same conclusions on your own computer. It is, in fact, possible to do so using the same software that was used by the authors. So, in this sense, this article is fully reproducible.

What then was the problem? It turned out that the results were software-specific.

However, one fails performing the same analysis in R Core Development Team. It turns out that Pollet and Nettle were tricked by a rather unfortunate and subtle default option when computing AICs for their proportional odds model in SPSS.

Certainly this type of problem is not confined to one branch of science. Many a time, description of method conveys something, but the underlying code does something else (of which even the authors are unaware), the results in turn seem to substantiate emerging, untested hypotheses and as a result, the blind spot goes unchecked. Veering to climate science and the touchstone of code-related issues in scientific reproducibility— the McIntyre and McKitrick papers, Hothorn and Leisch draw obvious conclusions:

While a scientific debate on the relationship of men’s wealth and women’s orgasm frequency might be interesting only for a smaller group of specialists there is no doubt that the scientific evidence of global warming has enormous political, social and economic implications. In both cases, there would have been no hope for other, independent, researchers of detecting (potential) problems in the statistical analyzes and, therefore, conclusions, without access to the data.

…

Acknowledging the many subtle choices that have to be made and that never appear in a ‘Methods’ section in papers, McIntyre and McKitrick go as far as printing the main steps of their analysis in the paper (as R code).

Certainly when science becomes data- and computing intensive, issues of how to reproduce an experiment’s results is inextricably linked with its own repeatability or reconstructibility. Papers may be fall into any combination of repeatability and reproducibility, with varying degree of both, and yet be wrong. As Hothorn and Leisch write:

So, in principle, the same issues as discussed above arise here: (i) Data need to be publically[sic] available for reinspection and (ii) the complete source code of the analysis is the only valid reference when it comes to replication of a specific analysis

Why the reluctance?

What reasons can there be, for scientists not willing to share their software code? As always, the answers turn out far less exotic. In 2009 Nature magazine, devoted an entire issue to the question of data sharing. Post-Climategate, it briefly addressed issues of code. Computer engineer Nick Barnes opined in a Nature column on the software angle and why scientists are generally reluctant. He sympathized with scientists – they feel that their code is very “raw”, “awkward” and therefore hold “misplaced concerns about quality”. Other more routine excuses for not releasing code, we are informed, are that it is ‘not common practice’, will ‘result in requests for technical support’, is ‘intellectual property’ and that ‘it is too much work’.

In another piece, journalist Zeeya Merali took a less patronizing look at the problem. Professional computer programmers were less sanguine about what was revealed in the Climategate code.

As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software, say computer scientists. At best, poorly written programs cause researchers such as Harry to waste valuable time and energy. But the coding problems can sometimes cause substantial harm, and have forced some scientists to retract papers.

While Climategate and HARRY_READ_ME focused attention on the problem, this was by no means unknown before. Merali reported results from an online survey by computer scientist Greg Wilson conducted in 2008. Wilson noted that most scientists taught themselves to code and had no idea ‘how bad’ their own work was.

As a result, codes may be riddled with tiny errors that do not cause the program to break down, but may drastically change the scientific results that it spits out. One such error tripped up a structural-biology group led by Geoffrey Chang of the Scripps Research Institute in La Jolla, California. In 2006, the team realized that a computer program supplied by another lab had flipped a minus sign, which in turn reversed two columns of input data, causing protein crystal structures that the group had derived to be inverted.

Geoffrey Chang’s story was widely reported in 2006. His paper in Science on a protein structure, had by the time the code error was detected, accumulated 300+ citations, impacted grant applications, caused contrary papers to be bounced off, and resulted in drug development work. Chang, Science magazine reported scientist Douglas Rees as saying, was a hard-working scientist with good data, but the “faulty software threw everything off”. Chang’s group retracted five papers in prominent science journals.

Interestingly enough, Victoria Stodden reports in her blog, that she and Mark Gerstein wrote a letter to Nature, responding to the Nick Barnes and Zeeya Merali articles voicing some disagreements and suggestions. They felt that journals could help tighten the slack:

However, we disagree with an implicit assertion, that the computer codes are a component separate from the actual publication of scientific findings, often neglected in preference to the manuscript text in the race to publish. More and more, the key research results in papers are not fully contained within the small amount of manuscript text allotted to them. That is, the crucial aspects of many Nature papers are often sophisticated computer codes, and these cannot be separated from the prose narrative communicating the results of computational science. If the computer code associated with a manuscript were laid out according to accepted software standards, made openly available, and looked over as thoroughly by the journal as the text in the figure legends, many of the issues alluded to in the two pieces would simply disappear overnight.

…

We propose that high-quality journals such as Nature not only have editors and reviewers that focus on the prose of a manuscript but also “computational editors” that look over computer codes and verify results.

Nature decided not to publish it. It is now obvious to see why.

Code battleground

Small sparks about scientific code can set off major rows. In a more recent example, the Antarctic researcher Eric Steig wrote in a comment to Nick Barnes that he faced problems with the code of Ryan O’Donnell and colleagues’ Journal of Climate paper. Irked, O’Donnell wrote back that he was surprised Steig hadn’t taken time to run their R code, as reviewer of their paper, a fact which was had remained unknown up-to that point. The ensuing conflagration is now well-known.

In the end, software code is undoubtedly an area where errors, inadvertent or systemic, can lurk and impact significantly on results, as even the meager examples above show, again and again. In his paper on reproducible research in 2006, Randall LeVeque wrote in the journal Proceedings of the International Congress of Mathematicians:

Within the world of science, computation is now rightly seen as a third vertex of a triangle complementing experiment and theory. However, as it is now often practiced, one can make a good case that computing is the last refuge of the scientific scoundrel. Of course not all computational scientists are scoundrels, any more than all patriots are, but those inclined to be sloppy in their work currently find themselves too much at home in the computational sciences.

However, LeVeque was perhaps a bit naivé, when expecting only disciplines with significant computing to attempt getting away with poor description:

Where else in science can one get away with publishing observations that are claimed to prove a theory or illustrate the success of a technique without having to give a careful description of the methods used, in sufficient detail that others can attempt to repeat the experiment? In other branches of science it is not only expected that publications contain such details, it is also standard practice for other labs to attempt to repeat important experiments soon after they are published.

In an ideal world, authors would make their methods, including software code available along with their data. But that doesn’t happen in the real world. ‘Sharing data and code’ for the benefit of ‘scientific progress’ may be driving data repository efforts (such as DataONE), but hypothesis-driven research generates data and code, specific to the question being asked. Only the primary researchers possess such data to begin with. As the “Rome” meeting of researchers, journal editors and attorneys wrote in their Nature article laying out their recommendations (Post-publication sharing of data and tools):

A strong message from Rome was that funding organizations, journals and researchers need to develop coordinated policies and actions on sharing issues.

…

When it comes to compliance, journals and funding agencies have the most important role in enforcement and should clearly state their distribution and data-deposition policies, the consequences of non-compliance, and consistently enforce their policy.

Modern-day science is mired in rules and investigative committees. What’s to be done naturally in science – showing other what you did – becomes a chore under a regime. However, rightly or otherwise, scientific journals have been drafted into making authors comply. Consequently it is inevitable that journal policies become battlegrounds where such issues are fought out.

0 0 votes

Article Rating

71 Comments

Inline Feedbacks

View all comments

John Whitman

February 26, 2011 10:20 pm

Shub Niggurath,
I enjoyed your circumspect approach to the topic of disclosure of scientific documentation.
It is my intention to advise my representatives in the US Federal Government as follows:

NOTE: Of course I would wordsmith the following a lot before sending to my government reps. : )

1. All, and I mean ALL, documentation in support of climate research performed with government funds must be supplied to citizens on request. ALL means all, which means the code too. Also, associated emails.
2. To the extent that government funds where provided to a scientist who subsequently submits a paper to a scientific journal using the result of government funding and that paper is published, then the author(s) of the papers must show all documentation to any citizen . . . . ALL as in including code.
3. In the modern era, clicking a button or several on a computer is all that is required to submit all documentation. Electronic storage is absurdly cheap. So, arguments that it is too costly or time consuming or manpower intensive are insufficient arguments.
John

The Dread Pirate Neck Beard

February 26, 2011 10:29 pm

Or, perhaps more succinctly: Computer Code is Method.

Pingo

February 26, 2011 10:59 pm

Why should i show you my work when you only want to find something wrong with it? A great read, thanks.

Steve R

February 26, 2011 11:28 pm

This is a subject of significant interest to me. as a licensed engineer I’m bound by law to be liable for the accuracy and appropriateness of all solutions, regardless of whether I’ve used pencil and paper for a simple analytical solution, or computer code for a complex numerical solution. I would rank computer code in order of my Increasing burden of proof that my the results are appropiate
1. Computer code which is in the public domain and has been verified by others in the literature to properly reproduce solutions of known analytIcal test cases.
2. Proprietary, but commercially available computer code which has been demonstrated to reproduce known analytical solutions.
3. Proprietary code which is not commercially available and has no previous evaluation or review in the literature.
If I were reviewing a paper and the results were presented using a computer code such as 1) above, I would be ok as long as the author made the raw input files to the code available and I was provided assurance that solution parameters were chosen that did not cause any sort of numerical instability in the solution or otherwise exceeded the capabilities of the solution techniques used in the code. Of course whether thier solution makes any sense and leads to the conclusion described by the author is a seperate matter, concerning proper calIbration, and selection of proper input parameters.
If I were reviewing a paper using unverifiable proprietary code such as 3) above, I would require a complete description of the code, the numerical solutions used, and a rigourous verification of the code by comparing it’s solutions to a wide range of known analytical solutions. I would expect the bulk of the paper to simply address the code, and I would still be skeptical of the conclusions drawn from use of the code.

Brian H

February 26, 2011 11:45 pm

First, Edit Notes:
’does your own methods do what you say they do?’ == ‘do your own methods …’
a bit naivé, == a bit naïve
_______
Second, it is a strange “truth of nature” which can only be observed by following a specific study protocol. (I am reminded of Skinner’s “Superstition and the Pigeon”, in which he induced hungry pigeons to perform bizarre dances and rituals by letting them have food pellets at random intervals.) Surely, any coherent hypothesis/conclusion should be demonstrable in a large number of procedurally unrelated ways. A “scientific truth” specific to one and only one method of discovery is of little value or significance.

Jay Currie

February 27, 2011 12:00 am

Thank you for this.
The reality is that data and code need to be disclosed so that errors – such as the inversion you mention – can be corrected.
And, just as importantly, before a computationally dense paper can be analyzed, its own basis should be replicated. This should not be made difficult. instead, the researchers should make a serious effort to put their “workings” before their peers and the occasional itinerant auditor who happens upon their material.
After all, there is nothing to lose scientifically. To be found out in an error is not shameful, it is useful. (A fact Steig seems to have trouble with, but, oh well.)
Science advances by getting closer to the truth. Providing the data and the code which got you closer to the truth confirms your claim. Unless it doesn’t in which case you go back to your work and see where you went wrong. No harm in that. Two steps forward, one step back.

Hoser

February 27, 2011 12:03 am

Software is intellectual property. It can be a trade secret, or patented. If it is published without protection, potential value is lost. That means there is a significant extra step that may be required prior to publishing. Of course, that is true of any process, frequently the scientific work that is the subject of the paper.
Perhaps authors should think about publishing the software aspect separately, and only demonstrating the process acting on representative data. Another possibility would be to deposit compiled versions of software without source code, but with descriptions of algorithms. These compiled versions might be downloaded and installed on compatible systems for anyone to use. Or, there might be a web interface to a server that could process data input from another laboratory.
It is important to provide the software name and version in methods sections. If a widely-available commercial product is used, obviously, the authors should not be required to deposit the software or source code in an archive prior to publication. Software changes rapidly, because people want to do new things with a useful tool. Of course, there are always bugs to discover and fix. Keeping track of every version could be a terrible burden and eventually a waste of resources.
I don’t think there is a perfect solution regarding software. People have too many different systems and prefered programming languages. Software can be combined in a larger process in nearly infinite ways. I believe it should be sufficient to report the approach and the tools used. A good scientist would try to make software available to colleagues. Universities are almost as concerned with intellectual property as industry. Academia is not much of a refuge.
Bad science gets discovered eventually. We can’t eliminate all of it through a prefilter. We find it when results just never seem to be repeated by anyone else. To me, the worst evil in climatology was multiple researchers apparently conspiring to hide cherry-picked data and the fraudulent nature of their conclusions. They may not be the only ones. I believe the only explanation for their conspiracy is the grant money involved. Large amounts of cash seems to breed corruption.
At this point, I’ve concluded corruption at some level does seem to be a necessary component of AGW. It seems to be more cult religion than science.

February 27, 2011 12:17 am

I presume Potti is in a prison somewhere now, fed beans with a slingshot on odd numbered days that are also primes, and in months with a blue moon.

Steve Garcia

February 27, 2011 12:23 am

This is an excellent post, Shub.
Code must be likened to a lab book, in any research that includes processing of data, and maybe nowhere more so than in climate science compilations with its reams of data, necessary adjustments and homogenization.
Providing textual descriptions of what the code is tasked with doing is directly analogous to an abstract to a scientific paper – a summary of what is described in greater detail in the body of the paper. But would any scientist think it adequate to provide only an abstract? Of course not. No one would accept only an abstract.
Just so, no one should accept descriptions of what code is intended to do, not without being able to see the code itself to see if some single typo or some mis-programming might have occurred.
I found it very impressive that over 300 papers had citations to Chang’s work, and that the process and his integrity brought about the retraction of his papers. I am sure it is something he lost sleep over, but it had to be done.
Let us not peer-review here. It DOES need to be pointed out that Chang’s papers were peer-reviewed prior to publication, thus showing that peer-review is imperfect and that any appeals to peer-review as proof that work is unimpeachable are to be viewed skeptically. No matter how small a percentage of reviews turn up to have missed flaws, the fact that they do should mean that the “assumption of error” approach of those attempting to replicate work are not “us vs them,” but merely scientists doing due diligence. It simply cannot suffice for others to start replication from an assumption of correctness.
Correctness has to earn its way to respect. The original work is step #1, review is step #2, publications is step #3, and replication is step #4. Only after step #4 is it established science.
Note that consensus was not one of the four steps.

Steve Garcia

February 27, 2011 12:32 am

@Steve R Feb 26, 2011 at 11:28 pm:

If I were reviewing a paper using unverifiable proprietary code such as 3) above, I would require a complete description of the code, the numerical solutions used, and a rigourous verification of the code by comparing it’s solutions to a wide range of known analytical solutions.

Yes, thanks for pointing out the last.
Why anyone would assert that new code is to be accepted without this proof of reliability and applicability belies belief. I see replication as part of this step in making a paper become established “scientific fact,” because replication will ideally include testing this “wider range.”

Richard111

February 27, 2011 12:36 am

So some scientists are not so good at writing original computer code?
Who’d ‘ave thunk it?

Pompous Git

February 27, 2011 12:44 am

Has Lonnie Thompson archived *any* of his data yet?

Mike G

February 27, 2011 1:46 am

You raise a vital topic and make an excellent case, but don’t go far enough!
The impression is given that coding bugs are a risk but perhaps not extensive, and that testing code is straightforward. Both are wrong. Coding bugs are endemic and difficult to find. Any code that has not been rigorously tested will contain bugs.
Any software-based work should include not just how the code works, but how it was tested and verified. Submitted code should include a software test harness and test data.
On the issue of intellectual property, why don’t publications require that code is released under some type of formal public licence?
Mike

FrancisT

February 27, 2011 1:52 am

I would suggest that the version problem can be minimized by requiring that the final published results be run on a (clean, minimal) virtual machine which is then archived at the journal’s SI site.
A windows 7 VM (which is, in my experience, the most space consuming) requires about 10GBytes of OS before you stick the relevant scientific packages etc. on it. However that 10GB is compressible so the total archive size is rather less than that. A linux VM typically requires 1-2GB for a basic OS plus GUI. ISTR that I built a linux VM with R in a 8GB HD and never used more than about a third of the disk space, even with multiple R libraries added.
There are a huge number of benefits from requiring the clean build. These range from being completely certain about the code version and any dependencies to the ability to document precisely the steps taken (install a specific OS from ISO or VMware image, install the following versions of tools etc.). Over all it helps enforce a good attitude regarding version control and other related software engineering best practices and, by providing the VM it allows anyone who wishes to reproduce the results given the input data. This then allows those who are interested to audit the process by analysing code quality or verifying robustness by porting the code to a different OS/languages/library version etc.

stupidboy

February 27, 2011 2:05 am

Excellent article. The dangers of acceptance of insufficient testing are highlighted by the failure of the drug Thalidomide. The United States was very fortunate to have Frances Kelsey review the drug for the FDA. She decided that Thalidomide had not been sufficiently tested, so FDA approval was not given.
In countries which had accepted the test results and approved the drug, around 12000 babies were born with phocomelia (birth deformities) due to the use of Thalidomide during pregnancy. However only seventeen were born in the US.
The original testing had been on rats, which had apparently shown no ill-effects. When Thalidomide was found to be teratogenic (cause malformation of a fetus) further tests on rabbits and primates resulted in phocomelia in these animals.
Thousands of people in the USA, around the age of fifty, were born healthy because Frances Kelsey refused to accept, on faith, the research results presented to her.

Boris Gimbarzevsky

February 27, 2011 2:13 am

One of the factors probably resulting in most resistance to publishing ones code is that a lot of scientists write ugly code. I worked as a scientific programmer in the 1980’s and there was considerable pressure to get the code working and get on with the real work of the lab which was electrophysiology. Once the code seemed to work, then it was used in experiments and I’d often be frantically programming in the middle of an experiment to correct bugs which had just showed up and many of these quick fixes were undocumented and so different versions of the program were used for different sets of data. Fortunately, the data acquisition and storage portion of the code was well debugged initially in the project but even the final version of my code still has bugs.
I found this out the hard way when I agreed to port what I thought was a well tested assembly language PDP-11 program I’d written . This was a bleeding edge program for the time pushing the PDP-11 to its limits for data acquisition/realtime analysis and I naively assumed that all I’d have to do was recompile the program in my colleague’s lab and use a lower A/D sampling rate for his slower PDP-11 model. What I didn’t know was that some instructions were missing from his machine, specifically SOB (subtract one and branch) and I just replaced that instruction with a decrement and comparison, but then the program didn’t work. It took over a day of almost continuous poring over my code to find a very subtle error which resulted in the program working just fine with the ordering of the instructions I had for my PDP 11/34 but failing on the lower end PDP-11. I’m sure there are still other bugs waiting to be found in that piece of code.
This is why I like open source code which has been described as peer-reviewed code. It’s impossible to adequately document scientific code as it’s constantly changing. When we needed programming help in the lab we chose engineers who got code written and working fast whereas computer science students coded far too slowly producing excessively commented pretty code instead of working programs. The other thing that has to be specified is what compiler was used for the code, what libraries were linked into the code and, what fixes were made to the libraries as well as the CPU version used. I didn’t realize how much I patched certain programs until I attempted to unsuccessfully run some of my own code 20 years later on a non-customized PDP-11 system.
When I first started at UBC, we stored experimental results in analog form on a multichannel tape recorder which means the worst case scenario is that one just has to re-analyze the original data if the analysis program has serious bugs. The last work I was doing involved total digital recording and there were numerous complaints about my being overly obsessive with meticulously documenting and testing the data acquisition system (my job also included digital logic design) and writing what I was told was far too detailed calibration code. That was the one section of the project which I had great confidence in but there were some major errors along the way in the data display and analysis sections which almost always involved using + instead of – or dividing when I should have multiplied. I hate to think what type of errors are present in some of the insanely complex satellite temperature measurement programs where the displayed data involves multiple processing steps.
One of the problems we had was that development of the programs was considered ancillary and the actual results which involved measuring the transfer functions of guinea pig trigeminal ganglion neurons and how they were affected by general anesthetics were the only things that counted for publication credits which were needed to renew grants. I estimate that 95% of the work on this project involved programming and digital electronics to perform a novel type of analysis which was pushing the limits of 1980’s hardware. I guess I got something right as people have now duplicated the results we got 25 years ago just using the general description of how we did the experiments. I tried to use some of my data acquisition routines recently and found it simpler to rewrite them than try to figure out my 30 year old FORTRAN code where a 2 character variable name was used only when absolutely necessary and, not only did I use GOTO’s in profusion, but I hacked the FORTRAN compiler so that I could create jump tables and other techniques which are considered to be too dangerous to use in code now.
I should also note that the full extent of my formal computer science education consisted of one evening FORTRAN programming course in 1969.

Richard

February 27, 2011 2:22 am

Hoser says:
February 27, 2011 at 12:03 am
“Software is intellectual property. It can be a trade secret, or patented. If it is published without protection, potential value is lost.”
This is fortunately not true. Software is automatically copyrighted upon creation (Berne convention and laws in each jurisdiction). It has the same protections as e.g. a novel that has been published. The mere fact of publication does not remove the copyright protections.
I repeat, this means that the software source code can be made available without loss of protection. Perhaps if (make that when) code is required to be made available, then the scientists creating the code will have it checked by competent programmers before they use it to create dubious results. I am certain that following the “open source” model, which does not necessarily involve the “free/libre open source” model, then major improvements will be made in the code and in the reliability of the results.

Patrick Davis

February 27, 2011 2:23 am

I am not a (Paid) scientist, BUT I have studied physics and chemistry since before the AGW-CO2 scare began (Long years. And I wish people would go read a real book and learn unbiased views, libraries are, STILL, free, rather than Wiki/Google “filtered” links. I can hope, right?). I started my “study” when “scientists” stated “an ice age was approaching”, due to emissions of CO2. Hummmm! There may have been genuine scientific concern, unfortunately, Thatcher sorted that out for us all. Her vision for the UK was “services”, NOT “making stuff”, as “making stuff” relied on COAL (Energy) at that time. Coal was bad in her eyes (Miners), AS WELL AS, oil AND nuclear. So, crush “industry” (The making stuff bit), and “expand services”. Result? Stuff all your eggs in one basket…=FAIL!

Mike Jowsey

February 27, 2011 2:30 am

Thanks for a good, informative and revealing article. It seems to this layman that the computer age has taken the scientific community by surprise with respect to:
1. software documentation and version control
2. the blog review process
3. “freedom of information” (e.g. climategate)

Another Gareth

February 27, 2011 2:44 am

Shub said: Certainly this type of problem is not confined to one branch of science. Many a time, description of method conveys something, but the underlying code does something else (of which even the authors are unaware), the results in turn seem to substantiate emerging, untested hypotheses and as a result, the blind spot goes unchecked.
The thrust of your piece is about how not making codes available is allowing bad science to be sustained. More might be achieved in pursuading journals if you reverse the argument – good science *may* be being abandoned due to duff results caused by computer programs.
Rather than lay it on thick that it should be easier to show where individual scientists have gone wrong releasing computer code as a matter of routine would be a massive aid to science in general getting things right. Perhaps even releasing code long before you finish your paper, so that you don’t slog away for months or years only to be undone by bad computer code.
Nature said: A strong message from Rome was that funding organizations, journals and researchers need to develop coordinated policies and actions on sharing issues.
Nonsense. What is it with this institutional craven attitude bordering on mania that no group should dare take the lead on something. They wish to erase the effect of peer pressure, to make ‘advances’ that the most illustrious journals can take credit for and further cement themselves at the top of the tree.
This is an opportunity for other journals to upset the applecart.

Brian H

February 27, 2011 2:47 am

Hoser, that’s nonsense. “Intellectual property” is a term relevant to a marketable product or process, not to scientific reports to the research community. Get this straight: the purpose of research, in each specific instance and in general, is to bring forward ideas and evidence which contribute to understanding. PERIOD. Especially when it comes to climate. Unless you have a climate control device you’re trying to sell?
Any competing interests are destructive to that understanding. Are they what you are defending?

Dave

February 27, 2011 3:20 am

Speaking as a programmer, I’d say there are two necessary practices to adopt. The first is something that might better be termed code auditing than code review: the code used to write a paper needs to be checked by an expert programmer to make sure it does what it is supposed to do. The choice of programming methodology would be more suitable for reviewers to comment on – so in some cases a reviewer will need to be an expert programmer. The other practice I would suggest involves the creation of some kind of standard pseudo-code with which one can accurately describe the programmatic steps taken, without providing actual code to run a program.
All that said, the adoption of proper programming practices would significantly improve matters. Scientists in general don’t seem to appreciate that programming is a profession just as much as practising law or medicine, with similar levels of knowledge and experience required to do the job well. People who wouldn’t dream of attempting to diagnose and treat their own illnesses, or representing themselves in court, will think themselves capable of effectively writing code that they are equally unqualified to write.

Dave

February 27, 2011 3:22 am

Oh, one more point: adequate unit-testing would all-but prevent the programmatic ‘bugs’ from affecting results. Every section of code should be checked piece by piece to ensure correct function at every stage.

Scottish Sceptic

February 27, 2011 3:24 am

I wonder if there is not a case for a sub-division of science: Basic-science and “scientific interpretation”. The aim of basic science would be to provide a repository of scientific fact – facts that are undisputed. As such basic-science would publish all code, data etc. etc.
Then perhaps for those who wish to be more secretive in their techniques, there should be lesser standards for “interpretative” science, whereby the rules for publishing details of code and methodology would be a lot less rigorous.
The problem with climate “science” is that it seems to want to have its cake and eat it. It wants to be seen as providing the raw data, the climate “facts”, but it doesn’t want to be subject to public scrutiny.

February 27, 2011 3:54 am

Journals / papers that do not publish computer code = grey literature

1 2 3 Next »

wpDiscuz

Watts Up With That?

The code of Nature: making authors part with their programs

Some backstory first

A question of code

Software dependence

Why the reluctance?

Code battleground

Like this:

Get notified when a new post is published.

Some backstory first

A question of code

Software dependence

Why the reluctance?

Code battleground

Share this:

Like this:

Related Posts

Would You Trust The National Academies Of Science To Tell You How Science Works?

Three Radical Ideas to Reform the Scientific Enterprise

The Problem with ‘Peer Review’

Human Evolution: ‘Our Ultimate Fate Comes Down to… Three Possibilities’