Nature endorses open source software in journal submissions

Open Source Logo 2Ron Dean writes in Tips and Notes:

Interesting article about Nature editorial endorsing open source software for journal submissions. No mention of climate models, but it certainly seems to play in their insistence for reproducibility . The money quote:

“Reproducibility becomes more difficult when results rely on software. The authors of the editorial argue that, unless research code is open sourced, reproducing results on different software/hardware configurations is impossible. The lack of access to the code also keeps independent researchers from checking minor portions of programs (such as sets of equations) against their own work.”

We certainly saw problems like this when in 2008 GISS released their GISTEMP code.

Written in and old version of FORTRAN, nobody was able to get it to work at first. If I recall correctly, after several weeks of trying, Steven Mosher got most but not all of it to run.

There’s something called the Data Quality Act (DQA) which defines how government agencies must adhere to making data open, accessible and of high quality.

Perhaps we could also do with a Code Quality Act, which would define that any software produced for publicly funded research must be able to be re-run elsewhere for replication of results. Of course for some very large code projects, not everyone has a spare supercomputer lying around to reproduce complex model runs on it. Obviously there are practical limits, but then that also begs the question: who can make sure the coding work done by researchers that have unique supercomputers purchased for specific task is accurate and reproducible?

It’s rather a sticky wicket.

Full story here:

http://arstechnica.com/science/news/2012/02/science-code-should-be-open-source-according-to-editorial.ars?utm_source=rss&utm_medium=rss&utm_campaign=rss

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

38 Comments
Inline Feedbacks
View all comments
March 1, 2012 4:23 am

Simple answer: If it “requires” a supercomputer, it’s invalid.
Any result that’s worth changing the world or spending lots of money to implement, is going to appear clearly in the direct results of a carefully designed observation or experiment.
If you need to run a model with thousands of variables, you have ALREADY ADMITTED that you are not objectively analyzing data. You are attempting to generate a desired result.

1DandyTroll
March 1, 2012 4:38 am

The CDC6600 was rated at 1 megaflop, 1964
The CRAY-1 was rated at 80 megaflops, 1975
The CRAY-X-MP was rated at 800 Megaflops, 1982
The first multicore CRAY-2 was rated at 1.9 gigaflops, 1985
The garden variety desktop CPU Intel Core I7 920 has been clocked at about 70 gigaflops.
So, essentially, pretty much everyone has access to a super computer. 🙂

Sandy
March 1, 2012 4:45 am

SETI and CERN do home number-crunching.
I wouldn’t be surprized if readers of this blog in a similar mechanism could get up to super-computer power.

Pull My Finger
March 1, 2012 5:02 am

They are rushing headlong into the 1990s!

ToneD
March 1, 2012 5:03 am

Massive computing power is readily available at relatively low prices if you know how to use it properly. Just speak to the good people at Amazon AWS. There is no excuse for not validating software or algorithms that are being used to justify far reaching changes to the world economy.

March 1, 2012 5:03 am

The Association for Computing Machinery (ACM, the SIGGRAPH people) publishes volumes of ‘Collected Algorithms’ that are validated and available in various languages. Maybe GISS should join and avail themselves of those building blocks.
I would suspect that the other associations, like math or statistical societies, would have their own libraries available, too. Makes no sense for GISS to be reinventing the wheel in addition to the climate.

Bloke down the pub
March 1, 2012 5:05 am

If it’s not reproduceable, it’s not science.

March 1, 2012 5:26 am

A few minor corrections, and some more information:
GISTEMP code was released in September 2007, not in 2008. A lot of people made a lot of noise about how convoluted it was, and how impossible it was to run it, but in fact various people, including myself, got the whole thing to run, without much difficulty (a day or two of effort). David Jones and I then rewrote the whole thing in clear, documented Python in the open-source project
http://ccc-gistemp.googlecode.com/. On the way we found a few minor bugs, which had tiny effects on the end result, and which we reported to GISS, and which were swiftly fixed in their original code.
The experience subsequently led to my writing the Science Code Manifesto http://sciencecodemanifesto.org/ last October, which makes basically the same arguments as the Ince/Hatton/JGC paper.
Finally, it’s not an editorial. It’s a Perspective piece: a peer-reviewed article which “may be opinionated but should remain balanced and are intended to stimulate discussion and new experimental approaches.” It doesn’t necessarily reflect Nature’s editorial position (although the NPG has been broadly supportive of moves towards increased code publication).

Charles.U.Farley
March 1, 2012 5:46 am

ToneD says:
March 1, 2012 at 5:03 am
Massive computing power is readily available at relatively low prices if you know how to use it properly. Just speak to the good people at Amazon AWS. There is no excuse for not validating software or algorithms that are being used to justify far reaching changes to the world economy.
————————————————————————————-
Im not giving it to you, you just want to find something wrong with it. 🙂

Bengt A
March 1, 2012 5:55 am

Sorry for going OT but Svensmark has published a new paper (heavy stuff!) beating CERN and CLOUD to the link between aerosols an CCNs.
http://arxiv.org/pdf/1202.5156.pdf
[Moderator’s Note: When something is OT, the best place for it is in Tips & Notes. Please. -REP]

March 1, 2012 6:10 am

Unless the complex model actually incorporated real scientific relationships and simply derived equations that simply mimic patterns, I would have little interest in running it. Once the programmers opted for methods that simply reproduced trends, just as any curve can be reproduced by a polynomial, they became effectively worthless as there is no way to realistically enter real world input and values.

March 1, 2012 6:15 am

Probably half the reason the AGW modelers and founts of wisdom like J Hansen are unwilling to release their software is that, judging from their results, there is little reason to believe that they ever used the software or that it has anything to do with the results. It’s only because the rest of the world thinks they probably needed software to do their work that they have some lying around for window dressing. Of course, the software may not run—it was never meant to or they never attempted to run it, so they did not know it would not run.
[REPLY: Last time we looked, J Hansen had some minor affiliation with GISS and their code was released and replicated. Read the article again and Nick Barnes comment here. -REP]

Kasuha
March 1, 2012 6:42 am

polistra says:
March 1, 2012 at 4:23 am
Simple answer: If it “requires” a supercomputer, it’s invalid.
______________
There’s quite a lot of supercomputer simulations which were already proven to be accurate and reliable. For instance today’s airplane motors contain quite a bunch of parts which were calculated on supercomputers. And even weather forecasts, even though they sometimes go completely wrong, can be considered rather reliable. You can’t do either within reasonable timeframe without a supercomputer.
The main difference here is, these simulations were already proven to be accurate and reliable.

tarpon
March 1, 2012 6:43 am

Science is reproducible … The days of hiding behind a curtain of mystery of computers is gone.
If the data and code are not clear, then there is something devious going on, as we havenow found.

More Soylent Green!
March 1, 2012 7:10 am

Kasuha says:
March 1, 2012 at 6:42 am
polistra says:
March 1, 2012 at 4:23 am
Simple answer: If it “requires” a supercomputer, it’s invalid.
______________
There’s quite a lot of supercomputer simulations which were already proven to be accurate and reliable. For instance today’s airplane motors contain quite a bunch of parts which were calculated on supercomputers. And even weather forecasts, even though they sometimes go completely wrong, can be considered rather reliable. You can’t do either within reasonable timeframe without a supercomputer.
The main difference here is, these simulations were already proven to be accurate and reliable.

Implying the climate models are simulators is an insult to simulators. It implies the models accurately reflect how the climate works. As you noted, the climate models come nowhere close to this.

Frank K.
March 1, 2012 7:21 am

Nick Barnes says:
March 1, 2012 at 5:26 am
Nick’s efforts are laudable, and I encourage people to visit his CCC webpage.
However, I couldn’t seem to find any documentation of the GISTEMP algorithm there. The original Hansen paper is confusing and poorly written, and the terse explanation on the GISS webpage is, predictably, inadequate. And this is one of the problems with open source codes – if people are unwilling to adequately document the algorithms, then it takes some effort (maybe a LOT of effort) to ultimately determine what a given piece of code is doing, particularly when you start getting into a climate model which is solving numerous coupled differential equations.

Rob Crawford
March 1, 2012 7:22 am

This is just the minimum:
1) The code should be published.
2) The code should be under version control. Ideally, kill two birds with one stone by using something like github, so you can simultaneously publish and version control your code.
3) You should identify the version used for the results. You can “tag” a release in the version control that identifies the exact version used to produce the results in your published paper.
Further than that, LEARN HOW TO CODE. Learn about structure and object-orientation and encapsulation. Unit test your code. If you’re really using a piece of software as a tool in your research, then you should build that code with the same care you’d want in the rest of your equipment.

Urederra
March 1, 2012 7:56 am

Bloke down the pub says:
March 1, 2012 at 5:05 am
If it’s not reproduceable, it’s not science.

You beat me to it.
Although in this case I would say “If it is not open source, it is not science”
The fact that Nature has to explain it tells volumes about the state of current science.

Owen in GA
March 1, 2012 8:37 am

We use super-computer simulations for all sorts of things, but I am not aware of anyone promising product on the back of just the simulation results (except climate science.) There is usually an experimental prototype made to make sure the model assumptions were correct before major company-breaking promises are made.
For instance, simulations of candidate drugs are run to make sure the structure doesn’t run afoul of any known biological pathways before it proceeds to the more costly animal testing protocols. It saves a lot of money on testing poor candidates, and prevents some of the more egregious errors of the past. However the company then does the animal and human trial testing before (and as part of) seeking market approval. As more biologic pathways are understood and modeled, they are added to the simulation, but we will always have something else to discover about the way life works so the simulations will never be perfect. The various drug approval agencies will always require the animal and human trials for safety because it is recognized that the simulations will never be perfect.
We have had simulated designs of aircraft engines that looked great in the computer, but had to be reworked in the prototype due to some instabilities the computer didn’t pick up on. This is one of the main reasons that wind tunnels will probably never go out of business. We have had others that worked almost exactly as designed. We still tested the prototype.
Computers have their uses, and they are great tools for finding errors in known processes, but the reliance on computer models in climate theory seems all wrong. The computers should be used to provide outputs based on some assumption, then measure the real world to determine if the assumption is even justified. Climate science seems to go in assuming they have perfect understanding of all variables and then treat the computer output as though it came from the hand of God. The output of the computer is only useful if it tells you what to measure in the real world to confirm or deny your input assumptions!

Luther Wu
March 1, 2012 8:59 am

1DandyTroll says:
March 1, 2012 at 4:38 am
The CDC6600 was rated at 1 megaflop, 1964
The CRAY-1 was rated at 80 megaflops, 1975
The CRAY-X-MP was rated at 800 Megaflops, 1982
The first multicore CRAY-2 was rated at 1.9 gigaflops, 1985
The garden variety desktop CPU Intel Core I7 920 has been clocked at about 70 gigaflops.
So, essentially, pretty much everyone has access to a super computer. 🙂
______________________________
I have a smallish 3.2 TFLOPS machine that anyone could build for less than $1000. Actually, this thing is several years and a far more powerful machine could be built for same.
I am not so arrogant as to assume that I could even get close to writing a program which would mimic anything near the complexities and unfathomable operands of our planet and it’s climate complexities.
However, I’m willing to try. For a mere 1% of federal climate research expenditures (thus freeing up significant monies for support of all the various social justices), I will provide an omnibus overview of the climate threats we face which would answer ( and – ahem- support) all known risks, such as sea-level rise, etc., but would even come up with some new and untried and previously untested scary scenarios and rationalization. I could have the project complete within 365 days.

More Soylent Green!
March 1, 2012 9:21 am

This is a good step in the right direction, however, it doesn’t go far enough.
1) Release the raw data
2) Release the adjusted data actually used to the model run(s)
3) Document and justify the adjustments
4) Release the source code
5) Release the documentation used to create the source code
6) Release the unit testing code, including the data used to run the unit tests
Everything should be reproducable.
However, this introduces a fundamental problem — Computer model runs are not experiments. If anybody followed these steps above, we would expect everyone to always get the same results. Same input, same code should always produce the same output.*
BTW: Perhaps if others were allowed to review and critic the sloppy code used in these models, the modelers might be inspired or shamed into writing better quality code. Can’t they find some Software Engineering graduate students to write or review the code?
* The discipline of software testing is based on this principle, so please, let’s not get into an OT discussion the extremely rare run-time conditions which might occur so that the results are not always the same.

AFPhys
March 1, 2012 10:31 am

[Moderator’s Note: When something is OT, the best place for it is in Tips & Notes. Please. -REP]
Thanks for posting that! It ought to be a prominent request somewhere near the top of each article.

Septic Matthew/Matthew R Marler
March 1, 2012 10:31 am

Written in and old version of FORTRAN, nobody was able to get it to work at first. If I recall correctly, after several weeks of trying, Steven Mosher got most but not all of it to run.
How much would be required to be “open sourced”? The compiler? The operating system? Some very good statistical software is not “open sourced”, namely SAS and SPSS. R is open sourced but not (I may be wrong about this) the compilers used. I think focusing on the open source is focusing on the wrong problem.
Instead what is required is that the programs be well-commented and as modular as possible (i.e., none of the ancient spaghetti code), that all the variables be declared, and that nifty machine-dependent features be avoided so as to facilitate portability.
More Soylent Green! recommends:
1) Release the raw data
2) Release the adjusted data actually used to the model run(s)
3) Document and justify the adjustments
4) Release the source code
5) Release the documentation used to create the source code
6) Release the unit testing code, including the data used to run the unit tests

If scientists use open source software but do not do Green!’s steps, then the use of open source software accomplishes very little or nothing.
Nick Barnes says:
March 1, 2012 at 5:26 am
Thank you for that.

1DandyTroll
March 1, 2012 11:01 am

Luther Wu says:
March 1, 2012 at 8:59 am
“I have a smallish 3.2 TFLOPS machine that anyone could build for less than $1000. Actually, this thing is several years and a far more powerful machine could be built for same.
I am not so arrogant as to assume that I could even get close to writing a program which would mimic anything near the complexities and unfathomable operands of our planet and it’s climate complexities.”
Neither can GISS/Hansen apparently. According to James Hansen the climate models are stripped down versions of weather models–I guess that’s why that don’t want to show ’em in public. :p
Here’s Control Datas 1604 in action, delivered in 1963 to the US Navy.
http://archive.computerhistory.org/resources/text/CDC/CDC.WeatherbyComputer.1963.102641261.pdf
A whooping 48-bit computing capable of in six minutes calculating a 48-hour projection.
So, essentially, since then, apparently, nothing much has happened, other than Climate offices, er, Met offices, around the world wasting tax payer’s hard earned cash on super duper computing power to the same in about the same time, which means the quality of code has degraded to the point of utter parallel chaos, hence chaos computing in a parallel universe. :p

AFPhys
March 1, 2012 11:32 am

Two comments:
First —
To those who think that (super)computer models are worthless need to educate themselves.
Examples – most Integrated Circuits now are completely computer designed, and in fact could not be designed without the computer modeling of everything from circuit layout to electronic noise. A company would not dare set up a mass production line and distribution based solely on results of their model and simulations, but creation of the designs by hand is not really possible. Likewise, aircraft are now built and “flown” with computer models now, and modified many times, before scale models are are built for wind tunnel tests before full scale prototypes are built before test aircraft is built before commercial production. Modeling is essential, and it can work well. However, it can not be totally relied on, or companies would skip the testing and go directly to mass production.
I have experience with a computer model that worked well compared to the physical item, was changed in a minor manner that “worked” beautifully in the simulation, but when the physical item was modified in that manner (for a mere $10,000,000) the result failed, A large team of people tinkering with the new equipment for three months, and working on the model for over six months, failed to explain the disparity. The machine was returned to its original state, where the model works just fine in explaining many other types of modifications. Obviously, we can’t tinker with the climate or the climate models in the same type of “down to the thousandth of an inch”, “thousandth of a volt”, etc., level. So, the only way to do the modeling is to test it with reality, where it seems to me they clearly lack performance for theoretical reasons and with respect to data, among others.
These are the types of reasons incorporated into why I don’t trust the climate models, and believe it likely they will never be reliable. Nevertheless, to say computer models are worthless in general gives the CAGW believers too easy a target to shoot down, and I strongly discourage that attitude because it is simply wrong.
Second point –
As far as publishing the code so everyone can use it, that is fine. However, I have to caution that there is a danger in everyone simply running the same code. It is important that several separate teams of people, as independently as possible, to do the same type of simulation. Sneaky errors can sneak in and be hidden no matter how many people are looking at the same code (or proofreading the same paragraph, for example). When it comes to code that can be compared to reality at least those errors usually come to light. In something like this climate modeling project where that comparison will be obvious maybe a century after the code is written, that comparison is not useful.
I am reminded of the Space Shuttle computers as a great way to do this – see http://en.wikipedia.org/wiki/Space_Shuttle – do a search on the page for IBM – and pay special attention to how four computers checked each other for hardware failures, and the “Backup Flight System (BFS) was separately developed software running on the fifth computer”.
Given that the majority of the world’s politicians seem to be interested in spending trillions of dollars over the next century, redundant software programs are essential.

Verified by MonsterInsights