The Journal Science – Free the code

In my opinion, this is a testament to Steve McIntyre’s tenacity.

Via the GWPF: At Last, The Right Lesson From Climategate Fiasco

Monday, 16 April 2012 11:21 PhysOrg

A diverse group of academic research scientists from across the U.S. have written a policy paper which has been published in the journal Science, suggesting that the time has come for all science journals to begin requiring computer source code be made available as a condition of publication. Currently, they say, only three of the top twenty journals do so.

The group argues that because are now an integral part of research in almost every scientific field, it has become critical that provide the source code for custom written applications in order for work to be peer reviewed or duplicated by other researchers attempting to verify results.

Not providing source code, they say, is now akin to withholding parts of the procedural process, which results in a “black box” approach to science, which is of course, not tolerated in virtually every other area of research in which results are published. It’s difficult to imagine any other realm of scientific research getting such a pass and the fact that code is not published in an open source forum detracts from the credibility of any study upon which it is based. Articles based on computer simulations, for example, such as many of those written about astrophysics or environmental predictions, tend to become meaningless when they are offered without also offering the source code of the simulations on which they are based.

The team acknowledges that many researchers are clearly reticent to reveal code that they feel is amateurish due to computer programming not being their profession and that some code may have commercial value, but suggest that such reasons should no longer be considered sufficient for withholding such code. They suggest that forcing researchers to reveal their code would likely result in cleaner more portable code and that open-source licensing could be made available for proprietary code.

They also point out that many researchers use public funds to conduct their research and suggest that entities that provide such funds should require that  created as part of any research effort be made public, as is the case with other resource materials.

The group also points out that the use of  code, both off the shelf and custom written will likely become ever more present in research endeavors, and thus as time passes, it becomes ever more crucial that such code is made available when results are published, otherwise, the very nature of peer review and reproducibility will cease to have meaning in the scientific context.

More information: Shining Light into Black Boxes, Science 13 April 2012: Vol. 336 no. 6078 pp. 159-160 DOI: 10.1126/science.1218263

Abstract
The publication and open exchange of knowledge and material form the backbone of scientific progress and reproducibility and are obligatory for publicly funded research. Despite increasing reliance on computing in every domain of scientific endeavor, the computer source code critical to understanding and evaluating computer programs is commonly withheld, effectively rendering these programs “black boxes” in the research work flow. Exempting from basic publication and disclosure standards such a ubiquitous category of research tool carries substantial negative consequences. Eliminating this disparity will require concerted policy action by funding agencies and journal publishers, as well as changes in the way research institutions receiving public funds manage their intellectual property (IP).

=========================================

About these ads

248 thoughts on “The Journal Science – Free the code

  1. This basic common sense rule is long overdue. It falls under the oft neglected science rule of: “If it’s not reproducible, it’s not science.”

  2. That’s great news! I’ve wondered why this wasn’t demanded from the beginning, especially with regard to the climate studies, which were mostly based on computer models. Had this been in place, the “decline” could not have been hidden.

    Of course the corollary to this is that the source data must be available as well, otherwise with what would the computer program be fed?

    Does it look like this might actually have the legs to change standard practice across the research world?

  3. The need to examine programs and code has been compelling for at least the last decade. How can you replicate/check anything without it?

    Pointman

  4. That’s great news! I had always wondered why this wasn’t a requirement all along. Of course, the corollary to this is that the source data must be made available as well. Had this been the practice there would have been no “hidden decline”.

    Does this have the legs to actually become standard practice?

  5. This is right on in spaids too. The philosophical unserpinnings of science simply damands complete and total openesness when it comes to data, proceedures and methods. Anything less is not sicnece. Some how we scientists must learn of overcome our egos and clearly and publicaly in our papers, books and presentations, not just what we did but exactly what assumptions were applied and more importently precicely what it is we simply do not know.

  6. “… many researchers are clearly reticent to reveal code that they feel is amateurish due to computer programming not being their profession”. An excellent reason FOR publishing the code, since a program that has been written ‘amateurishly’ is potentially erroneous, as well as ugly.

  7. Wonderful news. This should have been government (federal, state, local, universities, non-profits, anyone reciving taxpayer funding or tax breaks or support of any kind) policy all along. Anyone receiving public funds for any purpose should be 100% transparaent in everthing. The only exception would be military and other legitimate national security information. And, I believe that even this information should be safeguarded and archived for release at an appropriate time.

    Because governments of alll kinds take our taxes by force, we should know completely and without limitations where and how that money was spent. And, those responsible for spending any tax money should be held accountable, fully accountable. Kind of an opt out policy for national security, all else is public property and public information. This should extend to all fruit of publically funded activities.

    Happy day today and three cheers for Science Magazine.

  8. The comments at Phys Org are interesting. But none addresses what I suspect is the central point of keeping the code secret: not letting on that it is filled with bias, fudge factors, and errors. (I use the term “suspect” because NO one has any way to evaluate the quality of the codes.)

    Furthermore, if the code is released, it becomes impossible to argue that the data feeding it can or should be kept secret.

    As a reminder, it was requests for the codes and data that started the CAGW smear campaign to begin with. I expect the scientists making this call will be attacked in short order.

  9. Even baby steps add up. Of course, so much of science is “citing the previous guys” and all that code will still be exempt as “prior art”.

  10. Len, FYI: As a general rule, any code developed by a DoD contractor with any contract money is the property of the US government.

  11. A great initiative, and not just for climate science. Of all the fields of scientific endeavor, medical research has shown the highest incidence of academic fraud. Requiring the release of source code along with the publication would have caught many of these frauds long before they saw the light of day.

  12. Won’t matter much, since the military has adopted Carbon Sequestration as its own main mission. This will exempt all warmist code on national security grounds.

  13. This should be required for any academic work. It is a part of the research and the source information. No academician should be able to hide numbers behind an unpublished code. No peer review committee should just accept numbers without that source provided.

  14. As Steven Mosher says – cool.

    With regard to this bit:

    many researchers are clearly reticent to reveal code that they feel is amateurish due to computer programming not being their profession

    If they’re going to draw conclusions based on the results produced by that code, they’re going to have to get used to the code flapping about in the wind with everybody staring at it. Nobody who matters will laugh, I promise. The other thing to remember is that if the best criticism that can be levelled is “your code is amateurish”, then clearly said critics can’t find issue with the way the “amateurish” code works, or the models’ outputs.

    And with regard to this:

    some code may have commercial value

    If the code has commercial value, and can be shown beyond doubt to have been developed with strictly private funds only, then it gets tricky. Validation of the code by an NDA-bound third party would be a good start, as would submittal of all raw data input.

  15. This is something that should be clear AND easy for everyone to understand. IF the science is to be trusted, shouldn’t everything be open for review?

  16. How much of the stuff that’s out there is amateurish and therefore should be retracted until its code is improved?

    I wonder also why many wouldn’t approach physics or a lab in an amateurish fashion but think nothing of writing rubbish computer code.

  17. The team acknowledges that many researchers are clearly reticent to reveal code that they feel is amateurish due to computer programming not being their profession and that some code may have commercial value
    =============================
    And how many people have fallen for this….for how long

    The first one is obvious……the second one is….well…..commercial ops have better programers, and if it had any commercial application…they would already have one and it would be better….

    Translated: We produce a really crappy below standard product….but someone with far superior sources might steal it

  18. While laudable [and with my full support] there are problems. Some research does not require specific custom-written code, but can be adequately done with interactive tools. A trivial example being Excel. So, there is no code to publish. Publishing the spreadsheet might solve the problem for Excel, but not for more sophisticated [or proprietary] tools. Another problem is that the code might actually be quite small, but use extensive libraries and procedures developed by others [and shared by a larger community]. Some of that may be difficult to publish and near impossible for outsiders to use. In my own research I have often tried to use other people’s code, but mostly given up and written my own because of the steep learning curve involved in using other people’s stuff.

    • Leif – Maybe the solution is to require all code be programmed in something portable. FORTRAN used to be the universal programming language for science in the early days, now I’d guess that C++ might be more common…hard to say.

  19. Anthony Watts says:
    April 17, 2012 at 11:56 am
    Leif – Maybe the solution is to require all code be programmed in something portable. FORTRAN used to be the universal programming language for science in the early days, now I’d guess that C++ might be more common…hard to say.
    My choice would be R. But that is probably asking for too much. Now, don’t laugh, but a lot of my own coding is done in a dead [but standard] language COBOL [because of its easy handling of large amounts of textual data and databases]. My programs are completely portable and will run on ANY computer with a COBOL compiler.

    • Leif, no laughing. I myself programmed FORTRAN on punch cards and paper tape in ASR33 teletype terminals…talk about dead.

  20. Louise on April 17, 2012 at 10:42 am said:
    Does this site require this of all of its posters (e.g. Dr Spencer) ?

    ??…..Unbelievable! How could any honest person be against this proposal? And why would “this site” NOT require this? Why on earth not! Talk about a made up mind, damn…

    • Just ignore Louise, she’s just a well meaning Earth momma out to get me to save the Earth…she obviously has issues.

  21. It’s one thing to dash off some hastily assembled code for an immediate task, with the programmer relying on short-term memory for documentation, but entirely another to produce code that is fully documented and maintainable, ready to be passed on to the world.

    From my decades of programming experience, the latter takes about ten times the effort of the former. People budgeting for grants will have to up their software costs, labor time, and scheduling delays. It would be wise for a research facility to have a software validation department to handle all this new code promulgation.

    I’m not saying we shouldn’t do this, but good things are never free and everything has a downside. For example, it will be mandatory for science education to include as many software courses as math courses.

    Also, supercomputer code would fill a phone book, and is so particular to one parallel machine (and one particular month, most likely) that it can never be replicated. How are we supposed to vet all those huge climate models? When the models someday are truly enormous enough to reliably forecast weather and climate, they will be beyond any human ability to even inspect.

    Finally, what code language are we talking about here?
    Fortran? (whose version?)
    Matlab?
    Lab View?
    Every one of these has a costly learning curve for users and software specialists alike, but there’s no way every scientist in the world is going to use the same language. Can we make them only use approved languages from a short list? (whose list?)

    I wouldn’t bother writing to Science, but posting here may get these ideas to the right people.

  22. I think its 50 years too late to insist on a common programming language for science. Many people will just not touch some languages: personally, I would do C# but not C++. Publishing code is a good idea. No-one wants to look a fool for writing code with obvious errors, so expect much better checking in future if this happens.

  23. This is a damn good start.

    Of course when I run a computer program using the same data, I expect to get the same results each time. But if we have the source code, we can verify the methods being used and how the results were calculated.

    Two quick points:

    1 – Releasing the code and the data still doesn’t necessarily mean the results are correct, either.

    2 – Computer model runs are not experiments. The output of a computer model may be an hypothesis that can be validated against real-world observations.

  24. Well its about time…

    For everything I do, every single character in my software get scrutinized….no exaggeration.
    Furthermore, I have to even do tolerance analysis, StN analysis etc…..

    This has been a bee in my bonnet for years… I bet 1/2 of the code they use is uncontrolled crap.

  25. Anthony Watts says:
    April 17, 2012 at 12:36 pm
    Leif, no laughing. I myself programmed FORTRAN on punch cards and paper tape in ASR33 teletype terminals…talk about dead.
    I still do it…This is from my program to process data from the ACE satellite:
    IDENTIFICATION DIVISION.
    PROGRAM-ID. GETACE.

    AUTHOR. LEIF SVALGAARD.
    DATE-WRITTEN. 08/05/18
    -REVISED: 12/01/03.

    ENVIRONMENT DIVISION.

    CONFIGURATION SECTION.
    SOURCE-COMPUTER. PORTABLE.
    OBJECT-COMPUTER. PORTABLE.

    DATA DIVISION.

    WORKING-STORAGE SECTION.

  26. “But they only want it so that they can pick holes in it and find errors!!”

    Well – yes, that is the idea its how science works. Oh, I didn’t realize you are a climatologist – I do apologize I am more used to dealing with the hard sciences like sociology. /sarc

  27. By the the way… openness is one thing…. revision control is quite another…I have never met a graduate student that was concerned about revision tracking…. and guess who is writing all that crap code…. it isn’t price waterhouse.

  28. Your right Anthony, I will.
    why did I bother? It just amazes me how a warped mind can turn this sensible, honest and perfectly logical (although difficult, as smart guys above state) proposal around and use it to try to smear somenone…but in a very weirdo way…it’s like asking Realclimate if they require their own posters to declare that they do not take money from big oil…it doesn’t make ANY sense at all! But your right, zen zen…

  29. What a great concept transparency is!

    I would be curious as to how many custom formulas in Excel it takes to create a climate model?

    Perhaps a Googol of them? :)

  30. Anthony Watts says:

    Just ignore Louise, she’s just a well meaning Earth momma out to get me to save the Earth…she obviously has issues.

    She is asking a legitimate question. If this site is so big on code being made publicly available, why not use your influence as a friend and colleague to have Spencer and Christy release the code to do the UAH satellite temperature analysis? There are a lot of people on this site who complain to high heaven that Mann et al. haven’t released every last bit of code (although they have now released pretty close to that) but seem to give Spencer and Christy a free pass. Why is that?

    REPLY: As far as I know, they’ve made it available for inspection, and it was the doing so that enabled others to spot the orbit decay issue which introduced a bias, long since corrected, though few people like yourself ever let others bashing Christy and Spencer forget about that. BTW where’s the code on that paper you wrote a couple years ago Joel? – Anthony

  31. a word of warning here.
    Most coders regard even their own work as amateurish, because they recognise that with infinite resources it could be done a lot better. A lot tighter.
    Most code can be written amateurishly or perfectly and come up with the same results.
    Code can also be written So it is not consistant, uses different data sources without warning or uses inconsistant formatting (for e.g.) and this is shoddy
    it is not amateurishness that is dangerous, but shoddiness

    In my experience the shoddiness comes from the fellow specifying the task, rather than the executor, the developer (although, in reality, they may be the same person)

  32. As a programmer I have to totally agree. Without access to the code any results are worthless. Less than worthless. In my entire career (30 years now) I have never found a program that behaved as its documentation claimed.

    Furthermore if scientists are worried about unprofessional code then write professional code. Have someone else look at it. Code reviews are a key element of good software creation.

    Just like reviews are a part of good science. Or should be.

  33. Louise,

    You have my encouragement to request whatever code you feel is pertinent from Dr. Spencer. Should he refuse, you have the option of pursuing a request under state or federal freedom of information laws. You also have the option of publicly announcing any refusal on this and other blogs.

    If scientific journals take to heart the recommendations of Morin, et al, the you also will have the option of requesting or obtaning the code from these journals. I share in your joy that the scientific establishment now recognizes that the need for access to obscure computer code extends to all who pursue excellence in science, not just those who have established work relationships with select researchers.

    Please let us know how your project proceeds.

    REPLY: Agreed. I think she’s in for a shock, the real fun will be to watch what she does with it once she has it. – Anthony

  34. Dr. Svalgaard,

    My programs are completely portable and will run on ANY computer with a COBOL compiler.

    Shouldn’t that read BOTH computers, not ANY computer?

    ;-)

    Cheers,
    Earle

  35. EternalOptimist says:
    April 17, 2012 at 12:57 pm
    Most coders regard even their own work as amateurish, because they recognise that with infinite resources it could be done a lot better. A lot tighter.
    My experience with this [going back to the 1960s and continuing today] is that it is easier to write good, self-documenting, correct code than the amateurish hash you may refer to. The way to do this can be taught.

  36. “they feel is amateurish due to computer programming not being their profession and that some code may have commercial value, but suggest that such reasons should no longer be considered sufficient for withholding such code” As someone with a fair amount of numerical methods (modeling) training plus statistics the code must be available for review as it is too easy to make mistakes in these areas plus of course programming errors are notoriously frequent. I wonder how they test these programs including regression testing after making any and all changes.
    Dave W

  37. Leif, I think you will find that I was not talking about what is easiest, and I was not talking about what can be taught
    I was talking about how it is , right now, amongst the people who work in the field

  38. Coincidentally this week’s Economist has a leading article calling for open access to all research funded by government and charities. Apparently the UK government is going to mandate this.
    Another straw in the wind….

  39. Earle Williams says:
    April 17, 2012 at 1:15 pm
    “My programs are completely portable and will run on ANY computer with a COBOL compiler.”
    Shouldn’t that read BOTH computers, not ANY computer?

    The joke apart, I meant ANY [modern or ancient] computer.
    AGU’s EOS has a piece about using the R language which is appropriate for the topic:

    http://www.leif.org/EOS/2012EO160003-R.pdf

  40. WHY WOULDN’T THEY USE STANDARD, UNIVERSALLY RECOGNIZED CODE COMMENTING METHODS?

    Well, if they are going to produce the source code and data, then there should be some standards set on those submissions much as process and protocol you follow when you submit the paper (citing, bib, etc). This would note be an issue to specify that the code be annotated in certain ways to identify algorithms. If you use external libraries (regardless of the language),you cite them. (Heck, we used to put Easter eggs in the code and put in all kinds of funny stuff in the comments for future programmers to see, you can write a friggin’ book based on just the comments in coding. And with all the code and comment management tools out there….FREE…this makes no sense at all…they can just load it all up into CVS, SourceForge, subversion…come on, there’s tons of these out there. Plus there all tons of tools that walk code and pull out the comments; as well as map the code, etc, etc)

    Yes, some of these folks are really just almost hobbyists on some of the code; but, since the computer, source code, and data are now part of the professional tools kit, then they better darn well follow some semblance of standard programming format; much of which has been around for close to 60 years now.

  41. There is no today excuse for producing undocumented bug ridden code that has no error traps or a set-up guide unless it is for personal use. I had this *principle* drummed in to me at uni – not about code but as an engineer producing unlabeled drawings and maths withour annotation. the reason given was – if you work for an employer they pay for your time so if ou are hit by a bus someone else MUST be able to pick it up.

    after having spent 20years since in IT consultancy my tutors at uni were correct. Any employee of mine who does not do this gets a dressing down (it is rarely needed) and any contractor gets the heave-ho.

    Welcome O great scientific minds to the 21st century.

    @ Interstellar Bill, Antony and Leif,

    The language is not at issue because like anything else they are always machine/OS/version specific to be able to run error free. One answer is to archive a vitual machine (our company uses data centres but cloud or journal repository would be better in this case) with full code and os and version controlled and documented source code. This is now a slew of commercially available software and hardware to enable exactly this sort of archiving.

  42. I agree with Leif’s concerns about software dependencies, but the main point is to make sure that the custom code associated with the paper is available. The dependencies required to get it to work, while potentially annoying and time-consuming, do not stop someone from evaluating the code required for the paper to see if there are any issues with it.

    As for a standard language, I agree with the poster above that it is too late for that. Indeed, one of the reasons we have multiple programming languages is because they are for different purposes. FORTRAN, for example, was the language of choice for mathematical manipulation, but other languages were better for other purposes (eg. text manipulation). Someone expressed a preference for C#, but one disadvantage of that it’s not available on other platforms such as Linux. The Mono project is overcoming that limitation, but can it guarantee 100% compatibility? As an example, while OpenOffice is largely compatible with MS-Office, there are some areas where the two have subtle differences. I suspect there will be similar cases with Mono that may or may not affect the results for a program written in C#.

    As a computer programmer, I believe that seeing the code has benefits, even if the environment to run it can’t be easily reproduced. As an example, I’ve personally found bugs in open source code by simply reading the code – bugs that I then reproduced by setting up the conditions that the source code led me to believe would cause a problem. I wouldn’t have found those bugs if I hadn’t been able to read the code.

  43. Leif Svalgaard: Some research does not require specific custom-written code, but can be adequately done with interactive tools. A trivial example being Excel. So, there is no code to publish.

    Does not Excel create a log showing the code that is executed during the interactive session?

  44. so if you think your code is that bad – then you should do the following

    you *do* write top down code don’t you; you are not that mad that you just write it out without planning ?

    So – release your FLOW CHART and/or your pseudo code

    and for heavens sake COMMENT the thing; if it doesn’t have comments explaining what it does it is GIGO code.

    And – surely if you write code as part of your research project (publicly funded) then ALL of that code belongs to the tax payer; NOT to you.

    Of course if you work for a private institute then the code belongs to your employer.

    Either way it is NOT your code. Unless you are a gentleman scientist ?

    Note that by releasing your flow chart and pseudo code your ‘code’ is portable to an extent. And if one of your assumptions is wrong; why; the little box in the flow chart where you employ that assumption is available for nit picking.
    Just think; if we had the flow charts for the GCM we would be able to see the so few variables they use; and all the Crook’s Constants and Fiddle’s Factors that have to be applied to make them work.

  45. Leif Svalgaard: My experience with this [going back to the 1960s and continuing today] is that it is easier to write good, self-documenting, correct code than the amateurish hash you may refer to. The way to do this can be taught.

    I agree on both counts: (1) it can be taught and learned and (2) over the course of a project, it is easier to write good code than mateurish junk; it’s just that on some days you feel like you only have to solve a simple problem for that day, and you feel justified in cutting corners, and you need to learn early on to fight the temptation to do so. My experience goes back to the 60s and card punches, and I am not good enought to call myself proficient. but it is always best to write the programs from the start with the knowledge that you may have to explain how they do what they do two years from now when you have forgotten it.

  46. Bottom line on the coding thing from me. Who cares if you write crappy code or even what computer language you write it in. Commenting can follow universal standards (Flower Box for one), it’s not the computer language, its specifically describing what it’s doing. (comment Tools can even take all your comments and put them into a nice little book for you). ie: [this sets java up to work with javadoc] http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html

    And…shall we count the comment methods and styles?: http://en.wikipedia.org/wiki/Comment_(computer_programming)

  47. “The team acknowledges that many researchers are clearly reticent to reveal code that they feel is amateurish due to computer programming not being their profession and that some code may have commercial value”

    Well, then, maybe they should spend some cash on developers instead of conferences?

    Then maybe they’ll hear about things like automated testing and acceptance tests and…

  48. Ill happily give them a pass on amateur code as long as the statistics it implements are not.

    Thats really all we care about. What exactly did you do to the numbers prior to the pretty plots.

    Its so simple. just show all your work.

    Doing anything else is like a proof or derivation with a missing few lines.
    Lunacy.. and its about bloody time.

  49. “As a computer programmer, I believe that seeing the code has benefits, even if the environment to run it can’t be easily reproduced.”

    Are there still languages that do not have things like Java’s “maven” and ruby’s “gems”?

    It’s been six or seven years since I’ve had to worry about having all the right libraries and tools installed; the tools manage that for me.

  50. Louise~ Since the WUWT team appears to have grown since Anthony created the site, I was thinking they could use someone who is good at going over code. You sound very interested in seeing Dr. Spencer’s, perhaps you could petition to join the team and help them out?

  51. peter_dtm says:
    April 17, 2012 at 1:48 pm
    so if you think your code is that bad – then you should do the following

    you *do* write top down code don’t you; you are not that mad that you just write it out without planning ?

    So – release your FLOW CHART and/or your pseudo code

    and for heavens sake COMMENT the thing; if it doesn’t have comments explaining what it does it is GIGO code.

    And – surely if you write code as part of your research project (publicly funded) then ALL of that code belongs to the tax payer; NOT to you.

    Of course if you work for a private institute then the code belongs to your employer.

    Either way it is NOT your code. Unless you are a gentleman scientist ?

    Note that by releasing your flow chart and pseudo code your ‘code’ is portable to an extent. And if one of your assumptions is wrong; why; the little box in the flow chart where you employ that assumption is available for nit picking.
    Just think; if we had the flow charts for the GCM we would be able to see the so few variables they use; and all the Crook’s Constants and Fiddle’s Factors that have to be applied to make them work.

    Flow chart? Pseudo code? Top down code? In which century did you learn to program? 8>)

    I learned programming in the 80’s (OK, that’s a decade, not a century, but it was a decade in the last century) and that’s what we used. These were replaced with UML diagrams, class diagrams, object-oriented design, use cases and user stories years ago.

  52. As programming is not their main task they are afraid of releasing the code. There cannot be a better reason for the releasing of code. As scientific study is now using program’s the code must be made available to allow for the identification of coding and logic errors

  53. Opening the code is an excellent method of having it improved—free. It shouldn’t be scary.

    Look at the Free Software Foundation ( at gnu.org) and Linux, and all the Linux distributions available. None of the authors are scared of others seeing their code.

    The Internet runs on open sourced computer applications and operating systems.
    Without it, we wouldn’t have an Internet.

    How many scientists use R—the free, opensource competitor to Matlab (which is definitely not free, nor open)?

    As one open source developer said: “many eyes make bugs shallow.”

  54. Anthony Watts says:

    REPLY: As far as I know, they’ve made it available for inspection, and it was the doing so that enabled others to spot the orbit decay issue which introduced a bias, long since corrected, though few people like yourself ever let others bashing Christy and Spencer forget about that.

    Really…It is available? Where? I can give you links to Michael Mann’s code or the code to do GISS Temp or the code for the GISS climate model but nobody ever seems to be able to tell me where I can go to get Spencer and Christy’s code, just vague unsubstatiated claims that they think that it is available. Actually, as I recall hearing, it actually took the RSS group some effort just to get Spencer and Christy to release the relevant section of their code to them.

    BTW where’s the code on that paper you wrote a couple years ago Joel? – Anthony

    Our paper commenting on Gerlich & Tscheuschner? I don’t think there was anything in there that requires computer code to calculate. We purposely made our examples simple enough to work out easily with pencil and paper.

    I will also note that I am not the one who is loudly proclaiming that I believe scientists must always release their computer code. I only note the issue with Spencer and Christy to point out the inconsistency of those who do.

  55. Matthew R Marler says:
    April 17, 2012 at 1:42 pm
    Does not Excel create a log showing the code that is executed during the interactive session?
    slogging through that is no fun and not very illuminating…

  56. Leave Louise alone.

    Anyone who likes wine, supports nuclear power and reportedly likes to relax in high heeled boots, scant under garments, with a whip to hand is someone to be held in high regard in my book.

    Provided they are from the female side of the population. (Sorry Anthony.)

  57. It is worth repeating again and again: Steve McIntyre (together with a handful of others) deserves major credit for triggering this essential improvement in the peer review publication process.

  58. so the fate of the world is all based on computer programs…
    …that are based on amateurish code, so bad it would be embarrassing

    nice

  59. FORTRAN is not dead.. and its certainly a very quick language for throwing around big matrices and applying complex formulas.. Not so pretty on the user interface though !
    We use Intel Fortran 95 for our Engineering stuff, mainly because there is so much useful old code that you can link into without having to re-write it.

  60. Should this happen I expect the publication rate of many individuals in ‘climate modeling’ to collapse.

  61. Many researchers are *reluctant* to reveal their source code, but *reticent* they are not.

  62. Anthony Watts says:
    April 17, 2012 at 12:36 pm

    Leif, no laughing. I myself programmed FORTRAN on punch cards and paper tape in ASR33 teletype terminals…talk about dead.

    I wrote my first computer program in 1963. As a broad generalization, for me, there’s two kinds of programmers. Those that used punch cards, and those that didn’t.

    Those that used punch cards are generally skeptical of all computer results. Those that didn’t, often not so much …

    I publish my own data and code, but it’s not only not user-friendly, it would best be described as “user-agressive” … I write in R, and the joy and the bane of R is that you can run all or part of a program. So to produce a graph, I may run some lines from one part of the program to get the data, and some lines from another part to graph it … bad Willis, no cookies.

    w.

  63. Excellent idea. Regarding the problem with amateurish, undocumented code – why, we will just stipulate that every program be accompanied by its own Harry.readme file /sarc

  64. Lord, lord, when I was doing scientific programming about 45 years ago, I wrote in Fortran. But it was an extensively tweaked Fortran, which let me use assembly language in with the Fortran. It probably was only run on that one particular Control Data 3100. “Portable code” is a much larger problem than it looks to be. Flowcharts and pseudocode is the easiest way out.

  65. Graeme W says:
    April 17, 2012 at 1:41 pm

    I agree with Leif’s concerns about software dependencies, but the main point is to make sure that the custom code associated with the paper is available. The dependencies required to get it to work, while potentially annoying and time-consuming, do not stop someone from evaluating the code required for the paper to see if there are any issues with it.

    #############################################################

    This is surely the nub of the thread. The reason why source code and data should be an automatic provision is so that somebody who wishes to can follow the process and attempt to see any errors which would disprove whatever was claimed. It doesn’t matter if it is amateurish,long winded or disorganised – it either works or it doesn’t.
    If somebody claimed to be able to prove that a cold fusion process worked and provided a proof in a new, private mathematical symbology, who would believe them? It is the same with these climatologists who claim things but wont show their workings – who believes them? Not me.

  66. Really excellent news, except for those scientists who care more about avoiding their own errors coming to light than the truth and scientific progress.

    I don’t see that what language the code is written in is likely to be a major problem in the vast majority of cases, nor that it would be reasonable to seek to restrict what language was used, providing it and any code libraries used were reasonably accessible.

    Likewise, while good programming practice should certainly be encouraged (as should employment of a widely used language), I’m very doubtful that there should be any prescriptive requirements – IMO that would be unduly onerous on scientists, who cannot and should not be expected to be professional-standard programmers. The important objectives are surely to ensure replicability (including testing the effects of various changes) and to enable discovery of errors. I don’t see that elimination of errors shouldn’t be the key aim, at least in most cases, although publication of code can be expected to make scientists take more care to avoid and to correct program errors.

    The main thing from the point of replicability and finding any errors is surely that others should both be able to follow exactly what the code does and to run it, both as is and with changes to test the robustness of the results of the study employing the code. IMO (speaking as a pretty poor programmer), the key requirement for doing so is frequent, detailed comments in the code setting out what operation is being performed, along with extensive explanatory comments at the start of each code module setting out what it does, including a description of the variables and functions involved. An overall description of what the program does, with any relevant flowcharts, arguably belongs in the published paper, or in textual supplementary information to it, rather than within the code itself.

  67. Let me get this straight:

    Louise and joeldshore have elevated this site to the equivalence of a major scientific journal.

    Such high praise. Congrats Anthony.

  68. I applaud the policy paper which has been published in the journal Science. It is heartening to see the scientific community make a serious effort at self-correcting. I am sure there are some stubborn pragmatic obstacles to tame to implement the policy, perhaps, but there are no credible/serious show stoppers to achieve what the Science journal’s policy paper is promoting.

    We have seen the disturbing shortcomings of some high profile climate science papers in the area of QC and openness of code, methodology and data. The policy advocated by the journal Science should also be taken up by bodies issuing grant application specifications. The grant specifications also should include similar requirement on code, method and data; required of grant applicants by the body issuing grant applications to ensure there can be complete and timely independent verification by others scientists.

    Personal Note: I remember programing FORTRAN on punch cards at university in 1969 for my required computer science class for engineering majors. The computer was a CDC 6400 . . . . ahh the incredible fun we all had in the computer terminal building until the weeee hours of the morning. : )

    John

  69. Shock News: Scientific journal suggests that scientists use the Scientific Method. Film at 11.

  70. “and that some code may have commercial value”
    What a joke ! Just look at the wealth of extraordinary open source software in almost every area that is now available: Blender, GIMP, Open Office, Linux, Audacity, etc, etc

  71. Anthony
    Just after I’d enjoyable taken algebra in 1960 I was given access to a little IBM 1620. FORTRAN was the most wonderful magic, and I loved filling in the code=sheet with algebra-looking formulae, then seeing my lines turn into punch cards, the smell of which I liked as much as those of book-bindings.
    My very first program (on its twentieth try) transformed geocentric stellar coordinates into galactic xyz and displayed star maps from locations within 100 light years. When they started discovering exoplanets I went back to find I had listed their host stars in my little catalog of nearby sunlike stars, four decades earlier.

    I still haven’t heard anyone comment about supercomputer climate models and what you do with a code listing. Instead of comments, the various lines of code would need links to briefings of the numerous meetings that made the decisions embodied in the code. But that sounds to me like something so useful that of course no one would do it.

    You can bet that climate-model documentation, if any, is scattered, disorganized, incomplete, and intricately sequestered from public view. If the Team is ever forced to comply they will dump mountains of digital trash requiring hundreds of volunteers to spend years combing through the wretched mess. Then they’ll trumpet their transparency and openness.

  72. It seems to me that the concern about “amateur” code gives all the more reason to insist on releasing the code – so that more experienced coders can see if the code has “amateur” mistakes!

    Even as a professional programmer, I go through a code review process on a regular basis. It’s a good practice to get other eyes on your code to see things you’re often just too close to to see clearly.

  73. cw00p says:
    April 17, 2012 at 1:54 pm

    Thank you.

    Leif Svalgaard: slogging through that is no fun and not very illuminating…

    It makes publication possible, which addresses your point about Excel.

  74. pbittle says
    That’s great news! I had always wondered why this wasn’t a requirement all along. Of course, the corollary to this is that the source data must be made available as well. Had this been the practice there would have been no “hidden decline”.
    —————–
    Just because it is not a journal requirement does not mean the code is not available. Sometimes the code is available on the Internet, sometimes it’s available on request, sometimes it’s been lost, sometimes it is deliberately kept back for whatever reason.

    I tend to regard this particular debating point as bogus. There is a lot of code available and it can be inspected. When code or data does becomes available, when previously it was not, I see Climate skeptic land lose interest. The general impression is that the lack of code of data is just a debating point and you are not sincere in your complaints.

    REPLY: Riiiiight, Mr. Sincerity himself speaks. When we requested Hansen’s code, nobody could run it due to lack of hardware and ancient compilers, because Hansen’s stuff is spaghetti code, which was a learning experience all by itself. After weeks of effort and hair pulling, Steve Mosher finally succeeded. You wouldn’t be able to work on those Macs you program edu-apps for I bet. But go ahead diss the idea, it is quite suitable for your demeanor.

    Bottom line though is that I support code release across the board, whether you like it or not. – Anthony

  75. Let’s see how much substance there is in all this instead of hot air?

    How many of you have actually downloaded climate-related code and checked it for bugs? Just a simple code inspection would suffice.

    How many of you have actually found bugs in such code and reported the difference it makes to the outcome? My general impression that one or 2 instances of this have actually occurred.

    The same applies to data. I have been lead to believe that the Climategate raw data has been released. If true, an analysis of that would prove or disprove the repeated claims of fraudulent data manipulation in that data set. But I have seen nothing in climate skeptic land about this.

  76. Willis Eschenbach says:
    April 17, 2012 at 3:15 pm

    I wrote my first computer program in 1963. As a broad generalization, for me, there’s two kinds of programmers. Those that used punch cards, and those that didn’t.

    Remember the stigmata of the 60’s programmer–the rubber band (from and/or for the card deck) around the wrist?! A neat Trivia question, maybe.

  77. Regarding the problem of library dependencies and such, maybe these are some possible workarounds:
    1. The journals could stipulate that the code should compile and run on a reference platform. This could be for example FreeBSD or some stable and resource-rich Linux flavor like Debian. These two have a wealth of pre-packaged programs and libraries available right out of the box, especially in the realm of scientific computing, and are very well integrated and stable.
    The journals could then administer such reference platforms and make user accounts available to authors and readers alike; the authors would set up the software, and the readers could run their tests.
    2. In case the authors are unable to release code that conforms with the reference platform, they could be requested to make user accounts available for reviewers and readers on their own machines. The users would be able to run their own inputs, and at least the reviewers would also have to be given the credentials to verify that they are indeed running the released code; for example, they would have to be allowed to compile the code, md5digest the result and then compare it to the active executable.

  78. joeldshore says:
    April 17, 2012 at 2:33 pm

    … I can give you links to Michael Mann’s code …

    COOL! Where’s the link to his code for a) his Hockeystick paper, b) his 1999 paper, c) his 2003 paper with Jones (see below)? (After publication of his 2008 paper, and after complaints from Steve McIntyre among others, Mann archived his 2008 code.)

    Regarding the code for his 2003 paper, Mann said in the Climategate emails:

    I’ve attached a cleaned-up and commented version of the matlab code that I wrote for doing the Mann and Jones (2003) composites. I did this knowing that Phil and I are likely to have to respond to more crap criticisms from the idiots in the near future, so best to clean up the code and provide to some of my close colleagues in case they want to test it, etc. Please feel free to use this code for your own internal purposes, but don’t pass it along where it may get into the hands of the wrong people.

    Joel, you talk as though Mann has voluntarily made his code public, rather than claiming it was his personal property, and hiding it every chance he has gotten, as above. This is the guy who famously was quoted in the WSJ article:

    Mr. McIntyre thinks there are more errors but says his audit is limited because he still doesn’t know the exact computer code Dr. Mann used to generate the graph. Dr. Mann refuses to release it. “Giving them the algorithm would be giving in to the intimidation tactics that these people are engaged in,” he says.

    I just want you to remember exactly what kind of a serial liar and expert in scientific malfeasance you are holding up as an example of available code, and to remind you that when you lie down with dogs, you get up with fleas.

    For you to hold Mann’s code out as an example of transparency is a sick joke. I had expected better from you, Joel, much better. In fact you go on to say:

    I will also note that I am not the one who is loudly proclaiming that I believe scientists must always release their computer code.

    So … are you saying you are against the thrust of the Science article that says the opposite of that? You don’t believe that as a rule scientists should release their code when they publish the results of that code?

    w.

  79. I always like the box in the flow chart that reads: “And then a miracle happened.”

    Of course, with the AGW crowd it would read: “And then a catastrophe happened.”

  80. John W. says:
    April 17, 2012 at 11:07 am
    Len, FYI: As a general rule, any code developed by a DoD contractor with any contract money is the property of the US government.
    ———————-
    In that case it seems a small step to require it of all government spending, including grants and everything else.

  81. Willis Eschenbach says:

    So … are you saying you are against the thrust of the Science article that says the opposite of that? You don’t believe that as a rule scientists should release their code when they publish the results of that code?

    I would not say that I am against it, but I think there are some serious issues that will need to be addressed, including:

    (1) What if the code is proprietary to your company? When I worked for Kodak, I published papers based on code that I and others at Kodak had written but that Kodak would never allow us to release; in fact, releasing it could be grounds for severe disciplinary action up and including dismissal. Will this provide further discouragement for scientists in industrial environments to publish their work?

    (2) What if you use code that is proprietary to another company? There are lots of papers out there using, say, TracePro raytracing software or proprietary quantum chemistry software where the scientists writing the paper don’t have access to the source code themselves.

    (3) Will the requirement that scientists have to release code that has taken a significant amount of work to write mean that some scientists will shy away from writing papers in a timely manner, preferring instead to get all the use that they can out of their code before revealing it to the competing scientists and, if this occurs, how large a negative impact might it have? There are, after all, good reasons for allowing people to have certain intellectual property rights even given the need for openness and transparency in science.

    I am not saying that these issues can’t be overcome but I am just saying that there are some real issues that need to thought about. It is not as cut-and-dry as people seem to think.

  82. Leif Svalgaard says:
    April 17, 2012 at 11:53 am
    While laudable [and with my full support] there are problems. Some research does not require specific custom-written code, but can be adequately done with interactive tools. A trivial example being Excel. So, there is no code to publish. Publishing the spreadsheet might solve the problem for Excel, but not for more sophisticated [or proprietary] tools. Another problem is that the code might actually be quite small, but use extensive libraries and procedures developed by others [and shared by a larger community]. Some of that may be difficult to publish and near impossible for outsiders to use. In my own research I have often tried to use other people’s code, but mostly given up and written my own because of the steep learning curve involved in using other people’s stuff.
    ———————————————–
    If a paper to be published requires peer review, how can you say it was peer reviewed if the data and code were not checked? Do you just trust your pal’s numbers and graphs?
    I used to believe if a paper was peer reviewed it was pretty darn sure the truth. And the reviewer agreed with the paper.(with comments) Otherwise why even bother having a peer review it?
    If you’re not looking at the paper’s accuracy then why lie about the “peer review”?
    It sounds like maybe climate science is loaded with PAL review and maybe not so much peer review.

  83. Willis Eschenbach says:

    COOL! Where’s the link to his code for a) his Hockeystick paper, b) his 1999 paper, c) his 2003 paper with Jones (see below)? (After publication of his 2008 paper, and after complaints from Steve McIntyre among others, Mann archived his 2008 code.)

    I am not sure why people are so fascinated by code over a decade old that has been superceded by more recent work. As you noted, he has archived has 2008 code.

    He also has archived his data for the earlier papers, see e.g., here: http://www.meteo.psu.edu/~mann/shared/research/old/mbh99.html

    http://www.meteo.psu.edu/~mann/shared/research/old/mbh98.html

    I had thought the code was there too now but I can’t seem to find it at the moment.

    So, are you going to return the favor and provide me with the link to the UAH code? I don’t need to see all the earlier versions. The latest version would be just fine.

  84. Would WUWT be willing to initiate a policy whereby it will only publish articles on science where the code/data et al is available at the time of posting?

    If there is strong feeling about this, that would seem like the righteous thing to do, both to act according to principle, and to encourage scientists to make their data available.

    This could not operate retrospectively, of course, and it does make for some interesting choices regarding semi-regular contributors (eg, Roy Spencer, who I don’t think has published the UAH code/algorithms?).

    I would be interested in seing discussuion here of such a policy, pros and cons, because I think it would be revealing as to true attitudes on ‘releasing the code’.

  85. barry,

    The difference is that my tax dollars do not go into paying for WUWT. Where my tax dollars subsidized the code, then that code should be available to the public that paid for it.

  86. Nobody expects scientists in general to be wonderful programmers. But if they use computers as tools for analysis, and write their own programs as part of that, then they must provide source code. It becomes part and parcel of the “experimental” process and essential for others to independently try to replicate the results.

    There are so many traps in programming which aren’t even obvious to professional programmers. One has to understand the limits of digital representation of data and its consequences in numerical analysis. This is definitely an issue when one is dealing with finite element analysis; which encompassess circulation and other climate models. And also numerical statistics. “The Computer” won’t take of these things automagically. Theoretical analysis techniques must be programmed with due regard to the limits of data representation, which is often not the same techniques on would use when working through the data by hand.

    At the very least, one must be aware that “random noise” from the numerical analysis may swamp the actual results.

  87. joeldshore ,

    I think you must have skipped this, because about 10 people have said this as comments already, but you must have missed it….judging from your “points” you listed…

    If someone writes a science paper, the results are supposed to be replicated by anyone in that field. This is the same as any patent that is filed in the patent office should be easilly copied by anyone in the field that a certain patent is filed in.

    The advantage of writing in pure science is rather whimsical. There is very little advantage in publishing. If you do not like what pure science is, then perhaps publishing and likewise pure science and reasearch is not for you.

    The call for source code might not be clear, but its not a call for all source code per se.

    If someone is using a third party app for instance, there are probably cases where the source code might not be relevant to revelation to the usage of the code. For instance, if the professor merely used code, I don’t see how putting out source code in that case which has nothing to do with the research paper helps, to give an example, if someone writes a research paper on dinosaur eggs I doubt the source code for excel would help if he merely used it.

    So in essence, your points 1 and 2 are really irrelevant for this case. The source code being asked for is the source code that the writer of the papers actually writes himself. Now for one, perhaps in the rare case that a writer of a scientific paper was also working at a private firm and wrote some software at the same time and tried to write a scientific paper, but in that rare case, that writer could NOT write about that process because in a scientific paper you have to describe the process so that anyone could duplicate the process, and if the code is propriatary, well describing how to write the code is just as bad…and well the defeats the purpose if you think about it, otherwise well you aren’t writing a real scientific paper and just having people guess.

    And if people are guessing, its not real science. And that is the entire point being raised here, if your code and method is not easilly duplicated, then you need to produce all of it so that others can attempt to do so. In essence, any code being provided along with data should be provided because otherwise what are we being asked to accept? The scientists word as a person? I am sorry that does not cut it. In the heart of pure science we accept only blunt scepticism as our tool of trade.

    And your third point… Going back to what pure scientific research is, its about taking lots of time and effort for little or no gain for oneself.

    I tried to explain this at the start, but if you did not understand this by now, to explain it further, if someone is not willing to share everything up to the current point on what they have, they have no business being in science anyway. If someone wants to make a profit, then start a business, because that is not science. And in that case, you can act like Dr. Mann all you want and hide all the code you want to as well.

    But by saying that there might be a concern with research in the future possibly about sharing code is just trying to make it more difficult for others to replicate research when its hard enough to get other people’s programs to run.

  88. Providing computer code, and where necessary, input data, seems unquestionably the least ambiguous way to finally document what we have done. Without the code we are saying that here is the graph (or numerical results, etc.), and here is the method (or equations, etc.) that we affirm were used to produce these results. Here is our manuscript.

    Such descriptions in words are often incomplete, imprecise, or ambiguous, and even mathematical formulas may be ambiguous and/or subject to typos. But I can’t see how computer code, when provided “cut-and-paste” from our working programs, provided as a supplement to our text, can be either incomplete or ambiguous. Computer WON’T tolerate either.

    If a methodology is unclear to a reader from the text and related offerings, program code, even in a language in which the reader may not be familiar (not capable of writing), is nonetheless often fairly easily interpreted to yield the missing elements leading to a fuller, more exact, understanding. Aiding replication of results is important, but first of all the code can aid understanding of published papers.

    Authors who are not proud of their code because it is not elegant have likely, correspondingly, written simpler and more straightforward coding, code which will be MORE useful to more readers to show what was done and that it was done correctly. Researchers following up are of course free to write better (faster, more compact – but more obscure?) code for their own daily use.

  89. Smokey,

    For quite a few commenters here the principle is that you can’t trust the science if you don’t have the code. Do you think it’s ok for WUWT to promote science that doesn’t include code (and other data)?

    Comments by Joel and Louise touch on the issue of principle. Why do skeptics berate Michael Mann but not Spencer and Christy for keeping code? (both are ‘publicly funded’) It appears that skeptics think the ‘warmists’ are trying to keep their code secret, and that revealing it will overturn their conclusions. But this could work both ways.

    My query is about principle. I want to discover if the general agreement here on releasing code is about improving the science generally, or if it is about a belief that releasing the code will weaken the ‘warmist’ science. I’m seeing a little of both right now, from different contributors. Anthony has declared code should be released ‘across the board’. Your reply, Smokey, makes me think that you are mainly interested in how such a policy might crush the ‘warmists’. (You’re free to correct me on that)

    So, to ask once more of people who seem to feel strongly about this – would you advocate for WUWT only to post articles on science where all the data is available?

    I think the answer would be “no”. So my next question would be the reasoning behind such a reply. And how should we as readers approach such articles? BAU?

    Is code release so important that advocates would like to see some principled action here? Or is it not meaningful enough to make some kind of stand on it?

    Money/mouth and all that.

  90. joeldshore
    April 17, 2012 at 7:12 pm

    makes some excellent points regarding intellectual property.

    I still think that making the release of code the norm is the right thing to do. People will then have to explicitly declare whether or not they reserve intellectual property and, as such, prevent others from verifying their software. It will then be up to the reviewers and readers to decide whether or not a paper that is shackled and restricted in this way still has enough substance and credibility to merit publication. If that leads to fewer papers, it’s not the end of the world; in fact, it happens quite commonly in medicinal chemistry.

  91. joeldshore says:
    April 17, 2012 at 7:12 pm

    Willis Eschenbach says:

    So … are you saying you are against the thrust of the Science article that says the opposite of that? You don’t believe that as a rule scientists should release their code when they publish the results of that code?

    I would not say that I am against it, but I think there are some serious issues that will need to be addressed, including:

    (1) What if the code is proprietary to your company? When I worked for Kodak, I published papers based on code that I and others at Kodak had written but that Kodak would never allow us to release; in fact, releasing it could be grounds for severe disciplinary action up and including dismissal. Will this provide further discouragement for scientists in industrial environments to publish their work?

    Thanks as always for your reply, Joel. Certainly, if the code is proprietary then it cannot be published in a scientific journal … but then if the code used in a study is proprietary, then the study it can’t be replicated, and thus it shouldn’t be published in a scientific journal.

    (2) What if you use code that is proprietary to another company? There are lots of papers out there using, say, TracePro raytracing software or proprietary quantum chemistry software where the scientists writing the paper don’t have access to the source code themselves.

    That seems like a non-issue to me, for the same reason. If someone can’t explain what their procedure is doing for any reason, be it that they used proprietary quantum chemistry software or some other reason, then why are they publishing in a scientific journal?

    (3) Will the requirement that scientists have to release code that has taken a significant amount of work to write mean that some scientists will shy away from writing papers in a timely manner, preferring instead to get all the use that they can out of their code before revealing it to the competing scientists and, if this occurs, how large a negative impact might it have? There are, after all, good reasons for allowing people to have certain intellectual property rights even given the need for openness and transparency in science.

    At some point, the researchers are going to have to man up and decide if they are businessmen or scientists. If they are businessmen looking for a competitive advantage, fine, keep it all secret. If they are scientists, reveal it. I have no problem either way. But you want to have it both ways. You want them to have the imprimatur and the prestige of scientists, but not show their work because they work for Kodak … sorry, my friend. Make up your mind, one or the other.

    Finally, I don’t see how the ruling could have a “negative impact” compared to what’s happening now, because the problem is that people aren’t revealing their code now, but they are publishing. I’d much prefer that if they don’t want to reveal their code that they don’t publish, because that is commercialism masquerading as science … and them not publishing would be greatly preferable to them pretending to be scientists and publishing without backing up their claims in the normal scientific manner.

    I am not saying that these issues can’t be overcome but I am just saying that there are some real issues that need to thought about. It is not as cut-and-dry as people seem to think.

    I don’t see the “real issues”. It seems extremely cut and dried to me. If you want to be a businessman, you get to keep all the secrets you want. But if you want to be a scientist, you have to show your work. Where’s the issue?

    Transparency is at the core of science, because science is built on replicability. My high school chemistry teacher, Mrs. Henniger, would laugh in your face for claiming otherwise, Joel, for saying that business concerns should allow scientists to not show their work. She would say that’s not science, that’s just business … and she’d be right.

    w.

    PS—I’m still waiting for the link to the code for Mann’s various papers, you said you had them … although at this point Mann is probably claiming that they are business secrets because he was working for Kodak or something …

  92. Chuck Nolan says:
    April 17, 2012 at 7:18 pm
    If a paper to be published requires peer review, how can you say it was peer reviewed if the data and code were not checked?
    This is often not easy, but a lot can be done by inspection of the method and the analysis and generally making sure the authors are honest. As an example of peer review I offer my own review of a solar prediction paper [the prediction eventually failed] done without access to data and code:

    http://www.leif.org/research/Dikpati%20Referee%20Report.pdf

  93. barry says:
    April 17, 2012 at 8:14 pm

    “Your reply, Smokey, makes me think that you are mainly interested in how such a policy might crush the ‘warmists’.”

    Barry – we don’t wish to “crush the warmists”. We just want them to go away, stop bothering us, and use their OWN money to conduct their “science”.

    And to be honest, if NASA GISS Model E is any indication of the code quality for climate models, I really DON’T want to see the source code…

  94. Let’s not forget that adjustments would also be within the code, thereby outing any fudge factors for what they are. Such adjustments thus would need explaining – either at the time or when replication is attempted. Someone would spot some dodgy adjustments. The big ‘adjusters’, if they think they are powerful enough, would lean on journals to not go along with this.

    Andy bets on which journals that would be?

  95. To the person asking if Dr Cristy makes his code available, the last time I asked hi. He said no, but they were working with NOAA to get it available to the public.

    This was 2 years ago and I’ve since lost interest in chasing after scientists trying verify their work (can’t be done, IMHO). But if someone wanted to follow up they could shot Dr. Cristy an email.

    I have the history of my conversations with him as well as the name of the NOAA contact on my old blog http://magicjava.blogspot.com/ You’ll need to go through the posts to find the right ones, but they’re there.

  96. The thing I like about this site is that there are so many links provided. I can follow it along, checking out source documentation or sidetrack into deeper research.

    By contrast, the AGW mob wave around pretty graphs and yell that the science is settled. I always get the feeling they are talking down to me, as though I’m some kid who just has to accept what they say.

    Anthony never does that. None of the contributors here do that. Their line is, “Here’s what we’ve found, this is what we base our conclusions on, check it out for yourself.”

    The truth is not a religion. There have been too many errors and not a single prediciton correct from the warmist camp. That’s thirty years worth of nonsense from them. Now let’s cut the crap, it’s time to toss out CO2 as a cause of trouble, it’s time to stop treating human beings as the enemy and it’s time to stop wasting billions in trying to control what we are too puny to control – nature.

    Let’s get back to living and growing and creating funds enough to handle any adaptation we may need in the face of the cooler years heading our way.

  97. magicjava,

    searching your old website, Christy replies to you.

    We are in a program with NOAA to transfer the code to a certified system that will be mounted on a government site and where almost anyone should be able to run it. We actually tried this several years ago, but our code was so complicated that the transfer was eventually given up after six months.

    http://magicjava.blogspot.com.au/2010/02/dr-john-christy-on-uah-source-code.html

    You contact NOAA, and receive reply confirming

    the existence of the project and indicated that it is in its very early stages with no ETA at this time.

    http://magicjava.blogspot.com.au/2010/02/update-on-my-attempts-to-get-airs-and.html

    Both posts are dated February 2010. If we take several years to mean 3, then it has not been possible for Christy et al to get their code in the public domain for 5 years. His responses and the time-frame may give some perspective on other campaigns to ‘release the code’. Perhaps accusations of malfeasace have been ill-advised.

  98. joeldshore says:
    April 17, 2012 at 7:23 pm

    Willis Eschenbach says:

    COOL! Where’s the link to his code for a) his Hockeystick paper, b) his 1999 paper, c) his 2003 paper with Jones (see below)? (After publication of his 2008 paper, and after complaints from Steve McIntyre among others, Mann archived his 2008 code.)

    I am not sure why people are so fascinated by code over a decade old that has been superceded by more recent work. As you noted, he has archived has 2008 code.

    He also has archived his data for the earlier papers, see e.g., here: http://www.meteo.psu.edu/~mann/shared/research/old/mbh99.html

    http://www.meteo.psu.edu/~mann/shared/research/old/mbh98.html

    I had thought the code was there too now but I can’t seem to find it at the moment.

    You can’t seem to find it at the moment … OK, well, get back to us when you do find it. At this point your claim, that you could provide us links to Mann’s code, is looking real shaky. So if you want to make your word good, I wouldn’t delay too long.

    So, are you going to return the favor and provide me with the link to the UAH code? I don’t need to see all the earlier versions. The latest version would be just fine.

    Hey, you’re the one saying “I can give you links to” someone’s code, not me. I’ve said nothing of the sort, particularly about UAH. I have no clue if their code is available or not, never given it a moment’s thought until now. So why are you bugging me about it? Go ask John Christy for his code if you want it, come back and report the result.

    w.

  99. Not only should the code and data be provided, but a master script should be provided that runs all of the program sequences etc that create the final outputs.

    This way would allow anyone to recreate the final charts and tables that researchers drew their conclusions from.

    Otherwise you may get a climategate style data dump, which would require verifiers to try many different sequences of compilations trying to achieve the original output. (With the originators saying “well I have given you all the data, can’t you make sense of it?”

  100. Let’s get real here. It’s the warmists who insist we are “deniers”, it’s the warmists who object to our objections. It’s the warmists who want to bring in laws to silence the criticism and who want to “treat” us or ban us or lock us away. There’s even hints of death for dissention in the future – just like the good old day, hey what?

    If AGW is such a certainty and the AGW crowd has real evidence of it, wouldn’t it be a whole lot easier to shut us all up by SHOWING THE CODES and SHARING THE DATA?

    Surely Mann and the rest aren’t putting PERSONAL INTEREST ahead of SAVING THE WORLD!

    /sarc off (and sorry for shouting).

  101. As a professional programmer, I do more than provide source code and data (inputs and outputs). I also use version/revision control, in case I wipe out something important. Lately, I’ve also been documenting my coding process, in hopes of being able to automate my thinking – thus being able to think at a higher level. I do this so I can discover what is the quickest way to go from point A to point B.

  102. Frank K. says:
    April 17, 2012 at 6:14 pm
    This IS good news. But it means NOTHING if the codes are properly documented.
    Better than trying to document code, one should describe how the code was developed. I’ll give an example of the development of a non-trivial code for calculation of spherical harmonics of the sun’s magnetic field as we measure it at the Wilcox solar Observatory: http://www.leif.org/research/Calculation%20of%20Spherical%20Harmonics.pdf

  103. thomasl3125 says:
    April 18, 2012 at 2:48 am
    Lately, I’ve also been documenting my coding process,
    This is what I meant in my previous comment. The important word is ‘process’. That is what should be documented, not the code itself. Some would say that any tricks and obscure points should be well-documented. I would say that one should not use tricks and obscure code in the first place.

  104. There is really no substitute for good will and a real desire to communicate how the work was done. If that isn’t present, “process” type rules — like saying you have to publish the code and its documentation — won’t be as helpful as people here seem to think. No large and long code is perfect and without more bugs to be discovered — why do you think Microsoft and Apple keep on issuing patches to their operating systems? The real question is whether the bugs still there are important and significantly affect the results you’re interested in. (Come to think of it, computer languages like Fortran and C++ are also implemented by long complicated computer programs with undiscovered bugs.)

    What you really want is access to the stuff that is not meant for public consumption, like the emails leaked in climategate. Hey, that’s an idea! Why not require all scientific emails be disclosed? Wait, I know why that won’t work! The official emails will be unhelpful (because everyone will know they’re destined for release) while the real conspiracy occurs off the record. I suspect that requiring all the code to be released will produce lots of similarly unhelpful pro-forma information, with the computational dirty work buried in lots of “magic” data files that cannot be easily examined or understood.

    A better idea would be to require a scientific “audit” of what was done, or better yet, just require that others be able to reproduce the disputed work — and isn’t this in fact already the gold standard for whether a scientific discovery is valid?. If a group of scientists aren’t of their own free will going to help you — a fellow member of their discipline — replicate what they did, that already tells you everything you need to know about the quality of what was done. If they respond to criticism by launching a publicity campaign against you in the popular media, that confirms it.

  105. Publishing the codes is a bad idea. Any simulation requires three components: a physical model, a mathematical model, and a numerical code. Publishing first two components, i.e. physical and mathematical models, provides complete disclosure of information needed to reproduce the simulation. Disclosing the numerical procedure isn’t necessary. A numerical code can (and usually does) contain proprietary methods and, if enforced this rule will preclude publication of large number of good works. At this point I wonder why is that all you smart people allow few bad scientist control you by making you reconsider the established publication procedures that are suitable for free society? What’s next… the journals will require purity of heart certificate for publication?

  106. Code is a good start but I suspect more important problems in climate research that influence results: selection and expectation biases being the dominant ones. Both can in part be subconscious and these biases are not apparent in the code. Also, the best written code in the world cannot prevent using an improper statistical method.

  107. Walt the Physicist,

    Transparency is an absolute requirement of the scientific method. If an experiment or a paper based on a hypothesis cannot be replicated, it is simply a conjecture; an opinion.

    Public policy costing $Trillions is now predicated on opinions in which the basic assumptions are not transparent. Willis is correct: if it’s buisness, by all means, withhold the codes. But if public policy depends upon the codes, they must be published in full. No exceptions. No excuses.

  108. Not releasing the source code and data is the same as not releasing the procedures necessary to replicate an experiment. Without the source code, we just have some scientists saying “trust us.” That’s not science.

    BTW: I don’t want the source code released because I want to run the model myself. I want it released because it’s necessary for the process. We need to pay attention to the man (or Mann) behind the curtain.

    Releasing flow charts and pseudo code without the source code won’t cut it, either. We have to be able to verify the program works as documented and the source code and the data is the only way to verify that.

  109. Back in school, if you filled in your maths homework with just the answers, you would get no marks, and a comment ‘please show your working out’…….it seemed that teachers used to be concerned that not showing ‘your working out’ was akin to wild guesswork or maybe even cheating.
    They were right of course.

    Wonder if Joel ever ‘showed his working out’ ????

  110. Ian W (April 17, 2012 at 12:42 pm) wrote:
    “Oh, I didn’t realize you are a climatologist – I do apologize I am more used to dealing with the hard sciences like sociology. /sarc”

    An insightful comment.

  111. I feel that they are using the term “amateurish” as a fig leaf for those out there who have models with a predetermined output. It is a much less confrontational term than “dishonest”.

    This entire situation is mind boggling to me, as I was taught from grade school through college that showing your work was as important as what your answer was. Initially it was to allow the teacher to see where I had messed up in my long division homework, but eventually it came to the point where the steps being shown allowed my professors to follow my line of thought in addressing a project. No steps shown, no credit. How did this environment allow the situation we are in to develop?

  112. Anthony Watts says:
    April 17, 2012 at 12:36 pm
    Leif, no laughing. I myself programmed FORTRAN on punch cards and paper tape in ASR33 teletype terminals…talk about dead.

    I can, with minimal effort, lay my hands on the deck of punch cards for the simple distallation program I wrote (in FORTRAN) for a class now 35 years in the past. I can see them becoming collector’s items in another 35 :-} (should I live so long).

  113. D. J. Hawkins says:
    April 18, 2012 at 9:48 am
    I can, with minimal effort, lay my hands on the deck of punch cards for the simple distallation program I wrote (in FORTRAN) for a class now 35 years in the past.
    Most likely that program would still run today.

  114. I run a company that writes complex scientific code for our own commercial products. The work is private, self-funded. The resulting code is proprietary, but its not the code I want to hide specifically, its the underlying technology, so I simply don’t publish anything about it. If I am an academic and want to publish a novel method, then the code and implementation should be as much a part of the paper as the mathematical derivation. If the work is privately funded, then don’t publish. If its publically funded then I get VERY ticked off by academia claiming commercial advantage when they don’t have to provide their own money – a huge advantage over small companies like mine that have to self-fund.

    A good example of how academia can still write fantastic code, make it freely available and yet still demand (modest) licence fees for commercial use is the FFTW code from MIT. Developed and owned by MIT it performs very fast prime number FFT transforms in higher dimensions. Its amazing code and you can download it for free. If you use it internally, even in a commerical company, there is no licence fee (or for academic purposes of course) but if you sell the resulting product you have to buy a licence. The price is very reasonable – single payment of about $2,500 – $5,000 depending on the version. THAT’s the right commercial model for these things – everyone wins from that kind of arrangement.

    On the topic of code and portability, the journal Computers & Geosciences has been publishing algorithms and accompanying code for many years, sometimes on very technical and sophisticated techniques. No-one makes rules about what language etc, the authors just need to give possible users a guide as to what the basic compiler/OS might be. There are clever people out there – look at how Steve McIntyre has reverse engineered even some of the most obfuscated results from Mann and others. Its not hard for others to replicate the work on other OS etc if they have the basic alogirthm written as code. Very often just the key functions are provided, with the calling arguments, and its enough. Take a look at Numerical Recipes in [C, Java, FORTRAN - take your pick] to see how useful such routines are to a wider scientific community. You don’t need all of it, although for scripted languages like R it helps – Steve McIntyre is great in this regard.

    I am a technical bod, not a programmer, but I can still write reasonable programs. I learnt a long time ago the difference between commerical code and technical/scientific code. I write code to prove a concept or test an idea (be it something simple like using an Excel spreadsheet, or maybe Java or C). What I write is simple, practical, even amateurish code, but I am not ashamed of that. I am not a commerical programmer. Its the commercial programmer’s job to turn it into robust code that has regression tests and bounds checking etc etc, but that doesn’t invalidate my original analysis – unless I made a mistake. In which case I want to know about it and PUT IT RIGHT. Don’t those publishing analyses/algorithms concerned with AGW want to know if they got it wrong and, if so, put it right? Or is that too much to expect?

  115. ThinkingScientist:

    Thankyou for your superb post at April 18, 2012 at 10:21 am.

    I think it is by far the best post in the thread and I commend everybody to read all of it: each paragraph contains some meat.

    Again, thankyou.

    Richard

  116. @ thinking scientist:
    Great post, a very insightfull, professional and completely logical argument.

    As to your question

    Don’t those publishing analyses/algorithms concerned with AGW want to know if they got it wrong and, if so, put it right?

    When speaking of the Team, I think I know the answer to that one.

    Question for everybody here: if this were to be implemented by the journals, would this then mean that earlier published work also has to make available the code used, else be retracted?

  117. One potential issue with sharing the source code is more uniformity of the models. The code just shows us how it works, what assumptions are made, the fudge factors, and errors, etc. The code still doesn’t show us the results are correct, just how the results were made.

    I can even foresee open source climate models, officially endorsed and stamped with approval and the claim that since the code is good, then the results are as well.

  118. joeldshore: It is not as cut-and-dry as people seem to think.

    Actually, yes it is.

    As to points 1 and 3, the scientists involved can choose to protect their intellectual property rights (possibly temporarily) or to publish and gain the scientific prestige. All the complications can be considered by them individually on a case-by-case basis, but the choice is simple: publication in scientific journals requires disclosure. Publishing a paper without disclosing original code is equivalent to publishing a description of the procedure that omitted critical steps, or omitted a key ingredient.

    As to point 2, when using proprietary code by someone else (the examples cited above include Matlab, Excel, and statistical software), cite the name and version number — all information needed for someone to replicate the procedure exactly. That’s what scientists do when they use commercial reagents.

  119. About effing time that some scientists have stated the obvious… Now let’s see how long it takes to get the journals to enforce the idea.

    When they do, computer code should also come with instructions or documentation on how it was built, host platform, specific compiler versions, notes on platform dependencies, notes on underlying support code such as libraries and runtime environments. Obviously if some commercial library is used which cannot be legally shared as an object file, then version numbers for that SW should be provided, with a caveat that if the precise code is no longer sold due to obsolecense, the actual libraries used in the experiment can be retrieved. (Scientific and academic study is one of the Fair Use provisions that allow for copying in U.S. copyright law).

  120. Wijnand:

    At April 18, 2012 at 11:21 am you say;
    “Question: if this were to be implemented by the journals, would this then mean that earlier published work also has to make available the code used, else be retracted?”

    Well, I can only answer for myself and not “for everybody here”, but I think it would be very unreasonable to demand “that earlier published work also has to make available the code used, else be retracted”.

    A paper is published according to the rules stipulated by the journal at the time of the paper’s submission. Changing the publication rules after a paper has been published should not affect the published paper in any way.

    Your suggestion would require authors to guess how other publication rules may change in future. This would require all authors to each own a crystal ball (sarc on/ or a GCM sarc off/).

    Richard

  121. Willis Eschenbach says:

    Hey, you’re the one saying “I can give you links to” someone’s code, not me. I’ve said nothing of the sort, particularly about UAH. I have no clue if their code is available or not, never given it a moment’s thought until now. So why are you bugging me about it? Go ask John Christy for his code if you want it, come back and report the result.

    That’s quite an admission. You are so concerned about the fact that Mann might not have publicly available code over a decade old that has since been superseded by newer codes of his (for his 2008 paper) that are, by your own admission, publicly available. And, yet, at the same time, you have no clue whether Spencer and Christy have made ANY version of their codes publicly available?!?

    One thing that I have admired about you is how you are willing to step up and complain about censorship of comments whether the “censor” is RealClimate or tallbloke. Similarly, don’t you think it is important for you to be sure that scientists who you admire are complying with the standards that you want to impose on other scientists in regards to making code freely available?

  122. Willis Eschenbach says:

    Thanks as always for your reply, Joel. Certainly, if the code is proprietary then it cannot be published in a scientific journal … but then if the code used in a study is proprietary, then the study it can’t be replicated, and thus it shouldn’t be published in a scientific journal.

    So, basically, you are saying that a huge swath of science currently published in scientific journals shouldn’t be there. Practically entire fields of research, like research on organic light emitting diodes (OLEDs) would disappear under your new standards.

    As has been explained many times, “replication” has traditionally not meant what McIntyre and you and many other skeptics have defined it to mean. “Replication” has traditionally meant using the methods described in the paper to reproduce the basic results and conclusions of the paper. That has rarely involved using the author’s computer code.

    That seems like a non-issue to me, for the same reason. If someone can’t explain what their procedure is doing for any reason, be it that they used proprietary quantum chemistry software or some other reason, then why are they publishing in a scientific journal?

    There is a difference between explaining what it is doing and having the line-by-line code. I think there is universal agreement that papers need to adequately explain their methods. That has not traditionally been taken to mean that they have to provide the actual computer code.

    I don’t see the “real issues”. It seems extremely cut and dried to me. If you want to be a businessman, you get to keep all the secrets you want. But if you want to be a scientist, you have to show your work. Where’s the issue?

    The issue is with your very black-and-white, two-valued orientation: You are either a scientist or a businessman and if you are a scientist you have absolutely no intellectual property rights.

    That is not traditionally how things have been done…and there are some good reasons why things haven’t been done that way.

  123. Smokey says:

    But if public policy depends upon the codes, they must be published in full. No exceptions. No excuses.

    Great…So, where are the UAH codes? I am not asking for versions going back more than a decade. I would be perfectly happy just to see the most recent version …Or the version from Spencer and Christy’s most recent publication on the satellite temperature record.

  124. Similarly, don’t you think it is important for you to be sure that scientists who you admire are complying with the standards that you want to impose on other scientists in regards to making code freely available?

    The day I see someone demanding the code from scientists their ‘side’ is the day I might beleive I’m looking at a real live skeptic instead of a propagandist. Ogf course, a real live skeptic would eschew the notion of taking a side to begin with.

    Joel, Willis, is this the description of, and the actual fortran coding for MBH98?

    http://www.meteo.psu.edu/~mann/shared/research/MANNETAL98/METHODS/

  125. joeldshore: The issue is with your very black-and-white, two-valued orientation: You are either a scientist or a businessman and if you are a scientist you have absolutely no intellectual property rights.

    No, it just means that the scientist business man has to decide which intellectual property rights to keep secret for commercial reasons, and which to sacrifice for academic/professional advancement. Same as with reagents and other physical assets: keep the ingredients secret (as with Coke and Kentucky Fried Chicken) and sell the product, or publish the ingredients (Taq, etc) in the peer-reviewed literature for the academic and scientific credit.

    Great…So, where are the UAH codes?

    A palpable hit, in my opinion. There has to be a clear and unbiased standard. If you have asked for their code and they have not released it, then they have a problem.

  126. joel shore,

    I am an equal opportunity skeptic. Everyone who is government subsidized should provide their code, methods, metadata, etc. upon request. I would gladly support a law requiring that. The details could be worked out, but one requirement would be that anyone refusing to comply would be ineligible for any future gov’t money, as would their employer. Yeah, let’s make transparency the law. For everyone.

    Ball’s back in your court: would you support such a law? Do you think Michael Mann would?

  127. joeldshore: “Replication” has traditionally meant using the methods described in the paper to reproduce the basic results and conclusions of the paper.

    That “tradition” has outlived its usefulness. Now it is recognized that without the code that supported a reported experimental result, it can not be dermined whether successors repeated the “same” procedure with adequate fidelity; that applies whether the successors do or do not seem to have replicated the original result.

  128. More Soylent Green! I can even foresee open source climate models, officially endorsed and stamped with approval and the claim that since the code is good, then the results are as well.

    A more likely outcome, in my opinion, will be lots of professionals volunteering time to test various sections of the code, and compiling a list of tests that the code sections have passed and (sometimes) failed. What happens now with commercial software like SAS is that many people test the new releases and report back to the SAS Institute when (it’s always “when”, not “if”) they discover problems. People in Big Pharma routinely test new releases against results from previous releases to ensure that the new releases are as reliable as the old releases. So do people in other industries.

  129. barry says:

    Joel, Willis, is this the description of, and the actual fortran coding for MBH98?

    http://www.meteo.psu.edu/~mann/shared/research/MANNETAL98/METHODS/

    Thanks, barry. I think that is indeed what I had seen before and was trying to find again!

    Smokey says:

    I am an equal opportunity skeptic. Everyone who is government subsidized should provide their code, methods, metadata, etc. upon request.

    I would hardly call you “an equal opportunity skeptic”. You have repeatedly made statements that Mann is a fraud, hiding things, etc. just because McIntyre says there is some obscure piece of code from 14 years ago that Mann has supposedly not yet released. And yet, I have never seen you use any sort of bad language about Spencer and Christy who, to my knowledge, have never publicly made available ANY version of their code.

    Your claiming to be “an equal opportunity skeptic” is like a sheriff claiming he believes that everyone should obey the speed limit and then goes and throws a little old lady in jail for a week for going 27 in a 25 mph zone while failing to even pull over a political friend who whizzes by at 60 in a 30 zone.

  130. joel shore,

    That’s a lot of pointless chatter in place of answering my questions: would you support such a law? Do you think Michael Mann would?

  131. Joel Shore,
    You are still habitually evading questions I see.
    BTW, you should try reading Andrew Montford’s book; The Hockey Stick Illusion, available at modest cost from Amazon. You make yourself look very silly to make such a naïve statement on that topic

  132. Joel Shore,
    You are still habitually evading questions I see.
    BTW, you should try reading Andrew Montford’s book; The Hockey Stick Illusion, available at modest cost from Amazon. You make yourself look silly to make such a naïve statement on that topic

  133. joeldshore says:
    April 18, 2012 at 6:24 pm

    Willis Eschenbach says:

    Thanks as always for your reply, Joel. Certainly, if the code is proprietary then it cannot be published in a scientific journal … but then if the code used in a study is proprietary, then the study it can’t be replicated, and thus it shouldn’t be published in a scientific journal.

    So, basically, you are saying that a huge swath of science currently published in scientific journals shouldn’t be there. Practically entire fields of research, like research on organic light emitting diodes (OLEDs) would disappear under your new standards.

    So you are claiming the scientific journals should publish results that cannot be replicated … you’ll have to justify that one to me. Because it seems clear to me that if the research doesn’t contain enough information to replicate it, it shouldn’t be in a scientific journal. Feel free to argue the opposite side, that irreproducible results belong in scientific journals … I can hardly wait.

    As has been explained many times, “replication” has traditionally not meant what McIntyre and you and many other skeptics have defined it to mean. “Replication” has traditionally meant using the methods described in the paper to reproduce the basic results and conclusions of the paper. That has rarely involved using the author’s computer code.

    Oh, please, Joel, come up with a new argument. That has been explained to you many times. Why do you think Science magazine recommends requiring that the code be provided? Perhaps you can explain why Science magazine would do that when Joel Shore says that it is not necessary in any sense.

    And of course, as you say, replication “traditionally” that hasn’t involved the author’s computer code, because traditionally authors didn’t use computers, or they didn’t build the entire study around computers … but as the Science article clearly indicates, that was then, and this is now. Sorry to drag you kicking and screaming into the 21st century, but that’s the modern reality.

    That seems like a non-issue to me, for the same reason. If someone can’t explain what their procedure is doing for any reason, be it that they used proprietary quantum chemistry software or some other reason, then why are they publishing in a scientific journal?

    There is a difference between explaining what it is doing and having the line-by-line code. I think there is universal agreement that papers need to adequately explain their methods. That has not traditionally been taken to mean that they have to provide the actual computer code.

    Traditionally in science there were neither computers nor code, so I’m totally unclear what you mean by “traditionally”. In any case, explaining your computer code, even explaining it perfectly, doesn’t mean that your code actually does what you claim … you sure you’re familiar with computer programs? You do know that they often contain what they call “bugs”, and that without the code you can’t establish whether there are any of those “bugs”?

    I don’t see the “real issues”. It seems extremely cut and dried to me. If you want to be a businessman, you get to keep all the secrets you want. But if you want to be a scientist, you have to show your work. Where’s the issue?

    The issue is with your very black-and-white, two-valued orientation: You are either a scientist or a businessman and if you are a scientist you have absolutely no intellectual property rights.

    Where did I say scientists have no property rights? That’s a straw man. What I said was very different. I said that if you want to claim the mantle of science for your results, you have to be transparent so people can check your work. Otherwise, it’s not science. Why is that so hard to understand?

    That is not traditionally how things have been done…and there are some good reasons why things haven’t been done that way.

    Again with the tradition … I don’t know if you noticed, but times have changed, Joel. Traditionally scientists wrote their reports in longhand. So should we continue the tradition?

    Look, it appears that you think that Science magazine and a host of scientists are 100% wrong in wanting to require computer code. To most of us who have used computers extensively, requiring code makes perfect sense, because we know that computer programs are

    a) very difficult to describe accurately in English, and

    b) often contain bugs, and

    c) can easily conceal a foolish error like say using degrees instead of radians, and

    d) are quite fond of doing things that their programmers never dreamed of.

    When you can explain to me how to discover those problems with a scientist’s work WITHOUT having access to the code, I’ll believe you have a point. Until then, you’re just defending the actions of scoundrels. I see above that you want to claim that this kind of investigation, of trying to do exactly what the author did with the data and code, is not a part of “replication”, whatever replication means to you.

    Perhaps so, but I’ll leave the semantic hairsplitting to you. Me, I don’t care what you call it, but checking the accuracy and validity of someones data and code it is very necessary and critical part of the investigation and replication of anyone’s claims. The issue with Michael Mann’s “Hockeystick” code is a perfect example.

    Mann made a newbie mistake, he used un-centered PC analysis, and he didn’t even realize it. The only way that was it discovered was that he left some of his code on an open server. Without that good fortune, we would still not know about his error—it simply could not have been discovered without access to the code.

    That is the kind of bullshit that your argument is supporting, Joel—you are speaking out in favor of the concealment of the kind of crappy, error-containing code that Michael Mann wanted so desperately to keep secret. Are you sure that’s the side you want to be on?

    I was under the impression you were a scientist … so why are you so opposed to transparency? Why do you want Mann and others to continue to be able to hide their errors? Science magazine understands the issue with computer code. We all understand that issue. So why don’t you? That’s the part I don’t get …

    w.

    PS—Are you going to have the courage to acknowledge that you were … well, let me call it “overly optimistic” when you said “I can give you links to Michael Mann’s code”? Because all of this flailing strikes me in part as a vain attempt to distract us from your failure to make your word good.

  134. joeldshore says:
    April 18, 2012 at 6:13 pm

    Willis Eschenbach says:

    Hey, you’re the one saying “I can give you links to” someone’s code, not me. I’ve said nothing of the sort, particularly about UAH. I have no clue if their code is available or not, never given it a moment’s thought until now. So why are you bugging me about it? Go ask John Christy for his code if you want it, come back and report the result.

    That’s quite an admission. You are so concerned about the fact that Mann might not have publicly available code over a decade old that has since been superseded by newer codes of his (for his 2008 paper) that are, by your own admission, publicly available. And, yet, at the same time, you have no clue whether Spencer and Christy have made ANY version of their codes publicly available?!?

    I’m deeply sorry for my lack of omniscience, Joel, but I truly had never considered the question. So sue me …

    I suspect I had not considered it because I knew Spencer and Christy had given their code to RSS to pick apart. RSS is their competitor, and one of the few groups with the expertise to find errors in the UAH code. I’ve often advocated doing just that, giving the code to your worst enemy. I’ve said in the past that if your worst enemy can’t find errors in the code, then things are looking good.

    Since they had done exactly that, I didn’t even give it a second thought.

    Now you want to bust me for that? Get real. As posters above have pointed out, Spencer and Christy have been working with NOAA to make their code public, so you are bitching me out about a total non-issue.

    w.

    PS—If Mann had done exactly what Spencer and Christy did, if he had given the code to Steve McIntyre, it wouldn’t be an issue. He didn’t. Your attempt to equate the two is a joke.

    You need to get out more, you are the Pollyanna optimist that claimed that you could give us links to Mann’s code … now that you’ve noticed you can’t do that, you’d love to change the subject …

  135. joeldshore says:
    April 18, 2012 at 7:50 pm

    barry says:

    Joel, Willis, is this the description of, and the actual fortran coding for MBH98?

    http://www.meteo.psu.edu/~mann/shared/research/MANNETAL98/METHODS/

    Thanks, barry. I think that is indeed what I had seen before and was trying to find again!

    Nice try, but no. See, Joel, some of use were actually following this story when Mann finally released what he claimed was the code in 2005. As Steve McIntyre commented at the time (emphasis mine):

    The newly-archived source code and the data archive made public in June-July 2004 (mirrored at Nature) ftp://holocene.evsc.virginia.edu/pub/MANNETAL98/ are clearly connected with MBH98, but do not match. At present, it is impossible to do a run-through and get results.

    Read Steve’s article for more details, also see here—the bottom line is, no, that not what barry calls “the actual fortran coding for MBH98″.

    w.

  136. The overall discussion triggered two points in my mind, and I’m interested to hear reactions.

    1. It takes time to prepare code and model results for publication. When proposing for funding, scientists, including myself [I am primarily NASA-funded], would naturally rather propose to do more research than to propose to spend resources, say, cleaning up code and readying model runs for public consumption. The reason for this is pretty obvious: There is currently hardly any stigma attached to not publishing code, and proposals that contain additional research [vs. those proposing to archive code and model runs] are usually viewed by review panels as more cost-effective [more science per public dollar] and thus more likely to be funded. You may view the resulting science as inferior, but that usually doesn’t matter unless you happen to be serving on a review panel. No one ever said scientists were saints, and when the economic incentives slant heavily against publishing code and model runs, it’s shouldn’t be surprising that these items aren’t published. My opinion is that probably the best way to make publishing these items more commonplace is to have government funding agencies require it, but if even if that were in place then how do you get the independently wealthy or privately funded scientists [and there are lots of the latter in the medical disciplines] who are not beholden to government funding to change their current habits and buy in?

    2. Further, there is the additional cost of archiving. These costs are not trivial when one considers that model runs, especially those involving 3D time-evolving numerical simulations, can take up terabytes of storage space, and this number is growing rapidly. [I'm assuming that to live up to the ideal being espoused here, one would require the model results and associated code to process it, in addition to the computer code that generated it, to be available as well.] Any suggestions for who should archive these data, and who should pay for it? This is not a trivial question to answer. Such data would ideally be stored in a such a way to allow easy access by everyone for long periods of time, or basically forever. However, it turns out the main priority of for-profit publishers who house the journals in which the majority of peer-reviewed scientific research is published [the three big ones are Wiley, Springer, and Elsevier] is to make money, not back up terabytes of data or keep it online and readily accessible. Government centers or funding agencies are subject to myriad forces [political, financial, bureaucratic, etc.] and may not be reliable in the long-term. Maybe something like cloud storage or bittorrent could work?

  137. Competent parties need only:
    1. data.
    2. concise outline of methods. (Brighter parties won’t need #2.)

    The incessant whining for code spoonfeeding gives the impression of a quantitatively weak community.

    Red tape:
    a) builds in delays.
    b) deflects resources from research to admin.

    In the aggregate, the delays and resource waste burden an already over-taxed society.

  138. Seems like common sense to me. How much grief and misery would’ve been avoided if scientists, publishers and academics had gone down this route 15 years ago? Science is about testing hypotheses. If the hypothesis is based on computer code, it’s not testable unless the code is in the hands of people seeking to find fault with it. Climate models might be well nigh perfect by now … if only the modellers had opened their code to scrutiny. What we need now is retrospective enforcement of what has always been the rule. We’ll soon learn who’s been doing good research, and who’s been cooking the data to produce a grant … once we see the actual calculations. There probably aren’t all that many who’ve loaded the dice for profit. We need to check, though. It’s the only way to restore respect to climate science.

  139. Paul Vaughan says:
    April 19, 2012 at 1:33 am
    Competent parties need only:
    1. data.
    2. concise outline of methods. (Brighter parties won’t need #2.)

    The incessant whining for code spoonfeeding gives the impression of a quantitatively weak community.

    Red tape:
    a) builds in delays.
    b) deflects resources from research to admin.

    In the aggregate, the delays and resource waste burden an already over-taxed society.

    1) Computer models, computer simulations do not output data. Model results are neither facts nor data.

    2) Without the code, there is no way to tell if what we actually done matches the claimed methods. Perhaps there is a bug in the code, so that no matter what data is entered, the results are essentially the same.

    ~More Soylent Green!

  140. Willis Eschenbach: I suspect I had not considered it because I knew Spencer and Christy had given their code to RSS to pick apart. RSS is their competitor, and one of the few groups with the expertise to find errors in the UAH code.

    Ah. that is news to me. Thanks. That’s a good step. Did they make their code publicly available, as called for in the Science article, when they published results?

  141. More Soylent Green! (April 19, 2012 at 6:48 am) wrote:
    “Computer models, computer simulations do not output data. Model results are neither facts nor data.”

    What you address here is art & culture based on fantasy assumptions – and it isn’t good enough to attract sensible attention, let alone retain it.


    More Soylent Green! (April 19, 2012 at 6:48 am) wrote:
    “Without the code, there is no way to tell if what we actually done matches the claimed methods.”

    Simply not true. Red herrings & straw men – if allowed – will deliberately have armies tied up unproductively at committee in a protracted resource drain. War or attrition’s a tactical political exercise that will do nothing to advance our understanding of natural climate variability.


    I suggest redirecting focus to climate exploration without assuming climatology’s already at the level of a science. Simply explore the terrain and informally share raw findings without succumbing to procedural harassment and corrupt cultural pressure formally demanding editorial cosmetic distortion.

  142. Paul Vaughan says:
    April 19, 2012 at 8:30 am
    More Soylent Green! (April 19, 2012 at 6:48 am) wrote:
    “Computer models, computer simulations do not output data. Model results are neither facts nor data.”

    What you address here is art & culture based on fantasy assumptions – and it isn’t good enough to attract sensible attention, let alone retain it.


    More Soylent Green! (April 19, 2012 at 6:48 am) wrote:
    “Without the code, there is no way to tell if what we actually done matches the claimed methods.”

    Simply not true. Red herrings & straw men – if allowed – will deliberately have armies tied up unproductively at committee in a protracted resource drain. War or attrition’s a tactical political exercise that will do nothing to advance our understanding of natural climate variability.


    I suggest redirecting focus to climate exploration without assuming climatology’s already at the level of a science. Simply explore the terrain and informally share raw findings without succumbing to procedural harassment and corrupt cultural pressure formally demanding editorial cosmetic distortion.

    Can you explain the “simply not true” remark? Can you explain your entire post?

  143. Willis Eschenbach says:

    I suspect I had not considered it because I knew Spencer and Christy had given their code to RSS to pick apart. RSS is their competitor, and one of the few groups with the expertise to find errors in the UAH code.

    You know this how? What I have heard (admittedly only second-hand) is that it took RSS quite a bit of effort to get what they wanted from Spencer and Christy. And note that they, unlike McIntyre, were not asking for everything and anything but only for one particular small section of the code that related to their main point of contention.

    Spencer and Christy have been working with NOAA to make their code public, so you are bitching me out about a total non-issue.

    What magicjava said was:

    To the person asking if Dr Cristy makes his code available, the last time I asked hi. He said no, but they were working with NOAA to get it available to the public.

    This was 2 years ago and I’ve since lost interest in chasing after scientists trying verify their work (can’t be done, IMHO).

    If Mann had said that he was working with his employer (or former employer) to get the code made available to the public and then 2 years went by, I doubt that you guys would be representing this as “Mann has been working with his employer to make his code public.” No, I think you would be representing it as something more along the lines of “Mann is continuing to stonewall about releasing his code.”

  144. Willis Eschenbach says:

    Read Steve’s article for more details, also see here—the bottom line is, no, that not what barry calls “the actual fortran coding for MBH98″.

    The simple fact remains that if I want to see the code for Mann’s ***MOST CURRENT*** work on the proxy temperature record, it is all publicly available. If I want to see ***ANY*** code for Spencer and Christy’s UAH temperature analysis based on the satellite data, I appear to be completely out of luck.

  145. Willis Eschenbach says:
    I said that if you want to claim the mantle of science for your results, you have to be transparent so people can check your work.

    So even if you fully publish your methods, your work still isn’t reproducible?

    My group works on simulations of materials. We publish the equations we solve, the numerical methods we use to solve them, and all parameters and inputs that go into the simulations. The results are reproducible: but you’ll have to actually do some work to reproduce them.

    For us, our code is like a piece of lab equipment. We’re happy to explain what we do and how our methods works; that’s part of science and reproducibility. But we’re not just going to let you come in and use our lab equipment (even if it didn’t cost us anything), because that equipment cost us time and money to set up.

    To most of us who have used computers extensively, requiring code makes perfect sense…
    When you can explain to me how to discover those problems with a scientist’s work WITHOUT having access to the code, I’ll believe you have a point.

    Why not just write your own code?

    Sure, if it produces different results, you may not be able to explain exactly *why* the results are different, but that’s quite normal in science – some % of the time, we never figure out why so-and-so’s results were wrong.

  146. Willis Eschenbach says:

    So you are claiming the scientific journals should publish results that cannot be replicated … you’ll have to justify that one to me.

    I am not saying that. I am saying that you are defining the meaning of replication quite differently than it has traditionally been defined.

    And of course, as you say, replication “traditionally” that hasn’t involved the author’s computer code, because traditionally authors didn’t use computers, or they didn’t build the entire study around computers

    When I say “traditionally”, I mean including the last several decades over which computer code has been a very important part of scientific research.

    Why do you think Science magazine recommends requiring that the code be provided?

    Look, it appears that you think that Science magazine and a host of scientists are 100% wrong in wanting to require computer code.

    Where has Science magazine made this recommendation? I hope you are not misinterpreting / misrepresenting the fact that Science published this article in their “Policy Forum” to mean that Science endorses all of its conclusions.

  147. Windchaser says:
    April 19, 2012 at 11:53 am

    Willis Eschenbach says:

    I said that if you want to claim the mantle of science for your results, you have to be transparent so people can check your work.

    So even if you fully publish your methods, your work still isn’t reproducible?

    I’m sorry, Windchaser, but I don’t understand that. Why would your work not be reproducible?

    My group works on simulations of materials. We publish the equations we solve, the numerical methods we use to solve them, and all parameters and inputs that go into the simulations. The results are reproducible: but you’ll have to actually do some work to reproduce them.

    The issue is whether the results are reproducible. For many studies, they are not reproducible without the computer code.

    For us, our code is like a piece of lab equipment. We’re happy to explain what we do and how our methods works; that’s part of science and reproducibility. But we’re not just going to let you come in and use our lab equipment (even if it didn’t cost us anything), because that equipment cost us time and money to set up.

    Windchaser, this has all come up and come to a head because for many things, being “happy to explain” what you do and how you do it isn’t enough to reproduce what you have done, for the reasons I spelled out above.

    To most of us who have used computers extensively, requiring code makes perfect sense…
    When you can explain to me how to discover those problems with a scientist’s work WITHOUT having access to the code, I’ll believe you have a point.

    Why not just write your own code?

    Sure, if it produces different results, you may not be able to explain exactly *why* the results are different, but that’s quite normal in science – some % of the time, we never figure out why so-and-so’s results were wrong.

    The problem is that your method just leads to dueling claims, where nothing gets settled. For example, people tried to reproduce what Michael Mann did in the “Hockeystick” paper. They couldn’t replicate it. But Mann continued to insist that everything was right and proper, and his results continued to be cited even though they were clearly wrong. What do you do then?

    Without Mann leaving some of his code on an open server, that issue would never have been settled, because we’d never have found out just where Mann’s paper went off the rails. The problem was, unbeknownst to Mann, he was making a newbie math error. So the code wasn’t doing what he said it did. He, like you, was “happy to explain what we do and how our methods works” … but the code didn’t work the way he explained. What then?

    I see no way to avoid those problems and to settle those issues without access to the code. It’s like when Ross McKitrick erroneously used degrees instead of radians … but unlike Mann, Ross revealed his code, and so the error was found and remedied quite quickly.

    Under your proposed method, where hiding the code is perfectly fine, we’d never have found either Mann’s or McKitrick’s errors. Not only that, but in Mann’s case, the hours involved in trying to replicate it were huge. That’s a great waste of human resources, and that’s part of why Science magazine says, free the code.

    So if you want to keep your code secret, that’s your call … just don’t expect me to believe a single word of the results from your code. Why should I, when you won’t reveal exactly how your code does it? The code may well contain errors that you don’t know about. It may be that you are shading the truth about what it does. It may be that you have just made up the answers. It may be for some unknown reason you are deliberately producing slightly wrong answers.

    And without your code, we’ll never know the difference. You want us to trust that your code actually does what you claim, but science isn’t built on trust.

    w.

  148. donkeygod says:

    Climate models might be well nigh perfect by now … if only the modellers had opened their code to scrutiny.

    This is just pure silliness. The legitimate debates and uncertainties in climate science have nothing to do with the nitty-gritty details of various groups’ computer codes. It has to do with real issues involving things like clouds, aerosols, etc., all of which are openly discussed in the scientific literature, at conferences, and by e-mail every day. I doubt you’d be able to find any serious scientist in the field who would list lack of access to other groups’ computer codes as among the top issues in the field.

    And, some of the models, such as GISS Model E, have been publicly available. I doubt that as a result of this it is considered any more perfect than any other model.

  149. Windchaser
    My group works on simulations of materials. We publish the equations we solve, the numerical methods we use to solve them, and all parameters and inputs that go into the simulations. The results are reproducible: but you’ll have to actually do some work to reproduce them. For us, our code is like a piece of lab equipment. We’re happy to explain what we do and how our methods works; that’s part of science and reproducibility. But we’re not just going to let you come in and use our lab equipment (even if it didn’t cost us anything), because that equipment cost us time and money to set up.

    So, let’s assume that I take the information you provide and come up with a totally different result. And MY results are just as consistent as yours. Obviously, at least one of us has NOT implemented the simulation correctly – but how is anyone to know?

    In your example, it’s like you built your OWN piece of lab ware, but didn’t provide all of the details about how it was built. Maybe you gave the measurements, but failed to specify the materials, and yours was made of aluminum while mine was made of plastic. Or maybe you gave all the details, but made a mistake in the construction. How is anyone to know if you don’t let anyone SEE the piece of equipment you built? There is no way someone can compare what you SAY it’s supposed to do with what it ACTUALLY does, if nobody can examine it.

  150. joeldshore says:
    April 19, 2012 at 11:44 am

    Willis Eschenbach says:

    Read Steve’s article for more details, also see here—the bottom line is, no, that not what barry calls “the actual fortran coding for MBH98″.

    The simple fact remains that if I want to see the code for Mann’s ***MOST CURRENT*** work on the proxy temperature record, it is all publicly available. If I want to see ***ANY*** code for Spencer and Christy’s UAH temperature analysis based on the satellite data, I appear to be completely out of luck.

    The fact remains, your mouth made a bet that your resources can’t pay. You said you could give us links to Mann’s code. You can’t. Quit trying to wriggle out of it, it’s unseemly.

    w.

    PS—Please don’t try to claim you were talking about Mann’s most current code all along, your own words show that’s not the case.

  151. joeldshore says:
    April 19, 2012 at 12:29 pm

    Why do you think Science magazine recommends requiring that the code be provided?
    Look, it appears that you think that Science magazine and a host of scientists are 100% wrong in wanting to require computer code.

    Where has Science magazine made this recommendation? I hope you are not misinterpreting / misrepresenting the fact that Science published this article in their “Policy Forum” to mean that Science endorses all of its conclusions.

    So your claim is that in a Science magazine special issue, whose theme is “Computational Biology”, and whose introduction to the special issue says:

    Underlying all of this analysis is computer code, and in a Policy Forum, Morin et al. (p. 159) call for this code to be made widely available and suggest how this might be implemented.

    … your claim is that the Editors allowed a group of authors to put out a very strong call for disclosure that the Editors plan to ignore, or don’t approve of?

    The article says (emphasis mine):

    As reliance on scientist-created software grows across scientific fields, the common practice of source code withholding carries significant costs, yields few benefits, and is inconsistent with accepted norms in other scientific domains. Changing this practice will require concrete and unambiguous policy action (see the table). Less definitive disclosure policies are unlikely to achieve desired results. For example, a recent article (31) makes a persuasive case for the necessity of source code release in reproducing scientific results, but fails to lay out efficacious policy recommendations likely to achieve significant and timely change in withholding practices.

    and

    Scientific journal publishers must enact editorial policies requiring, as a condition of publication, that researchers make available new computer source code generated in the course of the research and necessary to reproduce the published research findings. Policies in place at journals already meeting this requirement (16-18, 36) could provide guidance for wider implementation.

    Reference 16 is by Brooks Hanson, Andrew Sugden, and Bruce Alberts, who are respectively two Deputy Editors and the Editor-in-Chief of Science magazine … the other three are to the code requirements of the Journal of Biological Sciences, PLoS, and the Proceedings of the National Academy of Science …and you are trying to get us to believe that Science magazine is not endorsing this approach? That they are just putting it up for discussion, but they’re not going to follow the ideas because they don’t believe in them??

    Get real, Joel, your gyrations to try to establish your claims are getting embarrassing.

    w.

    PS—You seem to think that the appearance of the article in the “Policy Forum” section means the magazine editors don’t back the ideas, but actually the opposite is true. Not only are they backed by the editors, but according to Science guidelines for authors, items in the “Policy Forum” are mostly “commissioned by the editors”

  152. I am “Computer Challenged” However the one point I see in all this is if I do a scientific experiment I try to documented what I did exactly because I want someone to be able to reproduce what I did whether it is tomorrow or a hundred years from now. The Millikan Oil Drop Experiment is a classic example.

    If ALL the information is not published, then the results are completely worthless to scientists in the future because they are nothing more than someone’s opinion. All the papers so far published without code and data attached will be useless a hundred years from now unlike Robert Millikan’s 1909 experiment.

    THAT is the critical issue in my opinion.

  153. joeldshore says:
    April 19, 2012 at 12:29 pm

    Willis Eschenbach says:

    Why do you think Science magazine recommends requiring that the code be provided?

    Look, it appears that you think that Science magazine and a host of scientists are 100% wrong in wanting to require computer code.

    Where has Science magazine made this recommendation? I hope you are not misinterpreting / misrepresenting the fact that Science published this article in their “Policy Forum” to mean that Science endorses all of its conclusions.

    Upon further research, I find that Science magazine made the recommendation in the cited reference 16 above, available here (paywalled), viz (emphasis mine):

    To address the growing complexity of data and analyses, Science is extending our data access requirement listed above to include computer codes involved in the creation or analysis of data.

    w.

  154. Eschenbach says:
    Windchaser, this has all come up and come to a head because for many things, being “happy to explain” what you do and how you do it isn’t enough to reproduce what you have done, for the reasons I spelled out above.

    The issue is whether the results are reproducible. For many studies, they are not reproducible without the computer code.

    Then either:
    1) The original work is incorrect, or
    2) the methodology is not clearly explained.

    The problem is that your method just leads to dueling claims, where nothing gets settled.

    Yes. Dueling claims happen all the time in science. But that’s hardly the same as “nothing gets settled”, because usually other people are willing to jump on board and run the experiments/simulations themselves. Controversy is what drives the scientific method, after all.

    The ‘currency’ of the publishing world is citations. Citations are a measure of your influence, success, and fame in this world, and are a part of what gets you more research money, a bigger lab, etc.

    A big controversy generates lots of papers and lots of citations, and getting in on the action by publishing is an easy way to build your academic career (provided your data is solid). Take a look at the recent faster-than-light neutrino controversy, for instance, where something like 150 papers were published in a few months after the controversial results came out, and see how many citations those initial papers received. Or look at the cold fusion story from the ’60s and the flurry of publication that followed.

    Basically – it’d be great for you if you can show that some important piece of work is not reproducible (well, provided the errors are in their code and not yours). And there’s plenty of motivation for other people to come along and help resolve the controversy.

    And without your code, we’ll never know the difference. You want us to trust that your code actually does what you claim, but science isn’t built on trust.

    Of course it’s not. Which is why you write your own code, test it yourself, and run the simulations yourself.

    TonyG says:
    In your example, it’s like you built your OWN piece of lab ware, but didn’t provide all of the details about how it was built. Maybe you gave the measurements, but failed to specify the materials, and yours was made of aluminum while mine was made of plastic. Or maybe you gave all the details, but made a mistake in the construction.

    Nah. All the important details about how it’s built are given, so that other people can build one themselves.

    If you follow the exact instructions and get different results from your equipment, then one of us built it badly, right? Maybe I scratched the lens on the microscope I was building, or didn’t measure the focal length properly, or whatever. But even if I didn’t correctly follow my own methodology, it doesn’t mean that you get to come into into my lab and inspect my equipment.

    Most of the potential problems you guys have raised, like bugs in the code, have nothing to do with publishing enough information about your methods, and everything to do with implementing those methods correctly.

  155. Willis,

    as far as I am aware, that is the base code for MBH98. You have cited a preliminary post McIntyre made in 2005, following it’s release. Since then McIntyre has made use of the code to check MBH98/99.

    A commenter at Science, under the Black Box abstract, says;

    much of research software is a brittle patchwork of barely working, very complex parts. Only the author, and maybe a few more people in the world (if they are willing to invest their time) can understand what’s going on. If you make it open-source, you’re committing to supporting that software forever. Imagine the horror of debugging this software through e-mail because some grad student doesn’t know how to get it to work.

    McIntyre has found it very deifficult to work with the code Mann posted. This stuff (apparently) is not like commercial software, it’s buggy, purpose built, and not easily applicable. It’s not written for end users.

    The commenter raises an interesting point – if code is not user-friendly, will it be enough to provide it and sit back? If it works for the makers, but peer-reviewers can’t get it to function, what then? If journals may not publish unless user-friendly code is provided, then that is going to slow down research.

    Results that are reproduced using different methods are more robust than done by repeating the steps. This call for scientists to write and publish code that others can easily use to verify is a call from auditors, not researchers. I can understand serving the wants of the gatekeepers, but I’m not sure we’d get better science or scientists. We might get better programmers (for end users).

    BTW, I could only get the summary of the Science editorial you cited. That was about freeing data, not code. Is the rest of the article specifically asking for code?

  156. barry says:
    April 19, 2012 at 4:03 pm

    Willis,

    as far as I am aware, that is the base code for MBH98. You have cited a preliminary post McIntyre made in 2005, following it’s release. Since then McIntyre has made use of the code to check MBH98/99.

    Cite?

    And what do you mean by “base code”? Either that is the code that was used to do the Hockeystick calculations, or it isn’t. As far as I know, it isn’t. Calling it the “base code” is just a verbal trick, that statement means nothing.

    … BTW, I could only get the summary of the Science editorial you cited. That was about freeing data, not code. Is the rest of the article specifically asking for code?

    Most of the article is about data. The part I quoted was about how Science magazine is changing its editorial policies to require code.

    w.

  157. >>
    Windchaser says:
    April 19, 2012 at 11:53 am

    Why not just write your own code?
    <<

    I’ve read some naive statements on this thread, but this one takes the cake.

    Let’s say, for grins, that I wrote a program that thinks–it passes the Turing test with flying colors. You want to see MY code? Hey, why not just write your own?

    Besides, you just want my code so you can prove me wrong.
    /sarcasm off

    Jim

  158. Windchaser says:
    April 19, 2012 at 3:19 pm

    Eschenbach says:
    Windchaser, this has all come up and come to a head because for many things, being “happy to explain” what you do and how you do it isn’t enough to reproduce what you have done, for the reasons I spelled out above.

    The issue is whether the results are reproducible. For many studies, they are not reproducible without the computer code.

    Then either:
    1) The original work is incorrect, or
    2) the methodology is not clearly explained.

    Fallacy of the Excluded Middle. You left out … or

    3) the code contains bugs, or

    4) the code contains errors in logic unknown to the authors (see Michael Mann and the Hockeystick), or

    5) the authors are flat-out lying about what they did, or

    6) the authors described what they thought they did, but in fact they did something else, or

    7) Harry of the read-me file has “tuned up” the code without the knowledge of the authors, or

    8) the authors are shading the truth for business reasons, or

    9) the code has accidentally become corrupted in the process of copying, or

    10) the users of the code are not fully conversant with the limitations and oddities of the code, or

    11) bad input has led to Garbage In, Garbage Out, or

    12) the code is machine-specific, and produces different answers on different machines, or …

    You’re back on to “trust me” … but science is not about trust. It is about verification. The only way to verify that the authors have done what they think they have done is to reveal the code.

    Let me repeat what the article said, and remind you that the authors are not unknown like yourself. Unlike you, they signed their names to the study, and they are from the following institutions:

    Harvard Medical School, Boston
    School of Law, University of California, Berkeley
    Lawrence Berkeley National Laboratory, Berkeley
    Argonne National Laboratory and University of Chicago
    University of California, San Francisco, San Francisco
    University of Washington and Howard Hughes Medical Institute

    Here’s what they said again (emphasis mine):

    As reliance on scientist-created software grows across scientific fields, the common practice of source code withholding carries significant costs, yields few benefits, and is inconsistent with accepted norms in other scientific domains.

    You see that part about “… carries significant costs, yields few benefits, and is inconsistent with accepted norms …”?

    Now you are here, a random anonymous internet popup, to claim there are no costs, that there are lots of benefits, and that it is in fact consistent with scientific norms … sorry, but you’ll have to do better than that.

    Science mag sees the issues with code withholding. The six authors see the issues. Science magazine, PLoS, PNAS, and the Journal of Biological Sciences have changed their policies because of the issues. Most of the folks posting here see the issues.

    When I was a kid on the cattle ranch, they used to say “If one man calls you a horse, laugh it off. If two men call you a horse, think it over. If three men call you a horse … buy a saddle.” True to your alias, you’re chasing the wind on this question, my friend, you are clinging to a rapidly disappearing past, a time when scientists could be trusted and procedures were simple enough to describe in English. That time is gone forever.

    w.

  159. Willis Eschenbach says:

    PS—You seem to think that the appearance of the article in the “Policy Forum” section means the magazine editors don’t back the ideas, but actually the opposite is true. Not only are they backed by the editors, but according to Science guidelines for authors, items in the “Policy Forum” are mostly “commissioned by the editors” …

    I think that the editors of Science are, like most journals, wrestling with what the require authors supplementary materials authors submit along with their papers and thus feel it is important to have an active and vigorous discussion about where to draw the line. I don’t think that the editors necessarily agree that the line should be drawn where the authors of that particular place say it should be drawn. I would need a clearer endorsement of this from the editors before I believed it represented their views.

    Upon further research, I find that Science magazine made the recommendation in the cited reference 16 above, available here (paywalled), viz (emphasis mine)

    That editorial is rather strange to me in that they spend most of it talking about the issue of data and then throw in this one sentence that, if interpreted strictly might seem to imply that essentially all code must be submitted. I would be rather surprised, however, to hear that Science is actually enforcing the strictest interpretation of that vague sentence, as it would raise all sorts of interesting questions about what papers they could publish and what they couldn’t. Basically anything using any sort of proprietary code would be out if one adopted the strictest interpretation of that one vague sentence…and I imagine one could find all sorts of articles actually published in Science since that editorial was written that don’t comply with a very strict interpretation of that sentence. (And, as I recall, is it not true that you or others have complained that Science is not adhering to their own policy because they are in fact not enforcing that requirement in the strictest sense that you would like to see it enforced. Or, am I remembering something involving a different issue or journal?)

  160. Willis Eschenbach says:

    The fact remains, your mouth made a bet that your resources can’t pay. You said you could give us links to Mann’s code. You can’t. Quit trying to wriggle out of it, it’s unseemly.

    barry gave you the link that I was thinking of. You and McIntyre seem to have some vague complaints about why this isn’t good enough…which are great at distracting everybody from the fact that Mann has released more code than any reasonable person could possibly ask for and almost as much as a not-so-reasonable person like McIntyre could ask for (so much, in fact, that even McIntyre has to go back to papers over a decade old to find anything to complain about!) In the meantime, we have not a shred of code provided by Spencer and Christy. And, despite this, Mann is pillaried and Spencer and Christy are considered minor deities in the community! Go figure.

  161. Jim Masterson thinks that Windchaser is being “naive” for suggesting that you should write your own code, but actually that is what a proper scientific test requires.

    Think experiments : you want to check the Michelson-Morley experiment – do you say it is impossible because you do not have Michelson’s interferometer? No, you build a better one.

    A true check of an “experiment” which depends on a computer program does not require that you run the same calculations with the same program (and on the same machine/hardware/operating system/system libraries etc etc). Frankly that is a very limited check. A true check would insist on using an independent program with the same functionality to see if it gave the same answers.

    I am asked to referee papers in a branch of physical chemistry which often using computer generated results. Sometimes I see results which do not make sense. The authors claim they have used program X. Do I check them using program X? No of course not, that would prove nothing. I use program Y instead. If they then agree I conclude that my intuition that the results were wrong was lacking. If they disagree, I suggest to the authors that they may re-consider the methodology. That is the proper way to do a check. Insisting that they produce the code is pointless because who is going to look through a million lines of C++ or whatever. You test programs by comparing the results with another independent program.

  162. joeldshore says:
    April 19, 2012 at 5:25 pm

    … Mann has released more code than any reasonable person could possibly ask for and almost as much as a not-so-reasonable person like McIntyre could ask for (so much, in fact, that even McIntyre has to go back to papers over a decade old to find anything to complain about!

    That’s freakin’ classic. For years and years, Mann refuses time after time to provide his code. Finally, after almost a decade, he releases part of it … and then you whine that McIntyre is discussing a paper almost a decade old?

    Dude, if Mann had released it at the time, we wouldn’t have had to go back almost a decade. We only had to go back because he said that to ask him for his code was “intimidation”, and refused for almost a decade to release it.

    Talk about blaming the victim, man, that takes the cake. He hides his code for a decade, and then when it’s finally dragged from him kicking and screaming, you bitch about people discussing code a decade old?

    That’s one for the books, Joel, that’s hilarious. I still don’t get why you are defending a serial liar, a guy who destroys evidence, but by gosh, when you do you sure provide great entertainment. Keep them excuses coming, Joel, they just get better and better.

    w.

  163. Willis,

    I used the term “base code” because having read through a number of posts at climateaudit, McIntyre had to get supplementary information to make it work (apparently). Mann et al didn’t write the stuff to be user-friendly. But I’m very sure it is the source code for MBH98, as I think I can demonstrate, citing exclusively from climateaudit.

    Here’s a 2008 post where Steve makes use of the code and links to it. You’ll recognize the web address to the code.

    http://climateaudit.org/2008/04/05/squared-weights-in-mbh98/

    And other posts where he specifically mentions using the ‘source code’ for MBH98.

    “The rescaling step is definitely observable in the MBH98 source code archived in summer 2005.” (link)

    “Once again, Mann source code shows that they carried out operations which yield…” (link)

    Here is a post where a contributor at climateaudit transliterates the MBH98 fortran code to Matlab.

    “In the MBH source code, they apply steps…(link)

    You get the picture.

    Weirdly, there are few (13) posts at climateaudit on Mannian source code, and it seems interests terminates miid-2008. Steve handily provides a category in the drop down sidebar titled “MBH98″, subheading “Source Code.” You can check for yourself.

    Inconvenient to our discussion, Steve never says “oh, by the way, I’ve ascertained that the code is the actual one after all”, so I guess you could dance on the head of that pin if you really wanted, but I think reasonable readers will discern that McIntyre is using the code because he believes it’s the actual one, not “as if” it’s the code.

    IOW, the source code for MBH98 has been on the web since 2005; the public release occurred about 3 years after first requested by Steve McIntyre (his original request was for own use, not public release).

  164. jimmi_the_dalek says:
    April 19, 2012 at 5:32 pm

    Jim Masterson thinks that Windchaser is being “naive” for suggesting that you should write your own code, but actually that is what a proper scientific test requires.

    Think experiments : you want to check the Michelson-Morley experiment – do you say it is impossible because you do not have Michelson’s interferometer? No, you build a better one.

    A true check of an “experiment” which depends on a computer program does not require that you run the same calculations with the same program (and on the same machine/hardware/operating system/system libraries etc etc). Frankly that is a very limited check. A true check would insist on using an independent program with the same functionality to see if it gave the same answers.

    Thanks, Jimmy. There are a couple of difficulties with your example. The main one is that in climate science we almost never are doing “experiments” in the traditional sense. Instead, we are taking some dataset X and subjecting it to an often extremely complex set of mathematical transformations.

    As a result, the usual ideas about “experiments” don’t hold. The first thing we need to do is to check and see if the authors actually did what they claim they did. And the only way to do that is to have access to their code and their data. Otherwise, for the 12 reasons listed above, we’re just spinning our wheels and wasting huge amounts of time. Michael Mann refused to reveal either his code or his data. Lonnie Thompson has made a living out of hiding his data. Bear in mind that those are the guys whose actions you are so vigorously defending …

    Ross McKitrick’s paper on UHI is a great example of the benefits of showing your work (just like you had to in high school, just like you have to in all true science). At one point Ross made an error in the code, he used radians rather than degrees. Such an error is impossible to dig out without having the code. But because he had made the code available, the error was discovered quickly.

    You are correct that doing that is a “limited check”, but it is a crucial check that should never be skipped. First of anything, before we see if we can do what they say they have done, we need to find out if they can do what they say they have done. If we find they have made some error such as the one Ross made, we can stop there and deal with that.

    Or if we find they’ve actually done what they described, then and only then can we go on to do further tests or repeat the analysis in some other manner, as an independent check.

    You seem to think that we should not check to see if they have made errors … I fear I don’t understand that idea at all. To use your example, suppose Michelson and Morley made some foolish error in their experimental setup … wouldn’t you want to know that before you set out to replicate their experiment? In the old days, you could look at M&M’s description of their experiment to determine that. Now, we need to look the code to determine that … and you say it’s OK to hide the code. But if M&M had hidden their actual description of their experiment, where would you be? And that is exactly what you are advocating …

    In closing let me repeat, because it bears repeating, the judgement of the authors of the article, who said:

    As reliance on scientist-created software grows across scientific fields, the common practice of source code withholding carries significant costs, yields few benefits, and is inconsistent with accepted norms in other scientific domains.

    You and Joel are arguing against that. But neither of you have shown that hiding the code doesn’t carry significant costs. Neither of you have shown that hiding the code yields benefits. Neither of you have shown that hiding the code is consistent with scientific practice in other fields.

    And until you can do that, I’m going with the author’s claims, because they seem self-evidently true.

    w.

  165. >>
    jimmi_the_dalek says:
    April 19, 2012 at 5:32 pm

    Think experiments : you want to check the Michelson-Morley experiment – do you say it is impossible because you do not have Michelson’s interferometer? No, you build a better one.
    <<

    So we should build a better one. How would we decide if a design is better–or worse? How would we know without comparing our design with the original? How does that work exactly?

    >>
    A true check of an “experiment” which depends on a computer program does not require that you run the same calculations with the same program (and on the same machine/hardware/operating system/system libraries etc etc). Frankly that is a very limited check. A true check would insist on using an independent program with the same functionality to see if it gave the same answers.
    <<

    As I said, there are many naïve comments on this thread. This is another one. You can do the same thing “functionally” in different ways. Some routines are approximate. Others are more exact–but most are approximate and will give slightly different answers.

    For example, if I say I’m using a function that computes the square root of a number, what EXACTLY does that mean? How many parameters are there to the routine? One? Two? Three? More? Do I return the answer as a function return, or in one of the parameters? Do I use integer arithmetic, floating point arithmetic, (or what they called fixed-point in PL/I)? (Did you know that standard floating point numbers are really a sparse representation on the number line?) Do I use only real positive numbers, real positive and negative numbers, or complex numbers? When I convert the internal module representation to logarithms (if I use logarithms) do I use complex numbers with degrees or radians? How do I handle any exceptions? Do I ignore them, stop processing, return zero, return one, return pi or e, or something else? Do I use single precision or double precision?

    I could probably go on for another few paragraphs, and that’s only one function.

    A program between 100 and 1,000 lines of code (a small program) would be impossible to duplicate without the original code. We haven’t even touched on the problems with chaotic systems.

    >>
    I am asked to referee papers in a branch of physical chemistry which often using computer generated results. Sometimes I see results which do not make sense. The authors claim they have used program X. Do I check them using program X? No of course not, that would prove nothing. I use program Y instead. If they then agree I conclude that my intuition that the results were wrong was lacking. If they disagree, I suggest to the authors that they may re-consider the methodology. That is the proper way to do a check. Insisting that they produce the code is pointless because who is going to look through a million lines of C++ or whatever. You test programs by comparing the results with another independent program.
    <<

    So it took ten-plus years to write program X (a million lines of C++ code). You just happen to have program Y in your back pocket? I really doubt it.

    Jim

  166. “So it took ten-plus years to write program X (a million lines of C++ code). You just happen to have program Y in your back pocket? I really doubt it.”

    In my case I do – and several more programs as well. The area I work in has at least 5 or 6 major program packages that could be used to check results. The general point I am making is quite simple – demanding to see the code is probably the least efficient way to check the results. And don’t give me stuff about finite precision arithmetic – you don’t need to reproduce a result to 15 decimal places to know whether it is right or not.

    What the BEST people did was much closer to the correct general approach. Yes, I know some people here don’t like their conclusions, but their approach of starting the analysis from scratch with an independent code was correct.

    And Willis, I am not arguing for hiding the code – I am saying believing that demanding its release will produce benefits is over optimistic. Nor should you lump me in with Joel – if he wants to reproduce Spencer’s results he should write a program (or persuade someone else to do so) to analyse the original data streams.

  167. barry says:
    April 19, 2012 at 6:58 pm

    Willis,

    just making sure you didn’t miss confirmation of MBH source code above. It would be good to resolve this bit of the conversation.

    http://wattsupwiththat.com/2012/04/17/the-journal-science-free-the-code/#comment-961342

    Thanks, barry. As an indication of the incompleteness of the code that Mann provided, consider this quote from the link you provided (emphasis mine)

    MBH98 did not mention any weighting of proxies anywhere in the description of methodology. Scott Rutherford sent me a list of 112 weights in the original email, so I’ve been aware of the use of weights for the proxies right from the start. Weights are shown in the proxy lists in the Corrigendum SI (see for example here for AD1400) and these match the weights provided by Rutherford in this period. While weights are indicated in these proxy lists, the Corrigendum itself did not mention the use of weights nor is their use mentioned in any methodological description in the Corrigendum SI.

    In one place, Wahl and Ammann 2007 say that the weights don’t “matter”, but this is contradicted elsewhere. For example, all parties recognize that different results occur depending on whether 2 or 5 PCs from the NOAMER network are used together with the other 20 proxies in the AD1400 network (22 or 25 total series in the regression network). Leaving aside the issue of whether one choice or another is “right”, we note for now that both alternatives can be represented merely through the use of weights of (1,1,0,0,0) in the one case and (1,1,1,1,1) in the other case – if the proxies were weighted uniformly. If the PC proxies were weighted according to their eigenvalue proportion – a plausible alternative, then the weight on the 4th PC in a centered calculation would decline, assuming that the weight for the total network were held constant – again a plausible alternative.

    But before evaluating these issues, one needs to examine exactly how weights in MBH are assigned. Again Wahl and Ammann are no help as they ignore the entire matter. At this point, I don’t know how the original weights were assigned.

    So in 2008, ten years after the publication of the Hockeystick, and three years after a US Congressional Committee finally forced Mann to archive the code, we still don’t know how Mann assigned the weights to the proxies … and that is central to the analysis, because all that the Mannian method does is is assign final weights to the proxies.

    So once again I have to say, as far as I know we still don’t have all of the code necessary to see what Mann has done. Sure, there are a bunch of things we can learn and have learned from the code that Mann was finally forced to release. And Steve has done yeoman work to learn those things.

    But we still don’t have it all.

    My thanks for your perseverance,

    w

  168. jimmi_the_dalek says:
    April 19, 2012 at 7:46 pm

    “So it took ten-plus years to write program X (a million lines of C++ code). You just happen to have program Y in your back pocket? I really doubt it.”

    In my case I do – and several more programs as well. The area I work in has at least 5 or 6 major program packages that could be used to check results. The general point I am making is quite simple – demanding to see the code is probably the least efficient way to check the results.

    You’re still not getting it. Suppose you check someone’s results by using another program that you have in your back pocket. Suppose your program gets different results. Both of you, of course, insist that you have the right answer.

    How on earth can you decide which one is correct without examining the code?

    This is particularly true in climate science, where in general there are exactly zero “major program packages that could be used to check results”. What “major program package” would you use to check Ross McKitrick’s UHI results, for example?

    This is a serious question, Jimmi. If you want to get some traction for your claims, you could start by telling us what “major program package” you’d use to check Ross’s work …

    If you can do that, then perhaps you could tell us what would happen if you got a different answer, and Ross refused to show his code. That’s exactly what has happened more than once in climate science. I see no way to resolve it without revealing the code.

    And Willis, I am not arguing for hiding the code – I am saying believing that demanding its release will produce benefits is over optimistic. Nor should you lump me in with Joel – if he wants to reproduce Spencer’s results he should write a program (or persuade someone else to do so) to analyse the original data streams.

    Jimmi, Steve McIntyre and I and a host of others spent a long time trying to show that the Hockeystick contained fatal flaws. It took years to do so, because Mann (like you) didn’t see any benefit in releasing the code.

    But if he had released it at the time, we would have been spared years of being beaten over the head with the Hockeystick by people claiming it was solid science.

    So I know for a fact, from my own experience and my own wasted time, that there are large benefits from releasing the code—at a bare minimum, ignoring all of the other benefits people have discussed above of saving time and allowing the discovery of errors, IT DISCOURAGES PEOPLE LIKE MANN FROM LYING ABOUT WHAT THEY DID.

    And if you don’t see that as a large benefit, you’re not paying attention. Perhaps whatever field you work in is not over-run with liars and crooks like climate science, perhaps in your field a host of the scientific leaders and main players are not self-confessed liars, cheats, and manipulators as they are in climate science.

    Here, we need such rules just to keep the animals in line.

    w.

  169. “Jimmi, Steve McIntyre and I and a host of others spent a long time trying to show that the Hockeystick contained fatal flaws. It took years to do so, because Mann (like you) didn’t see any benefit in releasing the code.

    But if he had released it at the time, we would have been spared years of being beaten over the head with the Hockeystick by people claiming it was solid science.”

    So why, ten years ago, did you and other climate sceptics not have a whip-round, hire a researcher for a year to recover the original data from the various weather stations (as the BEST people eventually did last year) so that there was no doubt about the data, and then write an analysis program. Wouldn’t that have been better than spending 10 years being beaten over the head?

    ” Perhaps whatever field you work in is not over-run with liars and crooks like climate science, perhaps in your field a host of the scientific leaders and main players are not self-confessed liars, cheats, and manipulators as they are in climate science.”

    Well yes, there is that – in my area most people appear to be reasonably honest. Perhaps the medics also have a problem, because when you look at that ‘diverse group of scientists’ that wrote this discussion paper, it turns out that the majority are in medical science/gene sequencing/protein structure research.

  170. Jim Masterson says:
    I’ve read some naive statements on this thread, but this one takes the cake.

    Let’s say, for grins, that I wrote a program that thinks–it passes the Turing test with flying colors. You want to see MY code? Hey, why not just write your own?

    ..Well, if you wrote down for me clearly and logically how you produced the code, then yeah, it should be reproducible. If it’s not, then you either didn’t provide enough detail, or you have bugs in your code.

    Call me naive if you want, but I deal with this in the real world: writing code to reproduce other people’s work, and having other people write code to reproduce mine. Our group’s code is about 5,000 lines.

    W Eschenbach says:
    Fallacy of the Excluded Middle. You left out … or

    3) the code contains bugs, or

    4) the code contains errors in logic unknown to the authors (see Michael Mann and the Hockeystick), or…

    Yeah.. those are all “the original work is incorrect”. If the bugs are causing you to produce bad results, then the original work is incorrect. Same for if the authors are lying, or the code contains other errors, or the authors are shading the truth, etc etc. Or if your code is machine-specific, and so is the other person’s, then one of yours must be wrong (well, at least one).

    You see that part about “… carries significant costs, yields few benefits, and is inconsistent with accepted norms …”?

    Now you are here, a random anonymous internet popup, to claim there are no costs, that there are lots of benefits, and that it is in fact consistent with scientific norms … sorry, but you’ll have to do better than that.

    Eh? You’re putting words in my mouth. I never said there were no costs. Of course there are costs. But I believe they’re outweighed by the benefits.
    As for it being consistent with scientific norms – how many major journals require you to publish your code right now? Maybe one or two, out of dozens or hundreds?

    The issue here is what economists would call “incentives”. Right now, many groups are successful because they developed something new and innovative. Of course they share their methods with other groups; that’s part of the scientific method. But being the first to develop a new code gives a group an advantage, and lets them publish new data and push forth the frontiers of science. That’s why they write the new code; to be the first to publish. It’s not for charity.

    Now, say that the first time you publish anything with your new code, you have to release the code. What will you do? You might wait as long as you think you can, and publish a bunch of papers all at the same time. Or you might just not write up new code at all, if it’s not worth your time.
    As with other kinds of ‘markets’, if you remove incentives for people to innovate, you’ll get less innovation.

  171. jimmi_the_dalek says:
    April 19, 2012 at 8:36 pm

    “Jimmi, Steve McIntyre and I and a host of others spent a long time trying to show that the Hockeystick contained fatal flaws. It took years to do so, because Mann (like you) didn’t see any benefit in releasing the code.

    But if he had released it at the time, we would have been spared years of being beaten over the head with the Hockeystick by people claiming it was solid science.”

    So why, ten years ago, did you and other climate sceptics not have a whip-round, hire a researcher for a year to recover the original data from the various weather stations (as the BEST people eventually did last year) so that there was no doubt about the data, and then write an analysis program. Wouldn’t that have been better than spending 10 years being beaten over the head?

    I guess my writing isn’t clear. We wanted to determine if the Hockeystick was calculated correctly. We didn’t want to make yet another reconstruction, we had no interest at all in doing that. That would be just one more thing to argue about. We wanted to see if the Hockeystick was true or false, not make Anotherstick.

    Because if we had taken your path, then we’d be back to dueling stories—Mann would claim he was correct, we’d claim we’d done it correctly, he’d say “you didn’t show I was wrong” … and he’d be correct. We would not have shown he was wrong.

    Let me try another way to explain it.

    Suppose I take dataset W and claim that I’ve subjected it to transformation X. If someone wants to falsify my claims, your suggestion is that someone who doesn’t believe me should take dataset Y and subject it to transformation Z.

    Sorry, Jimmi, but that makes no sense. How would doing a different analysis of a different dataset show anything at all about the Hockeystick? Let me say it real slowly so maybe it can get across:

    Science proceeds by falsification.

    Now, suppose we did as you suggest, we “recover the original data from the various weather stations (as the BEST people eventually did last year) so that there was no doubt about the data, and then write an analysis program”.

    How on earth would that falsify Mann’s work? You don’t seem to understand what we are about. We are engaged in falsification, which is at the core of science, and you want us to do original research instead. Certainly original research is important, but THAT’S NOT WHAT WE’RE DOING!

    w.

  172. Windchaser says:
    April 19, 2012 at 9:33 pm

    Jim Masterson says:

    I’ve read some naive statements on this thread, but this one takes the cake.
    Let’s say, for grins, that I wrote a program that thinks–it passes the Turing test with flying colors. You want to see MY code? Hey, why not just write your own?

    ..Well, if you wrote down for me clearly and logically how you produced the code, then yeah, it should be reproducible. If it’s not, then you either didn’t provide enough detail, or you have bugs in your code.

    Or the person provided plenty of detail, but he is lying.

    Or he provided enough detail but he didn’t understand what the code was doing.

    Or he described the code correctly and in adequate detail but he used the wrong input.

    Or he described the code in detail but didn’t realize there were incorrect constants.

    So … you think the person should describe the code so well it can be exactly reproduced … but you think the person shouldn’t have to reveal the code?

    What’s the difference, other than the huge chance that the English description and the computer language don’t correspond exactly?

    You know, Windchaser, there’s a reason why we don’t program computers in the English we use every day … it’s too vague. If you want to know what the program is doing, you need the program itself.

    w.

  173. >>
    Windchaser says:
    April 19, 2012 at 9:33 pm

    ..Well, if you wrote down for me clearly and logically how you produced the code, then yeah, it should be reproducible.
    <<

    I’ve never seen a specification so complete, clear, and logical, that you could exactly reproduce a program. However, if your structured English/pseudo code reproduced all the features of your program, then you’re basically rewriting the code into wordier, structured English/pseudo code. Wouldn’t it be simpler to just provide the code?

    >>
    If it’s not, then you either didn’t provide enough detail, or you have bugs in your code.
    <<

    Or the original code has bugs not specified (there’s no such thing as bug-free code), details are missing, different assumptions are made, two programmers may read the same specification statement and program it differently, and so on.

    >>
    Call me naive if you want, but I deal with this in the real world: writing code to reproduce other people’s work, and having other people write code to reproduce mine. Our group’s code is about 5,000 lines.
    <<

    It’s interesting that your shop reverse-engineers each other’s code. Some people sure have unusual programing jobs.

    >>
    As with other kinds of ‘markets’, if you remove incentives for people to innovate, you’ll get less innovation.
    <<

    Unless you specifically put the code into the public domain, the code’s protected by copyright. If a company/university provided the funds to create it, then they hold the copyright. This is a false concern.

    Jim

  174. Windchaser says:
    April 19, 2012 at 9:33 pm

    Eh? You’re putting words in my mouth. I never said there were no costs. Of course there are costs. But I believe they’re outweighed by the benefits.
    As for it being consistent with scientific norms – how many major journals require you to publish your code right now? Maybe one or two, out of dozens or hundreds?

    You’re not paying attention. The paper itself listed four journals as examples, and I listed them again above. Why are you claiming there are only “one or two”?

    And the problems you claim (below) will happen? Perhaps you could provide examples, since at a minimum four journals already have the policy. Have those journals had the problems you claim below, or is that just your claim with no evidence?

    The issue here is what economists would call “incentives”. Right now, many groups are successful because they developed something new and innovative. Of course they share their methods with other groups; that’s part of the scientific method. But being the first to develop a new code gives a group an advantage, and lets them publish new data and push forth the frontiers of science. That’s why they write the new code; to be the first to publish. It’s not for charity.

    Now, say that the first time you publish anything with your new code, you have to release the code. What will you do? You might wait as long as you think you can, and publish a bunch of papers all at the same time. Or you might just not write up new code at all, if it’s not worth your time.

    As with other kinds of ‘markets’, if you remove incentives for people to innovate, you’ll get less innovation.

    I’m sorry, but in climate science I can’t even envision that happening, where someone has code so unique that they can re-use it over and over, and thus gain a publication advantage over other researchers. I don’t even see how that would work. If you’d just give us an example of that happening in climate science, someone keeping their code secret so they can re-use it, then I would understand what it is you are talking about, and we could discuss it.

    And if you can’t come up with an example of that, we can ignore it.

    In any case, my friend, you’re swimming upstream. Requiring the code is going to happen across the board, no matter how hard you might cling to your outdated ideas about science and your misunderstandings about falsification, so you might as well get used to it.

    w.

  175. Willis, you ask “How on earth would that falsify Mann’s work?” to my suggestion that what you should have been doing was getting your own codes. Well to me it is very simple – you could have published it, and if Mann or anyone else objected, you would have been in a very strong position because you could have ‘shown your working’ ! At that point Mann et al would either have had to put up or shut up, short circuiting 10 years of arguments. Science progresses by falsification you say , well that is partly true , but there is more than one way to do that – instead of a direct proof that a result is wrong, you can instead produce a result that is unambiguously correct, which is preferable as it is a win-win situation. As well as getting rid of the incorrect result, you produce the correct one.

    “You know, Jimmi, there’s a reason why we don’t program computers in the English we use every day. . it’s too vague. If you want to know what the program is doing, you need the program itself.”

    I know, that’s why people use maths instead. I prefer to publish the maths.

  176. @More Soylent Green! (April 19, 2012 at 9:27 am)

    Simulation trash code bait is poison. It’ll make folks sick and the resultant mess won’t make an attractive story. …But it appears many folks are all excited about this, so rather than rain further on the parade, I’ll just step aside. It’s possible that a few good stories will come out of all the slogging. All the best.

  177. Willis,

    I think you’re repeating what I’ve said, but with different emphasis. Is this the source code? Yes. Does it include proxy data, weighting methods and other details? No. McIntyre is not able to perfectly replicate MBH98 because he does not have every tittle of an old paper. But researchers have not archived every tittle of their work because the research community does not replicate to advance knowledge. What we have here is a clash of cultures. Tax-style auditing borne of a political interest (MBH98 would not be an issue if it hadn’t been in the TAR), and research science, where archiving old code is much less important (because the next researcher will write their own). these details bear on the main article.

    The implication I’m getting from you is that Mann et al have denied the auditors full disclosure despite having posted much of their material, whereas Spencer and co are making a genuine effort to make their code accessible, despite having produced nothing and saying it was too difficult. Now, I’ve put another slant on it here, but only to emphasise a point. There’s a way of descrbing this neutrally.

    You said:

    Where’s the link to his code for a) his Hockeystick paper, b) his 1999 paper

    You’ve been given the link. I got it from climateaudit, where Steve describes it as MBH98 source code. He even wrote a 2005 paper on MBH, citing that as the MBH98 source code, so you seem to be at odds with Steve on that. (Yes, more information is needed)

    (BTW, the 1999 paper IS the ‘hockey stick’ paper – the iconic graph in TAR)

    There’s quite a bit of meta behind our talk here, but that’s not unexpected – Anthony himself credited McIntyre for the emergence of the Black Box paper in the OP (I have no idea if McIntyre is actualy mentioned in that paper). Speaking for myself, I think you’re a little strident on what is and isn’t so regarding MBH source code, which would be worth dropping the conversation for, except that the larger point about requiring code to publish is worth pursuing, and the difficulties Steve has had with what has been made available bears on that.

  178. Paul Vaughan says:
    April 20, 2012 at 12:40 am
    @More Soylent Green! (April 19, 2012 at 9:27 am)

    Simulation trash code bait is poison. It’ll make folks sick and the resultant mess won’t make an attractive story. …But it appears many folks are all excited about this, so rather than rain further on the parade, I’ll just step aside. It’s possible that a few good stories will come out of all the slogging. All the best.

    I am having trouble following you here, but I think you’re saying the code is trash and this concern with the code is unproductive.

  179. Or the person provided plenty of detail, but he is lying.

    Or he provided enough detail but he didn’t understand what the code was doing.

    Ok, fine, you got me there. The methodology could be incorrect.

    And how is this different from experiment? Experimentalists can misinterpret their results (really, it happens all the time) or lie about their methodology. You say that you just want ‘standard science’ applied, but you’re not asking to to go in and inspect lab equipment, watch the scientists perform experiments, etc.

    So … you think the person should describe the code so well it can be exactly reproduced … but you think the person shouldn’t have to reveal the code?

    Aye, that’s it.

    What’s the difference, other than the huge chance that the English description and the computer language don’t correspond exactly?

    What’s the difference between a design for an engine and having the actual engine? A lot of work; that’s what. And to build an engine, you actually have to understand the design, whereas taking one apart requires a lot less work or understanding.

    In other words, before you can write your own program, you have to really understand the methodology. And that’s the kind of testing that the scientific method uses – for rigorous testing, you need other people who understand the method/concept as deeply as the first group. We don’t want everyone to just use the code without having to understand it, we want at least a few people to rigorously re-check it.. and one of the best ways to ensure this that they recreate the code, entirely from scratch.

    The other ‘best way’ is to have a *very* open, well-maintained and well-documented set of open code. But that would also takes a lot of time to produce.

    You know, Windchaser, there’s a reason why we don’t program computers in the English we use every day … it’s too vague. If you want to know what the program is doing, you need the program itself.

    This is really not so.

    Jim Masterson says:

    I’ve never seen a specification so complete, clear, and logical, that you could exactly reproduce a program. However, if your structured English/pseudo code reproduced all the features of your program, then you’re basically rewriting the code into wordier, structured English/pseudo code. Wouldn’t it be simpler to just provide the code?

    Code is much longer than English or math.

    Here’s a rough example. Let’s say I was working on semiconductor physics in a transistor, and I said :
    “I used an implicit finite-difference method to solve the electron and hole concentrations, using Gummel’s method, over a 3-point centered difference grid”.

    This sentence, combined with the inputs (boundary conditions, system size, materials constants, timestep, etc), would be enough to *exactly* reproduce this work. Why? Because each of those phrases (implicit finite difference, Gummel’s Method, etc.) contains quite a large amount of information about what we do and how we do it. They’re well-established and well-defined in the literature. But the code would still take you a few weeks to write and test.

    I can tell you in single line how to take a discrete Fourier transform. But how many lines of code would it take to program it?

    It’s interesting that your shop reverse-engineers each other’s code. Some people sure have unusual programing jobs.

    Not usually in-group replication of work, but replication of other groups’ work.

  180. Willis Eschenbach says:

    You’re not paying attention. The paper itself listed four journals as examples, and I listed them again above. Why are you claiming there are only “one or two”?

    You’re right, I’d missed that. But the paper lists only three journals out of the top twenty: Science, Journal of Biological Chemistry, and the Proceedings of National Academy of Sciences. As the authors note, 3/20 is pretty far from being ‘standard practices’.

    I’m sorry, but in climate science I can’t even envision that happening, where someone has code so unique that they can re-use it over and over, and thus gain a publication advantage over other researchers. I don’t even see how that would work. If you’d just give us an example of that happening in climate science, someone keeping their code secret so they can re-use it, then I would understand what it is you are talking about, and we could discuss it.

    And if you can’t come up with an example of that, we can ignore it.

    I’m not familiar enough with the climate model literature to say if examples exist, but I’m happy to give you examples of it from other fields. But generally, scientific simulations are dependent on inputs, so if you can produce results about one system, you can produce results about another similar system. So, that code that models one material could model another, or the code that folds one protein could fold a different one, etc. A big piece of code isn’t worth writing up if it’s only going to get you one publication.

    Really, though, we’re just talking basic economics here. You’re saying that we can remove incentives for work and it won’t have any effect. Has that ever happened? I’m sorry, the burden of proof for this one is on you. History is littered with stories of societies that tried to live by “hey, let’s all just share”.

    If we were only talking about climate modelling, I could see extra requirements for disclosure, as the public has a particular interest in verifying any science that turns into public policy. Basically, we need at least one of these:

    1) The climatologists fully reveal their code, or
    2) They stop pushing for policy changes, or
    3) Their work is reproduced independently before we enact any policy changes.

    In any case, my friend, you’re swimming upstream. Requiring the code is going to happen across the board, no matter how hard you might cling to your outdated ideas about science and your misunderstandings about falsification, so you might as well get used to it.

    Sure, it will happen, eventually. But it’s going to face fierce resistance unless someone comes up with a way of repaying groups for the hard work of coming up with new code. If you fixed that, I’d happily get on board, but as it stands now, your interests and mine are not aligned.

    Have you talked to actual, working scientists about this? I think you’d be surprised at the level of resistance. We had an invited speaker (from NIST, I think) come talk about open source a few months ago, and the questions afterward were mostly from people asking how this could possibly work, because they sure weren’t going to give their hard work over to their competitors.

    PS, regarding re-use of code for multiple publications:
    I have my reasons for anonymity, unrelated to this thread. But if you like, I can email you privately and share my real name, and you can see how my group has done with having our own code. I’m rather pleased: as an almost-finished grad student, I’m a co-author on papers in Nature Materials, Nature Physics, and Science. Having your own code can definitely be an advantage.

  181. More Soylent Green! (April 20, 2012 at 6:08 am) wrote:
    “I am having trouble following you here, but I think you’re saying the code is trash and this concern with the code is unproductive.”

    Might as well run stories on vomit entrails. This fascination is just plain sick. It will make WUWT look bad to sensible people.

    The zealousness to rip through a heaping pile of steaming trash is particularly creepy. Interest in code is fine, but mouth watering over trash? We expect that of creatures that scurry around in the dark at night. Rodents, bugs, …

    The community looks to be going off the rails here. I put in my 2 cents in the hopes of helping with sober checks & balances, but in the end if the community goes through with this ugliness, I’ll just step aside.

    I’ve NEVER run into a scenario where I needed to reproduce results and could not. Even when others have made mistakes, I’ve always been able to reproduce their results – by figuring out EXACTLY what mistake they made. In fact, I always used that ability to design efficient marking keys when I taught stats & marked thousands upon thousands of exams. I would design marking keys that anticipated (based on intuition & experience) every possible mistake students would make, showed what final result they would arrive at, and had a fair mark ready. I didn’t require my students to show their work; quite the contrary, I required them to BE CONCISE.

    I might VOLUNTEER code UNDER FAVORABLE CIRCUMSTANCES, but no one here has any leverage over me, so demands will meet (friendly) defiance. Favorable circumstances are far from existing at present. I don’t have enough time & resources to come even remotely close to meeting MY OWN standards for formal presentation. I can whip off informal research notes relatively fast and then handle precise technical questions informally, but it will be years – maybe decades, maybe never – before I have time to cosmetically & editorially engineer formal aesthetic structure tailored for a general audience. The ONLY reason to meet formal demands is a huge paycheck and pension.

    The advantage few are recognizing here is that informal communications are ruthlessly EFFICIENT, whereas formality will consume 99% of a budget IF ONE ALLOWS SUCH INEFFICIENCY. Beware the sink of inefficiency. I would suggest that checks & balances on budget expenditures are not only sensible but also NECESSARY. Waste creates trash. Consuming trash is unhealthy. Suggestion: Be efficient friends.

    Operating on the corollary of the Pareto Principle is a SEVERE drain that gives ALMOST NO RETURNS (aside from fluffy cosmetics). In sharp contrast, operating lean & mean on the Pareto Principle (rather than its corollary) one can accomplish A LOT WITH SURPRISINGLY LITTLE.

    Is there no one here concerned with engineering efficient operations? Do we have access to some pile of infinite resources that no one is sharing with me? Is that why no one appears to be concerned with conserving precious time & resources by operating tactically & strategically – rather than squandering luxuriously?

    Good fun watching this code comedy at least. We all enjoy a good laugh. Nature’s beauty is simple & compact. Her secrets aren’t wastefully encoded in a heaping pile of stinking CAGW code. Why waste time looking for something in a place where it’s guaranteed to not be? Paparazzi-style politics at it’s worst, perhaps.

    But have your sickening fun if that’s all you aspire to do with your freedom. Maybe we’ll one day lose our freedom for the simple reason that we wasted it.

    All the Best.

  182. jimmi_the_dalek says:
    April 19, 2012 at 11:38 pm

    Willis, you ask “How on earth would that falsify Mann’s work?” to my suggestion that what you should have been doing was getting your own codes. Well to me it is very simple – you could have published it, and if Mann or anyone else objected, you would have been in a very strong position because you could have ‘shown your working’ ! At that point Mann et al would either have had to put up or shut up, short circuiting 10 years of arguments.

    Thanks, Jimmi. It seems that you agree that publishing yet another reconstruction would NOT falsify Mann’s paper, which is what I was saying.

    You say that if someone posted another paper, Mann would have to “put up or shut up” … why on earth do you think that would put even the slightest pressure on Mann? He was in the catbird seat, he had finagled his “Hockeystick” into the IPCC report and it had become the icon of the movement … why would he say anything?

    The part you seem to be overlooking is that Mann is a fraud. He knew before he published the Hockeystick paper that the data didn’t support it, particularly the 15th century data. He knew about (and lied to Congress about) the abysmal failure of his work to pass simple statistical tests.

    As a result, there was no way that simply publishing yet another reconstruction would have forced him to do a damn thing. He was not about to reveal that he was a fraud.

    And this is the problem with your proposal that code not be required. You assume that the scientists are honest, which marks you as pretty clueless about climate science. Part of the reason I want to see the code required, as I said above in capital letters, is to prevent people like Michael Mann making a mockery of the scientific process by lying and cheating.

    So your idea, that someone else publishing yet another reconstruction would somehow force Mann to admit that he is a liar and a fraud, is ridiculous. For example, my friend and co-author Craig Loehle published another reconstruction, and a good one, called A 2000-Year Global Temperature Reconstruction Based on Non-Treering Proxies.

    You can read what Mann said about Craig’s reconstruction here … and if after you read that you still think that someone publishing another reconstruction will do anything other than lead Mann to slime the author’s name in any sleazy way he can, you’re not following the story …

    w.

  183. Windchaser says:
    April 20, 2012 at 11:47 am

    … Code is much longer than English or math.

    Here’s a rough example. Let’s say I was working on semiconductor physics in a transistor, and I said :
    “I used an implicit finite-difference method to solve the electron and hole concentrations, using Gummel’s method, over a 3-point centered difference grid”.

    Thanks, Windchaser. The problem in climate science is twofold.

    First, many of the top people in the field are not statisticians, but they are writing papers full of statistics … and as a result they make errors.

    So Michael Mann says quite clearly that he is using principal components analysis.

    Unfortunately, he’s made a newbie mistake in the analysis … he didn’t center the matrix first. No way to determine that from the outside.

    You are right, the code for this is much, much longer than saying “I’m using principle components analysis” … which is why we program in code rather than English. The code contains all the details, including the errors of which the author is totally unaware.

    Second, many of the top people in the field are serial liars who are victims of Noble Cause Corruption. They think nothing of bending the truth to and past the breaking point, because they are Saving The World!!! and are glad to inform you of that.

    So Michael Mann says quite clearly that he never calculated the R^2 for his reconstruction … and that’s a total lie.

    That’s why we need the code. Because finding the errors is part of science, and in many cases you simply cannot find the errors without the code.

    In response, you say:

    You say that you just want ‘standard science’ applied, but you’re not asking to to go in and inspect lab equipment, watch the scientists perform experiments, etc.

    The issue is replication. With an experiment with beakers and glassware, we can duplicate the experiment.

    But with an “experiment” which is nothing but a pile of data and some rules for mathematical transformation of the data, to duplicate it we need two things:

    1. The data, and

    2. The transformations. Not some english language claim about the transformations, that’s orders of magnitude too vague to be of use.

    And that is why people like Michael Mann and Lonnie Thompson and far too many of the un-indicted co-conspirators hide their data and their code … and you keep claiming that hiding those is perfectly fine.

    I don’t get it. I don’t see how on earth you could determine if say the Hockeystick is valid or not without having access to the data and the code. You can’t do it by trying it yourself using your data and your code, that doesn’t verify or falsify one damn thing about the Hockeystick.

    So you tell me, Windchaser, or Jimmi, or anyone—how would you falsify the Hockeystick without access to the data and the code and when the authors don’t answer questions or emails about their work?

    I await your method of falsification of the Hockeystick …

    w.

    PS—Here’s another way to explain it that might make sense. You would admit that a scientific experiment is invalid without a detailed description of the experimental setup, detailed enough to allow the setup to be duplicated.

    But in climate science and some other fields, all we have is data and code, and the code IS the experimental setup.

    Now, Michael Mann described his code … but not in anywhere near enough detail to allow it to be duplicated. In fact, for a computer program of any complexity, it is hugely difficult to describe it in enough detail that someone can duplicate it. You can say, for example, as you do above:

    I used an implicit finite-difference method to solve the electron and hole concentrations, using Gummel’s method, over a 3-point centered difference grid

    But that may or may not correspond to the actual code, and more to the point, you may be totally unaware that it doesn’t correspond to the actual code. Your code may have gotten slightly corrupted along the line.

    So I come along, I follow your description exactly … and I get a slightly different answer. Not a lot different, this may be one of the cases where the error is trivially small.

    So I post my results, and you say “no, you’re wrong, we got a different answer … it’s probably a rounding error.”

    So we go away happy, thinking the scientific method is working, where in fact your code is flawed. Then in the next instance, the next time it comes up, your code and mine give widely differing answers …

    … then what? We’ve already agreed that my code and your code were both good, but now they’re not giving the same answer.

    Please tell me how that can be resolved without an examination of your code and my code, to see where they differ.

    The problem is that in far too many cases, unless you are willing to show your code, your work simply cannot be falsified, and thus it is not science. That’s how Mann got away with the Hockeystick fraud—because he hid the data and the code, nobody could falsify it.

    That is the path you are recommending, and that, my friend, is not science …

  184. Paul Vaughan says:
    April 20, 2012 at 12:23 pm

    … I’ve NEVER run into a scenario where I needed to reproduce results and could not. Even when others have made mistakes, I’ve always been able to reproduce their results – by figuring out EXACTLY what mistake they made.

    Hey, that’s great Paul. In that case, I’m sure you can figure out the method that Michael Mann used to set the weights for the proxies in his reconstruction. No one else has been able to do so, but you sound like just the man for the job. The puzzle is spelled out here, the issue is:

    But before evaluating these issues, one needs to examine exactly how weights in MBH are assigned. Again Wahl and Ammann are no help as they ignore the entire matter. At this point, I don’t know how the original weights were assigned. There appears to be some effort to downweight nearby and related series. For example, in the AD1400 list, the three nearby Southeast US precipitation reconstructions are assigned weights of 0.33, while Tornetrask and Polar Urals are assigned weights of 1. Each of 4 series from Quelccaya are assigned weights of 0.5 while a Greenland dO18 series is assigned a weight of 0.5. The largest weight is assigned to Yakutia.

    So how were the weights assigned, Paul? You implicitly claim you can reproduce these results of Mann’s … time to put your money where your mouth is. I await your report. Me, I think you’re just a braggart, hugely impressed with yourself because you can outguess most of your students, but I’m more than happy to have you prove me wrong …

    w.

  185. Windchaser says:
    April 20, 2012 at 12:11 pm

    Have you talked to actual, working scientists about this? I think you’d be surprised at the level of resistance.

    Sure. I come from a family of actual, working scientists. My cousin (until his retirement) was a high-temperature physicist. My older brother is a gifted scientist, who was one of Discover Magazine’s scientists of the year, and who used to head up one of the two Hewlett Packard research labs until he was headhunted by Trimble.

    My brother is a good example. He had a fistful of patents, both in the name of the company, and in his own name. He also was responsible for a number of discoveries that weren’t patented … because to patent them you have to disclose them, and either he or the company was unwilling to do that. That’s the nature of disclosure, it’s a choice, and it’s one that many “actual, working scientists” don’t want to take for very sound reasons. More power to them.

    The same was true about the journals. My brother published very little in the journals, for the same reason—he didn’t want to disclose the nature of what he had discovered and done. And as an “actual, working scientist” he did not have the “publish or perish” requirements of college professors.

    So I fear that I have little sympathy for your argument that requiring code will somehow cripple scientists. If there are advantages to keeping a discovery secret, those advantages exist whether or not the journals require code.

    w.

  186. But in climate science and some other fields, all we have is data and code, and the code IS the experimental setup.

    Right. Good, we agree on that.

    Now, Michael Mann described his code … but not in anywhere near enough detail to allow it to be duplicated. In fact, for a computer program of any complexity, it is hugely difficult to describe it in enough detail that someone can duplicate it.

    Not usually true. As I mentioned before, we describe parts of a code like parts of a machine or an experiment: Over here we have part A, which does ___ and works by principles A1, A2, A3, and over here we have this other part, part B, which does..
    You get the idea.

    But that may or may not correspond to the actual code, and more to the point, you may be totally unaware that it doesn’t correspond to the actual code. Your code may have gotten slightly corrupted along the line.

    Exactly! If the code is the experimental setup, and the code goes wrong somewhere along the line, then it’s like an experimental setup going wrong. Something like mislabeling your 1M hydrochloric acid as 10M, or maybe the daily humidity affects your experiment, or maybe you misweighed your reagants or mismeasured the temperature.

    In which case, your experiment may not be repeatable even if you provide your methodology, right?

    But if I understand you correctly, you’re not asking experimentalists to open up their labs. And understandably so; it’s even more of an imposition. But don’t say that what you’re asking is SOP for the scientific method.

    So we go away happy, thinking the scientific method is working, where in fact your code is flawed. Then in the next instance, the next time it comes up, your code and mine give widely differing answers …

    … then what? We’ve already agreed that my code and your code were both good, but now they’re not giving the same answer.

    Please tell me how that can be resolved without an examination of your code and my code, to see where they differ.

    Well, say you look at your code, and he looks at his, and you both say that yours is right. As I see it, you have two options:
    1) You publish your code and let others inspect it. If no major flaws are found in your code, you win.
    2) You try to get others to come along and make their own code, and you compare the results. This is the “normal scientific method” and comparable to a normal experimental reproduction: when you get a big result, other groups will often attempt to independently reproduce your work. Why? Two reasons:
    a) They want to “get in” on the new findings. The first paper never covers all the interesting or important details of the find.
    b) If you’re wrong, they get to be the first to publish it.

    Inviting others into your lab and letting them use your equipment wouldn’t help much. If you still had an error in your equipment, they’d “reproduce” the results, but they’d still be reproducing the wrong results. Maybe they’d find the error, but then again, maybe they wouldn’t..
    This is why independent verification is the standard.

    The problem is that in far too many cases, unless you are willing to show your code, your work simply cannot be falsified, and thus it is not science.

    This is completely wrong, I’m afraid. Say Pons and Fleischmann had never let anyone into their cold fusion lab, but they completely and fully described their setup to others. If one, then two, then dozens of other groups tried to reproduce their results, and couldn’t? Those results are gone, baby. Dead. They’re falsified.

    In fact, Pons and Fleischmann did eventually let people into their lab, because they wanted to uphold their cold fusion claims. And yes, in their lab you could reproduce some of their results, but.. not very well, and the experimental setup was somewhat sloppy. So, although we never really found out what was wrong with the experimental setup, the lack of reproducibility by independent groups means that cold fusion was a no go.

    That’s how Mann got away with the Hockeystick fraud—because he hid the data and the code, nobody could falsify it.

    Note that withholding input data and withholding code are two very different things; input data is a necessary part of ‘methods’, just like a full description of the analysis performed on the data. But as you could write a code to perform the analysis yourself, they don’t have to give you their code.

    Basically, the authors are obliged to make sure that all the information necessary to reproduce their work is either included in the paper or in the references. If that fails, if the information they provided is not enough, then their results are in doubt.
    If you manage to find errors in an important paper, or that the results are significantly wrong, write it up and publish it. It’ll be good for your career.. well, assuming you’re the one who’s right.

  187. So I fear that I have little sympathy for your argument that requiring code will somehow cripple scientists. If there are advantages to keeping a discovery secret, those advantages exist whether or not the journals require code.

    It won’t “cripple” all scientists, just some.

    Say you have one guy who spends 2 years developing and testing some amazing new code, and you have another guy who spends 2 years developing and testing an amazing new microscope. One of these guys will publish, publish, publish, keeping his advantage until someone else comes along and spends the time and money to reproduce the (now publicly available) microscope designs. The other one will publish once, and immediately others will start using his code to do all the research he didn’t have time to do. The first mover advantage becomes vastly different for these two scientists.

    Yes, science as a whole would be helped by the code being available. But right now, the scientific funding model doesn’t recognize that scientist’s contributions as being significant. You get no kudos or grant money for writing code that others use; you get grant money for publishing.

    Since there’s no money for writing code without publishing, there will soon be no professors who do this. If they try, they’ll lose their funding, and cease being professors.

    Like “survival of the fittest”, it’s a tautology that there will be few professors who repeatedly do things that are unhealthy for their career. And research careers depend on money, which comes from publishing. There’s a reason we say “publish or perish”.

    If you change the system to reward publication of code, this won’t be a problem. But don’t think that just requiring code publication will be enough: you need to actually reward it, in some way that gives continued funding to those scientists who are willing to develop and publish it.

    And as an “actual, working scientist” he did not have the “publish or perish” requirements of college professors.

    Yep. That makes your comparison rather moot, I think. Sorry.

  188. Joel Shore up above said something that I’ve been mulling over. He said:

    As has been explained many times, “replication” has traditionally not meant what McIntyre and you and many other skeptics have defined it to mean.

    I said I didn’t care what it was called. It is important to be able to follow someone’s steps exactly.

    I think that the misunderstanding revolves around around the idea of falsification. Here’s the problem.

    If a claim cannot be falsified, it is not a scientific claim.

    Take something like Mann’s Hockeystick. As long as he was successful in hiding his code from public examination, it was impossible to falsify his claim.

    You can’t falsify his claim in the traditional manner, by trying to replicate his calculations with the same data. If your results are different from his, he can just say “you didn’t do it right”. That’s what happened with Craig Loehle, for example.

    You see, Mann’s problem (inter alia) was bad math. As long as his code was hidden, it was a guessing game. Nobody could prove that he had done anything wrong. His work was not possible to falsify.

    Once his code came to light, however, the code was quickly shown to contain a fatal math error that made it mine even a random dataset for hockeystick shapes … and there was no way to tell that without access to his code.

    So let me be clear why it should be a requirement that computer code be published along with the study—without it, your work cannot be falsified.

    If someone wants to milk their code for all it is worth, that’s fine. Yes, there could be a cost in that, they may delay publication.

    But a late publication that is falsifiable is science. An early publication without the code is not science, because it is not falsifiable. Easy choice.

    I understand the issues and the questions that you guys raise. Yes, there will be implementation issues. I do not deny that there will be costs for some people. If you’ve been publishing unverifiable stuff, sure, you’re gonna scream about being brought to account. On the other hand, if you want to work with your code until you get every possible advantage, you’ll just have to wait to publish until you’ve done that. That’s a cost/benefit equation you’ll have to work out for yourself.

    The issue that has forced Science magazine to take action is that far too often, what are being published these days are unfalsifiable claims … and the cost in that publication of pseudo-science, to the public as well as to science itself, is huge.

    Here’s the short version. Scientists who publish but won’t reveal their code are peddling unfalsifiable non-science … and some of you folks, who claim to be scientists, are defending that …

    ??

    w.

  189. @Willis Eschenbach (April 20, 2012 at 2:40 pm)

    Necessity’s the mother of invention. (The key word: “need”.) You’d have to set me up with an absolutely permantently guaranteed $100+k/year with sweet pension to get me to comb through that ugliness. Even then it would be a waste of time & effort that could be tolerated only to maintain financial security. I’ve seen nothing that would lead me to believe nature’s secrets are encoded where you direct me. I’ll volunteer this much: You would NOT need code to figure that out. You would only need DEEP, SOUND conceptual foundations. Suggestion: Less luxurious attention to distracting sports and more strategic, tactical focus on the war of survival. I acknowledge that your interests may lie more in the area of climate politics whereas mine are firmly rooted in the area of nature exploration. All the best.

  190. Willis,

    As has been pointed out, a claim can be falsified without having access to all the code. Is it harder than if the author gives you access to all the code? Sure, but there are also advantages to having other scientists investigating the claims write their own code rather than just “auditing” the code. This concept of auditing other people’s code is not really what the whole replication notion was meant to entail.

    I am not dead-set against requiring the release of code, but I think there are lots of significant questions that need to be addressed both regarding the necessity of such a requirement and how it can work the way science is currently practiced. From my work at Kodak, I can say that there was little enough internal incentives to publish externally as it was; it would become a lot worse if scientists had to convince their managers to allow them to release all of the computer code related to the publication!

    Here’s the short version. Scientists who publish but won’t reveal their code are peddling unfalsifiable non-science

    I think that is a harsh thing to say about Spencer and Christy, but if you feel that way, then I guess you need to take it up with them.

  191. Paul Vaughan says:
    April 21, 2012 at 10:57 am

    @Willis Eschenbach (April 20, 2012 at 2:40 pm)

    Necessity’s the mother of invention. (The key word: “need”.) You’d have to set me up with an absolutely permantently guaranteed $100+k/year with sweet pension to get me to comb through that ugliness. Even then it would be a waste of time & effort that could be tolerated only to maintain financial security. I’ve seen nothing that would lead me to believe nature’s secrets are encoded where you direct me. I’ll volunteer this much: You would NOT need code to figure that out. You would only need DEEP, SOUND conceptual foundations. Suggestion: Less luxurious attention to distracting sports and more strategic, tactical focus on the war of survival. I acknowledge that your interests may lie more in the area of climate politics whereas mine are firmly rooted in the area of nature exploration. All the best.

    Gosh, Paul, I see that what I thought was just boasting was actually 100% factual. You said:

    … I’ve NEVER run into a scenario where I needed to reproduce results and could not. Even when others have made mistakes, I’ve always been able to reproduce their results – by figuring out EXACTLY what mistake they made.

    Now, having seen you run as fast as you can away from a challenge to reproduce some actual result, I’m sure your boast is true.

    I’m now convinced that as you say, you’ve “NEVER run into a scenario” where you could not reproduce the results … because I’ve witnessed the speed at which you run away from such scenarios. It all makes sense now.

    w.

  192. joeldshore says:
    April 21, 2012 at 1:48 pm

    Willis,

    As has been pointed out, a claim can be falsified without having access to all the code.

    No, Joel, that hasn’t “been pointed out”. I have asked repeatedly in this thread how you could falsify the Hockeystick without access to the code. Neither you nor anyone has answered, so your claim is false.

    w.

  193. joeldshore says:
    April 21, 2012 at 1:48 pm

    Here’s the short version. Scientists who publish but won’t reveal their code are peddling unfalsifiable non-science

    I think that is a harsh thing to say about Spencer and Christy, but if you feel that way, then I guess you need to take it up with them.

    RSS found an error in S&C’s code, so it is obvious that they have revealed their code, or RSS couldn’t have done that. Since RSS falsified part of the S&C code, the S&C code is OBVIOUSLY FALSIFIABLE … and since I pointed this out before, I do wish you’d pay closer attention.

    As a result of a portion of their code being falsified, it is patently clear that S&C are not “peddling unfalsifiable non-science”.

    w.

  194. joeldshore says:
    April 21, 2012 at 1:48 pm

    … I am not dead-set against requiring the release of code, but I think there are lots of significant questions that need to be addressed both regarding the necessity of such a requirement and how it can work the way science is currently practiced. From my work at Kodak, I can say that there was little enough internal incentives to publish externally as it was; it would become a lot worse if scientists had to convince their managers to allow them to release all of the computer code related to the publication!

    So what? Seriously, so what if some Kodak guy can’t publish unfalsifiable claims? That just reduces the amount of anti-scientific, unsupported, unverifiable BS I have to wade through in the journals.

    Kodak guys publishing claims without code are just stroking their egos. They are not adding to the world’s knowledge, because without the code their claims are nothing but anecdotes. I see that you think they should be free to publish such anecdotes as if they were actually science … me, not so much.

    So I hope that you are correct, and that as a result the number of anecdotes published in the journals goes down as more and more of the journals require code.

    w.

  195. Willis “If a claim cannot be falsified, it is not a scientific claim.”

    Oh dear , Popper’s rather simplistic view of science strikes again. However, let that pass for the moment.

    OK, some of Mann’s work is not science (that would make a good quote if truncated there…), rather it is maths. The process of taking a lot of data and fitting a curve to it is not science, it is maths, and the computer program is the equivalent of all those scribbles on the back of the exam paper. The science comes in the interpretation of the curve. So since it is maths there are two ways , at least, of showing it is wrong. Imagine you are a maths teacher. You could sit down and go through the students scribbles line by line, or you could present him with your working instead, and say “Here, this is how it should be done, go find your mistake”. Both ways show the original answer to be in error.

    Now what I have suggested, is the equivalent of the maths teacher writing the correct solution on the board. Get all the data, write your own code, publish the result. You say Mann may just ignore this? Well he can’t, not if it is done properly. A lot of people think that “peer review” stops when a paper is published, but it doesn’t. There is a mechanism for forcing revision. You write a paper titled “Comment on Mann 1999″ and send it to the original journal. The authors you are criticising will have a right of reply, and both your comment and their reply will be published next to each other in the journal (of course it is way too late for that by now – it would have had to been done at the time). Then everyone can read them and make their own judgement. I have done this myself in the area I work in. This is what should have been done a decade ago.

  196. Here is why the issue is so important right now. You guys seem to think that this is about honest scientists. It’s not. In large and increasing part it is about the age-old fight to keep humans from gaming the system.

    For example, jimmi says above:

    Now what I have suggested, is the equivalent of the maths teacher writing the correct solution on the board. Get all the data, write your own code, publish the result. You say Mann may just ignore this? Well he can’t, not if it is done properly. A lot of people think that “peer review” stops when a paper is published, but it doesn’t. There is a mechanism for forcing revision. You write a paper titled “Comment on Mann 1999″ and send it to the original journal.

    First, the reviewers of Mann’s paper were likely his friends, the climategate emails reveal extensive packing of the reviewer box. I say that because a real review would not have let his paper pass. As a result, it is unlikely they would let an opposing paper through. Second, Mann had also refused to release the data, although later he did let it go. Third, he didn’t publish his method in enough detail to allow anyone to replicate it. Fourth, journals don’t like to publish papers blowing previous papers out of the water, they favor new research that will garner them headlines, and disagreements over old papers don’t do that.

    But suppose they did let it through. Suppose I write my own code and publish my own paper … then what? I can’t claim I’ve replicated Mann’s method, because he hasn’t left sufficient clues to do that. He’s left lots and lots of steps out of what he did, so I’d be lying if I said I was sure I’d replicated his actions. Then what?

    Then Mann does nothing. There’s no reason for him to do anything. He’s already won, the IPCC has splashed his graph all over the planet, he’s been catapulted to fame from deserved obscurity … why say a word? At a maximum he just has to say, “No, Willis is doing something different from what I did.” And almost assuredly, given how vague his description was, he’s right … I would be doing something different, because I’m just guessing.

    In that situation, why on earth would he say anything? Peer pressure? Get real, there’s no peer pressure in climate science, climategate proved that beyond a shadow of a doubt. He has absolutely no reason to say a single word. After all, he knows he’s a fraud, and that anything he says will not be to his advantage. All he has to do is keep his mouth shut, say he’s “moved on”, and publish a new paper. There is no pressure on him at all to discuss, much less expose, his previous work.

    You are living in a dream world, jimmi, if you think Michael Mann would care in the slightest about your math teacher example. He’s not a student, he’s a crook … and you actually think peer pressure would cause him to admit his malfeasance and thereby throw away his career? Really?

    My dear fellow, you truly haven’t spent enough time around climate science. You are advancing ideas that might, and I emphasize might, keep honest scientists in line if they were to wander a bit.

    But here in climate science, we’re kinda short on the honest breed, the dishonest ones are ruling the roost, and tragically, the honest scientists are keeping their heads down and their mouths shut.

    In such a situation, your idealistic pollyanna solutions involving stories of how it works in school are just a bitter joke … remember, this is a guy who flat-out lied to a Congressional committee, and you think your fairy tales about maths teachers are going to bring him into line?

    I don’t know whether to laugh or cry about that level of naivete …

    w.

  197. jimmi_the_dalek says:
    April 21, 2012 at 4:37 pm

    Willis

    “If a claim cannot be falsified, it is not a scientific claim.”

    Oh dear , Popper’s rather simplistic view of science strikes again. However, let that pass for the moment.

    No, let’s not let that snide attempt at falsification via contempt pass for the moment. You appear to be taking the position that a claim that cannot be falsified is still a scientific claim. You’ll have to back that up, you can’t just assert it and then walk away.

    Perhaps you could provide us with say three samples of claims which cannot be falsified, but which in your view are still scientific statements?

    w.

  198. Willis Eschenbach says:

    RSS found an error in S&C’s code, so it is obvious that they have revealed their code, or RSS couldn’t have done that. Since RSS falsified part of the S&C code, the S&C code is OBVIOUSLY FALSIFIABLE … and since I pointed this out before, I do wish you’d pay closer attention.

    As a result of a portion of their code being falsified, it is patently clear that S&C are not “peddling unfalsifiable non-science”.

    McIntyre found an error in Mann et al.’s code, so it is obvious that they have revealed their code, or McIntyre couldn’t have done that. Since McIntyre falsified part of the Mann code, the Mann code is OBVIOUSLY FALSIFIABLE … and since I pointed this out before, I do wish you’d pay closer attention.

    As a result of a portion of their code being falsified, it is patently clear that Mann et al are not “peddling unfalsifiable non-science”.

    Seriously, I don’t understand the difference, except for the following facts:

    Mann has released the code for his 2008 paper so completely for all to see that even McIntyre, who has made his second career out of finding things to complain about, can’t find anything to complain about. (He has also publicly released his 1998 code, although there are apparently some things that McIntyre and you can find to complain about.) In contrast, S&C have never publicly released ANY of their code as far as I can tell…and while they did eventually share a small piece of it with RSS, what I have heard is that it involved quite a bit of work just to get that to happen.

    Willis, I think that everyone can see the obvious double-standard here. It is really quite amusing.

  199. “Perhaps you could provide us with say three samples of claims which cannot be falsified, but which in your view are still scientific statements?”

    I have no beef with the concept of falsification per se, except that it leads people to think that the only way to progress is by the equivalent of going through a students maths line by line. There are other ways to make progress. Put it this way : does using an idea which technically speaking has already been falsified, mean you are not doing science?

  200. jimmi_the_dalek says:
    April 21, 2012 at 7:34 pm

    “Perhaps you could provide us with say three samples of claims which cannot be falsified, but which in your view are still scientific statements?”

    I have no beef with the concept of falsification per se, except that it leads people to think that the only way to progress is by the equivalent of going through a students maths line by line. There are other ways to make progress.

    So everything I said is true, but your beef is that it is not complex enough? Certainly there are other ways to make progress, lots of them … but that has nothing to do with falsification.

    Put it this way : does using an idea which technically speaking has already been falsified, mean you are not doing science?

    Depends on what you are doing with the idea, so your question is far too vague to answer.

    w.

  201. joeldshore says:
    April 21, 2012 at 7:25 pm

    Willis Eschenbach says:

    RSS found an error in S&C’s code, so it is obvious that they have revealed their code, or RSS couldn’t have done that. Since RSS falsified part of the S&C code, the S&C code is OBVIOUSLY FALSIFIABLE … and since I pointed this out before, I do wish you’d pay closer attention.

    As a result of a portion of their code being falsified, it is patently clear that S&C are not “peddling unfalsifiable non-science”.

    McIntyre found an error in Mann et al.’s code, so it is obvious that they have revealed their code, or McIntyre couldn’t have done that. Since McIntyre falsified part of the Mann code, the Mann code is OBVIOUSLY FALSIFIABLE … and since I pointed this out before, I do wish you’d pay closer attention.

    Mann’s code was only falsifiable because he left it on an open server. Yes, once that great stroke of luck happened, it became falsifiable … but not by any action of Manns, and against his will.

    S&C’s code, on the other hand, was given voluntarily by Spencer and Christy to their strongest opponents and nay-sayers, RSS.

    If Mann had done that, we wouldn’t be discussing this. But Mann did everything he could to keep his code private and unfalsifiable, while S&C made their code falsifiable. If you can’t see the difference, I do wish you’d pay closer attention.

    w.

  202. joel shore says:

    “McIntyre found an error in Mann et al.’s code, so it is obvious that they have revealed their code, or McIntyre couldn’t have done that.”

    Shore apparently isn’t up to speed on any of this. The fact is that McIntyre and McKitrick spent years reverse engineering Mann’s code, and got it almost exactly right — with no help from Mann.

    When Steve McIntyre himself states that Mann has fully cooperated in sharing his code, methodologies, data and metadata, I will accept that as definitive. But so far he has not stated that, AFAIK.

    It is amazing to see the ease with which Willis cuts the heart out of Joel Shore’s weak arguments. Joel would be smart to quit digging. But no one has ever accused Joel Shore of being smart, as far as I know.

    The scientific method absolutely requires transparency. Otherwise, it is extremely difficult, if not impossible, to replicate experiments and support or falsify hypotheses. That is why the alarmist clique hides their data, code and methods: they know that their conjectures would be easily falsified if they practiced transparency per the scientific method.

    Really, all of Joel Shore’s arguments in suport of Mann’s obfuscation stem from the knowledge that CAGW is unsupportable. Planet earth is b!tch slapping Shore’s nonsense, but he endlessly repeats Mann’s deconstructed narrative, hoping to keep his pals riding the tax-sucking grant gravy train.

  203. Willis: “Certainly there are other ways to make progress, lots of them”

    Good so now you agree with me. There are other ways to make progress. In fact the main way to make progress does not use Popper’s philosophical theories, but instead progresses by the simple expedient of replacing partially incorrect ideas with less incorrect ideas. All science currently known is incorrect (or perhaps you would prefer ‘incomplete’) it is just that some of it is more incorrect than the rest. Even Mann99 must have some bits correct…..

    “Depends on what you are doing with the idea, so your question is far too vague to answer.”
    Ok, on using already falsified ideas, what if you were using some equations with were known to be incorrect to make a prediction about some real world phenomenon – go on, take a stab at it.

  204. I didn’t read Willis’ comment before I posted, but it looks like we’re on the same page.

    . . .

    Jimmy the d says:

    “…on using already falsified ideas, what if you were using some equations with were known to be incorrect to make a prediction about some real world phenomenon…”

    I don’t understand that at all. Could you please re-phrase, with context? Thanks.

  205. Smokey say:

    When Steve McIntyre himself states that Mann has fully cooperated in sharing his code, methodologies, data and metadata, I will accept that as definitive. But so far he has not stated that, AFAIK.

    (1) It must certainly be fun to be Steve McIntyre: All you have to do is feed people like Smokey what they want to hear and they will unquestionably believe you, while self-parodyingly calling themselves “skeptics”.

    (2) In fact, McIntyre can’t find anything to complain about in regards to sharing of the data, methodologies, data, and metadata for Mann’s major work in this area that is not already over a decade old.

    Meanwhile, I am still waiting for someone to show me where I can find Spencer and Christy’s code. Heck, forget their whole code, at this point, I might even settle for one or two subroutines!

    Willis Eschenbach says:

    S&C’s code, on the other hand, was given voluntarily by Spencer and Christy to their strongest opponents and nay-sayers, RSS.

    This statement is pure spin. Even Christy has admitted that they only gave RSS a part of the code and it is unclear how much effort RSS had to expend to get even this much. I have given links to Mann’s 2008 code that is so completely public that even McIntyre can’t find a nit to pick in that regard. And, I have given links to the most, if not all, of Mann’s 1998 code. You in return haven’t been able to find one line of Spencer and Christy’s code to provide to me.

  206. joel shore says:

    “I have given links to Mann’s 2008 code that is so completely public that even McIntyre can’t find a nit to pick in that regard.”

    If you don’t mind, I will take Steve McIntyre’s word for it.

    Not yours.

  207. Smokey :”I don’t understand that at all. Could you please re-phrase, with context? Thanks.”

    Read the previous comments re Popper. The context basically is that I think that Willis is too hung up on falsification.

  208. jimmi,

    Falsification is absolutely essential to the scientific method. Anything less is post-normal science; AKA: pseudo-science.

  209. Except ,Smokey, were we to apply Popper’s own criteria to Popper’s theory, it would be falsified, though that that would be a paradox….

    Science, I regret to inform you, proceeds not by Popper’s ideas in practice, but by the good old incremental improvement model, coupled with the occasional Kuhn-style paradigm shift. And the relevance that observation is that I hold that the way to deal with examples like Mann is not to spend a decade banging on about falsification, but instead just to publish a better version.

  210. jimmi,

    You do realize that Kuhn was a post-normal guy, don’t you? Sorry, but I don’t think post-normal science is science at all. Testability is essential. If something [like AGW] is not testable, it is not science, it is belief; a conjecture. So I guess we’ll just have to disagree about this. Even though it looks like we’ve both got Michael Mann’s number.

  211. jimmi_the_dalek says:
    April 21, 2012 at 8:08 pm

    Willis:

    “Certainly there are other ways to make progress, lots of them”

    Good so now you agree with me.

    Dang, you’re in a hurry. I have no idea if I agree with you, since neither of us have defined what we mean by “other ways to make progress”. From what you say below, we may not agree at all.

    There are other ways to make progress. In fact the main way to make progress does not use Popper’s philosophical theories, but instead progresses by the simple expedient of replacing partially incorrect ideas with less incorrect ideas.

    Indeed … and the reason ideas get replaced is because they are falsified. Not necessarily 100% falsified root and branch, Newton’s ideas weren’t thrown out by Einstein’s insights. But the applicability of Newton’s laws in all situations was certainly falsified. And that’s why it Newton’s ideas were replaced in those situations by Einstein’s work.

    Which is why I said above that we may not agree. What you have done is just give another example of falsification, not another way to make progress.

    I would put out, as another way to make scientific process, the discovery of graphene. There was nothing that needed to be falsified to make that discovery, so totally new discoveries are another way to make scientific progress.

    Please note, however, that the new discovery must be falsifiable to endure. If you say “I’ve discovered a new arrangement of carbon, but I’m not going to tell you what it looks like or what its properties are”, that is not falsifiable. On the other hand, “I’ve discovered a flat, single-atom thick hexagonally structured form of carbon” is very falsifiable … and has not been falsified.

    All science currently known is incorrect (or perhaps you would prefer ‘incomplete’) it is just that some of it is more incorrect than the rest. Even Mann99 must have some bits correct…..

    I don’t understand what you are getting at here. Things are never any more than provisionally “true”, they are only valid until they are falsified. Which may be never, or may be tomorrow. But this doesn’t mean that “all of science … is incorrect”. For example, it’s not “incorrect” that 2 + 2 = 4, or that hydrogen is the simplest element, or that nuclear fission can be initiated by certain specified procedures. So I don’t understand what you mean that “all of science … is incorrect”. That seems like far too broad a statement to be true.

    For your final comment, I have to back up to provide context, as you have not quoted your claim that started that part of the discussion. You had said:

    Put it this way : does using an idea which technically speaking has already been falsified, mean you are not doing science?

    The conversation continued with my response and your reply:

    “Depends on what you are doing with the idea, so your question is far too vague to answer.”

    Ok, on using already falsified ideas, what if you were using some equations with [which?] were known to be incorrect to make a prediction about some real world phenomenon – go on, take a stab at it.

    If I understand you correctly, you are asking if using incorrect equations to make a prediction is whatever you are calling “doing science” … I haven’t a clue, because I don’t know what you are calling “doing science”.

    I’d say in general that using known incorrect equations is just plain dumb, but I have no idea whether it fits into whatever you are calling “doing science”.

    In any case, why not give up the Socratic method of asking impenetrable questions, and just say what’s on your mind?

    Because I still haven’t a clue what you are getting at with your questions. What does using equations which are known to be incorrect have to do with science? Who would do such a foolish thing as knowingly using incorrect equations? I don’t get it.

    w.

  212. jimmi_the_dalek says:
    April 21, 2012 at 9:34 pm

    Science, I regret to inform you, proceeds not by Popper’s ideas in practice, but by the good old incremental improvement model.

    The problem with that theory is that the “good old incremental improvement model” proceeds because whatever was “incrementally improved” was falsified. See my discussion of Einstein and Newton directly above.

    Newton’s Laws were “incrementally improved” by Einstein’s ideas because Newton’s Laws were falsified in certain unusual situations. This was tested using the transits of Mercury, for which Newton’s Laws didn’t work … in other words, in those certain situations Newton’s Laws were falsified.

    If that had not occurred, if Newton’s Laws had worked perfectly to predict Mercury’s transits, Newton would not have been falsified, and Einstein’s ideas would have been thrown in the trash.

    In other words, your “good old incremental improvement model” depends completely on falsification.

    w.

  213. joeldshore says:
    April 21, 2012 at 8:34 pm

    Willis Eschenbach says:

    S&C’s code, on the other hand, was given voluntarily by Spencer and Christy to their strongest opponents and nay-sayers, RSS.

    This statement is pure spin. Even Christy has admitted that they only gave RSS a part of the code and it is unclear how much effort RSS had to expend to get even this much.

    Thanks, Joel. Since S&C were not forced or ordered to hand over their code, it was voluntary, so no spin there. I’ve heard nothing from RSS saying they asked for other parts of the code and didn’t get it. Nor have I seen anything from Christy about how much code they gave to RSS, and you provide no cites. So I’m not sure what you call spin, what I said seems to be all true to me. I have not made any overarching claims about any of that.

    But this is because I have little interest in Spencer and Christy’s code. I can make sense of Mann’s code and understand it. But what S&C are doing to transform satellite observations into temperatures is another universe from where I work, so perforce I must leave it to folks like the guys at RSS. Obviously, since they found an error in S&C’s code, they must understand it, so I leave it in their hands.

    You seem to think I should take sides in every dispute, even parts of the grand scientific adventure in which I have no interest … sorry, not my job. The world is full of issues, every man has to choose which battles to fight, and this is not one I care to fight, I lack both information and interest.

    If this issue is as important to you as you claim, if you want S&C to do something specific with their code, how about you ask them? Or alternatively, since you haven’t asked them, why are you acting like this issue is so important to you?

    I have no idea why you think it is my business to take a position on S&C and their code, when I know very little about it other than that they gave enough of it voluntarily to RSS for the RSS guys to find the error that they suspected. You seem to want to make that into a crime … but what does that have to do with me?

    Seriously, Joel, you are obsessing about my position on S&C’s code, and I don’t have one, I don’t have the information to form one, and I don’t particularly care about the issue. I have far too many battles to fight without taking on the ones you think are important. If there were a blowup where RSS was demanding the code and S&C were refusing them, I might take a side if the issues were clear.

    But that’s not the case. S&C voluntarily gave enough code to RSS to allow them to find the error they suspected, RSS isn’t making any noise about being denied code, and I have neither the information nor the interest to take some stand on the issue just to please you. So sue me …

    w.

    PS—Above, you claim equivalency between Mann and S&C, saying that both had been falsified. The difference is, when S&C were shown that their code contained an error, they fixed it. When Mann was shown that his code contained an error, he denied it. Hardly equivalent … heck, Mann is still using Tiljander upside-down …

  214. Willis Eschenbach says:

    Thanks, Joel. Since S&C were not forced or ordered to hand over their code, it was voluntary, so no spin there.

    Mann wasn’t “forced” to hand over his code either. And, he certainly wasn’t forced to release all of the code for his 2008 paper.

    I’ve heard nothing from RSS saying they asked for other parts of the code and didn’t get it. Nor have I seen anything from Christy about how much code they gave to RSS, and you provide no cites. So I’m not sure what you call spin, what I said seems to be all true to me. I have not made any overarching claims about any of that.

    Here is what Christy says ( http://climateaudit.org/2006/10/02/christy-on-source-code/ ):

    We gave RSS the part of the code that was still a source of confusion (a correction for diurnal drift for the LT product). In addition, we provided intermediate adjustment datafiles for both MT and LT – going far beyond only the “final product” that Connolley seems to think. We did this as early as 2003.

    It is true that RSS has not audited our complete code (really codes), but they were essentially able to reproduce the intermediate and final results for the various adjustments based on descriptions in our papers and in dozens of emails with more detailed information.

    So, that is Christy, putting his best face on what they did, admitting that they have not allowed RSS access to the complete codes and using the excuse that they haven’t seemed to need it in order to basically figure out what Spencer and Christy have done from their description of their methods. Sound familiar?

    And, I am not saying what Spencer and Christy did was wrong or that we should be calling them frauds. But, then I am not the one who is making broad pronouncements about how publishing papers without the full release of code is not science, etc., etc.

    But this is because I have little interest in Spencer and Christy’s code. I can make sense of Mann’s code and understand it. But what S&C are doing to transform satellite observations into temperatures is another universe from where I work, so perforce I must leave it to folks like the guys at RSS. Obviously, since they found an error in S&C’s code, they must understand it, so I leave it in their hands.

    Fine…So, you personally don’t have such a strong interest in it. However, given that you have used full release of code to make broad pronouncements about scientists and to say whether something is “science” or “anecodote” (even in fields that you know absolutely nothing about, I might add), it is strange how you are so quiet on the issue of the full release of Spencer and Christy’s code because it is not quite in the same subfield of the general field of climate studies that you are most interested in.

    Seriously, Joel, you are obsessing about my position on S&C’s code, and I don’t have one, I don’t have the information to form one, and I don’t particularly care about the issue.

    That doesn’t seem to stop you in other situations.

    RSS isn’t making any noise about being denied code, and I have neither the information nor the interest to take some stand on the issue just to please you.

    That is because RSS doesn’t think that they are entitled to every shred of code that they may want to look at, because they are engaged in real science rather than grandstanding.

    PS—Above, you claim equivalency between Mann and S&C, saying that both had been falsified. The difference is, when S&C were shown that their code contained an error, they fixed it. When Mann was shown that his code contained an error, he denied it. Hardly equivalent …

    Mann has moved on. The best objective scientific assessment of Mann’s codes and temperature reconstructions in general was provided by the NAS panel convened on the matter. And, they did not say that Mann’s results were erroneous, although they did point out some pitfalls of the method that Mann originally used (and no longer uses) and identified some other issues that need to be addressed to get greater certainty (and better quantification of uncertainty) in the reconstructions, particular before 1600. Mann’s 2008 paper is an attempt to deal with these issues.

    heck, Mann is still using Tiljander upside-down …

    I am surprised that you are so tempted to use this sort of talking point. The method that Mann used decides the orientation of the proxy automatically on the basis of the correlation with the historical temperature record over a certain time period. Yes, the Tiljander proxy ended up upside down from the way in which the originators of the data apparently thought the data would correlate with temperature. So, as a result of this (and the belief of those authors that other anthropogenic interference might have contaminated the proxy over the period it seemed correlated with the temperature record), Mann also repeated the analysis with this proxy removed.

  215. @Willis Eschenbach (April 21, 2012 at 1:55 pm)

    Willis, there’s important work to do.
    Why do you waste your time chasing red herrings and attacking straw men?

    The only possible reason I can see:
    Your interest in politics FAR exceeds your desire to explore & know nature.

    Rather than succumb to incessant tabloid-style fascination with Dr. Michael Mann, the “hockey stick”, “climategate”, etc., you could redirect your focus productively by helping out with efforts to understand natural variability. See for example:

    http://wattsupwiththat.com/2012/03/30/open-thread-weekend-9/#comment-940636


    The incessant whining for code spoonfeeding makes the community look quantitatively weak. If the political & quantitative education systems have failed an individual, that individual has the option of attempting to take personal responsibility:

    http://en.wikipedia.org/wiki/Peter_Principle

    We may one day lose our freedom for the simple reason that we wasted it.

  216. Paul Vaughan says:
    April 22, 2012 at 8:09 am (Edit)

    @Willis Eschenbach (April 21, 2012 at 1:55 pm)

    Willis, there’s important work to do.
    Why do you waste your time chasing red herrings and attacking straw men?

    The only possible reason I can see:
    Your interest in politics FAR exceeds your desire to explore & know nature.

    So according to you, you think Science magazine should not be interested in supporting transparent science because it’s “politics” …

    My friend, your post is truly way, way out in left field. I have no idea why you think supporting transparent science is “politics”, but it’s not. Transparency is a crucial part of science.

    The incessant whining for code spoonfeeding makes the community look quantitatively weak.

    “The community”? Science magazine is “the community” in your curious geography? What planet are you living on?

    How does requiring code before publication make anyone “look quantitively weak”? You’ve lost the plot completely.

    w.

  217. Joel, you seem obsessed with Spencer and Christy’s code. And that’s fine. Every man chooses what is important to him, every man decides which battles he wants to fight.

    What is bizarre is your continued insistence that because you care about their code, that I should care about their code. I don’t. Simple as that. It’s not an issue in my world.

    And what is truly over the top is that despite your claimed deep and passionate concern about the matter, in the real world you don’t care enough about S&C’s code to get up off your dead … chair and do one single solitary damn thing about it.

    I’m sorry, Joel, but I’m not going to fight your fights. I have plenty of my own. I don’t think there’s a problem with the S&C code. You claim to care about it, but instead of actually doing something about it, you keep insisting that I should carry your load, that I should share your concern, that I should go into battle over this issue.

    Sorry, my friend, but you have to conduct your own wars. I have neither the time nor the interest to fight them for you, I have battles of my own to fight. If you truly want to see more of their code, then I suggest that you go ask them for it. Because it’s useless to be talking to me about them not doing what you think they should be doing.

    Me, I agree with Steve McIntyre, who commented in your citation,

    It sounds like Christy has made and is continuing to make a diligent effort to provide support and documentation for his analyses, as compared to the obfuscation of the Hockey Team e.g. Michael “I will not be intimidated into disclosing my code”, “I did not calculate the verification r2 statistic – that would be a foolish and incorrect thing to do” Mann.

    Now, it seems that Spencer and Christy’s effort is not enough for you, you want them to do more. And that’s a legitimate point of view, although I don’t share it myself. But since you hold that position, OK, fine, then GO DO SOMETHING ABOUT IT.

    Because bitching about it to me is meaningless—I don’t care about it, and I already have more battles on my plate than I know what to do with.

    If you don’t have the balls to fight your own fights, Joel, don’t come complaining to me and insisting that I should do something you are unwilling to do yourself. That’s your business, not mine. If it truly is an issue to you as you keep claiming, then man up and confront Christy about it.

    And if you’re not willing to man up about it, then my suggestion is that you shut up about it, because trying to bust me for not something that you are unwilling to do yourself just makes you look like a hypocrite.

    w.

  218. Willis Eschenbach says:

    If you don’t have the balls to fight your own fights, Joel, don’t come complaining to me and insisting that I should do something you are unwilling to do yourself. That’s your business, not mine. If it truly is an issue to you as you keep claiming, then man up and confront Christy about it.

    And if you’re not willing to man up about it, then my suggestion is that you shut up about it, because trying to bust me for not something that you are unwilling to do yourself just makes you look like a hypocrite.

    What you don’t get is that it’s not my fight…It is your fight. I am not the one going around making bombastic statements about people being frauds and people engaging in anecdotes instead of science because they aren’t publishing their code along with their papers. I am just the one who is asking YOU not to be a hypocrit and, if you want to make such bombastic statements, then apply them even to people who you might like.

    Or, better yet, admit that your statements are out-of-line. That would be a rather smart thing to do, as Paul Vaughan notes, even from a pragmatic point-of-view. Calling respected scientists “frauds” or claiming that papers published without code constitute “anecdotes” is a good way to make sure that the larger scientific community doesn’t take you seriously.

    Oh, and by the way, I was thinking a little bit more about your statement regarding the difference between Mann and Spencer & Christy being that the latter have acknowledged their mistakes. I already noted some other issues I have with that claim but I will also note, although Spencer and Christy have eventually acknowledged mistakes with the satellite data record, they have tried to downplay them (e.g., they have said things that seem to imply that the mistakes made minimal difference when in fact it is easy to check that they are responsible for about half the difference in the LT satellite trend they claimed in the late 90s vs the current trend they claim, with the other half being due to the longer data record).

    Furthermore, on other issues, they have not come clean:

    (1) To my knowledge, Christy has never forthrightly acknowledged the huge statistical blunder they made in the Douglass et al paper ( http://onlinelibrary.wiley.com/doi/10.1002/joc.1651/abstract ) by comparing the data record to the models but using the standard error rather than the standard deviation as a measure of the uncertainty in the model predictions.

    (2) To my knowledge, Spencer has never acknowledged the huge mathematical blunder he made in a post here (pointed out by tamino, but pretty obvious to anyone once it is pointed out) in which he tried to argue for evidence that the rising CO2 trend might not be due to humans.

    Few scientists are as forthright as they should be about acknowledging errors. I guess it is a general defect of human nature.

  219. joeldshore says:
    April 22, 2012 at 2:35 pm

    Willis Eschenbach says:

    If you don’t have the balls to fight your own fights, Joel, don’t come complaining to me and insisting that I should do something you are unwilling to do yourself. That’s your business, not mine. If it truly is an issue to you as you keep claiming, then man up and confront Christy about it.

    And if you’re not willing to man up about it, then my suggestion is that you shut up about it, because trying to bust me for not something that you are unwilling to do yourself just makes you look like a hypocrite.

    What you don’t get is that it’s not my fight…It is your fight.

    So you now think that you are in charge of deciding what is my fight and what is not? Really? Your ego has expanded to the point that you are now deciding what fights other people should fight? Man, you AGW folks know no bounds.

    My friend, you’ve jumped the shark entirely. Every man gets to choose which fights he wants to fight, and which fights are not important enough to be worth the candle … and no man can decide that for another man.

    Come back when your ego is no longer large enough to require its own postal code, and we can discuss this … in the meantime, consider which of us is claiming loudly that this is a huge issue. That would be you. I don’t think it’s a big issue at all. If it’s a huge issue for you, that’s fine with me.

    But saying that you, the almighty Joel, have decided that it has to be a huge issue for me as well?

    Sorry, never gonna happen. You are welcome to fight with S&C over the availability of their code if that is as important as you claim, or not as you see fit. That’s up to you … although it’s odd that you refuse to do anything about it while simultaneously claiming that it is hugely important.

    But it’s not a fight I’m interested in. Me, I have other fights to fight, and there’s no way you get to pick which fights those are.

    w.

    PS—Christy said in 2010:

    We are in a program with NOAA to transfer the code to a certified system that will be mounted on a government site and where almost anyone should be able to run it. We actually tried this several years ago, but our code was so complicated that the transfer was eventually given up after six months.

    I talked with John Bates of NOAA two weeks ago and indicated I wanted to be early (I said the “first guinea pig”) in the program. He didn’t have a firm date on when his IT/programming team would be ready to start the transition, so I don’t know.

    So Christy is working with NOAA to get it set up so that you and I can not only see the code, but actually run it … but noooo, that’s not good enough for Joel, he wants the code RIGHT NOW or he’ll throw a tantrum and accuse me of mopery on the skyways and other unspecified crimes.

    In any case, NOAA is handling it, and the IT/programming team hasn’t gotten it done yet. Go complain to John Bates of NOAA if you don’t like that.

    Now, I’m sorry if Christy’s actions in that regard don’t satisfy you. I regret that’s not enough for you, that you want something different.

    But you don’t get to decide if that’s enough for me, Joel, and all of your petty bitching about it and all your nasty accusations about me don’t change that one bit.

  220. Oh, yeah, Joel, you make this claim:

    joeldshore says:
    April 22, 2012 at 2:35 pm

    … although Spencer and Christy have eventually acknowledged mistakes with the satellite data record, they have tried to downplay them (e.g., they have said things that seem to imply that the mistakes made minimal difference when in fact it is easy to check that they are responsible for about half the difference in the LT satellite trend they claimed in the late 90s vs the current trend they claim, with the other half being due to the longer data record).

    Well, let’s see. They’ve said:

    Update 24 Aug 2001 *********************

    I’ve discovered a Y2K error in the program which reads
    the diurnal corrections. The corrections for NOAA-14
    were not applied after 1999. These will be applied
    when the August data are processed. Preliminary checks
    indicate the impact is less than 0.01 C/decade. Because
    the diurnal corrections were not completely applied, I
    recomputed the PRT coefficients to adjust for heating
    of the instrument. This impact is between .001 and .003
    C/decade on the full trend, so it is tiny.

    and

    Update 8 April 2002 **********************

    Roy Spencer and I are in the process of upgrading
    the MSU/AMSU data processing to include a new
    non-linear approximation of the diurnal cycle
    correction (currently the approximation is linear).
    In preliminary results, the effect is very small,
    well within the estimated 95% C.I. of +/- 0.06
    C/decade. In the products released today, some
    minor changes have been included (though not the
    new non-linear diurnal adjustment). The 2LT trend
    is +0.053 C/decade through Mar 2002. The difference
    in today’s release vs. last month’s is a slight
    warming of monthly data after 1998. Essentially,
    this release corrects an error in the linear diurnal
    adjustment and produces better
    agreement between the MSU on NOAA-14 and the AMSU
    on NOAA-15. The single largest global anomaly
    impact is a relative increase of +0.041 (April 2001)
    while most are within 0.02 of the previous values.
    The net change in the overall trend was toward a more
    positive value by +0.012 C/decade.

    and

    Update 10 Jan 2003 *****************************

    Roy Spencer has updated the ephemeris corrections which
    had been estimated for the past 9 months. Changes are minor
    with a few monthly global averages a few hundredths C
    warmer in 2002 than as shown last month for LT.

    With the completion of 2002, we will be recalculating
    the target coefficients and if the changes impact the
    trends much, we may up the version number to 5.1 from
    5.0. This should be complete with the calculation of
    the January 2003 values.

    Coming in 2003 – diurnal drift corrections for the AMSUs
    on NOAA-15 and NOAA-16. There should be little impact.
    We will also be merging data from NOAA-17 this year.
    Also, the paper describing Version 5.0 of the microwave
    data will be appearing in J. Atmos. Oceanic Tech. this
    year.

    and

    Update 7 Mar 2003 *****************************

    We have made some changes to the data processing that were
    quite minor. Even so, we decided to change the version
    number to 5.1 from 5.0. These changes will not
    affect scientific results for those of you
    in the process of publishing work from version 5.0.

    For all three products we have
    strengthened the requirement a bit for acceptable data
    to entire into the routine that calculates the intersatellite
    biases. This resulted in a very slightly more negative trend
    in LT by 0.004 c/decade and for MT by about 0.003 C/decade. In
    addition as noted in the 10 Jan 03 entry, we have updated the
    Target Temperature coefficients since 2002 added some MSU data
    from NOAA-14. The update only affected LS (T4). One of the
    coefficients barely exceeded the 40% explained variance threshhold
    this time (NOAA-11), so it was employed in the processing. This
    helped reduce the daily error variance and the difference in trends
    between NOAA10 v. 11 and NOAA11 v. 12. The net effect on the trend was
    about 0.02 C/decade (more positive)

    and regarding the error discovered by RSS they say

    Update 7 Aug 2005 ****************************

    An artifact of the diurnal correction applied to LT
    has been discovered by Carl Mears and Frank Wentz
    (Remote Sensing Systems). This artifact contributed an
    error term in certain types of diurnal cycles, most
    noteably in the tropics. We have applied a new diurnal
    correction based on 3 AMSU instruments and call the dataset
    v5.2. This artifact does not appear in MT or LS. The new
    global trend from Dec 1978 to July 2005 is +0.123 C/decade,
    or +0.035 C/decade warmer than v5.1. This particular
    error is within the published margin of error for LT of
    +/- 0.05 C/decade (Christy et al. 2003). We thank Carl and
    Frank for digging into our procedure and discovering this
    error. All radiosonde comparisons have been rerun and the
    agreement is still exceptionally good. There was virtually
    no impact of this error outside of the tropics.

    and a few more errors

    Update 10 Nov 2006 *******************************

    Notice that data products are back to version 5.2 for LT and 5.1 for MT and LS.

    We had hoped to solve the inconsistencies between NOAA-15 and NOAA-16 by this time, but we are still working on the problem. The temperature data for LT and MT are diverging, and we had originally thought that the main error lay with NOAA-15. However, after looking closely, there is evidence that both satellites have calibration drifts. We will assume, therefore, that the best guess is simply the average of the two. This is what is represented in LT 5.2, MT 5.1 and LS 5.1. These datasets have had error statistics already published, so we shall stick with these datasets for a few more months until we get to the bottom of the calibration drifts in the AMSUs. However, the error statistics only cover ther period 1978 – 2004. The last two years cover the period where the two AMSUs are drifting apart, so caution is urged on the most recent data.

    and

    Update 15 Dec 2006 ******************************

    Due to a dumb mistake, the values for MT were in error when loaded up for the period ending Nov 2006. Rather than eliminating NOAA-16 data (the bad satellite) I had eliminated NOAA-15 (the good satellite) after Sept 2005. So, the values for MT have all been rerun and replaced.

    There are slight changes throughout the time series since the mean annual cycle was affected. I’ve also replaced all of LT to make sure they were consistent.

    and

    Update 13 Apr 2010 *********************************

    The addition of NOAA-18 on the gridded monthly anomalies has created a sudden divergence between land and ocean temperatures beginning in 2005 (when NOAA-18 began) in v 5.3.

    I will update v5.3 through March 2010 without NOAA-18 and place it on the website. There is likely some error in the merging of NOAA-18 that creates this rather spurious redistribution.

    The version 5.3 files without NOAA-18 are appended with 5.3a, i.e. tltmonamg.2000_5.3a and for the sections as uahncdc.lt53a.

    Now Joel, perhaps you can point out in there just exactly where they “have tried to downplay” their errors. Seems to me that it would be hard to be more honest about the exact nature and size of the errors than they have been. So please indicate for us just where you claim they are downplaying the size of the errors.

    Note also that they reported these errors clearly and openly, rather than hiding or dissembling about them.

    You also falsely claim that they have “eventually acknowledged mistakes”, when in fact they have reported them in detail in a timely manner. You’re just making up nasty accusations and hoping they stick.

    In short, I don’t have a clue why you are hating on them regarding their handling of errors. If Mann and others were as open about their errors as S&C have been, life would be much simpler.

    w.

  221. John Christy, from here:

    Sharing data and computer code

    Dr. Roy Spencer and I created the first satellite-based temperature dataset for climate studies in 1990. At present we are working on improvements for the 8th adjustment to the dataset brought about by the divergence of the most recent two satellites. Of the 7 previous changes in methodology, two were discovered by other scientists while the other 5 were discovered by us. Satellite instruments and data are complicated and affected by processes which no one really understands completely. Since we cannot go back in time with better instruments, we have to study the ones that were in orbit then and do the best we can to understand how confounding influences affect the measurements.

    The computer code we employ consists of 6 complicated programs which at times run sequentially on 3 different machines. The raw datafiles are enormous. When asked, we have shared with others parts of the computer code that were important to understanding how our methodology worked as well as intermediate products which served as a test to check that are [our] methodology was doing what it was intended to do.

    When asked, we provided Remote Sensing Systems (RSS) a section of our code which calculated part of the adjustment for the satellites’ east-west drift as well as files with the actual values of the adjustment to be sure that our intention in the code and the output matched. They believed our accounting of this particular adjustment was incorrect. Frankly, this was a difficult process from a personal standpoint. By sharing this information, we opened ourselves up to exposure of a possible problem in the code which we had somehow missed. Or worse, a simple disagreement which would lead to arguments about obscure technical aspects of the problem might arise for which there was no simple answer. However, and more importantly, if there was a problem, we certainly wanted to know about it and fix it.

    Not knowing the outcome of their work, I received a request from RSS for permission to publish one of the files that we had sent to them. In my formal scientific response I wrote, “Oh what the heck” … “ I think it would be fine to use and critique … that’s sort of what science is all about.”

    And so it was that in August 2005 RSS published a clear example of an artifact in our adjustment procedure which created erroneous values in our tropical temperature trend (Mears and Wentz 2005). In Science magazine the following November we published information about our now-corrected temperatures and expressed our gratitude to RSS for discovering our error (Christy and Spencer, 2005, below). The UAH dataset is better as a result. RSS has also generated a set of satellite temperature products which still differ from ours in some aspects and explanations of those differences are being explored and documented in soon-to-be published material.

    The NAS report on temperature reconstructions made the point that when datasets and methods are fully exposed to independent eyes the results will carry more confidence within the scientific community.

    In his testimony, Christy has clearly laid out the issues and the decisions and the outcomes regarding his experience with the sharing of code. He is strong and emphatic about the advantages of openness and transparency.

    And that, Joel, is why the availability of S&C’s code is not an issue for me … yes, as yet there’s no URL available where I can inspect either their six programs that run sequentially on three separate machines or the terabytes of data. But John Christy is an ethical scientist who understands the importance of the sharing of both. And as a result, it does not concern me.

    You keep insisting I have some imagined obligation to speak out against his practices.

    Unfortunately, from my point of view, there’s nothing I can either teach him or reproach him with regarding transparency. What would I say?

    w.

  222. @Willis Eschenbach (April 22, 2012 at 10:24 am)

    I will leave you to your unproductive pursuits.

  223. Paul Vaughan says:
    April 23, 2012 at 6:47 am

    @Willis Eschenbach (April 22, 2012 at 10:24 am)

    I will leave you to your unproductive pursuits.

    Thank goodness, I thought you’d never leave.

    w.

  224. I think another reason scientists hesitate to publish their code is that *they do not want to support it*. Sometimes, people think the original authors of code are required to solve their problems. The “no warranty” clause of the GPL is an important one. The culture of “patches welcome” hasn’t really made it into academic science, where it’s more likely to be seen as rude that you’d edit somebody else’s work.

Comments are closed.