Anthony has suggested that I post a paper of mine that was written in 2006 about unusual temperatures in Svalbard. It was published as a peer-reviewed submission in Energy and Environment, the journal that AGW supporters love to hate.
First, where the heck is Svalbard? It’s up at the top of the world, at 78° North, where it’s frozen most of the year.
Figure N1. Location of Svalbard. The islands are north of Norway, marked by a snowflake. The North Pole is shown as a red star, and Greenland is to the left.
In April 2006, there was an anomalously high temperature recorded at Svalbard, which was the subject of my paper.
How anomalous was April ’06? Here’s the month by month record for Svalbard, showing the temperatures from January (1) to December (12) since 1912
Figure N2. Monthly temperatures in Svalbard. The green circle is the anomalously high temperature in April 2006. Note also the orange circle (April 1917), blue circle (September 1990) and the yellow circle (October 1968). Photo is of Svalbard.
The most obvious feature of the Svalbard record is that there are huge year-to-year swings in the winter temperature. The months from November to March show a temperature range of about 20°C.
Is the April 2006 temperature (green circle) unusual? Yes, it is. It is 3.0 standard deviations (SD) from the average of all of the April data. However, it is not the most unusual. That would be the October 1968 (yellow circle) at 3.5 standard deviations from the October average. We also have September 1990 at 2.9 SD, and April 1917 at 2.8 SD out.
So with that as a prelude, here is my 2006 paper.
————————————————————–
PROBLEMS WITH PUBLISHING SCIENTIFIC INFORMATION ON THE WEB: HOW UNUSUAL WERE TEMPERATURES IN SVALBARD, NORWAY?
Willis Eschenbach
INTRODUCTION
Traditional science journals accept papers for publication following peer review of their contents that provides at least some independent assessment of the paper’s contents. As part of the scientific process, the content of such papers can be challenged in the journal that publishes them. However, with the advent of the web it is becoming common for some scientists to publish work on web sites. This practice has the advantage of rapid communication to a wide audience, but it by-passes the safeguards inherent in the traditional publication procedure. This increases the risk of flawed information becoming published, with inevitable lowering of scientific standards and possible harm to the reputations of individual scientists.
This paper discusses an example that raises the problems of ‘web publication’ for scientific information. The web site in this case is RealClimate.org. Michael Mann and Gavin Schmidt are listed as the first two principals of the site. Both are well known for their scientific research in support of the hypothesis that the we are currently experiencing dangerous anthropogenic (human-caused) global warming, and both have strongly defended this position in public as scientific truth.
The stated purpose of the RealClimate site is the presentation and discussion of factually accurate scientific information concerning climate change. The site is commonly known by the abbreviation ‘RC’. The example concerns publication on that web site of a claim concerning unusually warm temperatures in April 2006 at Svalbard, Norway.
This discussion should be understood in the context of the fierce debate about the causation of current Arctic temperatures, as well as the prediction of rapid further rises by many climate models. Both form a major platform in current energy policy debates world wide, and especially the UK and USA.
THE EVENTS
On May 22nd, 2006, the RealClimate web site published an article titled “More On The Arctic” by two very well-known climate scientists, Michael Mann and Phil Jones. The full text of the report published on the RealClimate web site is as follows:
Svalbard, an Arctic island in the Northern North Atlantic, is predicted to warm considerably more than most of the rest of the earth in many model-based scenarios [http://www.grida.no/climate/ipcc_tar/wg1/fig9-10.htm]. See for example the figure to the right, which represents a relatively high-end IPCC Third Assessment Report scenario for the projected surface temperature difference between the period 2071–2100 and 1961–1990. Svalbard is the island north of Norway at about 80N between 15-30E. The enhanced warming in this region is related to the issue of polar amplification that we have discussed previously on RC. It also happens that the Svalbard meteorological station is the 2nd station in the World Meteorological Organization (WMO) meteorological station list. This means that it tends to get noticed. The Climatic Research Unit (CRU) of the University of East Anglia maintains one prominent version of the global surface temperature data set and as part of its routine quality control, CRU flags any unusual (anomalous warm or cold) new measurements that come in. Svalbard has now been flagged consistently over the past several months, but the values have been confirmed as accurate by the Norwegian Met Service, which operates the Svalbard station. Here are the recent Svalbard monthly surface temperature measurements, the longterm (1961–1990) means (“ybar”) and standard deviations (“sd”), and associated anomalies i.e., departure from average (“delta”) for Dec 2005 through April 2006 (all in degrees C):
Month, Value, ybar, sd, Delta
Dec 05, -3.8, -13.3, 4.4, +9.5
Jan 06, -2.7, -15.3, 4.7, +12.6
Feb 06, -9.8, -16.3, 3.7, +6.5
Mar 06, -13.1, -15.8, 3.7, +2.7
Apr 06, 0.0, -12.4, 2.7, +12.4,
The numbers are fairly remarkable. April ‘06 was warmer than any previously recorded May, and January ‘06 was warmer than any previously recorded April. The previously warmest April was -7.0C (1996). There is currently an absence of sea ice off much of the coast of Svalbard, which is also unprecedented for so early in the year.
The April mean temperature is almost 5 standard deviations above the mean, a “5 sigma event” in statistical parlance. Under the assumption of stationary ‘normal’ statistics, such an event is considered astronomically improbable (< 1 in 10^6), and, like the summer heat wave in Europe in 2003 (which was a 5 sigma event in Switzerland, 3 sigma over Europe as a whole), deserves special attention. As we have nonetheless remarked before on RC, particular events, even seasonally-persistent anomalies as unusual as these, do not “prove” anthropogenic warming. But in a statistical sense, large outliers like this make it more probable that the underlying distributions are shifting and give us a glimpse into the types of anomalies we might expect to become more common in the decades ahead.
In their article on RC, a number of claims are made, including that the recent unusually warm April temperatures in Svalbard, Norway, represented a “five sigma event”.
“Sigma” is the distance of a data point from the average of a data set, as measured by the standard deviation. The higher the sigma of an event, the less likely it is that the event is the result of chance. The odds of a “five sigma event” happening by chance are less than one in a million. If this were actually a “five sigma event”, then the reported warm April 2006 Svalbard temperatures would be greatly outside the normal temperature range, and would thus represent a very significant occurrence with important implications.
This was an extraordinary claim which, if true, would have had great importance for detection of climate change in the region of Svalbard. Therefore, replication of the finding was warranted. This replication required use of the source data for the calculation conducted by Mann & Jones and reported in the article. The text quoted above states that this data is archived at CRU.
My investigation failed to reveal the data at CRU, but I found that the GISS temperature database does have a record of temperatures measured at Svalbard. However, although the data reported on RC was for the period “1961-1990”, the GISS Svalbard record contains no data prior to 1977. In an attempt to resolve this problem I tried to post a comment to the RC thread. The comment I tried to post said in full:
According to GISS, the Svalbard station only started reporting temperatures in mid 1977. How did you determine the “long-term (1961–1990) means (“ybar”) and standard deviations (“sd”)”? Thanks, w.
This seemed to be a reasonable question, but it was not published by RC and did not appear on the web site. I did not receive any answer to my question either in public or as personal correspondence. Therefore, I resent the question, and suggested that my original post might have been caught in the spam filter. This was answered on RC, but the entire answer said:
[Response: The comment was most likely deleted because it was already indicated above that the record is available back to 1911, and the source of the data (Climatic Research Unit, not GISS) was already indicated. —mike]
Presumably, the “Mike” who answered the question was Michael Mann. It seems that my question had lacked precision, and that I should have asked exactly where I could find their dataset. But their answer repeated that the data was from CRU, so I went again to look at the available data at CRU. Again I failed to find the relevant data. In the meantime, Hans Erren picked up the topic, in a posting that was accepted and posted by RC which said:
Fair enough, but Svalbard Luft started in October 1977 and Isfjord Radio stopped with continuous recording in June 1976. How was the homogenization obtained between the two stations. Moreover the nearest other station Bjornoya has the 50’s hotter than the 00’s, Which suggests an inhomogeneity in Svalbard in 1977. Here is the giss list:
0 km (*) Svalbard Luft 78.2 N 15.5 E 634010080002 rural area 1977–2006
47 km (*) Isfjord Radio 78.1 N 13.6 E 634010050010 rural area 1912–1980
425 km (*) Bjornoya 74.5 N 19.0 E 634010280003 rural area 1949–2006
RC posted a reply to that, saying in full:
[Response: For any further details, you and any other interested readers should refer to the linked CRU website for information and references leading to an extensive body of literature that describes how CRU forms representative composite 5 degree latitude ´ longitude gridbox estimates (which is what is referred to here) that account for time-dependent sampling variations and potential inhomogeneities associated with the individual recording stations that fall within the same grid cell. —mike]
Once again I examined the “linked CRU website”. Still finding nothing, I asked directly for the source of their data, saying in an attempted post to RC (in full):
Mike, thank you for your answer to Hans Erren. You say:
For any further details, you and any other interested readers should refer to the linked CRU website for information and references leading to an extensive body of literature that describes how CRU forms representative composite 5 degree latitude ´ longitude gridbox estimates (which is what is referred to here) that account for timedependent sampling variations and potential inhomogeneities associated with the individual recording stations that fall within the same grid cell.
Now I’m confused. You say that you are referring to the representative composite 5° ´ 5° gridbox temperatures … but in your original post, you say “Here are the recent Svalbard monthly surface temperature measurements …” and you compare them to the December ‘05 to April ‘06 Svalbard surface temperature measurements. You also refer to a check made with the Norwegian Met Service to make sure that your figures were correct, which clearly means that they are not gridbox temperatures.
Could you clarify which dataset you used for your calculations, the merged Svalbard/Isfjord Radio surface temperature dataset, or the 5° ´ 5° gridbox temperatures?
I ask in part because, despite an extensive search the CRU site, I cannot find any dataset for Svalbard or Isfjord Radio. Is the CRU the source of your dataset, and if so, where?
Many thanks for your clarification, w.
This comment was a simple request for the actual location of the dataset, but once again it was not published, and did not appear on RC. Again, I received no reply, either by email or on the web. I again tried to obtain an answer by attempting to post the following comment:
Thank you, Mike. Much appreciated. However, the “source of the data” (CRU) doesn’t make individual station records available, so I had to make do with what I had.
Upon further research, I found that the complete spliced dataset is available at http://www.unaami.noaa.gov/analyses/sat/#table.
In any case, the record is a spliced record, with a one year period between the two records without any data. I’m curious about the justification for the splice, since there is no overlap, and I’m curious that a spliced record with no overlap between the two station records would be used for this type of statistical procedure.
There is a larger problem with the claim, however, This is that you have taken the mean and standard deviation of a subset of the data (1961–1990), and are using these figures to derive sigma figures for a time period which is out of the subset (Dec ‘05 to April ‘06).
Mathematically, this is simply not correct. For example, could we take the mean and SD of a ten year subset, say 1983–1992, and compare that to April ‘06? If we do, we will get a very different answer. By that measure, the odds of April ‘06 are only about one in a hundred.
The only way to do a sigma calculation is to include all of the data when calculating the mean and standard deviation. Otherwise, the answer depends on which subset of the data is chosen, which is clearly wrong.
This is a very serious error, which materially affects the stated results. I expect that, in conformance with your stated policies, my message (being scientific in nature, not a troll, and not abusive in any way) will not be censored. Many thanks, w.
This posting was also not published, despite the reminder of the stated policy of RC. The censoring of this comment is in direct contravention of their stated policy which says in part:
1) Questions, clarifications and serious rebuttals and discussions are welcomed.
2) Only comments that are germane to the post will be approved. Posts that only contain links to inappropriate, irrelevant or commercial sites will be deleted.
3) No flames, profanity, ad hominen comments, or you said/he said type arguments.
4) As stated in the blog description, no discussions of non-scientific subjects will be allowed.
However, although my attempted post was a germane, scientific, and serious rebuttal of their claim. it was not “welcomed” as they claimed—instead, it was not published at all. However, soon afterwards a curious “update” from Michael Mann appeared on RC which said:
[Response:(update) I was a bit careless in my wording. The long-term series from 1911 to date currently maintained by the Norwegian Met Service (and to which we are referring to here) is based on the combined record for the Svalbard airport and Isfjord Radio sites. The latter started in 1911. The Norwegian Met Service combined the two records in one of the recent nordic projects, and any inhomogeneities were taken into account in the process. There is little doubt that the anomalies observed so far this year are unprecedented as far back as the measurements go (early 20th century). – mike]
This was very strange. In the original posting (quoted above), RC clearly said the data was station data from Svalbard maintained at CRU. Then in their first response to me, they again cited CRU as the source, saying “the source of the data (Climatic Research Unit, not GISS) was already indicated”. In their response to Hans Erren (quoted above), they said they were discussing, not CRU station data, but “composite 5 degree latitude ´ longitude gridbox estimates (which is what is referred to here)”.
Finally, they said they had used a totally different dataset, not maintained by CRU, but by the Norwegian Met Service. This is not “careless wording” — either it is three very large but honest errors, or it is an attempt to obscure the data source.
Then on the 28th of May, RC closed the thread without further comment or explanation. The thread had existed for only 6 days from posting to closure, which is very much shorter than other threads on RC. A number of very important questions remained unresolved. By censoring posts and closing down the thread, RC has made it impossible to resolve those questions on the thread, making it worthwhile to discuss them here.
The Problems With The Svalbard Record
I eventually found the location of the Svalbard dataset although Mann & Jones were unwilling to reveal it – and they never did – despite repeated requests for the exact location where the dataset could be found.
The Svalbard dataset is available here. The file, Nordklim_data_set_v1_0_2002.xls, is a 9.1 meg Excel file containing a variety of climate data for a number of Norwegian sites. The Svalbard station has the Norwegian station ID 99840, and the Svalbard monthly mean temperature data starts on line 4639 of the sheet called “101”.
As Mann finally admitted, the source of the Svalbard record is a “spliced” dataset, meaning it is a combination of the Svalbard and Isfjord Radio records. He does not mention, however, and may not have known, that the location of the Isfjord Radio station was changed three times. The station then fell into German hands during WWII, and subsequently was returned to Norwegian control at the end of WWII. The Norwegian scientists have done an admirable job combining the data from these four locations into one record. They have made a number of adjustments to each section of the Isfjord Radio dataset to try to make it have relevance to the final dataset, that of Svalbard. The complete merged Svalbard record is provided here as Figure 1.
Figure 1: The Svalbard temperature data set. The blue line is monthly temperatures. The black line is a 35 month (+/- 3 std dev) Gaussian average of the monthly temperatures.
There are two things of note about this record. One is that there is no sign of any recent “polar amplification” of warming that was claimed on RC. On the contrary, the warming occurred at the beginning of the record, not at the end. Second, the record shows no signs of any recent significant warming. For the period 1920–2006, there is no statistically significant trend in the data. The 1920–2006 trend is 0.04°/decade, with a two standard deviation error of 0.08° per decade, which means it is not a significant trend. The 1970–2006 trend is 0.04°/decade, with a two standard deviation error of 0.2° per decade, which again is not significant.
Finally, it must be remembered that this record is a spliced record from three Isfjord Radio locations in one area (with data collected by two different countries at different times), plus the Svalbard record from a location in a different climate zone 43 kilometres away. Since it is a spliced record from four different locations, any statistical conclusions from the record must be suspect. With that caution in mind, let us look at the claims made by Mann and Jones in their article published on RC.
THE ERRORS IN THE MANN-JONES ANALYSIS
Inflated Claims
According to their own figures, this was not a “five sigma event” with odds of less than one in 10^6. It was a 4.6 sigma event, which has a probability a full order of magnitude smaller, less than one in 10^5. (An “order of magnitude” means one number is ten times larger or smaller than another.)
Improper Choice of Statistical Methods
As they stated, the authors used statistics designed for stationary normal datasets. However, the temperature dataset they used is neither stationary (which in this context corresponds to trendless) nor normal. Both the standard deviation and the average change over time. Because of this, the results depend on the period chosen for the baseline. Figure 2 shows the variation in the results for different thirty year periods.
To illustrate the effect of the choice of the exact thirty year period, I calculated the “sigma” of the April 2006 temperature using all possible thirty year periods in the record. Using different thirty year periods, the odds of the April 2006 Svalbard temperatures range from a low of one in 28, to a high of one in 5,392,753. Hence, one could pick any period to prove any point. The “five sigma event” claimed by Mann & Jones is merely one of many possibilities.
In addition, the authors neglect to give error figures for their result. The green line in Figure 2 shows the error bar (+/- 2SD) for their results (using the ‘61=’90 data).
Incorrect Calculation of Standard Deviation
It appears that the Mann & Jones calculations for the 1961–1990 period contain an error. Although the averages they published on RC were exactly correct, their calculated standard deviations (sd) were not. While it is possible that this is from some corrections made to the data (removal of incorrect data points etc.), since they have not made their data available it is not possible to say whether this is actually an error or is an unrevealed alteration of the data. Here are their sd values, compared to the actual sd values for the 1961–1990 period.
Month, M&J sd, Actual sd
Dec 05, 4.4, 4.7
Jan 06, 4.7, 5.1
Feb 06, 3.7, 4.0
Mar 06, 3.7, 4.2
Apr 06, 2.7, 3.0
It is of interest that in all cases, the effect of their error was to reduce the calculated standard deviation, and thus increase the sigma. In an attempt to determine the source of their erroneous standard deviations, I calculated the standard deviation of all possible periods in the record. There are only three periods where the five months have a standard deviation that matches sd values published on RC by Mann & Jones. These periods are 53 years starting in 1938, 52 years starting in 1939, and 51 years starting in 1940. While this was probably an honest error, Mann & Jones chose one of the lowest periods in the record. For example, of the 44 fifty-one year periods in the record, only 9 of them give smaller standard deviations and thus larger sigmas. Once again, their error artificially increases the calculated sigma.
Figure 2: Average, standard deviation, and sigma for 30 year (trailing) periods of the Svalbard April temperature data set. The blue line is the thirty year trailing mean (right scale). The red line is the 30 year trailing standard deviation. The black line is the sigma of the April ‘06 temperature record, based on that mean and standard deviation. The green line is the error bar (+/-2SD) for the sigma of the author’s chosen period (61–90).
While the post 1990 average is rising, this rise is not statistically significant.
Lack of Correction For Autocorrelation
At the beginning of their statistical section (above), Mann & Jones say “Under the assumption of stationary ‘normal’ statistics, …”. A “normal” dataset is one that obeys the standard laws of random events (see “Further Reading” at the end for more information on normal distributions).
However, their assumption of stationary normal distribution is totally unwarranted. As professionals in the field, they know that temperature series are generally not “stationary ‘normal’” distributions. In particular, even a cursory examination shows that the April temperatures of this dataset are significantly non-stationary and nonnormal. Both the average and the standard deviation of the April dataset change over time (see Figure 2 above), showing that they are not stationary. The dataset is also what is called “fat-tailed”, which means that it contains an unusually large number of temperatures which are either higher or lower than the expected range. Figure 3 shows the Svalbard temperature data compared to a normal distribution. The Jarque-Bera Test for normality finds the Svalbard dataset non-normal with p<0.02.
Figure 3. Histogram of Svalbard Data. Blue dots show midpoints of bins for data. Red line shows normal distribution given the same bins.
Temperature series are also generally “autocorrelated”, meaning that a hot year is often followed by another hot year, and vice versa. Because of the non-normality and the autocorrelation of the Svalbard April dataset, standard statistical tests such as they have used give inaccurate results. In particular, they produce artificially inflated sigma values. They also often indicate significant trends where no such significant trend exists.
The True Sigma Value Of The Claimed ‘5 Sigma Event’
The errors of Mann & Jones all combine to artificially increase the apparent significance of the April temperature record. Let me correct them, one at a time.
a) The odds a five-sigma event occurring by chance are one in 1,744,000. This is the “less than one in 10^6” figure Mann & Jones claimed in their article.
b) However, according to their own figures, this was a 4.6 sigma event. The odds of this are one in 236,700, about an order of magnitude smaller than their claim.
c) In addition, they made an error in their figures. When their erroneous 1961–1990 standard deviation calculations are corrected, it becomes a 4.2 sigma event. The odds drop by another order of magnitude, to one in 37,465.
d) Next, we can get a more accurate idea of the sigma by using the average and standard deviation of the full dataset. Using the full dataset, April 2006 represents a 3.4 sigma event, with odds of one in 1,484.
e) Finally, when the calculations are corrected for autocorrelation, we get a sigma of 2.87. The odds of this are one in 244.
So correcting for the above errors in the calculations reduces the sigma of the April 2006 temperatures from their claimed “5 sigma event” to a more accurate 2.87 sigma event. Corresponding to the drop in the sigma, the odds of the April 2006 temperature occurring by chance are reduced from one in three million to one in 244. And these corrections do not include correction for the non-normality of the dataset in any measure except autocorrelation.
In other words, the April 2006 temperature in Svalbard is not statistically unusual in any way. It does not indicate “enhanced warming”. It is not “astronomically improbable”. It does not support the claimed “polar amplification”. It does not “make it more probable that the underlying distributions are shifting.” In fact, it doesn’t indicate anything at all—it’s just another April in Svalbard.
CONCLUSIONS
The ‘Svalbard Affair’ is a clear demonstration of the problems that can be posed to scientific standards by the increasing use of ‘web publication’ that by-passes the traditional method of scientific publication. Errors that could be expected to be detected by peer review can be published on web sites. In the case of the ‘Svalbard Affair’ the problems with the published work were so obvious that they were immediately seen and reported to the authors. But the response of the authors was to attempt to try to stop any question of the matter, and this would also have been difficult if traditional publication had been used. Other less obvious cases may go undetected before data published on the web have become used in related studies by other workers.
Indeed, the analysis of Svalbard temperatures published on the RC web site continues to be used after ‘Svalbard Affair’. For example, the UKCIP (UK Climate Impacts Programme) is based at the University of Oxford and is funded by the UK government to help organizations assess how they will be affected by climate change so they can prepare for its impacts. On 12 June 2006 (i.e. weeks after RC closed the Svalbard thread) UKCIP circulated an email newsletter saying;
“5. Artic study demonstrates temperature rise
Real Climate has published a paper detailing new climate observations in Svalbard, an Arctic island in the North Atlantic. Observed temperatures this year are far warmer than comparable data from previous years. In statistical terms, the April 2006 temperature difference from average is equivalent (if not greater) to the summer 2003 heat wave in NW Europe. You can read the article at http://www.realclimate.org/ index.php?p=309”
It is difficult to imagine a more clear example of how ‘web publication’ can result in effects that are not justified by the worth of the published work. Errors of Mann & Jones in the ‘Svalbard Affair’ would very probably have prevented their analysis passing through peer review for publication in the traditional manner.
The detected errors of the RC authors in the ‘Svalbard Affair’ were:
1) They exaggerated their own calculations by a factor of ten, from calculated odds of less than one in 10^5 to claimed odds of less than one in 10^6 (seemingly for shock effect).
2) They used an inaccurate procedure, calculating the sigma from a subset of the data rather than from the full data set.
3) They did not correctly calculate the standard deviation of the 1961–1990 subset. 4) Although they know that temperature records are rarely “stationary, ‘normal’” datasets, and if they had looked they would have known that this temperature dataset was not a “stationary, ‘normal’” dataset, they used statistical procedures designed solely for use with stationary normal datasets.
5) They were either “careless in their wording” in specifying three different incorrect locations as the source of their data (they twice claimed it was CRU station data, then said it was CRU 5° ´ 5° gridbox data), or they did not know the source of their data, or they were deliberately obscuring the source of their data.
6) Rather than choosing a complete record from a single station for their analysis, they used a spliced record from four different locations. This makes the result of any statistical analysis very questionable. They did not reveal this in their initial posting, and but for the prodding of Hans Erren and myself, the fact would have gone unremarked.
7) They claimed “enhanced warming in this region” and a “polar amplification” of warming. However, not only is there no polar amplification in the Svalbard record, there is no significant warming in most of the Svalbard record (1920–2006 trend is 0.04° per decade +/– 0.08°C per decade [2 SD]). Even if one were to take their numbers at face value and to ignore the lack of statistical significance, this is only 60% of the reported global warming for that period, which is the opposite of “amplification”.
8) Despite repeated requests, they never indicated the actual web address of the dataset.
9) When scientific questions were asked about some of these errors, they did not publish the questions. When the questions were asked again, they did not publish the new questions.
10) To shut off the scientific questions entirely, they closed down the thread after only six days.
The last three of these problems are the most disturbing. Everyone makes mistakes, and some scientists have been known to try various types of unusual or doubtful methods with the data and calculations, or to exaggerate their claims. It is part of science. But to refuse to publish important scientific questions, to obfuscate regarding the source of your data, and to close down discussion when someone points out an error, is the antithesis of science.
Science proceeds by trial and error, and the finding and subsequent publication of scientific mistakes is a critical part of that process. The participation of Drs. Mann and Jones in this affair clouds their scientific reputations, and for the RealClimate web site to claim it welcomes scientific discussion when in fact it shuts out scientific discussion is a grave misrepresentation of the actual situation that may not be known to many who use the site as an information resource.
I am not an opponent of the publishing of scientific information on the web. I think that it is an very good way to get new ideas out into the scientific community, and serves as an excellent complement to traditional scientific journals. However, as is the case with publication in scientific journals, if scientists choose to use this method they need to publicly answer scientific questions about their claims. To do otherwise brings all of their claims into question.
I would welcome any comments from Drs. Mann and Jones, or from any of the principals of RealClimate such as Gavin Schmidt, regarding any of these issues. Unfortunately, despite repeated attempts to get them to answer even one of these questions, to date Drs. Mann, Jones, and Schmidt have bluntly refused to address these matters that seem to be egregious, grave and serious errors of fact, method, and scientific conduct.
Willis Eschenbach 6 June 2006
FURTHER READING
There is an excellent explanation of statistics as applied to temperature series called “Statistical Issues Regarding Trends”, by Tom M. L. Wigley. It is available at Climatescience.gov/Library/sap/sap1-1/finalreport/sap1-1-final-appA.pdf. It is a short paper, and is well worth reading by anyone interested in the theoretical underpinnings of the statistical concepts I have discussed in this paper.
———————————————————————–
FINAL THOUGHTS:
1. It’s always interesting to re-read something I wrote a while ago. I’d write it different now, but that’s how life works.
2. The high April temperatures in 2006 have not been a harbinger that “the underlying distributions are shifting” as Mann and Jones claimed. Svalbard temperatures peaked in 2006, and have dropped since then:
Figure N3. Monthly and annual temperatures in Svalbard. Photo is of Longyearbyen, Svalbard, the town where the current temperature station is located.
3. When scientific studies are published on the Web, it is crucial that the authors stick around and answer all relevant scientific questions. It is also vital that scientific questions are not censored. Watts Up With That is a good example of a web site doing this the right way. Yes, some of the claims posted here are wrong. I’ve been wrong on WUWT before. But I’ve read and attempted to answer all scientific questions, and acknowledged my errors when they are pointed out. This is how science advances.
4. Given our current knowledge of the emails and the evasion of Freedom of Information requests at CRU, the Mann/Jones claim that “the source of the data (Climatic Research Unit, not GISS) was already indicated” is particularly risible.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.





Final Thought 4: Yep. Climategate was one big “Told ya so”!!!
The arrogance, cowardice, and disingenuity of the so-called scientists running RC never ceases to amaze.
You have something going until you hit this spot,
‘According to their own figures, this was not a “five sigma event” with odds of less than one in 10^6. It was a 4.6 sigma event, which has a probability a full order of magnitude smaller, less than one in 10^5. (An “order of magnitude” means one number is ten times larger or smaller than another.)’
One order of magnitude between 100000 and 10000 is a not a significant difference when talking about one event. Either way, 5 sigmas or 4.6 sigmas, it’s still an extraordinarily small chance of it happening.
Getting down to the 1 in 200 range is significant, but as you point out, it’s questionable how significant a ‘spliced’ set of data is in assessing the temperature changes that have occurred over the last 80 or so years.
Also,
‘It is of interest that in all cases, the effect of their error was to reduce the calculated standard deviation, and thus increase the sigma.’
is contradictory. Sigma is the standard deviation. If we are reducing the standard deviation, then we are also reducing sigma. Reducing the standard deviation will make a single point seem farther from the mean, thus increasing the number of sigmas that single point is away from the mean. That may have been the point you were trying to make, but you’re saying it in a roundabout and confusing way. It just needs to be cleared up.
As for,
‘While the post 1990 average is rising, this rise is not statistically significant.’
what significance test/s did you use to determine whether this trend is ‘significant’ or not?
I think that you’re correct in your assessment in the way that RC dealt with this particular post and the comments you brought up. It’s likely that they felt no one would call out that type of aspect of the analysis.
Thanks for the post and the place to comment.
Cheers.
Thanks, Willis!
I found it interesting that you pitched this as primarily a critique on web-publishing of scientific documents.
Reading it, it is also a detailed refutation of a web-published paper, a strong criticism of the site in question, and a criticism of the scientists involved. Did you consider publishing it as only a refutation of the web-published paper, and added the critique parts later, or were they an integral part of the development of this paper?
Re the author’s points 1-10 “… detected errors of the RC authors in the ‘Svalbard Affair…’
What is the probability that all of these errors could occur in one article, written by professional scientists, by chance?
10 May: Space and Science Research Center: Food and Ethanol Shortages Imminent as Earth Enters New Cold Climate Era
Press Release SSRC 2-2010 :
The Space and Science Research Center (SSRC), the leading independent research organization in the United States on the subject of the next climate change, issues today the following warning of imminent crop damage expected to produce food and ethanol shortages for the US and Canada:
Over the next 30 months, global temperatures are expected to make another dramatic drop even greater than that seen during the 2007-2008 period. As the Earth’s current El Nino dissipates, the planet will return to the long term temperature decline brought on by the Sun’s historic reduction in output, the on-going “solar hibernation.” In follow-up to the specific global temperature forecast posted in SSRC Press Release 4-2009, the SSRC advises that in order to return to the long term decline slope from the current El Nino induced high temperatures, a significant global cold weather re-direction must occur. According to SSRC Director John Casey, “The Earth typically makes adjustments in major temperature spikes within two to three years. In this case as we cool down from El Nino, we are dealing with the combined effects of this planetary thermodynamic normalization and the influence of the more powerful underlying global temperature downturn brought on by the solar hibernation. Both forces will present the first opportunity since the period of Sun-caused global warming period ended to witness obvious harmful agricultural impacts of the new cold climate. Analysis shows that food and crop derived fuel will for the first time, become threatened in the next two and a half years. Though the SSRC does not get involved with short term weather prediction, it would not be unusual to see these ill-effects this year much less within the next 30 months.”
The SSRC further adds that the severity of this projected near term decline may be on the order of 0.9 C to 1.1 C from present levels. Surprising cold weather fronts will adversely impact all northern grain crops including of course wheat and the corn used in ethanol for automotive fuel….ETC
http://www.spaceandscience.net/id16.html
So now they are hiding their OWN decline! LOL
They decline to answer questions. They decline to show their data. They decline to even allow further questions. They decline, indeed, to be scientists because they decline to follow the scientific method and they decline to admit to mistakes they’ve made while appearing to be scientists…
And so they continue to manufacture false premises and as they announce one falsified horror after another, their coffers fill up with government Grants fed by the fear they create.
And so the decline of Science continues into the abyss of Political venality…
I used statistical analysis for many years in an industrial experimentation context. I can appreciate the complexity of analysis when not dealing with a designed experiment but rather analyzing “raw” data. (As was often the case to determine the experimental “bite size” of the designed study to follow.)
Getting “corrections” after the fact and especially after recommendations were provided to management were particularly onerous. Care must be taken when straying from what you know by using methods and techniques that can be helpful when used correctly and extremely misleading when used improperly.
A cautionary tale with “significant” implications.
What are the odds that all of Mann’s errors were unintentional and unbiased? The answer is probably a six-sigma event.
Willis,
No disrespect but it seems like a strange thing to write about in a peer-reviewed paper. What interest to academia is the to-and-fro of emails and blog postings between you and Mann. Or even the incorrect postings of data on a blog. What is the academic value of your E&E paper?
“Real Climate Science by Real Climate Scientists”
In the light of recent events, this is unfortunately all too true.Which is a sad commentary of the state of “climate science”
With the current state of Peer Review this might have been published in Nature or Science with your conclusions also being ignored.
Some pretty dodgy stuff gets through peer review as well – web publishing as exemplified by this post might well bring a new layer of accountability because of greater transparency. It also circumvents some of the covert censorship inherent in peer review processes. Ultimately, the greater market freedom of the web (while more anarchic) may allow for frank and free discussions.
In a geological context, perhaps it’s worth mentioning that, 56 to 34 million years ago, during the Eocene, the island of Spitsbergen, which is in the Svalbard archipelago, had alligators.
And I know what you all are thinking. No, it wasn’t at a much lower latitude. The Arctic was just MUCH warmer back then.
HR says:
May 12, 2010 at 8:13 pm (Edit)
That’s an interesting question, HR. More and more science is being done in the context of the Web, from pre-publication of interesting ideas to the type of analysis published by RC about Svalbard. All of this is (or should be) of interest to academia.
More to the point, when scientists such as Mann and Jones misbehave in public on a very visible blog, it is damaging to all of science, academia included. Academics seem to think that what happens in the blogosphere is of no interest to them. But millions of people (including academics) read blogs such as this one, and form their opinions of both the science and the academics from what they read.
As such, it behooves us all to behave “scientifically” when posting about science. For me this means that the work (like all scientific work) needs to be transparent, well cited, with the data clearly identified and code and methods explained, and that objections to the work be answered or at least acknowledged. You know, the standard scientific stuff that Mann and Jones didn’t do.
If we don’t do that, all of science suffers … including the academics. The arrogance of the people who run and write head posts at RC is one of the reasons that people don’t trust climate scientists much, and that affects us all.
A-pril in Svalbard … Sig-mas in blos-som …
zzzMmmpf! Sorry, must have dozed for a moment. Once again, thanks for an excellent article.
This is yet more confirmation that Mann, Jones, RC & Co. are not doing science, they are desperately doing propaganda to keep a fundamentally ridiculous fraud alive — which has been obvious for over a decade and is made devastatingly clear in the Climategate emails and programmer’s notes.
maxwell says:
May 12, 2010 at 7:28 pm
My point exactly. Their claim that the event was extraordinary and highly significant was not true. It was unusual … but not unusually so.
I wonder why the Willis’ stymied search for responses and accurate information from Mann & Jones sounds so ummm, …familiar?
Willis,
Awsome piece of detective work, wow.
That said, I followed the link in your post to the article on polar amplification on RC. I noticed something rather odd in the article.
Fig. 1 in the article quotes a 1980 study predicting (modeling?) temperature increase by latitude and height for CO2 quadrupling. Not double, quadruple. Based on the last few decades of CO2 measurements that would take over 600 years. They seem to show (I think) that surface temps in the tropics will rise about 3 degrees and arctic temps about 9. But then… look at Fig 2 which shows the surface temperature warming in degrees PER CENTURY from a DIFFERENT study and shows the arctic at…. 9! Now if in 1980 some wild eyed climatologists thought we would be averaging 600 million barrels per day by now, I can roll my eyes and laugh. What’s the excuse for still using that number in 2010? Or am I reading this wrong?
BTW – nothing in the article about surface temps being much lower in the arctic in the first place and hence by Stefan Boltzman more sensitive to CO2 forcing, which I find equally odd in an article about why the polar regions should be expected to heat up faster!
What’s the excuse for still using that number in 2010?>>
I should be in bed already. The date on the RC article is 2006. Which doesn’t really change the question.
Graeme W says:
May 12, 2010 at 7:32 pm
It was written as all of those. I was upset by the bogus scientific claims, outraged that they would not allow even simple questions about their work, and astounded that they either didn’t know or wouldn’t say where they got the data … so I wrote about it all.
The biggest statistical error is not accounting for selection bias.
How many stations are there to choose from? If there are 100, then the odds of a 2 or 3 sigma event happening at one station are pretty high.
Now if they want to pick a SINGLE station this year to be analyzed for outliers NEXT year, then I might pay attention. Otherwise, it is just cherry picking.
I would not be surprised if after correcting for the number of stations available that this result becomes a 1 sigma event.
James
I note that your last graph shows the same pattern post 1990 that I’ve found in many other locations. A “hockey blade” rising with a ‘clipping’ of low going monthly data. Note that the tops run along about the 5 line, it is the bottoms that make the “trend” in the annual line. As near as I can tell, this is an artifact of the “QA” process that tosses out low going extremes more than high going ones. It is a serious flaw in the product of GHCN and GISS (and one presumes, CRU as well).
A very valuable project would be to gather truly raw daily data from some selected sites and compare it to the computed monthly means. Basically, audit the “QA” process (that despite it’s name, does more than simply check quality… For the USHCN it will also create and fill in ‘missing data’ with averages from surrounding locations; so if you toss low extreme events and fill them in with averages, what happens to the monthly mean you then calculate from those ‘data’?… ) I suspect a similar thing is done to the GHCN data.
You misspelled “heck” in the title. Touch typing is not for the inexperienced.
As one whose initials are ‘RC’ I take personal offense at the association.