A question about proxies and calibration with the adjusted temperature record

treemometer_MannWUWT reader Tom O’Hara writes in a question that seemed worthwhile to discuss. Paleo specialists can weigh in on this. It seems to me that he has a point, but like him, I don’t know all the nuances of calibrating a proxy. (Graphic at right by Willis Eschenbach, from another discussion.)

O’Hara writes:

[This] is a puzzle to me.

Everything we know about past climate is based on “proxies.”  As I understand the concept, science looks at “stuff” and finds something that tends to mirror the changes in temperature, or whatever, and uses that as a means to determine what the likely temperature would have been at an earlier time.  This is, I am sure, an oversimplified explanation.

So what we have, in essence, is a 150 year or so record of temperature readings to use to determine our proxy’s hopeful accuracy.

Now my question would be, if we are continuously adjusting the “readings” of that record, how does that affect the usefulness of the proxy information?

If I have correlated my proxy to a moving target, doesn’t that effect the likelihood that the proxy will yield useful information?

It would seem to me that this constant massaging of the database used to define and tune my proxy, would, in the end, destroy the utility of my proxy to deliver useful information.  Or have I got it all wrong?

A few primers for discussion:

1.Detecting instabilities in tree-ring proxy calibrationVisser et al

Abstract. Evidence has been found for reduced sensitivity of tree growth to temperature in a number of forests at high northern latitudes and alpine locations. Furthermore, at some of these sites, emergent subpopulations of trees show negative growth trends with rising temperature. These findings are typically referred to as the “Divergence Problem” (DP). Given the high relevance of paleoclimatic reconstructions for policy-related studies, it is important for dendrochronologists to address this issue of potential model uncertainties associated with the DP. Here we address this issue by proposing a calibration technique, termed “stochastic response function” (SRF), which allows the presence or absence of any instabilities in growth response of trees (or any other climate proxy) to their calibration target to be visualized and detected. Since this framework estimates confidence limits and subsequently provides statistical significance tests, the approach is also very well suited for proxy screening prior to the generation of a climate-reconstruction network.

Two examples of tree growth/climate relationships are provided, one from the North American Arctic treeline and the other from the upper treeline in the European Alps. Instabilities were found to be present where stabilities were reported in the literature, and vice versa, stabilities were found where instabilities were reported. We advise to apply SRFs in future proxy-screening schemes, next to the use of correlations and RE/CE statistics. It will improve the strength of reconstruction hindcasts.

Citation: Visser, H., Büntgen, U., D’Arrigo, R., and Petersen, A. C.: Detecting instabilities in tree-ring proxy calibration, Clim. Past, 6, 367-377, doi:10.5194/cp-6-367-2010, 2010.


From WUWT A new paper now in open review in the journal Climate of the Past suggests that “modern sample bias “has “seriously compromised” tree-ring temperature reconstructions, producing an “artificial positive signal [e.g. ‘hockey stick’] in the final chronology.”

Basically, older trees grow slower, and that mimics the temperature signal paleo researchers like Mann look for. Unless you correct for this issue, you end up with a false temperature signal, like a hockey stick in modern times. Separating a valid temperature signal from the natural growth pattern of the tree becomes a larger challenge with this correction.  More here



Calibration trails using very long instrumental and proxy data

Esper et al. 2008


The European Alps are one of the few places that allow comparisons of natural climate proxies, such as tree-rings, with instrumental and documentary data over multiple centuries. Evidence from local and regional tree-ring analyses in the Alps clearly showed that tree-ring width (TRW) data from high elevation, near treeline environments contain substantial temperature signals (e.g., Büntgen et al. 2005, 2006, Carrer et al. 2007, Frank and Esper 2005a, 2005b, Frank et al. 2005). This sensitivity can be evaluated over longer timescales by comparison with instrumental temperature data recorded in higher elevation (>1,500 m asl) environments back to the early 19th century, and, due to the spatially homogenous temperature field, back to the mid 18th century using observational data from stations surrounding the Alps (Auer et al. 2007, Böhm et al. 2001, Casty et al. 2005, Frank et al. 2007a, Luterbacher et al. 2004). Further, the combination of such instrumental data with even older documentary evidence (Pfister 1999, Brázdil et al. 2005) allows an assessment of temporal coherence changes between tree-rings and combined instrumental and documentary data back to AD 1660. Such analyses are outlined here using TRW data from a set of Pinus cembra L. sampling sites from the Swiss Engadin, and calibrating these data against a gridded surface air temperature reconstruction integrating long-term instrumental and multi-proxy data (Luterbacher et al. 2004).

paper here: Esper_et_al_TraceVol_6 (PDF)


newest oldest most voted
Notify of
nutso fasst

How does CO2 concentration affect growth rates?

Coach Springer

“Lay person” reaction: Proxy is projection of the present onto the past. Change the present, change the projection. The issue serves to remind me that these are only projections rather than assuming a false accuracy just because different projections might produce minute variances. (A whole set of forecasts could be off by a mile but only vary from one another minimally. Sound familiar with regard to projections into the future?)

Maybe that the reason for the divergence problem is also bad weather station placements, especially in the arctic. If you “train” your proxies using “bad” stations, where situation has been changed, you may see the artificial warming, which is not visible in the tree rings, simply because there was no real warming.

David McKeever

Statistics can look at the same data and look at it from a different angle (so to speak) and find a stronger correlation with some subset of the data (pca analysis is just one technique). Once you have a data set you aren’t frozen into one analysis. That also opens the door to abusing these same techniques (see Steve McIntyre on the hockey stick). Abusing the methods to find a predetermined pattern doesn’t nullify all the methods (used appropriately).

Joseph Murphy

A general rule of mine, you can not do hard science on anything with a specific date attached to it. Experimentation requires that time be irrelevant. (You can do an experiment that shows x causes y. But, if you know that y occurred at sometime in the past, you can not do an experiment to show that x was the cause of that y.) This post seems to be pondering some of the extra assumptions required when specific ‘times’ are incorporated into science.

Jim G

Don’t trees, like most living things, adapt over time to their environment? Plus the variables are many, ie CO2, moisture, temperature, sunlight, humidity, etc.


– It is important to realize that field temperature readings are themselves proxies of particle velocity or kinetic energy.
– In addition, the unit scales (Celsius, Kelvin, etc.) employed are also proxies for reality.
– calibration is also a proxy for ‘accuracy’, since precision and the limits of observation make the resulting readings a ‘fuzzy’ probability cloud rather than a single value.


Related to proxies, what are the resolution of various proxies? I always hear (mostly from skepticalscience.com) about how proxies show that we’ve never seen as rapid a temperature rise as we have in the last century anywhere in the historical record. My impression, though, is that there’s not enough resolution in the proxies at the sub-centennial scale. Is this true? Can someone help shed some light on this for me? Thank you!


All you are really pointing out is that tree rings in particular make lousy proxies, because tree growth rates are highly multivariate and because any process you use to include or exclude specific trees on the basis of IMAGINED confounding processes are open opportunities for undetectable confirmation bias to creep into your assessment. You can only reject trees if you think you know the answer they are supposed to be providing, outside of the usual statistical process of rejecting extreme outliers. But one of the problems with Bayesian reasoning in this context is that one man’s Bayesian prior can all too easily become another man’s confirmation bias that prejudices a particular answer. One has to have a systematic way of reassessing the posterior probabilities based on data.
But data is what this approach can never obtain. We cannot ever know the temperatures in the remote, pre-thermometric past. Hell, we can barely assess them now, with thermometers! One could do a multi-proxy analysis, using things like O18 levels that might be a completely independent proxy with independent confounding errors to improve Bayesian confidence levels, but I’ve always thought “dendroclimatology” is largely nonsense because of my work on random number generator testers (dieharder).
Here’s an interesting question. Once upon a time, before computer generation of pseudorandom numbers became cheap and reliable in situ, books like the CRC handbook or Abramowitz and Stegun (tables) often included pages of “tested” random numbers for people to use in Monte Carlo computations done basically by hand. Even into the 90’s, one of the premier experts on random number generators and testing (George Marsaglia) distributed tables of a few million “certified” random numbers — sets that passed his diehard battery of random number generator tests — along with the tests themselves on a CD you could buy. What is wrong with this picture?
Random number generators are tested on the basis of a pure (null) hypothesis test. One assumes that the generator is a perfect generator (and that the test is a perfect test!), uses it to generate some systematically improvable/predictable statistic that can be computed precisely some other way, and then compute the probability of getting the answer you got from using the RNG if it were a perfect RNG. In case this is obscure, consider testing a coin, presumed to be 50-50 heads and tails. If we flip the coin 100 times and record the number of heads (say) we know that the distribution of outcomes should be the well-known binomial distribution. We know exactly how (un)likely it is to get (say) 75 heads and 25 tails — it’s a number that really, really wants to be zero. If we have a coin that produces 75 heads and 25 tails, we compute this probability — known as the p-value of the test — and if it is very, very low, we conclude that it is very, very unlikely that a fair coin would produce this outcome, and hence it is very, very unlikely that the coin is, indeed, the unbiased coin we assumed that it was. We falsify the null hypothesis by the data.
Traditionally, one sets the rejection threshold to p = 0.05. There isn’t the slightest good reason for this — all this means is that a perfect coin will be falsely rejected on average 1 time out of 20 trials of many samples, no matter how many samples there are in each trial. A similar problem with accepting a result if it reaches at least a p-value of 0.05 “significance” plagues medical science as it enables data dredging, see:
which is an entire education on this in a single series of cartoon panels.
However, there is a serious problem with distributing only sets of random numbers that have passed a test at the 0.05 level. Suppose you release 200 such sets — all of them pass the test at the 0.05 level, so they are “certified good’ random numbers, right? Yet if you feed all 200 sets into a good random number generator tester, it will without question reject the series! What’s up with that?
It’s simple. The set is now too “random”! You’ve applied an accept/reject criterion to the sets with some basically arbitrary threshold. That means that all the sets of 100 coin flips that have just enough heads or tails to reach a p-value of 0.04 have been removed. But in 200 sets, 8 of them should have had p-values this low or lower if the coin was a perfectly random coin! You now have too few outliers. The exact same thing can be understood if one imagines testing not total numbers of heads but the probability of (say) 8 heads in a row. Suppose heads are 1’s and tails are 0’s. 8 heads in a row is something we’d consider (correctly) pretty unlikely — 1 in 256. Again, we “expect” most coin flips to have roughly equal numbers of heads and tails, so combinations with 4 0’s and 4 1’s are going to be a lot more likely than combinations with 8 0’s or 8 1’s.
We are then tempted to reject all of the sets of flips that contain 6, 7, or 8 1’s or 0’s as being “not random enough” and reject them from a table of “random coin flips”. But this too is a capital mistake. The probability of getting the sequence 11111111 is indeed 1/256. But so is the probability of getting 10101010, or 11001010, or 01100101! In fact, the probability of getting any particular bit pattern is 1/256. A perfect generator should produce all such bit patterns with equal probability. Omitting any of them on the basis of accept/reject at some threshold results in an output data set that is perfectly biased and that will fail any elementary test for randomness except the one used to establish the threshold. This is one reason that humans make lousy random number generators. If you are asked to put down a random series of 1’s and 0’s on the page, or play rock-paper-scissors with random selection, you simply cannot do it. We aren’t wired right. We will always produce series that lack sufficient outliers because 11111111 doesn’t look random, where 10011010 does.
With that (hopefully) clear, the relevance to data selection in any sort of statistical analysis should be clear. This is an area where angels should rightly fear to tread! Even the process of rejecting data outliers is predicated on a Bayesian assumption that they are more likely to be produced by e.g. spurious errors in our measuring process or apparatus than to be “real”. However, that assumption is not always correct! When Rutherford (well, really Geiger and Marsden) started bombarding thin metal foil with alpha particles, most passed through as expected. However, some appeared to bounce back at “impossible” angles. These were data outliers that contradicted all prior expectations, and it would have been all too easy to reject them as fluctuations in the apparatus or other method errors — in which case a Nobel prize for discovering the nucleus would have been lost.
In the case of tree ring analysis, it is precisely this sort of accept/reject data selection on the basis of an arbitrary criterion that led Mann to make the infamous Hockey Stick Error in his homebrew PCA code — a bias for random noise to be turned into hockey sticks. Even participants in this sort of work acknowledge that it is as much guesswork and bias as it ever is science — usually off the record. In the Climategate letters, one such researcher laments that the trees in his own back yard don’t reflect the perfectly well known temperature series for that back yard, causing his own son to get a null result in a science fair contest (IIRC, it is some time since I read them:-). Then there is the infamous remark ON the record about the need to pick cherries to make cherry pie — to the US Congress!
I spent 15 years plus doing Monte Carlo computations, which rely even now on very, very reliable sources of random numbers. Applying heuristic selection criteria to any data series with a large, unknown, unknowable set of “random” confounding influences in the remote past to “improve” the result compared to just sampling the entire range of data and hope that the signal exceeds the noise (eventually) by dint of sheer statistics is trying to squeeze proverbial statistical blood from a very, very hard no-free-lunch stone. Chances are excellent that your criterion will simply bias your answer in a way you can never detect and that will actually make your answers systematically worse as you gather more data relative to the unknown true answer.

Pamela Gray

Proxies based on solar metrics may also find themselves with a published temperature paper that has morphed from a gold standard to one that is now questionable and possibly unreliable. But this should be seen as part of the scientific process and not reflect poorly on the authors of such papers. There have been many examples in the past where understanding at that time was accepted only to see that understanding nearly stand on its head decades later (or even a few years later) yet those papers were not pulled and can still be read today. Which is the way it should be. The fact that tree ring and other proxies are now being questioned, and temperature observations adjusted up or down is rather normal in the history of science advances and paradigm shifts rather than an exception, and the process whereby these things happen should remain in the journals instead of removed.
Which reminds me of a very important step in defensible research. Do your literature review very thoroughly. That vetting process should not be quickly dispatched lest you find yourself basing your entire work on out of date information or somewhat paradoxically, current science fads that will eventually go down the same path.

Robert of Texas

There is a related issue I would like to hear someone address:
Proxies such as Tree Ring measurements have multiple confounding factors: Temperature, Water Availability, Sunlight, CO2 Availability, Nutrient Availability (other than CO2), other stress factors (pests, disease, early winter). There may be others.
So not only does the baseline move, but how do you assign the growth of a ring to all of these factors (and probably more I didn’t think of)? Each of these factors may change year to year or decade to decade. I just do not understand how you untangle them without introducing bias.

Bob Kutz

No, I think the part where dendro-chronology falls off the rails has nothing to do with revising data. The fact that our proxy goes completely the wrong way for about the last 50 years (aka “hide the decline” in the original context of ‘Mike’s Nature trick’) means that this particular proxy should be completely disregarded until such time as the difference can be reconciled.
THAT, at least, shouldn’t be hard for anybody to understand.
Too bad the media completely glossed over Meuller’s real comments on that point in favor of his ‘vindication’ of the historical temp. data.

BioBob says:
June 13, 2014 at 8:07 am

– It is important to realize that field temperature readings are themselves proxies of particle velocity or kinetic energy.
– In addition, the unit scales (Celsius, Kelvin, etc.) employed are also proxies for reality.
– calibration is also a proxy for ‘accuracy’, since precision and the limits of observation make the resulting readings a ‘fuzzy’ probability cloud rather than a single value.

The first two are true, but they have the advantage of being replicable (ie, I can build a thermometer in my kitchen, calibrate it to the freezing & boiling points of water, & I will be very close to everyone else’s thermometers), unlike the paleo-proxies, which have to simply be trusted.
The third point is a great big “So what? That’s life, get used to it.”


RGB, Plus One, as usual.

Ron Clutz

In Soviet Russia, they used to say: “The future is known, it is the past that keeps changing.”


From the article above: “Furthermore, at some of these sites, emergent subpopulations of trees show negative growth trends with rising temperature. These findings are typically referred to as the “Divergence Problem” (DP). ”
This is not a divergence problem – This is basic biology – all plants have optimum growing ranges – A bell curve – too cold – plants grow slowly, as it warms plants grow faster until it reaches an optimum growth, as temp gets even warmer, the plant growth slows down. An important question for the Detro experts – is how can you tell the difference along with all the other factors, light, nutrition, rainfall, etc
Virtually all plant species have geographical ranges. There is a reason plants growing in northern latitudes dont grow well in southern latitudes and visa versa for plant species growing in southern latitudes.
Is the yamal ural divergence problem due to getting too warm? I dont know.
Are the proxies that are not picking up the MWP due to it getting too warm and therefore having slower growth?

lemiere jacques

using a proxy means you are making an assumption.
being able to assess a proxy means you didn(t need it.
well if you have several independant proxies you can begin to work more seriously.


I have not seen a discussion about sampling technique and sampling bias. I teach a basic statistics course and have some insight into these problems when associated with any study. Personally, I do not think that taking a few trees in the Northern Hemisphere constitutes a valid sampling technique. In addition, a researcher must be very careful to extrapolate conclusions beyond the region in which the samples were taken. From my perspective, the only valid data we have for the entire planet is the satellite data and we just don’t have enough of it to be drawing firm conclusions about anything. Maybe an expert can weigh in on sampling.

Gunga Din

My conclusion? There’s more than one bug in Mann’s tree rings.


at a minimum every time the data is “adjusted” any modeling done using said data becomes invalidated and must be rerun … if the modeler used hindcasting to tune his model he would have to retune the model with the new historic data and rerun the model for future forecasts …


Stark Dickflüssig says: June 13, 2014 at 8:34 am The third point is a great big “So what? That’s life, get used to it.”
So what ? Every AGW graph, every temperature reading I have ever seen ignores point 3. Liquid in glass thermometers typically have a plus or minus .5 degree F limit of observability reported by the manufacturer and yet weather station report temperatures that employ such devices supposedly have a precision of .01 to .001 rather than to the nearest degree F that the instrument proxy can not discern.
That’s what Stark. Read em and weep for “life as we do NOT know it”. The central limit theorem (which likely does not apply in any case) concerns variance, not instrument limitations.


By comparing two very respectable science proxy-based reconstructions and one observational data calculation, good agreement is attained ( HERE
A provisional conclusion could be that the accord may not be coincidental, or at least unlikely.
a – …annual band counting on three radiometrically dated stalagmites from NW Scotland, provides a record of growth rate variations for the last 300 years. Over the period of instrumental meteorological records we have a good historical calibration with local climate (mean annual temperature/mean annual precipitation), regional climate (North Atlantic Oscillation) and sea surface temperature (SST; strongest at 65-70°N, 15-20°W)….- Baker, A., Proctor, C., – NOAA/NGDC Paleoclimatology Program, Boulder CO, USA.
b – ….. observational results indicate that Summer NAO variations are partly related to the Atlantic Multidecadal Oscillation.Reconstruction of NAO variations back to 1706 is based on tree-ring records from specimens collected in Norway and United Kingdom …. – C. Folland, UK Met Office Hadley Centre.
c – …. Solar magnetic cycles (SIDC-SSN based) & geomagnetic variability (A. Jackson, J. Bloxham data) interaction. Calculations – vukcevic


Stanleysteamer says: June 13, 2014 at 9:33 am From my perspective, the only valid data we have for the entire planet is the satellite data
Even sat data has it’s problems or we would not have things like this:
example from the post:
“The U.S. physicist agrees there may now be thousands of temperatures in the range of 415-604 degrees Fahrenheit automatically fed into computer climate models and contaminating climate models with a substantial warming bias. This may have gone on for a far longer period than the five years originally identified.”


Stark Dickflüssig says: June 13, 2014 at 8:34 am I can build a thermometer in my kitchen, calibrate it to the freezing & boiling points of water, & I will be very close to everyone else’s thermometers
LOL – when pigs fly.
Build your thermometer:
1) demonstrate how it’s response is linear between -200 c to +100 c or whatever range you use,
2) determine the limits of the changes it can reliable discern (limits of observation)
3) let me know how you define “very close” when identical machine produced devices placed in identical seeming Stevenson screens at identical heights, etc vary significantly with age, etc etc.
In short bullcrap !!

Peter Miller

Ron C, I hope you know that saying is the First Rule of Mann – The future is known, it is the past which keeps changing.
The gatekeepers of the earth’s ground temperature records obviously believe the same.

The fork in tree proxies for me is that I planted 10 or 12 trees of the same age/size 14 years ago in my yard, most of them are about the same size, but I have the largest (+10/20%) right next to the smallest (-10/20%). I know why they grew different (water), but once the trees were turned into lumber (IIRC where the oldest proxies came from), there’s no way you’d know.


In my humble opinion, ALL proxies must be viewed with extreme caution. For a multitude of reasons, but mostly because
a) it is very unusual for a proxy measurement to be in effect a remote measurement of a single parameter itself determined by a single factor/parameter (I can’t think of one offhand). Hence, all proxies are based on assumptions of other factors (often many, as in tree ring width!).
b) the accuracy of the ‘proxy’ measurement itself – i.e. the physical analysis and measurement of the proxy – e.g. isotope analysis – for such micro detections – accuracy (and therefore the deduced proxy effect) is paramount.
c) the physical issue of the proxy itself – e.g. take ice cores, where we have the additional assumption that the trapped air is indeed ‘trapped’ and has not been altered since trapping or cross contaminated with adjacent ‘bubbles’, etc. Is there an error in there? How would we know?
d) the cross comparison or calibration (if you like) with modern measurements. How do we know the modern measurement and calibration is realistic? In truth, we simply cannot ‘know’ for such long term things as climate assumptions and unless humankind lives on for another few millennia, it is unlikely we will ever be able to check our calibrations!
When you add all these potential errors together – it is clear that they could compound/combine together very badly and give extremely misleading results, or at least a ‘value’ with very large error bars!. The primary scientific assumption, with which I do not agree – is that any errors will be evened out over the dataset. For example, I consider it a bit wild to assume that the effect of the modern calibration for isotope analysis of ice cores since the last ice age should be assumed/applied for much older times, e.g. before the last ice age!


In the case of tree ring analysis, it is precisely this sort of accept/reject data selection on the basis of an arbitrary criterion that led Mann to make the infamous Hockey Stick Error in his homebrew PCA code — a bias for random noise to be turned into hockey sticks.
No, that only happened if you use Monte-Carlo data with an extremely high autocorrelation parameter. The NRC had to use AR1(.9) to get that result.

Steve McIntyre

As someone that’s spent a lot of time on proxy data, I don’t regard adjustments to temperature data as an important issue in proxy reconstructions. Or even a minor issue.


“Is the yamal ural divergence problem due to getting too warm?”
No. This is a problem with climatologists and their strange data processing. The divergence of dendros is a myth.


Proxy construction is another area where climate research hopes that combining/averaging a number of mediocre estimators might improve the accuracy of the resulting estimate. Of course when the proxies are highly correlated, reduction in the variance/dispersion of the average or other linear combination will be small. Laws of large numbers (LLNs) require something close to independence (or else low correlation) between the things being averaged. A point rgbatduke often makes about averaging GCM simulations.


My understanding follows, taking tree rings as an example:
If we know what the temperatures were over a period of time and, further, if we know the amount the tree rings expanded over that time, we can ascribe (calibrate) a certain amount of tree expansion to a change in temperature.
So it is assumed, ceteris paribus, that we may infer the ambient temperature from tree rings at a time when no temperature record exits.

“Is the yamal ural divergence problem due to getting too warm? I dont know.
Are the proxies that are not picking up the MWP due to it getting too warm and therefore having slower growth?”
The divergence has been variously explained.
I recall one study that put some of the issue down to temperature adjustments
(need to verify)
More recently


Jez, I wish Jim Bouldin would join this discussion. He has become an outlier in the paleo world hisself.

BioBob says:
June 13, 2014 at 10:40 am

1) demonstrate how it’s response is linear between -200 c to +100 c or whatever range you use,

Demonstrate the difference between a possessive & a contraction, you illiterate goober.


Botanist and paleobotanist have spent the past 30 years telling the “climate scientist” that you can’t use tree rings as a proxy for temperature. The annual growth ring size is dependent on too many factors to be a proxy for temperature. I am surprised to see this website further the scientific nonsense of tree rings as temperature proxies. You should know better.
You also really didn’t answer the question of how you can calibrate any proxy with only 150 years of poor quality temperature data. The simple fact is, they really aren’t calibrated; they are used to define temperatures that are outside the range available in the 150 years of poor quality data. Most of the proxies are really just bench tests (unverified by real world data), extrapolations beyond the range, and “educated guess work”.
I short, there is very little actual science in “climate science.”

Willis Eschenbach

Well, now, that’s odd. I took a look at the Esper-Frank 2008 study linked to above. They compared tree rings and observational data. They say the stations they used were:

Stations include Bernina Pass (Ber), Bever (Bev), Buffalora (Buf ), Samedan (Sam), Sils Maria (Sil), Station Maria (Stm).

I think that what they call “Station Maria” is a misreading of the Santa Maria station, which is abbreviated “Sta. Maria”. And I’ve located Samedan and Buffalora in the Berkeley Earth dataset.
However, none of those three have more than a few decades of data, and I can’t locate the other ones in either the GISS or the Berkeley Earth dataset (Switzerland station map).
Anyone have any guesses about why this might be? The authors show data from ~1960 for their stations. I can’t find it.

Follow the Money

I will say this discussion is steps above IPCC science. The trees are limited here to “alpine” and northern treeline specimens, whose rings, even pre-Kyoto, were thought to be maybe partially related to summer temperature or summer season length. Remember, partially, not even mainly. However, IPCC science, perhaps I should specify as Australian Climate Science, uses studies (mostly Australian) that almost any old tree in Australia and NZ is a tree-mometer. What is it about Australia? These also show up in AR5 S. Hemisphere multi-proxy thingeroos.

Willis Eschenbach

Steve McIntyre says:
June 13, 2014 at 12:25 pm

As someone that’s spent a lot of time on proxy data, I don’t regard adjustments to temperature data as an important issue in proxy reconstructions. Or even a minor issue.

Thanks for that, Steve. Any ideas on my question immediately above?
Also, it seems to me that whether adjustments are important in proxy reconstructions depends on how they’ve calibrated their reconstruction. Esper and Frank appear to have used the Luterbacher temperature reconstruction to calibrate their results. As a result, it seems that how that reconstruction was created, including adjustments to the stations, could have an effect on their results. The problem from my perspective is that the overall trend in the proxy reconstruction is generally some linear function of the overall trend in the temperature data used for the reconstruction.

I learned more than enough about proxies and statistics from “Hockey Stick Illusion.” I consider proxies not far removed from tea leaves and Ouija broads. How can any reasonable person take the results seriously? And to “..torture and molest..” a line out of a huge cloud of proxy data and then plot that “result” with real instrument data from MLO, that’s not just sloppy data presentation, IMHO it’s plain, ordinary fraud!


RGB: tree rings in particular make lousy proxies, because tree growth rates are highly multivariate
C. Folland from the UK’s Met Office used tree-ring records collected from specimens in UK and Norway, suggesting the data should be a good proxy for the summer NAO (atmospheric pressure – related to both temperature and precipitation during growing season). As it happens it also appears to be an excellent proxy for solar/geomagnetic intra-activity for the 1700-1850 period (see second graph in my comment above ( June 13, 2014 at 10:20 am). Alas, nothing is forever, the following 150 years correlation is only sporadic (non-stationary). As it happens the Scottish stalagmites’ growth has a longer tolerable range. Just another natural coincidence, you might say.


Willis Eschenbach,
Sils Maria, homogenized data from 1864 :
Apparently, Esper uses the raw series.

That’s it, Global Dimming!
“So the trees aren’t acting as thermometers over a significant fraction of the instrumental era.
Ah yes it could be “increased CO2”, “global dimming”, “atmospheric nitrate deposition”.
Virtually anything in fact.”
Mosher, you are a hoot…

Gary Pearse

Using apriori reasoning:
Wouldn’t an ancient forest have richer nutrients, particularly minerals, than it would a thousand years later? I know that virgin prairie soil grew crops like crazy for a few generations and then nutrients had to be added annually.
How is it possible to control out the effect of limitations on moisture and the putative temperature differences in the signal? Wouldn’t a drought look like cold temperatures in the tree growth. Wouldn’t following years of adequate moisture look like warmer temperatures?
Wouldn’t too much water reduce a tree’s growth?
Would it not be better to look at several species at a time? If you had a forest of pine, spruce but with some boggy areas that might support ash, willow or other water-loving tree and you had a drought, wouldn’t you tend to find the ash effected more, pine the most resistant and spruce in between? Wouldn’t the spruce tend to encroach on the ash bog as it dried?
Do proxilitizers consider these types of questions?

Steve McIntyre

RE Mosher comment above – I tried to look at the data underpinning the argument supposedly linking divergence to global dimming. Unfortunately the global dimming data was password protected and I have thus far been unsuccessful in getting access to it. If I can’t get data, it’s hard to analyse the supposed linkage.
Willis, don’t know what question you’re talking about? Is it about station data? if it is, I don’t know.

Pat Frank

Steve Mc, as temperature data are used to calibrate proxy data, any magnitude uncertainty in the temperature record is immediately transferred to the proxy reconstruction. Accuracy is a major issue, quite independent of whether the proxy is actually a proxy, or whether the proxy record exhibits proper statistics.


Maybe there are different issues here. My understanding (which may be wrong) is that the proxy data is initially calibrated or trained against a first subset of the instrumental record and then tested against a second, different subset of the instrumental record as a verification step. This always smacked of something that would be susceptible to cherry picking to me, as I don’t think I read anything about accepted standards for demarking the boundary between the training set and the testing set, so that someone could just adjust that boundary (or the relative sizes of the sets) until the proxy trends matched the interval in the instrumental record set aside for verification. In other words, you just determine what percentage of the instrumental record is used to calibrate the reconstruction in a manner such that the reconstruction matches the remainder of the instrumental record.
Anyway, what happens if you have a proxy reconstruction that showed a good match against that portion of the instrumental record set aside as the testing data when the reconstruction was made, but ten years later the instrumental record was “adjusted” in light of newly discovered biases or supposedly better statistical adjustment techniques, and the reconstruction no longer matches the instrumental record.
Similarly, if the portions of the instrumental record by which the reconstruction has been trained is later revised, does the proxy reconstruction have to be adjusted. And do you get to start all over again and select a new interval as the training data and a new interval as the testing data to try to keep the reconstruction from changing too much?


As someone who’s spent the greater part of my adult life trying to divine this crap…..all proxies are crap!
Too many assumptions have to be made, too much data has to be ignored…and all of it is built on precious proxies that made the same assumptions and picked data

rgb – I find your explanation of random numbers and their pitfalls intriguing. Logically, it is impossible to examine a set of numbers and determine whether they are random. [Explanation: Start with a sequence of binary numbers, length one. You can’t tell. Length two – no matter what the combination 00, 01, 10, 11 – you can’t tell. Etc, ad infinitum]. So you can test a generator for randomness, but you can’t do that just by testing its output!
On proxies – IMHO if you have to torture the data then what you end up with is of little or no value. I can’t see how random numbers can be relevant to proxies if you aren’t torturing – either the proxies have a clear theoretical basis and give a clear picture or they don’t. Re temperature, trees don’t have a clear theoretical basis (too many competing factors) so they are of little or no value as proxies. Period.

Catherine Ronconi

Steven Mosher says:
June 13, 2014 at 1:30 pm
Has anyone here read the awful offal on Yamal currently stinking up Rational Wiki? It bears the smelly imprint of Connelly, to wit:
“The Yamal controversy was an explosion of drama in the global warming blog wars that spilled over into the pages of mainstream newspapers. In the wake of Climategate, deniers latched onto a set of tree-ring data called the Yamal series that had been the topic of some of the leaked e-mails (after they were done squawking about “nature tricks” and “hiding the decline,” of course). The Yamal series refers to the tree-ring data taken from the Yamal Peninsula in Siberia by a team of Russian researchers, Hantemirov and Shiyatov, in the late ’90s. Hantemirov and Shivatov released more of their data in 2009 and Steve McIntyre jumped all over it, snarking:
“”I’m assuming that CA readers are aware that, once the Yamal series got on the street in 2000, it got used like crack cocaine by paleoclimatologists, and of its critical role in many spaghetti graph reconstructions, including, most recently, a critical role in the Kaufman reconstruction.[1]
“Keith Briffa, a climatoligist at the Climatic Research Unit (CRU) in East Anglia, had based a number of temperature reconstructions on a subset of the Yamal data. He claimed he had used a different methodology than Hantemirov and Shivatov because the original methodology didn’t preserve long-term climate change.[2] McIntyre accused Briffa of cherry-picking. Of course, it would be perfectly legitimate to criticize Briffa’s reconstruction and perform a new reconstruction on one’s own. However, McIntyre just downloaded some other unrelated Yamal dataset from the internet and chucked it into the original set.[3] Deniers, obviously, failed to care about this and the “Yamal is a lie!” claim shot through the deniosphere, with Anthony Watts picking up the story next.[4] It then found its way into the right-wing rags, with James Delingpole and others declaring that the “hockey stick” graph had been soundly “debunked.”[5][6]
“However, Briffa’s Yamal reconstructions were only included in four of the twelve hockey stick reconstructions and even McIntyre criticized other deniers for blowing his “critique” of Briffa out of proportion and walked back his accusations of cherry-picking. Sure enough, both Briffa and a member of the original Russian team released full reconstructions using the previously unreleased data and the hockey stick shape returned, confirming Briffa’s original assertions.[7][8]
“However, the incident was still missing something: That classic McIntyre hypocrisy. McIntyre had been whining for quite some time that Briffa had been blowing him off (gee, wonder why?). However, Briffa, even though he had a good excuse, hadn’t been stonewalling McIntyre — the complete dataset was under the control of the Russian team that had collected it. After Briffa notified him of this, McIntyre then flippantly replied he had had the data all along!
“”In response to your point that I wasn’t “diligent enough” in pursuing the matter with the Russians, in fact, I already had a version of the data from the Russians, one that I’d had since 2004.[9]”
Correct me if wrong, but doesn’t “Yamal” come down to a single tree, which somehow missed out on the now touted “global dimming” magically affecting all its neighboring trees?