Guest post by Pat Frank
It’s become very clear that most published proxy thermometry since 1998 [1] is not at all science, and most thoroughly so because Steve McIntyre and Ross McKitrick revealed its foundation in ad hoc statistical numerology. Awhile back, Michael Tobis and I had a conversation here at WUWT about the non-science of proxy paleothermometry, starting with Michael’s comment here and my reply here. Michael quickly appealed to his home authorities at, Planet3.org. We all had a lovely conversation that ended with moderator-cum-debater Arthur Smith indulging a false claim of insult to impose censorship (insulting comment in full here for the strong of stomach).
But in any case, two local experts in proxy thermometry came to Michael’s aid: Kaustubh Thimuralai, a grad student in proxy climatology at U. Texas, Austin and Kevin Anchukaitis, a dendroclimatologist at Columbia University. Kaustubh also posted his defense at his own blog here.
Their defenses shared this peculiarity: an exclusive appeal to stable isotope temperature proxies — not word one in defense of tree-ring thermometry, which provides the vast bulk of paleotemperature reconstructions.
The non-science of published paleothermometry was proved by their non-defense of its tree-ring center; an indictment of discretionary silence.
Nor was there one word in defense of the substitution of statistics for physics, a near universal in paleo-thermo.
But their appeal to stable isotope proxythermometry provided an opportunity for examination. So, that’s what I’m offering here: an analysis of stable isotope proxy temperature reconstruction followed by a short tour of dendrothermometry.
Part I. Proxy Science: Stable Isotope Thermometry
The focus is on oxygen-18 (O-18), because that’s the heavy atom proxy overwhelmingly used to reconstruct past temperatures. NASA has a nice overview here. The average global stable isotopic ratios of oxygen are, O-16 = 99.757%, O-17 = 0.038 %, O-18 = 0.205 %. If there were no thermal effects (and no kinetic isotope effects), the oxygen isotopes would be distributed in minerals at exactly their natural ratios. But local thermal effects cause the ratios to depart from the average, and this is the basis for stable isotope thermometry.
Let’s be clear about two things immediately: first, the basic physics and chemistry of thermal isotope fractionation is thorough and fully legitimate. [2-4]
Second, the mass spectrometry (MS) used to determine O-18 is very precise and accurate. In 1950, MS of O-18 already had a reproducibility of 5 parts in 100,000, [3] and presently is 1 part in 100,000. [5] These tiny values are represented as “%o,” where 1 %o = 0.1% = 0.001. So dO-18 MS detection has improved by a factor of 5 since 1950, from (+/-)0.05%o to (+/-)0.01%o.
The O-18/O-16 ratio in sea water has a first-order dependence on the evaporation/condensation cycle of water. H2O-18 has a higher boiling point than H2O-16, and so evaporates and condenses at a higher temperature. Here’s a matter-of-fact Wiki presentation. The partition of O-18 and O-16 due to evaporation/condensation means that the O-18 fraction in surface waters rises and falls with temperature.
There’s no dispute that O-18 mixes into CO2 to produce heavy carbon dioxide – mostly isotopically mixed as C(O-16)(O-18).
Dissolved CO2 is in equilibrium with carbonic acid. Here’s a run-down on the aqueous chemistry of CO2 and calcium carbonate.
Dissolved light-isotope CO2 [as C(O-16)(O-16)] becomes heavy CO2 by exchanging an oxygen with heavy water, like this:
CO2 + H2O-18 => CO(O-18) + H2O-16
This heavy CO2 finds its way into the carbonate shells of mollusks, and the skeletons of foraminifera and corals in proportion to its ratio in the local waters (except when biology intervenes. See below).
This process is why the field of stable isotope proxy thermometry has focused primarily on O-18 CO2: it is incorporated into the carbonate of mollusk shells, corals, and foraminifera and provides a record of temperatures experienced by the organism.
Even better, fossil mollusk shells, fossil corals, and foraminiferal sediments in sea floor cores promise physically real reconstructions of O-18 paleotemperatures.
Before it can be measured, O-18 CO2 must be liberated from the carbonate matrix of mollusks, corals, or foraminifera. Liberation of CO2 typically involves treating solid CaCO3 with phosphoric acid.
3 CaCO3 + 2 H3PO4 => 3 CO2 + Ca3(PO4)2 + 3 H2O
CO2 is liberated from biological calcium carbonate and piped into a mass spectrometer. Laboratory methods are never perfect. They incur losses and inefficiencies that can affect the precision and accuracy of results. Anyone who’s done wet analytical work knows about these hazards and has struggled with them. The practical reliability of dO-18 proxy temperatures depends on the integrity of the laboratory methods to prepare and measure the intrinsic O-18.
The paleothermometric approach is to first determine a standard relationship between water temperature and the ratio of O-18/O-16 in precipitated calcium carbonate. One can measure how the O-18 in the water fractionates itself into solid carbonate over a range of typical SST temperatures, such as 10 C through 40 C. A plot of carbonate O-18 v. temperature is prepared.
Once this standard plot is in hand, the temperature is regressed against the carbonate dO-18. The result is a least-squares fitted equation that tells you the empirical relationship of T:dO-18 over that temperature range.
This empirical equation can then be used to reconstruct the water temperature whenever carbonate O-18 is known. That’s the principle.
The question I’m interested in is whether the complete physico-chemical method yields accurate temperatures. Those who’ve read my paper pdf on neglected systematic error in the surface air temperature record, will recognize the ‘why’ of focusing on measurement error. It’s the first and minimum error entering any empirically determined magnitude. That makes it the first and basic question about error limits in O-18 carbonate proxy temperatures.
So, how does the method work in practice?
Let’s start with the classic: J. M. McCrea (1950) “On the Isotopic Chemistry of Carbonates and a Paleotemperature Scale“[3], which is part of McCrae’s Ph.D. work.
McCrae’s work is presented in some detail to show the approach I took to evaluate error. After that, I promise more brevity. Nothing below is meant to be, or should be taken to be, criticism of McCrae’s absolutely excellent work — or criticism of any of the other O-18 authors and papers to follow.
McCrae made truly heroic and pioneering experimental work establishing the O-18 proxy temperature method. Here’s his hand-drawn picture of the custom glass apparatus used to produce CO2 from carbonate. I’ve annotated it to identify some bits:

Figure 1: J. McCrae’s CO2 preparative glass manifold for O-18 analysis.
I’ve worked with similar glass gas/vacuum systems with lapped-in ground-glass joints, and the opportunity for leak, crack, or crash-tastrophe is ever-present.
McCrae developed the method by precipitating dO18 carbonate at different temperatures from marine waters obtained off East Orleans, MA, on the Atlantic side of Cape Cod, and off Palm Beach, Florida. The O-18 carbonate was then chemically decomposed to release the O-18 CO2, which was analyzed in a double-focusing mass spectrometer, which they apparently custom built themselves.
The blue and red lines in the Figure below show his results (Table X and Figure 5 in his paper). The %o O-18 is the divergence of his experimental samples from his standard water.

Figure 2, McCrae, 1950, original caption (color-modified): “Variation of isotopic composition of CaCO3(s) with reciprocal of deposition temperature from H2O (Cape Cod series (red); Florida water series (blue)).” The vertical lines interpolate temperatures at %o O-18 = 0.0. Bottom: Color-coded experimental point scatter around a zero line (dashed purple).
The lines are linear least square (LSQ) fits and they reproduce McCrae’s almost exactly (T is in Kelvin):
Florida: McCrae: d18O=1.57 x (10^4/T)-54.2;
LSQ: d18O=1.57 x (10^4/T)-53.9; r^2=0.994.
Cape Cod: McCrae: d18O=1.64 x (10^4/T)-57.6;
LSQ: d18O=1.64 x (10^4/T)-57.4; r^2=0.995.
About his results, McCrae wrote this: “The respective salinities of 36.7 and 32.2%o make it not surprising that there is a difference in the oxygen composition of the calcium carbonate obtained from the two waters at the same temperature.(bold added)”
The boiling temperature of water increases with the amount of dissolved salt, which in turn affects the relative rates that H2O-16 and H2O-18 evaporate away. Marine salinity can also change from the influx of fresh water (from precipitation, riverine, or direct runoff), or from upwelling, from wave-mixing, and from currents. The O-16/O-18 ratio of fresh water, of upwelling water, or of distant water transported by currents, may differ from a local marine ratio. The result is that marine waters of the same temperature can have different O18 fractions. Disentangling the effects of temperature and salinity in a marine O-16/O-18 ratio can be difficult to impossible in paleo-reconstructions.
The horizontal green line at %o O18 = zero intersects the Florida and Cape Cod lines at different temperatures, represented by the vertical drops to the abscissa. These show that the same dO-18 produces a difference of 4 C, depending on which equation you choose, with the apparent T covarying with a salinity change of 0.045%o.
That means if one generates a paleotemperature by applying a specific dO18:T equation to paleocarbonates, and one does not know the paleosalinity, the derived paleotemperature can be uncertain by as much as (+/-)2 C due to a hidden systematic covariance (salinity).
But I’m interested in experimental error. From those plots one can estimate the point scatter in the physico-chemical method itself as the variation around the fitted LSQ lines. The point scatter is plotted along the purple zero line at the bottom of Figure 2. Converted to temperature, the scatter is (+/-)1 C for the Florida data and (+/-)1.5 C for the Cape Cod data.
All the data were determined by McCrae in the same lab, using the same equipment and the same protocol. Therefore, it’s legitimate to combine the two sets of errors in Figure 2 to determine their average, and the resulting average uncertainty in any derived temperature. The standard deviation of the combined errors is (+/-)0.25 %o O-18, which translates into an average temperature uncertainty of (+/-)1.3 C. This emerged under ideal laboratory conditions where the water temperature was known from direct measurement and the marine O18 fraction was independently measured.
Next, it’s necessary to know whether the errors are systematic or random. Random errors diminish as 1/sqrtN, where N is the number of repetitions of analysis. If the errors are random, one can hope for a very precise temperature measurement just by repeating the dO-18 determination enough times. For example, in McCrae’s work, 25 repeats reduces the average error in any single temperature by 1.3/5 => (+/-)0.26 C.
To bridge the random/systematic divide, I binned the point scatter over (+/-)3 standard deviations = (+/-)99.7 % certainty of including the full range of error. There were no outliers, meaning all the scatter fell within the 99.7 % bound. There are only 15 points, which is not a good statistical sample, but we work with what we’ve got. Figure 3 shows the histogram plot of the binned point-scatter, and a Gaussian fit. It’s a little cluttered, but bear with me.

Figure 3: McCrae, 1950 data: (blue points), binned point scatter from Figure 2; red line, Two-Gaussian fit to the binned points; dashed green lines, the two fitted Gaussians. Thin purple points and line: separately binned Cape Cod point scatter; thin blue line and points, separately binned Florida point scatter.
The first thing to notice is that the binned points are very not normally distributed. This immediately suggests the measurement error is systematic, and not random. The two Gaussian fit is pretty good, but should not be taken as more than a numerical convenience. An independent set of measurement scatter points from a different set of experiments may well require a different set of Gaussians.
The two Gaussians imply at least two modes of experimental error operating simultaneously. The two thin single-experiment lines are spread across scatter width. This demonstrates that the point scatter in each data sets participates in both error modes simultaneously. But notice that the two data sets do not participate equivalently. This non-equivalence again indicates a systematic measurement error that apparently does not repeat consistently.
The uncertainty from systematic measurement error does not diminish as 1/sqrtN. The error is not a constant offset and does not subtract away in a difference between data sets. It propagates into a final value as (+/-)sqrt[(sum of N errors)^2/(N-1)].
The error in any new proxy temperature derived from those methods will probably fall somewhere in the Figure 3 envelope, but the experimenter will not know where. That means the only way to honestly present a result is to report the average systematic error, and that would be T(+/-)1.3 C.
This estimate is conservative, as McCrae noted that, “The average deviation of an individual result from the relation is 0.38%o.”, which is equivalent to an average error of (+/-)2 C (I calculated 1.95 C; McCrae’s result). McCrae wrote later, “The average deviation of an individual experimental result from this relation is 2°C in the series of slow precipitations just described.”
The slow precipitation experiments were the tests with Cape cod and Florida water, shown in Figure 2, and McCrae mentioned their paleothermal significance at the end of his paper, “The isotopic composition of calcium carbonate slowly formed from aqueous solution has been noted to be usually the same as that produced by organisms at the same temperature.”
Anyone using McCrae’s standard equations to reconstruct a dO-18 paleotemperature must include the experimental uncertainty hidden inside them. However, they are invariably neglected. I’ll give an example below.
Another methodological classic is Sang-Tae Kim et al. (2007) “Oxygen isotope fractionation between synthetic aragonite and water: Influence of temperature and Mg2+ concentration“.[6]
Kim, et al., measured the relationship between temperature and dO-18 incorporation in Aragonite, a form of calcium carbonate found in mollusk shells and corals (the other typical form is calcite). They calibrated the T:dO-8 relationship at five temperatures, 0, 5, 10, 25, and 40 C which covers the entire range of SST. Figure 4a shows their data.

Figure 4: a. Blue points: Aragonite T:dO-18 calibration experimental points from Kim, et al., 2007; purple line: LSQ fit. Below: green points, the unfit residual representing experimental point-scatter, 1-sigma = (+/-)0.21. b. 3-sigma histogram of the experimental unfit residual (points) and the 3-Gaussian fit (purple line). The thin colored lines plus points are separate histograms of the four data sub-sets making up the total.
The alpha in “ln-alpha” is the O-18 “fractionation factor,” which is a ratio of O-18 ratios. That sounds complicated, but it’s just (the ratio of O-18 in carbonate divided by the ratio of O-18 in water): {[(O-18)c/(O-16)c] / [(O-18)w/(O-16)w]}, where “c” = carbonate, and “w” = water.
The LSQ fitted line in Figure 4a is 1000 x ln-alpha = 17.80 x (1000/T)-30.84; R^2 = 0.99, which almost exactly reproduces the published line, 1000 x ln-alpha = 17.88 x (1000/T)-31.14.
The green points along the bottom of Figure 4a are the unfit residual, representing the experimental point scatter. These have a 1-sigma standard deviation = (+/-)0.21, which translates into an experimental uncertainty of (+/-)1 C.
In Figure 4b is a histogram of the unfit residual point scatter in part a, binned across (+/-)3-sigma. The purple line is a three-Gaussian fit to the histogram, but with the point at -0.58,3 left out because it destabilized the fit. In any case, the experimental data appear to be contaminated with at least three modes of divergence, again implying a systematic error.
Individual data sub-sets are shown as the thin colored lines in Figure 4b. They all spread across at least two of the three experimental divergence modes, but not equivalently. Once again, that means every data set is uniquely contaminated with systematic measurement error.
Kim, et al., reported a smaller analytical error (+/-)0.13, equivalent to an uncertainty in T = (+/-)0.6 C. But their (+/-)0.13 is the analytical precision of the mass spectrometric determination of the O-18 fractions. It’s not the total experimental scatter. Residual point scatter is a better uncertainty metric because the Kim, et al., equation represents a fit to the full experimental data, not just to the O-18 fractions found by the mass spectrometer.
Any researcher using the Kim, et al., 2007 dO-18:T equation to reconstruct a paleotemperature must propagate at least (+/-)0.6 C uncertainty into their result, and better (+/-)1 C.
I’ve done similar analyses of the experimental point-scatter in several studies used to calibrate the T:O-18 temperature scale. Here’s a summary of the results:
Study______________(+/-)1-sigma______n_____syst err?____Ref.
McCrae________________1.3 C_________15_____Y________[3]
O’Neil_________________29 C_________11______?________[7]
Epstein_______________0.76 C________25______?_________[8]
Bemis________________1.7 C_________14______Y________[9]
Kim__________________1.0 C_________70______Y________[6]
Li____________________2.2 C__________5______________[10]
Friedman______________1.1 C__________6______________[11]
O’Neil’s was a 0-500 C experiment
All the Summary uncertainties represent only measurement point scatter, which often behaved as systematic error. The O’Neil 1969 point scatter was indeterminate, and the Epstein question mark is discussed below.
Epstein, et al., (1953), chose to fit their T:dO-18 calibration data with a second-order polynomial rather than with a least squares straight line. Figure 5 shows their data with the polynomial fit, and for comparison a LSQ straight line fit.

Figure 5: Epstein, 1953 data fit with a second-order polynomial (R^2 = 0.996; sigma residual = (+/-)0.76 C) and with a least squares line (R^2 = 0.992; sigma residual = (+/-) 0.80 C). Insets: histograms of the point scatter plus Gaussian fits; Upper right, polynomial; lower left, linear.
The scatter around the polynomial was pretty Gaussian, but left a >3-sigma outlier at 2.7 C. The LSQ fit did almost as well, and put the polynomial 3-sigma outlier within the 3-sigma confidence limit. The histogram of the linear fit scatter required two Gaussians, and left an unfit point at 2.5-sigma (-2 C).
Epstein had no good statistical reason to choose the polynomial fit over the linear fit, and didn’t mention his rationale. The poly fit came closer to the high-temperature end-point at 30 C, but the linear fit came closer to the low-T end-point at 7 C, and was just as good as through the internal data points. So, the higher order fit may have been an attempt to save the point at 30 C.
Before presenting an application of these lessons, I’d like to show a review paper, which compares all the different dO-18:T calibration equations in current use: B. E. Bemis, H. J. Spero, J. Bijma, and D. W. Lea, Reevaluation of the oxygen isotopic composition of planktonic foraminifera: Experimental results and revised paleotemperature equations. [9]
This paper is particularly valuable because it reviews the earlier equations used to model the T:dO18 relationship.
Figure 6 below reproduces an annotated Figure 2 from Bemis, et al. It compares several T:dO-18 calibration equations from a variety of laboratories. They have similar slopes but are offset. The result is that a given dO-18 predicts a different temperature, depending on which calibration equation one chooses. The Figure is annotated with a couple of very revealing drop lines.

Figure 6: Original caption”Comparison of temperature predictions using new O. universa and G. bulloides temperature:dO-18 relationships and published paleotemperature equations. Several published equations are identified for reference. Equations presented in this study predict lower temperatures than most other equations. Temperatures were calculated using the VSMOW to VPDB corrections listed in Table 1 for dO-18w values.”
The green drop lines show that a single temperature associates with dO-18 values ranging across 0.4 %o. That’s about 10-40x larger than the precision of a mass spectrometer dO-18 measurement. Alternatively, the horizontal red extensions show that a single dO-18 measurement predicts temperatures across a ~1.8 C range, representing an uncertainty of (+/-)0.9 C in choice of standards.
The 1.8 C excludes the three lines, labeled 11-Ch, 12-Ch, and 13-Ch. These refer to G. bulloides with 11-, 12-, and 13-chambered shells. Including them, the spread of temperatures at a single dO-18 is ~3.7 C (dashed red line).
In G. bulloides, the number of shell chambers increases with age. Specific gravity increases with the number of chambers, causing the G. bulloides to sink into deeper waters. Later chambers sample different waters than the earlier ones, and incorporate the ratio of O-18 at depth. Three different lines show the vertical change in dO-18 is significant, and imply a false spread in T of about 0.5 C.
Here’s what Bemis, et al., say about it (p. 150), “Although most of these temperature:d18O relationships appear to be similar, temperature reconstructions can differ by as much as 2 C when ambient temperature varies from 15 to 25 C.”
That “2 C” reveals a higher level of systematic error that appears as variations among the different temperature reconstruction equations. This error should be included as part of the reported uncertainty whenever any one of these standard lines is used to determine a paleotemperature.
Some of the variations in standard lines are also due to confounding factors such as salinity and the activity of photosynthetic foraminiferal symbionts.
Bemis, et al., discuss this problem on page 152: “Non-equilibrium d18O values in planktonic foraminifera have never been adequately explained. Recently, laboratory experiments with live foraminifera have demonstrated that the photosynthetic activity of algal symbionts and the carbonate ion concentration ([CO32-]) of seawater also affect shell d18O values. In these cases an increase in symbiont photosynthetic activity or [CO32-] results in a decrease in shell d18O values. Given the inconsistent SST reconstructions obtained using existing paleotemperature equations and the recently identified parameters controlling shell d18O values, there is a clear need to reexamine the temperature:d18O relationships for planktonic foraminifera.”
Bemis, et al., are thoughtful and modest in this way throughout their paper. They present a candid review of the literature. They discuss the strengths and pitfalls in the field, and describe where more work needs to be done. In other words, they are doing honest science. The contrast could not be more stark between their approach and the pastiche of million dollar claims and statistical maneuvering that swamp AGW-driven paleothermometry.
When the inter-methodological ~(+/-)0.9 C spread of standard T:dO-18 equations is added as the rms to the (+/-)1.34 C average measurement error from the Summary Table, the combined 1-sigma uncertainty in a dO-18 temperature =(+/-)sqrt(1.34^2+0.9^2)=(+/-)1.6 C. That doesn’t include any further invisible environmental confounding effects that might confound a paleo-O18 ratio, such as shifts in monsoon, in salinity, or in upwelling.
A (+/-)1.6 C uncertainty is already 2x larger than the commonly accepted 0.8 C of 20th century warming. T:dO-18 proxies are entirely unable to determine whether recent climate change is in any way historically or paleontologically unusual.
Now let’s look at Keigwin’s justly famous Sargasso Sea dO-18 proxy temperature reconstruction: (1996) “The Little Ice Age and Medieval Warm Period in the Sargasso Sea.” [12] The reconstructed Sargasso Sea paleotemperature rests on G. ruber calcite. G. ruber has photosynthetic symbionts, which induces the T:dO-18 artifacts mentioned by Bemis, et al. Keigwin is a good scientist and attempted to account for this by applying an average G. ruber correction. But removal of an average bias is effective only when the error envelope is random around a constant offset. Subtracting the average bias of a systematic error does not reduce the uncertainty width, and may even increase the total error if the systematic bias in your data set is different from the average bias. Keigwin also assumed an average salinity of 36.5%o throughout, which may or may not be valid.
More to the point, no error bars appear on the reconstruction. Keigwin reported changes in paleotemperature of 1 C or 1.5 C, implying a temperature resolution with smaller errors than these values.
Keigwin used the T:dO-18 equation published by Shackleton in 1974,[13] to turn his Sargasso G. ruber dO-18 measurements into paleotemperatures. Unfortunately, Shackleton published his equation in the International Colloquium Journal of the French C.N.R.S., and neither I nor my French contact (thank-you Elodie) have been able to get that paper. Without it, one can’t directly evaluate the measurement point scatter.
However in 1965, Shackleton published a paper demonstrating his methodology to obtain high precision dO-18 measurements. [14] Shackleton’s high precision scatter should be the minimum scatter in his 1974 T:dO-18 equation.
Shackleton, 1965 made five replicate measurements of the dO-18 in five separate samples of a single piece of Italian marble (marble is calcium carbonate). Here’s his Table of results:
Reaction No. _1____2____3____4____5____Mean____Std dev.
dO-18 value__4.1__4.45_4.35__4.2__4.2____4.26%___0.12%o.
Shackleton mistakenly reported the root-mean-square of the point scatter instead of the standard deviation. No big deal, the true 1-sigma = (+/-)0.14%o; not very different.
In Shackleton’s 1965 words, “The major reason for discrepancy between successive measurements lies in the difficulty of preparing and handling the gas.” That is, the measurement scatter is due to the inevitable systematic laboratory error we’ve already seen above.
Shackleton’s 1974 standard T:dO-18 equation appears in Barrera, et al., [15] and it’s T = 16.9 – 4.38(dO-18) + 0.10(dO-18)^2. Plugging Shackleton’s high-precision 1-sigma=0.14%o into his equation yields an estimated minimum uncertainty of (+/-)0.61 C in any dO-18 temperature calculated using the Shackleton T:dO-18 equation.
At the ftp site where Keigwin’s data are located, one reads “Data precision: ~1% for carbonate; ~0.1 permil for d18-O.” So, Keigwin’s independent dO-18 measurements were good to about (+/-)0.1%o.
The uncertainty in temperature represented by Keigwin’s (+/-)0.1%o spread in measured dO-18 equates to (+/-)0.44 C in Shackleton’s equation.
The total measurement uncertainty in Keigwin’s dO-18 proxy temperature is the quadratic sum of the uncertainty in Shackleton’s equation plus the uncertainty in Keigwin’s own dO-18 measurements. That’s (+/-)sqrt[(0.61)^2+(0.44)^2]=(+/-)0.75 C. This represents measurement error, and is the 1-sigma minimum of error.
And so now we get to see something possibly never before seen anywhere: a proxy paleotemperature series with true, physically real, 95% confidence level 2-sigma systematic error bars. Here it is:

Figure 7: Keigwin’s Sargasso Sea dO-18 proxy paleotemperature series, [12] showing 2-sigma systematic measurement error bars. The blue rectangle is the 95% confidence interval centered on the mean temperature of 23.0 C.
Let’s be clear on what Keigwin accomplished. He reconstructed 3175 years of nominal Sargasso Sea dO-18 SSTs with a precision of (+/-)1.5 C at the 95% confidence level. That’s an uncertainty of 6.5% about the mean, and is a darn good result. I’ve worked hard in the lab to get spectroscopic titrations to that level of accuracy. Hat’s off to Keigwin.
But it’s clear that changes in SSTs on the order of 1-1.5 C can’t be resolved in those data. The most that can be said is that it’s possible Sargasso Sea SSTs were higher 3000 years ago.
If we factor in the uncertainty due to the (+/-)0.9 C variation among all the various T:dO-18 standard equations (Figure 6), then the Sargasso Sea 95% confidence interval expands to (+/-)2.75 C.
This (+/-)2.75 C = (uncertainty in experimenter d-O18 measurements) + (uncertainty in any given standard T:dO-18 equation) + (methodological uncertainty across all T:dO-18 equations).
So, (+/-)2.75 C is probably a good estimate of the methodological 95% confidence interval in any determination of a dO-18 paleotemperature. The confounding artifacts of paleo-variations in salinity, photosynthesis, upwelling and meteoric water will bring into any dO-18 reconstruction of paleotemperatures, further errors that are invisible but perhaps of analogous magnitude.
At the end, it’s true that the T:dO18 relationship is soundly based in physics. However, it is not true that the relationship has produced a reliably high-resolution proxy for paleotemperatures.
Part II: Pseudo-Science: Statistical Thermometry
Now on to the typical published proxy paleotemperature reconstructions. I’ve gone through a representative set of eight high-status studies, looking for evidence of science. Evidence of science is whether any of them make use of physical theory.
Executive summary: none of them are physically valid. Not one of them yields a temperature.
Before proceeding, a necessary word about correlation and causation. Here’s what Michael Tobis wrote about that, “If two signals are correlated, then each signal contains information about the other. Claiming otherwise is just silly.”
There’s a lot of that going around in proxythermometry, and clarification is a must. John Aldrich has a fine paper [16] describing the battle between Karl Pearson and G. Udny Yule over correlation indicating causation. Pearson believed it, Yule did not.
On page 373, Aldrich makes a very relevant distinction: “ Statistical inference deals with inference from sample to population while scientific inference deals with the interpretation of the population in terms of a theoretical structure.”
That is, statistics is about the relations among numbers. Science is about deductions from a falsifiable theory.
We’ll see that the proxy studies below improperly mix these categories. They convert true statistics into false science.
To spice up the point, here are some fine examples of spurious correlations, and here are the winners of the 1998 Purdue University spurious correlations contest, including correlations between ice cream sales and death-by-drowning, and between ministers’ salaries and the price of vodka. Pace Michael Tobis, each of those correlated “signals” so obviously contains information about the other, and I hope that irony lays the issue to rest.
Diaz and Osuna [17] point out that distinguishing, “between alchemy and science … is (1) the specification of rigorously tested models, which (2) adequately describe the available data, (3) encompass previous findings, and (4) are derived from well-based theories. (my numbers, my bold)”
The causal significance of any correlation is revealed only within the deductive context of a falsifiable theory that predicts the correlation. Statistics (inductive inference) never, ever, of itself reveals causation.
AGW paleo proxythermometry will be shown missing Diaz and Osuna elements 1, 3, and 4 of science. That makes it alchemy; otherwise known as pseudoscience.
That said, here we go: AGW proxythermometry:
1. Thomas J. Crowley and Thomas S. Lowery (2000) “How Warm Was the Medieval Warm Period?.” [18]
They used fifteen series: three dO-18 (Keigwin’s Sargasso Sea proxy, GISP 2, and the Dunde Ice cap series), eight tree-ring series, the Central England temperature (CET) record, an Iceland temperature (IT) series, and two plant-growth proxies (China phenology and Michigan pollen).
All fifteen series were scaled to vary between 0 and 1, and then averaged. There was complete and utter neglect of the physical meaning of the five physically valid series (3 x dO18, IT, and CET). All of them were scaled to the same physically meaningless unitary bound.
Think about what this means: Crowley and Lowry took five physically meaningful series, and discarded the physics. That made the series fit to use in AGW-related proxythermometry.
There is no physical theory that coverts tree ring metrics into temperatures. That theory does not exist and any exact relationship remains entirely obscure.
So then how did Crowley and Lowery convert their unitized proxy average into temperature? Well, “The two composites were scaled to agree with the Jones et al. instrumental record for the Northern Hemisphere…,” and that settles the matter.
In short, the fifteen series were numerically adjusted to a common scale, averaged, and scaled up to the measurement record. Then C&L reported their temperatures to a resolution of (+/-)0.05 C. Measurement uncertainty in the physically real series was ignored in their final composite. That’s how you do science, AGW proxythermometry style.
Any physical theory employed?: No
Strictly statistical inference?: Yes
Physical content: none.
Physical validity: none.
Temperature meaning of the final composite: none.
2. Timothy J. Osborn and Keith R. Briffa (2006) The Spatial Extent of 20th-Century Warmth in the Context of the Past 1200 Years.” [19]
Fourteen proxies — eleven of them tree rings, one dO-18 ice core (W. Greenland) — were divided by their respective standard deviation to produce a common unit magnitude, and then scaled into the measurement record. The ice core dO-18 had its physical meaning removed and its experimental uncertainty ignored.
Interestingly, between 1975 and 2000 the composite proxy declined away from the instrumental record. Osborn and Briffa didn’t hide the decline, to their everlasting credit, but instead wrote that this disconfirmation is due to, “the expected consequences of noise in the proxy records.”
I estimated the “noise” by comparing its offset with respect to the temperature record, and it’s worth about 0.5 C. It didn’t appear as an uncertainty on their plot. In fact, they artificially matched the 1856-1995 means of the proxy series and the surface air temperature record, making the proxy look like temperature. The 0.5 C “noise” divergence got suppressed and looks much smaller than it really is. Actual 0.5 C “noise” error bars scaled onto the temperature record of their final Figure 3 would have made the whole enterprise theatrically useless, no matter that it is bereft of science in any case.
Any physical theory employed?: No
Strictly statistical inference?: Yes
Physical uncertainty in T: none.
Physical validity: none.
Temperature meaning of the composite: none.
3. Michael E. Mann, Zhihua Zhang, Malcolm K. Hughes, Raymond S. Bradley, Sonya K. Miller, Scott Rutherford, and Fenbiao Ni (2008) “Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia.” [20]
A large number of proxies of multiple lengths and provenances. They included some ice core, speleothem, and coral dO-18, but the data are vastly dominated by tree ring series. Mann & co., statistically correlated the series with local temperature during a “calibration period,” adjusted them to equal standard deviation, scaled into the instrumental record, and published the composite showing a resolution of 0.1 C (Figure 3). Their method again removed and discarded the physical meaning of the dO-18 proxies.
Any physical theory employed?: No
Strictly statistical inference?: Yes
Physical uncertainty in T: none.
Physical validity: none.
Temperature meaning of the composite: none.
4. Rosanne D’Arrigo, Rob Wilson, Gordon Jacoby (2006) “ On the long-term context for late twentieth century warming .” [21]
Tree ring series from 66 sites, variance adjusted, scaled into the instrumental record and published with a resolution of 0.2 C (Figure 5 C).
Any physical theory employed?: No
Strictly statistical inference?: Yes
Physically valid temperature uncertainties: no
Physical meaning of the 0.2 C divisions: none.
Physical meaning of tree-ring temperatures: none available.
Temperature meaning of the composite: none.
5.Anders Moberg, Dmitry M. Sonechkin, Karin Holmgren, Nina M. Datsenko and Wibjörn Karlén (2005) “Highly variable Northern Hemisphere temperatures reconstructed from low- and high-resolution proxy data.” [22]
Eighteen proxies: Two d-O18 SSTs (Sargasso and Caribbean Seas foraminiferal d-O18, and one stalagmite d-O18 (Soylegrotta, Norway), seven tree ring series. Plus other composites.
The proxies were processed using an excitingly novel wavelet transform method (it must be better), combined, variance adjusted, intensity scaled to the instrumental record over the calibration period, and published with a resolution of 0.2 C (Figure 2 D). Following standard practice, the authors extracted the physical meaning of the dO-18 proxies and then discarded it.
Any physical theory employed?: No
Strictly statistical inference?: Yes
Physical uncertainties propagated from the dO18 proxies into the final composite? No.
Physical meaning of the 0.2 C divisions: none.
Temperature meaning of the composite: none.
6. B.H. Luckman, K.R. Briffa, P.D. Jones and F.H. Schweingruber (1997) “Tree-ring based reconstruction of summer temperatures at the Columbia Icefield, Alberta, Canada, AD 1073-1983.” [23]
Sixty-three regional tree ring series, plus 38 fossilwood series; used the standard statistical (not physical) calibration-verification function to convert tree rings to temperature, overlaid the composite and the instrumental record at their 1961-1990 mean, and published the result at 0.5 C resolution (Figure 8). But in the text they reported anomalies to (+/-)0.01 C resolution (e.g., Tables 3&4), and the mean anomalies to (+/-)0.001 C. That last is 10x greater claimed accuracy than the typical rating of a two-point calibrated platinum resistance thermometer within a modern aspirated shield under controlled laboratory conditions.
Any physical theory employed?: No
Strictly statistical inference?: Yes
Physical meaning of the proxies: none.
Temperature meaning of the composite: none.
7. Michael E. Mann, Scott Rutherford, Eugene Wahl, and Caspar Ammann (2005) “Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate.” [24]
This study is, in part, a methodological review of the recommended ways to produce a proxy paleotemperature made by the premier practitioners in the field:
Method 1: “The composite-plus-scale (CPS) method, “a dozen proxy series, each of which is assumed to represent a linear combination of local temperature variations and an additive “noise” component, are composited (typically at decadal resolution;…) and scaled against an instrumental hemispheric mean temperature series during an overlapping “calibration” interval to form a hemispheric reconstruction. (my bold)”
Method 2, Climate field reconstruction (CFR): “Our implementation of the CFR approach makes use of the regularized expectation maximization (RegEM) method of Schneider (2001), which has been applied to CFR in several recent studies. The method is similar to principal component analysis (PCA)-based approaches but employs an iterative estimate of data covariances to make more complete use of the available information . As in Rutherford et al. (2005), we tested (i) straight application of RegEM, (ii) a “hybrid frequency-domain calibration” approach that employs separate calibrations of high (shorter than 20-yr period) and low frequency (longer than 20-yr period) components of the annual mean data that are subsequently composited to form a single reconstruction, and (iii) a “stepwise” version of RegEM in which the reconstruction itself is increasingly used in calibrating successively older segments. (my bold)”
Restating the obvious: CPS: Assumed representative of temperature; statistical scaling into the instrumental record; methodological correlation = causation. Physical validity: none. Scientific content: none.
CFR: Principal component analysis (PCA): a numerical method devoid of intrinsic physical meaning. Principal components are numerically, not physically, orthogonal. Numerical PCs are typically composites of multiple decomposed (i.e., partial) physical signals of unknown magnitude. They have no particular physical meaning. Quantitative physical meaning cannot be assigned to PCs by reference to subjective judgments of ‘temperature dependence.’
Scaling the PCs into the temperature record? Correlation = causation.
‘Correlation = causation is possibly the most naive error possible in science. Mann et al., unashamedly reveal it as undergirding the entire field of tree ring proxy thermometry.
Scientific content of the Mann-Rutherford-Wahl-Ammann proxy method: zero.
Finally, an honorable mention:
8. Rob Wilson, Alexander Tudhope, Philip Brohan, Keith Briffa, Timothy Osborn, and Simon Tet (2006), “Two-hundred-fifty years of reconstructed and modeled tropical temperatures.”[25]
Wilson, et al, reconstructed 250 years of SSTs using only coral records, including dO-18, strontium/calcium, uranium/calcium, and barium/calcium ratios. I’ve not assessed the latter three in any detail, but inspection of their point scatter is enough to imply that none of them will yield more accurate temperatures than dO-18.
However, all the Wilson, et al., temperature proxies had real physical meaning. What a great opportunity to challenge the method, and discuss the impacts of salinity, biological disequilibrium, and how to account for them, and explore all the other central elements of stable isotope marine temperatures.
So what did they do? Starting with about 60 proxy series, they threw out all those that didn’t correlate with local gridded temperatures. That left 16 proxies, 15 of which were dO-18. Why didn’t the other proxies correlate with temperature? Rob Wilson & co., were silent on the matter. After tossing two more proxies to avoid the problem of filtering away high frequencies, they ended up with 14 coral SST proxies.
After that, they employed standard statistical processing: divide by the standard deviation, average the proxies together (they used the “nesting procedure,” which adjusts for individual proxy length), and scale up to the instrumental record.
The honorable mention for these folks derives from the fact that they used only physically real proxies, and then discarded the physical meaning of all of them.
That puts them ahead of the other seven exemplars, who included proxies that had no known physical meaning at all.
Nevertheless,
Any physical theory employed?: No
Strictly statistical inference?: Yes
Any physically valid methodology? No.
Physical meaning of the proxies: present and accounted for, and then discarded.
Temperature meaning of the composite: none.
Summary Statement: AGW-related paleo proxythermometry as ubiquitously practiced consists of composites that rely entirely on statistical inference and numerical scaling. They not only have no scientific content, the methodology actively discards scientific content.
Statistical methods: 100%.
Physical methods: nearly zero (stable isotopes excepted, but their physical meaning is invariably discarded in composite paleoproxies).
Temperature meaning of the numerically scaled composites: zero.
The seven studies are typical, and representative of the entire field of AGW-related proxy thermometry. As commonly practiced, it is a scientific charade. It’s pseudo-science through-and-through.
Stable isotope studies are real science, however. That field is cooking along and the scientists involved are properly paying attention to detail. I hereby fully except them from my general condemnation of the field of AGW proxythermometry.
With this study, I’ve now examined the reliability of all three legs of AGW science: Climate models (GCMs) here (calculations here), the surface air temperature record here (pdf downloads, all), and now proxy paleotemperature reconstructions.
Every one of them thoroughly neglects systematic error. The neglected systematic error shows that none of the methods – not one of them — is able to resolve or address the surface temperature change of the last 150 years.
Nevertheless, the pandemic pervasiveness of this neglect is the central mechanism by which AGW alarmism survives. This has been going on for at least 15 years; for GCMs, 24 years. Granting integrity, one can only conclude that the scientists, their reviewers, and their editors are uniformly incompetent.
Summary conclusion: When it comes to claims about unprecedented this-or-that in recent global surface temperatures, no one knows what they’re talking about.
I’m sure there are people who will dispute that conclusion. They are very welcome to come here and make their case.
References:
1. Mann, M.E., R.S. Bradley, and M.S. Hughes, Global-scale temperature patterns and climate forcing over the past six centuries. Nature, 1998. 392(p. 779-787.
2. Dansgaard, W., Stable isotopes in precipitation. Tellus, 1964. 16(4): p. 436-468.
3. McCrea, J.M., On the Isotopic Chemistry of Carbonates and a Paleotemperature Scale. J. Chem. Phys., 1950. 18(6): p. 849-857.
4. Urey, H.C., The thermodynamic properties of isotopic substances. J. Chem. Soc., 1947: p. 562-581.
5. Brand, W.A., High precision Isotope Ratio Monitoring Techniques in Mass Spectrometry. J. Mass. Spectrosc., 1996. 31(3): p. 225-235.
6. Kim, S.-T., et al., Oxygen isotope fractionation between synthetic aragonite and water: Influence of temperature and Mg2+ concentration. Geochimica et Cosmochimica Acta, 2007. 71(19): p. 4704-4715.
7. O’Neil, J.R., R.N. Clayton, and T.K. Mayeda, Oxygen Isotope Fractionation in Divalent Metal Carbonates. J. Chem. Phys., 1969. 51(12): p. 5547-5558.
8. Epstein, S., et al., Revised Carbonate-Water Isotopic Temperature Scale. Geol. Soc. Amer. Bull., 1953. 64(11): p. 1315-1326.
9. Bemis, B.E., et al., Reevaluation of the oxygen isotopic composition of planktonic foraminifera: Experimental results and revised paleotemperature equations. Paleoceanography, 1998. 13(2): p. 150Ð160.
10. Li, X. and W. Liu, Oxygen isotope fractionation in the ostracod Eucypris mareotica: results from a culture experiment and implications for paleoclimate reconstruction. Journal of Paleolimnology, 2010. 43(1): p. 111-120.
11. Friedman, G.M., Temperature and salinity effects on 18O fractionation for rapidly precipitated carbonates: Laboratory experiments with alkaline lake water ÑPerspective. Episodes, 1998. 21(p. 97Ð98
12. Keigwin, L.D., The Little Ice Age and Medieval Warm Period in the Sargasso Sea. Science, 1996. 274(5292): p. 1503-1508; data site: ftp://ftp.ncdc.noaa.gov/pub/data/paleo/paleocean/by_contributor/keigwin1996/.
13. Shackleton, N.J., Attainment of isotopic equilibrium between ocean water and the benthonic foraminifera genus Uvigerina: Isotopic changes in the ocean during the last glacial. Colloq. Int. C.N.R.S., 1974. 219(p. 203-209.
14. Shackleton, N.J., The high-precision isotopic analysis of oxygen and carbon in carbon dioxide. J. Sci. Instrum., 1965. 42(9): p. 689-692.
15. Barrera, E., M.J.S. Tevesz, and J.G. Carter, Variations in Oxygen and Carbon Isotopic Compositions and Microstructure of the Shell of Adamussium colbecki (Bivalvia). PALAIOS, 1990. 5(2): p. 149-159.
16. Aldrich, J., Correlations Genuine and Spurious in Pearson and Yule. Statistical Science, 1995. 10(4): p. 364-376.
17. D’az, E. and R. Osuna, Understanding spurious correlation: a rejoinder to Kliman. Journal of Post Keynesian Economics, 2008. 31(2): p. 357-362.
18. Crowley, T.J. and T.S. Lowery, How Warm Was the Medieval Warm Period? AMBIO, 2000. 29(1): p. 51-54.
19. Osborn, T.J. and K.R. Briffa, The Spatial Extent of 20th-Century Warmth in the Context of the Past 1200 Years. Science, 2006. 311(5762): p. 841-844.
20. Mann, M.E., et al., Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia. Proc. Natl. Acad. Sci., 2008. 105(36): p. 13252-13257.
21. D’Arrigo, R., R. Wilson, and G. Jacoby, On the long-term context for late twentieth century warming. J. Geophys. Res., 2006. 111(D3): p. D03103.
22. Moberg, A., et al., Highly variable Northern Hemisphere temperatures reconstructed from low- and high-resolution proxy data. Nature, 2005. 433(7026): p. 613-617.
23. Luckman, B.H., et al., Tree-ring based reconstruction of summer temperatures at the Columbia Icefield, Alberta, Canada, AD 1073-1983. The Holocene, 1997. 7(4): p. 375-389.
24. Mann, M.E., et al., Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate. J. Climate, 2005. 18(20): p. 4097-4107.
25. Wilson, R., et al., Two-hundred-fifty years of reconstructed and modeled tropical temperatures. J. Geophys. Res., 2006. 111(C10): p. C10007.
William Connolly cites a poster here who said: “It is beyond my knowledge to judge the validity of Pat Frank’s thesis, without years of study, but it seems thorough and is well presented”.
This gets precisely to the center of the problem with 90% of skeptics. Even though most of them know very little about climate science ( and probably not much about science in general) anything that supports their preconceived position is immediately praised, and anything that goes against it is immediately dammed. Thus the praise for Pat Frank’s post. The only way to judge its validity is to see if he can get it published in a reputable journal. What’s the betting that it never gets beyond WUWT?
Brian, speaking only for myself, I report 1-sigma. But among physical scientists, 1-sigma is enough. Everyone knows what it means.
Monty, you wrote, “The only way to judge its validity is to see if he can get it published in a reputable journal.”
Not correct. The only way to judge its validity is on the internal merits. The same is true of any scientific argument.
OK Pat. Where are you going to submit this? GRL? Climate of the Past? Honestly, I encourage you to. If it passes peer review and has an impact then you will have made a contribution. A blog post just doesn’t cut it I’m afraid.
Monty, your original point was about “validity.” Having lost that point, you’ve now shifted your ground to “impact.”
Having an “impact” depends on whether a result is noticed. Publication doesn’t guarantee that. Whether the result is correct or not is the baseline issue. I stand by my results, and they’re all right here to be judged. So far those in opposition — Connolley, Tobis, Thirumalai — have been objectively ineffective.
Kaustubh Thimuralai, who has his own blog, has wasted space in a long essay in reply that is nothing more than a personal attack. One would think that, as a graduate student in the very field, he’d have made a quantitative rebuttal. Guess not.
I’d suggest a blog post *does* cut it, in that the argument can be completely valid. What sort of impact it has depends on the reaction and position of those who notice it.
This is just a long-winded attempt to justify not sending this to peer review where it can be judged by experts in the field, rather than praised by sycophants who don’t understand a word of it. The only conclusion that can be drawn is that you know it wouldn’t get in to a reputable journal. I can only imagine the fuss that the skeptics would make if climate science papers were similarly treated.
You don’t understand a word of it either, Monty, and yet you criticise anyway. That makes you their opposite — a bombast also with a worthless opinion.
I’ve already published three peer-reviewed critical articles on the neglect of error in AGW-related climate science, and expect to have the surface air temperature record by the short hairs in my next paper. Maybe after that, I’ll write up a more complete analysis of the effect of measurement error on the reliability of temperature proxies. But that would concern physically real proxies. The temperature-proxy-by-statistical-scaling field is hopeless pseudo-science. A criticism of that would have to be published in a philosophical journal.
You said; “You don’t understand a word of it either, Monty, and yet you criticise anyway”.
As a matter of fact I have a PhD in a climate science and have published around 70 papers in the peer-reviewed science literature on various aspects of climate change. These include lots of papers in the leading journals in my field.
Now, I’m not an expert in the use of oxygen isotope ratios but that’s not the point. The point is that such experts do exist and it is very telling that you dare not expose your ‘research’ to their scrutiny.
Makes me think your ‘research’ isn’t up to much.
Pat:
1) A paired t-test is an experimental design that doesn’t demand that the uncertainty in the experimental and control groups be added in quadrature. Likewise, with foraminifera, uncertainty can be reduced by analyzing only samples that are expected to fall on one calibration line, instead of the full range of possibilities shown in Figure 6.
2) In the real world, the uncertainty introduced using any calibration curve or standard curve is always established by running an adequate number of positive controls. No matter what you think the uncertainty should be from error propagation, the scientists running these experiments know what the uncertainty IS with positive controls. When they say the temperatures reconstructed for positive controls are typically good to 0.5 degC, there is no point in arguing. If you don’t have access to positive controls, the uncertainty can be determined from the confidence intervals for the derived slope and intercept (linear fit) or coefficients a, b and c (quadratic fit) determined during the least-squares fit. See the sixth paragraph on calibration curves in Wikipedia.
4) You advocate adding in quadrature the isotope uncertainty during calibration to the isotope uncertainty during analysis; a procedure that appears flawed: For simplicity, consider a linear calibration. When calibration is done with a precision of 0.1%o at two temperatures, 21 and 24 degC (roughly the range of the Sargasso Sea), the slope and intercept will be determined within certain confidence intervals. If the calibration is performed with samples every 2 degC between 16 and 28 degC, the slope and intercept will have much tighter confidence intervals (particularly the slope), despite being constructed from equally precise isotopic data. The reliability of a standard curve depends on more than just precision of the raw data used to construct it. The uncertainty in the isotope data enters the error propagation analysis ONCE and is combined with the uncertainty in the parameters derived when fitting the standard curve. Since we can get better answers from positive controls, this type of error propagation analysis is rarely needed.
5) To some extent, you MAY be trying to add SYSTEMATIC error that might be introduced by changes salinity, light, and other factors to the experimental uncertainty (random error). This can’t be done: experimental uncertainty can be quantified by statistical analysis of observations; systematic error can not. When systematic bias can be accurately quantified, we remove it. If salinity or some other factor has changed enough over time to effect the O18-temperature, the Keigwin’s reconstruction will have a systematic error, not a larger experimental uncertainty. All scientists know that reasonably tight experimental results (temp SE =<0.5 degC, or p<0.01) can be invalidated by systematic error. Experiment variability is covered statistically and quantitatively; possible sources of systematic error are discussed qualitatively in the paper.
The amount of O18 is seawater increased during the ice ages as O18-depleted ice accumulated on land. Someone has demonstrated that this systematic error is large enough to invalidate temperature reconstructions from foraminifera extending into ice ages, so O18 in foraminifera is not used for these periods. For the same reason, a systematic error due to changing salinity or seawater O18 COULD invalidate Keigwin's reconstruction. Knowing that one can produce a change in the isotope/temperature calibration by making a large change in salinity in the laboratory doesn't demonstrate a systematic error in Keigwin's reconstruction. First, you need to estimate how much salinity (like O18 during the ice ages) might have changed with time in the Sargasso Sea and then consult laboratory experiments with salinity to see how big a systematic error that might produce.
6) Several papers I glanced at show inadequate accuracy in reconstructions of SST at different locations, demonstrating the existence of systematic error when this methodology is used at different sites. Bemis is trying to identify the cause of these systematic errors and develop methods for correcting them. Using these proxies to determine absolute temperature a different locations is a perilous operation, but assessing temperature CHANGE at one site seems reasonable.
"The first principal is that you must not fool yourself – and you are the easiest person to fool." Richard Feynman, in Cargo Cult Science.
Monty if you’re as expert as you say, you’d not have any problem evaluating and criticising my analysis yourself. You’d also know that the merit of any scientific case depends on content, and in no way on peer-review.
And yet, you’ve posted here four times without one word of objective criticism, and without evidencing any understanding of the criteria of validity of a scientific argument.
My analysis is here in public, on a prominent website read by millions and available for criticism by any professional including you. It could not be more exposed than it would be on any open access journal peer-review.
Any scientist can come here with the opportunity to knock it down in full view of a sophisticated reading public. If you have some valid criticism, go ahead and make it. Otherwise, you’re just making noise.
I can only imagine the screaming and shouting from the skeptics if climate scientists posted their research on ‘pro-AGW’ blogs and refused to submit to peer-review. It’s clear that the only reason that you refuse to submit to a leading science journal is that you know your ‘research’ is flawed and that it wouldn’t pass peer review.
There are open-access journals such as Climate of the Past Discussions where everyone can see the refereeing process as it happens (I am often asked to referee papers there myself). You could submit there….but I bet you don’t.
And not a single criticism from the sycophants who populate this blog! Amazing!
Monty – Needs to debate the work presented here …not the venue chosen.
An attempt at pea shuffling seems to be all he can offer…much like his posts on such as Deltoid.
He consistently says such as, “I think who always trumpets his PhD You are right to be suspicious..”….”I’m always suspicious of those who make a song and dance about their qualifications.”
Yet, follows up as here, with “As an aside; I have a PhD and I’m a climate scientist.
Posted by: monty | June 24, 2011 8:57 AM http://scienceblogs.com/deltoid/2011/06/the_conversation_on_climate_ch_2.php
” I have a PhD in a relevant subject but don’t feel the need to advertise it with my blog name.”
Posted by: monty | March 21, 2011 5:57 AM http://scienceblogs.com/deltoid/2011/03/ian_enting_on_climate_science.php
Sorry Kim, I don’t quite understand. The only reason I mentioned I was an academic working in climate science is because Pat wrote “You don’t understand a word of it either, Monty, and yet you criticise anyway”. Had he not said this I wouldn’t have felt the need to advertise my qualifications.
My basic point still stands. For all his bluster, Pat Frank will not allow expert scientists to review his work. If there are any lurkers here, you can draw your own conclusions.
Monty says:
April 11, 2012 at 7:13 am
Sorry Kim, I don’t quite understand. The only reason I mentioned I was an academic working in climate science is because Pat wrote “You don’t understand a word of it either, Monty, and yet you criticise anyway”. Had he not said this I wouldn’t have felt the need to advertise my qualifications.
xxxxxxxxxxxxxxxxxxxxx
C’mon you advertise your qualifications on many threads. [ See links in above post ].
xxxxxxxxxxxxxxxxxxxxxx
Monty says:
April 11, 2012 at 7:13 am
“My basic point still stands. For all his bluster, Pat Frank will not allow expert scientists to review his work. If there are any lurkers here, you can draw your own conclusions.”
xxxxxxxxxxxxxxxxxxxxxxxxxx
For a PhD don’t you have to learn about “logic fallacies”?. Your logic fallacies here.. are your continued instance of an “Appeal to Authority” – Compounded by the “Red Herring Fallacy”. [ Actually, you also throw in a “Straw Man” Fallacy ].
[ Learn about them here ] http://www.fallacyfiles.org/redherrf.html
1: Expert Scientists have embraced numerous faulty papers [ Appeal to Authority ]
2: Diverting from the debated paper / conclusions made by Mr Pat Frank [ Red Herring ]
3: Experts CAN review here – Mr Pat Frank hasn’t refused access [ Straw man ].
Unlike echo-chambers and edit-mills…. WUWT will teach you debate skills.
Actually, yes I would appeal to authority here. I’m guessing that the world’s leading experts in isotope chemistry and paleoclimatology probably do know more than Pat Frank about isotope chemistry and paleoclimatology. His refusal to submit his ‘research’ so that it can be judged by such experts is damming. I wonder what the odds are of Pat Frank ever making a significant contribution to this field? Zero?
Kim2000 you must be in possession of some pretty impressive intellectual blinkers if you can’t see that.
Bye bye.
Monty says:
April 11, 2012 at 9:45 am
Actually, yes I would appeal to authority here.
xxxxxxxxxxxxxx
Not yours!
That is what you are demanding. You want Mr Pat Frank to surrender to YOUR authority and submit to YOUR desires.
My intellectual “blinkers” saw through your logic fallacies.
You’re dismissed 🙂 bye bye
Frank, using your numbering:
1) A paired t-test only reveals correlation. It says nothing about accuracy (or precision).
In Figure 6 (Bemis Figure 2), the shift in the standard lines are due to uncontrolled variables. That means unknown influences that materially impact the result. In turn, that means no one knows why the various standard lines are different. For any given new data set, no one knows which of those lines is relevant, or whether any of them is relevant.
2) You wrote, “When they say … there is no point in arguing.” You’re making an argument from authority. Invalid to the max.
McCrae reported the same uncertainty as I derived. I quoted him above: “The average deviation of an individual experimental result from this relation is 2°C in the series of slow precipitations just described.” The point scatter in the other calibrations led to uncertainties of similar magnitude, and the error envelopes behaved as systematic error. You have no case.
Taking your error in slope and intercept: for any line you’d have this — y = M[(+/-)m]*x +B(+/-)b, where M is the mean slope, m is the uncertainty in the slope, B is the mean intercept and b is the uncertainty in the intercept.
With two slopes and two intercepts about the mean, you have five lines that describe any one relation between x and y (or T and dO-18): the mean line and the four lines that bound your uncertainty.
Given a T:dO-18 data set, every dO-18 will have five temperatures associated with it; the mean temperature and four uncertainty-bound temperatures about that mean, i.e., Tmean and t1, t2, t3, t4, where the t’s are the four temperatures that define your uncertainty bound. Those four uncertainty temperatures combine to give you a standard deviation around your Tmean, calculated as sqrt[sum of (t-T)^2/3].
You can’t escape it, Frank. And note that the uncertainty is systematic not random.
3) you had no point three
4) I’m using standard error propagation, Frank. I’m not “advocating” it. The method is a basic tool in science and has been in common use for more than 100 years.
You wrote, “When calibration is done with a precision of 0.1%o…” You still haven’t realized that the point scatter comes from the entire methodology. Mass spec is only part of that.
The rest of your point is just hand-waving will this and would that. These contribute nothing without having done the experiment.
In my recent post, I showed that the uncertainties in Kim & O’Neil’s “positive controls” is (+/-)2 C to (+/-)4C. Your claim about “better answers from positive controls” has already been refuted by direct demonstration.
Regarding the Sargasso Sea analysis, Keigwin didn’t report any experimental error, and none of his dO-18 points include error bars. But his reported that salinity statistically accounted for 30% of his isotopic signal in his calibration against recent SSTs.
As we do not know how salinity changed at that location across the 3000 years of Keigwin’s analysis (he assumed constant salinity — an assumption refuted by the recent variations in salinity), then a properly conservative and critical view of uncertainty would put 30% error bars on his dO-18 reconstructed SSTs. That would be (+/-)7 C.
So, if anything, the (+/-)0.75 C uncertainty I calculated just from Keigwin’s point scatter is extremely generous.
5) The experimental error behaves as systematic, not random. Figures 2-5 and Table 1 report the errors in calibration experiments. They are not subject to unknowns of salinity, light, or temperature.
The uncertainties I calculated for Keigwin come from his method, not from unknown marine variables. I’ve explained this to you now at least twice, e.g., here, and 7) and 8) hereand you still repeat the mistake. Your entire argument about Keigwin is based in a persistent misconception.
6) Assessing temperature differences does not remove any uncertainty due to systematic error. Systematic error is removed by differencing only if it is constant and of known magnitude. Neither of those conditions are true in dO-18 proxy temperatures, most especially in those representing paleo-temperatures.
Feynman’s quote is a double-pointed spear, Frank. It points at you as much as me. Speaking for myself, I’ve had no trouble objectively defending my analysis.
Monty, you’re obviously hostile to my analysis. It’s clear that if you were able to make a criticism, you’d have done so. However, your posts remain insubstantial and your competence remains undemonstrated.
In the absence of competence, your argument about peer-review is no more than an argument from authority.
My field is x-ray absorption spectroscopy applied to elements of biological interest, especially transition metals. Were there a blog critical of the method, it would be no problem to evaluate the criticisms and dispute them, or agree with them, either in some detail.
This is typical of any scientist encountering his/her field. However, your posts are empty of any trace of scientific familiarity. You haven’t even shown a familiarity with common error analysis.
You’ve given us no reason to think you know what you’re talking about.
As to my analysis, it’s only a couple of weeks old. Who knows, I may decide to write it up formally and submit it somewhere. But I’m in the middle of a more extended air temperature reliability study, and that’s first priority for committed time. The proxy business was a side-light consequent to my conversation with Michael Tobis.
Monty wrote, “I mentioned I was an academic working in climate science … because Pat wrote “You don’t understand a word of it either, Monty, and yet you criticize anyway”. Had he not said this I wouldn’t have felt the need to advertise my qualifications.
“My basic point still stands. For all his bluster, Pat Frank will not allow expert scientists to review his work.”
But you claim to be an “expert scientist,” Dr. academic-working-in-climate-science-Monty — complete with “qualifications.” And yet you’ve been unable to produce a single substantive sentence.
So, what are we to conclude?
From the evidence we have two choices: either you’re not a climate scientist, or one can be a climate scientist without displaying any competence.
Any climate scientist can come here and lay on the criticism. I’ve been here consistently, taking up all challenges. Unlike you, who has sniped without end. And then you accuse me of bluster. What a laugh. I can’t allow or disallow anything here. i’ve no control over posting.
You’re a climate scientist of high standing — we all know that because you’ve said so. So, how about you round up a few of your climate scientist buddies and come back with something relevant to say. I’ll be here, and you’ll all be allowed to post whatever you like within the bounds of Anthony’s posted blog ethics.
Put up or shut up, Monty.
Monty wrote, “I’m guessing that the world’s leading experts in isotope chemistry and paleoclimatology probably do know more than Pat Frank about isotope chemistry and paleoclimatology.”
Except that my post is about error analysis. Irrelevant again, Monty.
Pat Frank says:
April 10, 2012 at 10:01 am
You don’t understand a word of it either, Monty, and yet you criticise anyway. That makes you their opposite — a bombast also with a worthless opinion….
_______________________________________
What makes Monty’s criticism without any facts here on WUWT so interesting is that Monty has a PhD in Physics. (I followed the link in his name when he first came onto WUWT) However Monty does not use his degree in Physics in his rebuttals instead he seems to be using Alinsky’s Rules for Radicals
I have been following WUWT for years. Most people here seem to have a high level of science background as some of the more lively discussions have shown. The newest crop of trolls seem to be following the USDAs Handbook suggesting staff address farmers at the sixth grade level. I find that extremely insulting. At work I routinely sat in on critiques for new products where the chemistry and engineering were discussed and was on many occasions able to spot problems based on logic and not on intimate knowledge of the process.
A scientific paper SHOULD be written so other can follow it. Unfortunately when writting for peer reviewed journals, Bafflegab Pays.Dr. Scott Armstrong even wrote a paper on it.
Pat: I’m painfully aware that the Feynman quote points both ways. I immediately and clearly acknowledged at one point my gross error in accepting the resolution of the mass spec as the uncertainty in isotope measurements. I didn’t fully understand where all of the numbers came from in your calculated error bars for Keigwin and I should have acknowledged that your replies did clear that up. (You introduced uncertainty from Shackelton, not Bemis). However, you haven’t acknowledged that ANY of the points I have made might have any value. Of course, they could all be wrong, so I did ask if I was fooling myself (as Feynman recommends).
1) When you objected to my initial estimate of an uncertainty of 0.5 degC, I did some research that lead me to Lea’s published paper with value of 0.5 degC. Lea could be wrong, but someone who understands the field does agree with me.
2) When you continued to insist on your method of adding in quadrature Shackelton’s calibration error to Keigwin’s experimental error, I dug up a Wikipedia reference that confirmed my initial thought that you should have used the uncertainty in the least-squares coefficients obtained by Shackelton. Wikipedia isn’t a great source, but it reduced the chance I might be fooling myself.
3) My inability to find a better authority than Wikipedia on the error introduced by standard curves eventually reminded me that such error is invariably established experimentally by running control samples – not by error propagation. (Do you have any experience with standard curves? The analytical chemists and other scientists l have worked with always include positive control samples in every experiment.) If I had designed Keigwin’s experimental work, analysis of the “unknown” samples from the sediment core would have been randomly interspersed with multiple control samples grown at several different temperatures 0.5 degC apart and covering the full range of expected temperature. There would be no doubt about whether the experimental technique used with THESE samples reliably resolved temperature CHANGES of 0.5 or 1.0 degC. (Those with experience in this field would have access to control samples from earlier projects.) When I said that Keigwin should know his uncertain from EXPERIMENTAL control samples, you dismissed this as as an “appeal to authority”.
4) You have intermixed discussion of random and systematic error. You haven’t acknowledged that moving from one calibration line to another is a systematic error, but that no matter which calibration line you are on, a CHANGE of 0.2%o in isotope ratio always translates to about a 1 degC temperature CHANGE. (The uncertainty inherent in a CHANGE isotope ratio is determined by addition the standard error of individual isotope measurements in quadrature. With equal variances and sample sizes, the uncertainty increases by a factor of 1.4.) Although you have shown that systematic errors are easy to find BETWEEN sites, you have refused to discuss in quantitative terms how much various factors (like salinity, sea water isotope ratio, and light) would need to change OVER TIME to introduce a significant error in Keigwin’s reconstruction.
The internet seems to be full of climate change skeptics who need to be reminded of Feynman’s saying “the easiest person to fool is yourself”. Perhaps you will recognize that your replies have forced me to test what I previously believed to be correct and that some of my criticisms have survived my review. Whether you are capable of applying Feynman’s quote to your own work remains in doubt as long as you seem to be “stonewalling” constructive criticism.
Gail, I don’t understand Monty’s approach either. As a physicist he should have objected quantitatively. But then Arthur Smith at Planet3org responded similarly, and he’s a Ph.D. physicist, too.
Your experience in meetings matches my own. A professional with experience and training can spot logical errors in out-of-specialty process. In fact, often those without any specialist training can spot logical flaws in process, though not errors in the science or the data.
You’re right about bafflegab, too. Steve McIntyre showed beyond any doubt that Michael Mann engaged in exactly that when writing MBH98/99. His obscurantism in phrasing and methodology were deliberately calculated to impress without informing.
It happens elsewhere, too. I can recall a graduate student complaining that her advisor told her she’d made their paper “too pedagogical.” By that, he meant she’d been too clear in explaining what they’d done and how they’d done it.
People, even many scientists, are afraid to ask for clarifications for fear of revealing ignorance (or stupidity). The ambitious, the egotistical and, unfortunately, the dishonest exploit that fear to their benefit.
Frank, first off the Feynman quote came at the end of your long set of criticisms. It had all the implicate meaning of being directed at me.
Second, I’m sorry to observe that you haven’t made any substantive points.
Frank, you strike me as a good guy. You’ve been unfailingly polite and have come across as sincere and honest, and doing your best. I’ve appreciated that.
But try as you might, it was quickly clear that you started your criticism without knowing anything about mass spectrometry, without knowing anything about how the dO-18 proxy works, without knowing anything about measurement error or the difference between accuracy and precision, and without knowing anything about how to propagate error. But that didn’t stop you from sailing in.
On the other hand, you have tried to provide substance while knowing nothing, while Monty has provided nothing while claiming everything. So, you get an “A” for effort, in any case, even if not for content.
Following your numbering:
1) Lea doesn’t agree with you. You agree with Lea, which is an entirely different matter.
Look at Lea, Table 1. The estimated 0.5 C SE is relevant “when d18O-sw is known. That means they’re relevant when the dO-18 content of sea water is known. However, sea water paleo-DO-18 is not known. The 0.5 C SE is irrelevant for dO-18 paleo-temperature reconstructions.
Second, none of the Table 1 estimated SE’s are referenced to a study. We don’t know where they came from, or how those estimates were derived. They are apparently Lea’s own ball-park estimates.
In discussing proxy calibrations and their error, Lea cites McCrae, 1950; Epstein, 1953, Shackleton, 1974; Kim&O’Neil, 1997; Bemis, 1998; and Zhou & Zheng, 2003.
I’ve already discussed the measurement error in McCrae in Figures 2&3, in Epstein, 1953 in Figure 5, and in Kim & O’Neil, 2007 in Figure 4. Those results and others, including Shackleton and Bemis, 1998, are summarized in Table 4. Where they could be examined, they all exhibit basic systematic measurement error ranging from 0.6 – 2.2 C; i.e., uniformly more than allowed by Lea.
But to respond even more fully to your concerns, I’ve now looked at Kim&O’Neil, 1997 here: http://i41.tinypic.com/r8x99y.jpg and Zhou & Zheng, 2003 here: http://i43.tinypic.com/ionhg7.jpg
Both data sets show unmistakeable evidence of systematic measurement error. The Kim & O’Neil, 1997 1-sigma is (+/-)4 C, and Zhou & Zheng, 2003 an incredible (+/-)180 C.
These analyses now cover all of Lea’s principal citations. None of the calibrations are good to (+/-)0.5 C, and many of them are far poorer.
2) It’s as though, once again, you didn’t read the essay. I used Shackleton’s 1969 precision mass spec experiment to calculate a lower limit of measurement error in his method. That lower limit stands.
Shackleton published in a French journal that I can’t access. We, including you, don’t even know whether he published the LSQ uncertainties. So you could never have seen them to suggest using them.
The LSQ uncertainties will produce an experimental uncertainty similar to the standard deviation of the experimental points. You achieve nothing by that route.
Keigwin’s experiment is independent of Shackelton’s. Propagating Shackleton’s experimental error into Keigwin’s uncertainty is no more than standard propagation of experimental error. You have no point here.
3) All of the experiments I evaluated above, apart from Keigwin’s, are calibrations of the method, what you call “positive controls.” How is it you still don’t know that?
I do analytical work regularly.
You wrote, “When I said that Keigwin should know his uncertain from EXPERIMENTAL control samples, you dismissed this as as an “appeal to authority”.” None of my posts here, to you or to anyone else, contain the phrase “appeal to authority.” kim2000 used that phrase in a reply to Monty.
Looking through your posts, you mentioned Keigwin’s experiment in 8), here. Item 8) in my reply was on point and didn’t mention anything about appeals to authority in any form.
Here, I wrote that, “Your entire argument about Keigwin is based in a persistent misconception.,” which is true.
Under “2)” in that post, I wrote that you were making an argument from authority when you wrote that, “When they say the temperatures reconstructed for positive controls are typically good to 0.5 degC, there is no point in arguing.” That had to do with Lea, not Keigwin and was indeed an argument from authority.
It appears you’re convoluting one exchange into another. Combinations of disparate conversations are mistakes, not history.
Science (except lately in the UK), is “nullius in verba,” remember? The point of arguing is made when an assertion is demonstrated not true. The analyses presented here falsify the assertion of an average (+/-)0.5 C dO-18 proxy uncertainty.
4) You wrote, “You have intermixed discussion of random and systematic error.” No, my discussion has invariably been about systematic error. The non-Gaussian histograms of the experimental residuals justify a conclusion of systematic measurement error. That has been the case throughout.
You wrote, “no matter which calibration line you are on, a CHANGE of 0.2%o in isotope ratio always translates to about a 1 degC temperature CHANGE”
Frank, experimenters do not use those calibration lines to obtain temperature changes. They do not calculate a temperature from one part of the line, a second temperature from another part of the line, and then subtract them.
The proxy experiment works like this: an experimenter calculates the dO-18 in, say, a foraminiferal sediment. S/He assumes the standard sea water salinity and dO-18 for that locale has persisted over the intervening time, unless another proxy is used to make a correction.
The dO-18 ratio in the calcareous fossil is ratioed to the standard sea water dO-18 assumed to mirror the sea water of that past time. A calibration curve is chosen — Bemis, 1998, Epstein, 1953, or whichever one is deemed appropriate. A sea surface temperature (SST) is calculated using the standard curve. If it’s a tropical locale, the paleo SST will be somewhere around 30(+/-)1 C, with that (+/-)1 C due to systematic measurement error only. Any errors that covary with salinity are not included, because standard salinity was assumed.
To get a temperature change since that time, that paleo 30 C must be subtracted from the recent SST in that locale. Suppose the local modern SST is obtained from floating buoys, and is 30.5 C. The uncertainty in buoy temperatures has been estimated, for example, by W. J. Emery, et al., (2001) “Accuracy of in situ sea surface temperatures used to calibrate infrared satellite measurements, JGR 106(C2), 2387-2405. That error is about (+/-)0.5 C.
The difference SST is 30.5-30 = 0.5 C. The uncertainty in that 0.5 C difference is the uncertainties in the proxy paleo-SST and the modern SST in quadrature, and 1-sigma = sqrt(1^2+0.5^2)=1.1 C. So, you’d have to report your temperature difference as 0.5(+/-)1.1 C.
How significant is the difference temperature?
If two paleo-temperatures are subtracted to get a difference, say Keigwin’s data at -1ky and -2ky, the propagation of uncertainty following a subtraction applies. In Keigwin’s case, that would be 1-sigma = sqrt(2×0.75^2)=1.1 C.
The (+/-)0.75 C uncertainty in each temperature does not subtract away because the error is systematic. That 0.75 C is only an average, and the magnitude of the systematic error in each of the two temperatures is not known.
Look again at the replicate measurements in the two jpegs I linked. Notice the replicate points are vertically displaced, even though they represent measurements of the same quantity. Each point has its own magnitude of systematic error. Subtracting two individual points can actually decrease or increase the error in the difference, and we can never know which of those two results occurred in any given case because we don’t know the true answer. The only way out is to be conservative about uncertainty and propagate the average error and report that.
The same logic applies to a difference of averages.
You wrote, “Although you have shown that systematic errors are easy to find BETWEEN sites,..“. Nothing I’ve written here has ever dealt with systematic error between sites. Everything I’ve done here has concerned methodological uncertainty due to systematic measurement error within dO-18 calibration experiments..
After all this time, all this conversation, and after my repeated explanations, I don’t understand how such a fundamental misunderstanding is possible.
You wrote, “you have refused to discuss in quantitative terms how much various factors (like salinity, sea water isotope ratio, and light) would need to change OVER TIME to introduce a significant error in Keigwin’s reconstruction.”
First, in item 5) here, I pointed out to you that, “The uncertainties I calculated for Keigwin come from his method, not from unknown marine variables., and provided you with two links to prior posts where I had made the same point. You’ve now made that same error four times running.
I haven’t refused to do anything except be distracted. Apart from discussing what I have in fact done, I have tried to repair your continual misperceptions about what I’ve done.
Second, in paragraph 5 under item 4) in my immediately prior response, I did in fact estimate the effect of salinity, using Keigwin’s own description of the statistical covariance of salinity with dO-18 in recent Sargasso Sea waters. That covariance was ~30%, and introduced a 7 C uncertainty in his paleotemperature reconstruction.
Finally, you wrote, “…some of my criticisms have survived my review. Whether you are capable of applying Feynman’s quote to your own work remains in doubt as long as you seem to be “stonewalling” constructive criticism.”
Your criticisms have been neither valid nor constructive. They’ve either been wrong outright or based in misreadings or factual misunderstandings. It’s not “stonewalling” to point that out, or to demonstrate your apparently inevitable errors.
You have pushed on with your criticisms no matter that you don’t know mass spectrometry, that you don’t understand the dO-18 proxy method, that you have no understanding of measurement error or its significance or how to propagate error or how such error impacts the significance of a result. Was that wise?
But you’ve tried hard, for which I again give you credit.
Shoot — I just noticed that I mistakenly included the Zhou & Zheng 2003 error histogram in the Kim & O’Neil,1997 analysis plot. Here’s the corrected Kim & O’Neil 1997 analysis Figure: http://i42.tinypic.com/x3thlc.jpg
The fit to the histogram of point scatter shows at least two error modes in the data.
Pat: Thanks for your kind reply. I do appreciate the ad hominem remarks; they make it more rewarding to finally expose the fact. The science underlying the correlation between O18 and temperature is relatively simply: In Figure 4, the linear plot of the natural logarithm of the isotope ratio vs. 1/T comes from applying the Arrhenius equation to kinetic isotope effects and the slope is the difference in activation energy for the isotopes divided by R. The biological process would probably also show a linear relationship if plotted this way. Analyzing the uncertainty associated with the calibration curves used by this method is more involved. The approach in your post exaggerates uncertainty compared to what follows below.
I previous suggested that the uncertainty arising from standard curves is normally calculated from the uncertainty in the parameters of the least-squares fit. A presentation showing how this process is done for a linear fit can be seen at: http://ull.chemistry.uakron.edu/chemometrics/07_Calibration.pdf
The key section is located on slide 20: “Sensitivity: Smallest CHANGE in amount we can see with a known level of confidence.” We want to know the sensitivity of O18:T dating, which is shown graphically in Slide 21. Lea’s figure of 0.5 degC may be the sensitivity derived from Shakelton’s calibration curve.
The Shakelton calibration curve can be used to convert isotope data into temperature data for multiple specimens obtained from a layer of a sediment core that covers some period of time, presumably many decades. The resulting temperature data will have a mean and a sample standard deviation that reflects: a) annual (and possibly decadal) variation in temperature at the site, b) annual (and possibly decadal) variation in salinity, seawater O18, light and other factors that might perturb the relationship between temperature and O18, and c) experimental variability when O18 is measured. The uncertainty contributed by all of these factors to the standard error of the mean temperature diminishes with the square root of the number of samples analyzed, but the overall uncertainty for the method can never drop below the sensitivity of the calibration. Therefore:
a) If site variability was relatively high and/or Keigwin analyzed a relatively small number of samples from a layer, the standard error of his mean temperature for that period might be greater than the sensitivity of the calibration method. The observed SE should be reported as the uncertainty. Note that this ALREADY includes the experimental variability associated with measuring O18.
b) If site variability was lower and/or Keigwin analyzed a larger number of samples, the standard error of his mean temperature might be less than the sensitivity of the calibration method. Under these circumstances, the sensitivity of the calibration, not the standard error of the mean, should be reported as the uncertainty. (If Keigwin analyzed 100 control samples, the standard error of the mean control isotope ratio would be very low, but no mean isotope ratio can be converted into temperature more accurately than the regression boundaries of the calibration curve permit, ie than the sensitivity of the method.)
c) Shakelton’s and Keigwin’s uncertainties are never added in quadrature. The sensitivity calculated from Shakelton’s calibration merely provides a lower limit to uncertainty that Keigwin can’t overcome by analyzing more samples.