Proxy Science and Proxy Pseudo-Science

Guest post by Pat Frank

It’s become very clear that most published proxy thermometry since 1998 [1] is not at all science, and most thoroughly so because Steve McIntyre and Ross McKitrick revealed its foundation in ad hoc statistical numerology. Awhile back, Michael Tobis and I had a conversation here at WUWT about the non-science of proxy paleothermometry, starting with Michael’s comment here and my reply here. Michael quickly appealed to his home authorities at, Planet3.org. We all had a lovely conversation that ended with moderator-cum-debater Arthur Smith indulging a false claim of insult to impose censorship (insulting comment in full here for the strong of stomach).

But in any case, two local experts in proxy thermometry came to Michael’s aid: Kaustubh Thimuralai, a grad student in proxy climatology at U. Texas, Austin and Kevin Anchukaitis, a dendroclimatologist at Columbia University. Kaustubh also posted his defense at his own blog here.

Their defenses shared this peculiarity: an exclusive appeal to stable isotope temperature proxies — not word one in defense of tree-ring thermometry, which provides the vast bulk of paleotemperature reconstructions.

The non-science of published paleothermometry was proved by their non-defense of its tree-ring center; an indictment of discretionary silence.

Nor was there one word in defense of the substitution of statistics for physics, a near universal in paleo-thermo.

But their appeal to stable isotope proxythermometry provided an opportunity for examination. So, that’s what I’m offering here: an analysis of stable isotope proxy temperature reconstruction followed by a short tour of dendrothermometry.

Part I. Proxy Science: Stable Isotope Thermometry

The focus is on oxygen-18 (O-18), because that’s the heavy atom proxy overwhelmingly used to reconstruct past temperatures. NASA has a nice overview here. The average global stable isotopic ratios of oxygen are, O-16 = 99.757%, O-17 = 0.038 %, O-18 = 0.205 %. If there were no thermal effects (and no kinetic isotope effects), the oxygen isotopes would be distributed in minerals at exactly their natural ratios. But local thermal effects cause the ratios to depart from the average, and this is the basis for stable isotope thermometry.

Let’s be clear about two things immediately: first, the basic physics and chemistry of thermal isotope fractionation is thorough and fully legitimate. [2-4]

Second, the mass spectrometry (MS) used to determine O-18 is very precise and accurate. In 1950, MS of O-18 already had a reproducibility of 5 parts in 100,000, [3] and presently is 1 part in 100,000. [5] These tiny values are represented as “%o,” where 1 %o = 0.1% = 0.001. So dO-18 MS detection has improved by a factor of 5 since 1950, from (+/-)0.05%o to (+/-)0.01%o.

The O-18/O-16 ratio in sea water has a first-order dependence on the evaporation/condensation cycle of water. H2O-18 has a higher boiling point than H2O-16, and so evaporates and condenses at a higher temperature. Here’s a matter-of-fact Wiki presentation. The partition of O-18 and O-16 due to evaporation/condensation means that the O-18 fraction in surface waters rises and falls with temperature.

There’s no dispute that O-18 mixes into CO2 to produce heavy carbon dioxide – mostly isotopically mixed as C(O-16)(O-18).

Dissolved CO2 is in equilibrium with carbonic acid. Here’s a run-down on the aqueous chemistry of CO2 and calcium carbonate.

Dissolved light-isotope CO2 [as C(O-16)(O-16)] becomes heavy CO2 by exchanging an oxygen with heavy water, like this:

CO2 + H2O-18 => CO(O-18) + H2O-16

This heavy CO2 finds its way into the carbonate shells of mollusks, and the skeletons of foraminifera and corals in proportion to its ratio in the local waters (except when biology intervenes. See below).

This process is why the field of stable isotope proxy thermometry has focused primarily on O-18 CO2: it is incorporated into the carbonate of mollusk shells, corals, and foraminifera and provides a record of temperatures experienced by the organism.

Even better, fossil mollusk shells, fossil corals, and foraminiferal sediments in sea floor cores promise physically real reconstructions of O-18 paleotemperatures.

Before it can be measured, O-18 CO2 must be liberated from the carbonate matrix of mollusks, corals, or foraminifera. Liberation of CO2 typically involves treating solid CaCO3 with phosphoric acid.

3 CaCO3 + 2 H3PO4 => 3 CO2 + Ca3(PO4)2 + 3 H2O

CO2 is liberated from biological calcium carbonate and piped into a mass spectrometer. Laboratory methods are never perfect. They incur losses and inefficiencies that can affect the precision and accuracy of results. Anyone who’s done wet analytical work knows about these hazards and has struggled with them. The practical reliability of dO-18 proxy temperatures depends on the integrity of the laboratory methods to prepare and measure the intrinsic O-18.

The paleothermometric approach is to first determine a standard relationship between water temperature and the ratio of O-18/O-16 in precipitated calcium carbonate. One can measure how the O-18 in the water fractionates itself into solid carbonate over a range of typical SST temperatures, such as 10 C through 40 C. A plot of carbonate O-18 v. temperature is prepared.

Once this standard plot is in hand, the temperature is regressed against the carbonate dO-18. The result is a least-squares fitted equation that tells you the empirical relationship of T:dO-18 over that temperature range.

This empirical equation can then be used to reconstruct the water temperature whenever carbonate O-18 is known. That’s the principle.

The question I’m interested in is whether the complete physico-chemical method yields accurate temperatures. Those who’ve read my paper pdf on neglected systematic error in the surface air temperature record, will recognize the ‘why’ of focusing on measurement error. It’s the first and minimum error entering any empirically determined magnitude. That makes it the first and basic question about error limits in O-18 carbonate proxy temperatures.

So, how does the method work in practice?

Let’s start with the classic: J. M. McCrea (1950) “On the Isotopic Chemistry of Carbonates and a Paleotemperature Scale“[3], which is part of McCrae’s Ph.D. work.

McCrae’s work is presented in some detail to show the approach I took to evaluate error. After that, I promise more brevity. Nothing below is meant to be, or should be taken to be, criticism of McCrae’s absolutely excellent work — or criticism of any of the other O-18 authors and papers to follow.

McCrae made truly heroic and pioneering experimental work establishing the O-18 proxy temperature method. Here’s his hand-drawn picture of the custom glass apparatus used to produce CO2 from carbonate. I’ve annotated it to identify some bits:

Figure 1: J. McCrae’s CO2 preparative glass manifold for O-18 analysis.

I’ve worked with similar glass gas/vacuum systems with lapped-in ground-glass joints, and the opportunity for leak, crack, or crash-tastrophe is ever-present.

McCrae developed the method by precipitating dO18 carbonate at different temperatures from marine waters obtained off East Orleans, MA, on the Atlantic side of Cape Cod, and off Palm Beach, Florida. The O-18 carbonate was then chemically decomposed to release the O-18 CO2, which was analyzed in a double-focusing mass spectrometer, which they apparently custom built themselves.

The blue and red lines in the Figure below show his results (Table X and Figure 5 in his paper). The %o O-18 is the divergence of his experimental samples from his standard water.

Figure 2, McCrae, 1950, original caption (color-modified): “Variation of isotopic composition of CaCO3(s) with reciprocal of deposition temperature from H2O (Cape Cod series (red); Florida water series (blue)).” The vertical lines interpolate temperatures at %o O-18 = 0.0. Bottom: Color-coded experimental point scatter around a zero line (dashed purple).

The lines are linear least square (LSQ) fits and they reproduce McCrae’s almost exactly (T is in Kelvin):

Florida: McCrae: d18O=1.57 x (10^4/T)-54.2;

LSQ: d18O=1.57 x (10^4/T)-53.9; r^2=0.994.

Cape Cod: McCrae: d18O=1.64 x (10^4/T)-57.6;

LSQ: d18O=1.64 x (10^4/T)-57.4; r^2=0.995.

About his results, McCrae wrote this: “The respective salinities of 36.7 and 32.2%o make it not surprising that there is a difference in the oxygen composition of the calcium carbonate obtained from the two waters at the same temperature.(bold added)”

The boiling temperature of water increases with the amount of dissolved salt, which in turn affects the relative rates that H2O-16 and H2O-18 evaporate away. Marine salinity can also change from the influx of fresh water (from precipitation, riverine, or direct runoff), or from upwelling, from wave-mixing, and from currents. The O-16/O-18 ratio of fresh water, of upwelling water, or of distant water transported by currents, may differ from a local marine ratio. The result is that marine waters of the same temperature can have different O18 fractions. Disentangling the effects of temperature and salinity in a marine O-16/O-18 ratio can be difficult to impossible in paleo-reconstructions.

The horizontal green line at %o O18 = zero intersects the Florida and Cape Cod lines at different temperatures, represented by the vertical drops to the abscissa. These show that the same dO-18 produces a difference of 4 C, depending on which equation you choose, with the apparent T covarying with a salinity change of 0.045%o.

That means if one generates a paleotemperature by applying a specific dO18:T equation to paleocarbonates, and one does not know the paleosalinity, the derived paleotemperature can be uncertain by as much as (+/-)2 C due to a hidden systematic covariance (salinity).

But I’m interested in experimental error. From those plots one can estimate the point scatter in the physico-chemical method itself as the variation around the fitted LSQ lines. The point scatter is plotted along the purple zero line at the bottom of Figure 2. Converted to temperature, the scatter is (+/-)1 C for the Florida data and (+/-)1.5 C for the Cape Cod data.

All the data were determined by McCrae in the same lab, using the same equipment and the same protocol. Therefore, it’s legitimate to combine the two sets of errors in Figure 2 to determine their average, and the resulting average uncertainty in any derived temperature. The standard deviation of the combined errors is (+/-)0.25 %o O-18, which translates into an average temperature uncertainty of (+/-)1.3 C. This emerged under ideal laboratory conditions where the water temperature was known from direct measurement and the marine O18 fraction was independently measured.

Next, it’s necessary to know whether the errors are systematic or random. Random errors diminish as 1/sqrtN, where N is the number of repetitions of analysis. If the errors are random, one can hope for a very precise temperature measurement just by repeating the dO-18 determination enough times. For example, in McCrae’s work, 25 repeats reduces the average error in any single temperature by 1.3/5 => (+/-)0.26 C.

To bridge the random/systematic divide, I binned the point scatter over (+/-)3 standard deviations = (+/-)99.7 % certainty of including the full range of error. There were no outliers, meaning all the scatter fell within the 99.7 % bound. There are only 15 points, which is not a good statistical sample, but we work with what we’ve got. Figure 3 shows the histogram plot of the binned point-scatter, and a Gaussian fit. It’s a little cluttered, but bear with me.

Figure 3: McCrae, 1950 data: (blue points), binned point scatter from Figure 2; red line, Two-Gaussian fit to the binned points; dashed green lines, the two fitted Gaussians. Thin purple points and line: separately binned Cape Cod point scatter; thin blue line and points, separately binned Florida point scatter.

The first thing to notice is that the binned points are very not normally distributed. This immediately suggests the measurement error is systematic, and not random. The two Gaussian fit is pretty good, but should not be taken as more than a numerical convenience. An independent set of measurement scatter points from a different set of experiments may well require a different set of Gaussians.

The two Gaussians imply at least two modes of experimental error operating simultaneously. The two thin single-experiment lines are spread across scatter width. This demonstrates that the point scatter in each data sets participates in both error modes simultaneously. But notice that the two data sets do not participate equivalently. This non-equivalence again indicates a systematic measurement error that apparently does not repeat consistently.

The uncertainty from systematic measurement error does not diminish as 1/sqrtN. The error is not a constant offset and does not subtract away in a difference between data sets. It propagates into a final value as (+/-)sqrt[(sum of N errors)^2/(N-1)].

The error in any new proxy temperature derived from those methods will probably fall somewhere in the Figure 3 envelope, but the experimenter will not know where. That means the only way to honestly present a result is to report the average systematic error, and that would be T(+/-)1.3 C.

This estimate is conservative, as McCrae noted that, “The average deviation of an individual result from the relation is 0.38%o.”, which is equivalent to an average error of (+/-)2 C (I calculated 1.95 C; McCrae’s result). McCrae wrote later, “The average deviation of an individual experimental result from this relation is 2°C in the series of slow precipitations just described.

The slow precipitation experiments were the tests with Cape cod and Florida water, shown in Figure 2, and McCrae mentioned their paleothermal significance at the end of his paper, “The isotopic composition of calcium carbonate slowly formed from aqueous solution has been noted to be usually the same as that produced by organisms at the same temperature.

Anyone using McCrae’s standard equations to reconstruct a dO-18 paleotemperature must include the experimental uncertainty hidden inside them. However, they are invariably neglected. I’ll give an example below.

Another methodological classic is Sang-Tae Kim et al. (2007) “Oxygen isotope fractionation between synthetic aragonite and water: Influence of temperature and Mg2+ concentration“.[6]

Kim, et al., measured the relationship between temperature and dO-18 incorporation in Aragonite, a form of calcium carbonate found in mollusk shells and corals (the other typical form is calcite). They calibrated the T:dO-8 relationship at five temperatures, 0, 5, 10, 25, and 40 C which covers the entire range of SST. Figure 4a shows their data.

Figure 4: a. Blue points: Aragonite T:dO-18 calibration experimental points from Kim, et al., 2007; purple line: LSQ fit. Below: green points, the unfit residual representing experimental point-scatter, 1-sigma = (+/-)0.21. b. 3-sigma histogram of the experimental unfit residual (points) and the 3-Gaussian fit (purple line). The thin colored lines plus points are separate histograms of the four data sub-sets making up the total.

The alpha in “ln-alpha” is the O-18 “fractionation factor,” which is a ratio of O-18 ratios. That sounds complicated, but it’s just (the ratio of O-18 in carbonate divided by the ratio of O-18 in water): {[(O-18)c/(O-16)c] / [(O-18)w/(O-16)w]}, where “c” = carbonate, and “w” = water.

The LSQ fitted line in Figure 4a is 1000 x ln-alpha = 17.80 x (1000/T)-30.84; R^2 = 0.99, which almost exactly reproduces the published line, 1000 x ln-alpha = 17.88 x (1000/T)-31.14.

The green points along the bottom of Figure 4a are the unfit residual, representing the experimental point scatter. These have a 1-sigma standard deviation = (+/-)0.21, which translates into an experimental uncertainty of (+/-)1 C.

In Figure 4b is a histogram of the unfit residual point scatter in part a, binned across (+/-)3-sigma. The purple line is a three-Gaussian fit to the histogram, but with the point at -0.58,3 left out because it destabilized the fit. In any case, the experimental data appear to be contaminated with at least three modes of divergence, again implying a systematic error.

Individual data sub-sets are shown as the thin colored lines in Figure 4b. They all spread across at least two of the three experimental divergence modes, but not equivalently. Once again, that means every data set is uniquely contaminated with systematic measurement error.

Kim, et al., reported a smaller analytical error (+/-)0.13, equivalent to an uncertainty in T = (+/-)0.6 C. But their (+/-)0.13 is the analytical precision of the mass spectrometric determination of the O-18 fractions. It’s not the total experimental scatter. Residual point scatter is a better uncertainty metric because the Kim, et al., equation represents a fit to the full experimental data, not just to the O-18 fractions found by the mass spectrometer.

Any researcher using the Kim, et al., 2007 dO-18:T equation to reconstruct a paleotemperature must propagate at least (+/-)0.6 C uncertainty into their result, and better (+/-)1 C.

I’ve done similar analyses of the experimental point-scatter in several studies used to calibrate the T:O-18 temperature scale. Here’s a summary of the results:

Study______________(+/-)1-sigma______n_____syst err?____Ref.

McCrae________________1.3 C_________15_____Y________[3]

O’Neil_________________29 C_________11______?________[7]

Epstein_______________0.76 C________25______?_________[8]

Bemis________________1.7 C_________14______Y________[9]

Kim__________________1.0 C_________70______Y________[6]

Li____________________2.2 C__________5______________[10]

Friedman______________1.1 C__________6______________[11]

O’Neil’s was a 0-500 C experiment

All the Summary uncertainties represent only measurement point scatter, which often behaved as systematic error. The O’Neil 1969 point scatter was indeterminate, and the Epstein question mark is discussed below.

Epstein, et al., (1953), chose to fit their T:dO-18 calibration data with a second-order polynomial rather than with a least squares straight line. Figure 5 shows their data with the polynomial fit, and for comparison a LSQ straight line fit.

Figure 5: Epstein, 1953 data fit with a second-order polynomial (R^2 = 0.996; sigma residual = (+/-)0.76 C) and with a least squares line (R^2 = 0.992; sigma residual = (+/-) 0.80 C). Insets: histograms of the point scatter plus Gaussian fits; Upper right,  polynomial; lower left, linear.

The scatter around the polynomial was pretty Gaussian, but left a >3-sigma outlier at 2.7 C. The LSQ fit did almost as well, and put the polynomial 3-sigma outlier within the 3-sigma confidence limit. The histogram of the linear fit scatter required two Gaussians, and left an unfit point at 2.5-sigma (-2 C).

Epstein had no good statistical reason to choose the polynomial fit over the linear fit, and didn’t mention his rationale. The poly fit came closer to the high-temperature end-point at 30 C, but the linear fit came closer to the low-T end-point at 7 C, and was just as good as through the internal data points. So, the higher order fit may have been an attempt to save the point at 30 C.

Before presenting an application of these lessons, I’d like to show a review paper, which compares all the different dO-18:T calibration equations in current use: B. E. Bemis, H. J. Spero, J. Bijma, and D. W. Lea, Reevaluation of the oxygen isotopic composition of planktonic foraminifera: Experimental results and revised paleotemperature equations. [9]

This paper is particularly valuable because it reviews the earlier equations used to model the T:dO18 relationship.

Figure 6 below reproduces an annotated Figure 2 from Bemis, et al. It compares several T:dO-18 calibration equations from a variety of laboratories. They have similar slopes but are offset. The result is that a given dO-18 predicts a different temperature, depending on which calibration equation one chooses. The Figure is annotated with a couple of very revealing drop lines.

Figure 6: Original caption”Comparison of temperature predictions using new O. universa and G. bulloides temperature:dO-18 relationships and published paleotemperature equations. Several published equations are identified for reference. Equations presented in this study predict lower temperatures than most other equations. Temperatures were calculated using the VSMOW to VPDB corrections listed in Table 1 for dO-18w values.

The green drop lines show that a single temperature associates with dO-18 values ranging across 0.4 %o. That’s about 10-40x larger than the precision of a mass spectrometer dO-18 measurement. Alternatively, the horizontal red extensions show that a single dO-18 measurement predicts temperatures across a ~1.8 C range, representing an uncertainty of (+/-)0.9 C in choice of standards.

The 1.8 C excludes the three lines, labeled 11-Ch, 12-Ch, and 13-Ch. These refer to G. bulloides with 11-, 12-, and 13-chambered shells. Including them, the spread of temperatures at a single dO-18 is ~3.7 C (dashed red line).

In G. bulloides, the number of shell chambers increases with age. Specific gravity increases with the number of chambers, causing the G. bulloides to sink into deeper waters. Later chambers sample different waters than the earlier ones, and incorporate the ratio of O-18 at depth. Three different lines show the vertical change in dO-18 is significant, and imply a false spread in T of about 0.5 C.

Here’s what Bemis, et al., say about it (p. 150), “Although most of these temperature:d18O relationships appear to be similar, temperature reconstructions can differ by as much as 2 C when ambient temperature varies from 15 to 25 C.

That “2 C” reveals a higher level of systematic error that appears as variations among the different temperature reconstruction equations. This error should be included as part of the reported uncertainty whenever any one of these standard lines is used to determine a paleotemperature.

Some of the variations in standard lines are also due to confounding factors such as salinity and the activity of photosynthetic foraminiferal symbionts.

Bemis, et al., discuss this problem on page 152: “Non-equilibrium d18O values in planktonic foraminifera have never been adequately explained. Recently, laboratory experiments with live foraminifera have demonstrated that the photosynthetic activity of algal symbionts and the carbonate ion concentration ([CO32-]) of seawater also affect shell d18O values. In these cases an increase in symbiont photosynthetic activity or [CO32-] results in a decrease in shell d18O values. Given the inconsistent SST reconstructions obtained using existing paleotemperature equations and the recently identified parameters controlling shell d18O values, there is a clear need to reexamine the temperature:d18O relationships for planktonic foraminifera.

Bemis, et al., are thoughtful and modest in this way throughout their paper. They present a candid review of the literature. They discuss the strengths and pitfalls in the field, and describe where more work needs to be done. In other words, they are doing honest science. The contrast could not be more stark between their approach and the pastiche of million dollar claims and statistical maneuvering that swamp AGW-driven paleothermometry.

When the inter-methodological ~(+/-)0.9 C spread of standard T:dO-18 equations is added as the rms to the (+/-)1.34 C average measurement error from the Summary Table, the combined 1-sigma uncertainty in a dO-18 temperature =(+/-)sqrt(1.34^2+0.9^2)=(+/-)1.6 C. That doesn’t include any further invisible environmental confounding effects that might confound a paleo-O18 ratio, such as shifts in monsoon, in salinity, or in upwelling.

A (+/-)1.6 C uncertainty is already 2x larger than the commonly accepted 0.8 C of 20th century warming. T:dO-18 proxies are entirely unable to determine whether recent climate change is in any way historically or paleontologically unusual.

Now let’s look at Keigwin’s justly famous Sargasso Sea dO-18 proxy temperature reconstruction: (1996) “The Little Ice Age and Medieval Warm Period in the Sargasso Sea.” [12] The reconstructed Sargasso Sea paleotemperature rests on G. ruber calcite. G. ruber has photosynthetic symbionts, which induces the T:dO-18 artifacts mentioned by Bemis, et al. Keigwin is a good scientist and attempted to account for this by applying an average G. ruber correction. But removal of an average bias is effective only when the error envelope is random around a constant offset. Subtracting the average bias of a systematic error does not reduce the uncertainty width, and may even increase the total error if the systematic bias in your data set is different from the average bias. Keigwin also assumed an average salinity of 36.5%o throughout, which may or may not be valid.

More to the point, no error bars appear on the reconstruction. Keigwin reported changes in paleotemperature of 1 C or 1.5 C, implying a temperature resolution with smaller errors than these values.

Keigwin used the T:dO-18 equation published by Shackleton in 1974,[13] to turn his Sargasso G. ruber dO-18 measurements into paleotemperatures. Unfortunately, Shackleton published his equation in the International Colloquium Journal of the French C.N.R.S., and neither I nor my French contact (thank-you Elodie) have been able to get that paper. Without it, one can’t directly evaluate the measurement point scatter.

However in 1965, Shackleton published a paper demonstrating his methodology to obtain high precision dO-18 measurements. [14] Shackleton’s high precision scatter should be the minimum scatter in his 1974 T:dO-18 equation.

Shackleton, 1965 made five replicate measurements of the dO-18 in five separate samples of a single piece of Italian marble (marble is calcium carbonate). Here’s his Table of results:

Reaction No. _1____2____3____4____5____Mean____Std dev.

dO-18 value__4.1__4.45_4.35__4.2__4.2____4.26%___0.12%o.

Shackleton mistakenly reported the root-mean-square of the point scatter instead of the standard deviation. No big deal, the true 1-sigma = (+/-)0.14%o; not very different.

In Shackleton’s 1965 words, “The major reason for discrepancy between successive measurements lies in the difficulty of preparing and handling the gas.” That is, the measurement scatter is due to the inevitable systematic laboratory error we’ve already seen above.

Shackleton’s 1974 standard T:dO-18 equation appears in Barrera, et al., [15] and it’s T = 16.9 – 4.38(dO-18) + 0.10(dO-18)^2. Plugging Shackleton’s high-precision 1-sigma=0.14%o into his equation yields an estimated minimum uncertainty of (+/-)0.61 C in any dO-18 temperature calculated using the Shackleton T:dO-18 equation.

At the ftp site where Keigwin’s data are located, one reads “Data precision: ~1% for carbonate; ~0.1 permil for d18-O.” So, Keigwin’s independent dO-18 measurements were good to about (+/-)0.1%o.

The uncertainty in temperature represented by Keigwin’s (+/-)0.1%o spread in measured dO-18 equates to (+/-)0.44 C in Shackleton’s equation.

The total measurement uncertainty in Keigwin’s dO-18 proxy temperature is the quadratic sum of the uncertainty in Shackleton’s equation plus the uncertainty in Keigwin’s own dO-18 measurements. That’s (+/-)sqrt[(0.61)^2+(0.44)^2]=(+/-)0.75 C. This represents measurement error, and is the 1-sigma minimum of error.

And so now we get to see something possibly never before seen anywhere: a proxy paleotemperature series with true, physically real, 95% confidence level 2-sigma systematic error bars. Here it is:

Figure 7: Keigwin’s Sargasso Sea dO-18 proxy paleotemperature series, [12] showing 2-sigma systematic measurement error bars. The blue rectangle is the 95% confidence interval centered on the mean temperature of 23.0 C.

Let’s be clear on what Keigwin accomplished. He reconstructed 3175 years of nominal Sargasso Sea dO-18 SSTs with a precision of (+/-)1.5 C at the 95% confidence level. That’s an uncertainty of 6.5% about the mean, and is a darn good result. I’ve worked hard in the lab to get spectroscopic titrations to that level of accuracy. Hat’s off to Keigwin.

But it’s clear that changes in SSTs on the order of 1-1.5 C can’t be resolved in those data. The most that can be said is that it’s possible Sargasso Sea SSTs were higher 3000 years ago.

If we factor in the uncertainty due to the (+/-)0.9 C variation among all the various T:dO-18 standard equations (Figure 6), then the Sargasso Sea 95% confidence interval expands to (+/-)2.75 C.

This (+/-)2.75 C = (uncertainty in experimenter d-O18 measurements) + (uncertainty in any given standard T:dO-18 equation) + (methodological uncertainty across all T:dO-18 equations).

So, (+/-)2.75 C is probably a good estimate of the methodological 95% confidence interval in any determination of a dO-18 paleotemperature. The confounding artifacts of paleo-variations in salinity, photosynthesis, upwelling and meteoric water will bring into any dO-18 reconstruction of paleotemperatures, further errors that are invisible but perhaps of analogous magnitude.

At the end, it’s true that the T:dO18 relationship is soundly based in physics. However, it is not true that the relationship has produced a reliably high-resolution proxy for paleotemperatures.

Part II: Pseudo-Science: Statistical Thermometry

Now on to the typical published proxy paleotemperature reconstructions. I’ve gone through a representative set of eight high-status studies, looking for evidence of science. Evidence of science is whether any of them make use of physical theory.

Executive summary: none of them are physically valid. Not one of them yields a temperature.

Before proceeding, a necessary word about correlation and causation. Here’s what Michael Tobis wrote about that, “If two signals are correlated, then each signal contains information about the other. Claiming otherwise is just silly.

There’s a lot of that going around in proxythermometry, and clarification is a must. John Aldrich has a fine paper [16] describing the battle between Karl Pearson and G. Udny Yule over correlation indicating causation. Pearson believed it, Yule did not.

On page 373, Aldrich makes a very relevant distinction: “ Statistical inference deals with inference from sample to population while scientific inference deals with the interpretation of the population in terms of a theoretical structure.

That is, statistics is about the relations among numbers. Science is about deductions from a falsifiable theory.

We’ll see that the proxy studies below improperly mix these categories. They convert true statistics into false science.

To spice up the point, here are some fine examples of spurious correlations, and here are the winners of the 1998 Purdue University spurious correlations contest, including correlations between ice cream sales and death-by-drowning, and between ministers’ salaries and the price of vodka. Pace Michael Tobis, each of those correlated “signals” so obviously contains information about the other, and I hope that irony lays the issue to rest.

Diaz and Osuna [17] point out that distinguishing, “between alchemy and science … is (1) the specification of rigorously tested models, which (2) adequately describe the available data, (3) encompass previous findings, and (4) are derived from well-based theories. (my numbers, my bold)”

The causal significance of any correlation is revealed only within the deductive context of a falsifiable theory that predicts the correlation. Statistics (inductive inference) never, ever, of itself reveals causation.

AGW paleo proxythermometry will be shown missing Diaz and Osuna elements 1, 3, and 4 of science. That makes it alchemy; otherwise known as pseudoscience.

That said, here we go: AGW proxythermometry:

1. Thomas J. Crowley and Thomas S. Lowery (2000) “How Warm Was the Medieval Warm Period?.” [18]

They used fifteen series: three dO-18 (Keigwin’s Sargasso Sea proxy, GISP 2, and the Dunde Ice cap series), eight tree-ring series, the Central England temperature (CET) record, an Iceland temperature (IT) series, and two plant-growth proxies (China phenology and Michigan pollen).

All fifteen series were scaled to vary between 0 and 1, and then averaged. There was complete and utter neglect of the physical meaning of the five physically valid series (3 x dO18, IT, and CET). All of them were scaled to the same physically meaningless unitary bound.

Think about what this means: Crowley and Lowry took five physically meaningful series, and discarded the physics. That made the series fit to use in AGW-related proxythermometry.

There is no physical theory that coverts tree ring metrics into temperatures. That theory does not exist and any exact relationship remains entirely obscure.

So then how did Crowley and Lowery convert their unitized proxy average into temperature? Well, “The two composites were scaled to agree with the Jones et al. instrumental record for the Northern Hemisphere…,” and that settles the matter.

In short, the fifteen series were numerically adjusted to a common scale, averaged, and scaled up to the measurement record. Then C&L reported their temperatures to a resolution of (+/-)0.05 C. Measurement uncertainty in the physically real series was ignored in their final composite. That’s how you do science, AGW proxythermometry style.

Any physical theory employed?: No

Strictly statistical inference?: Yes

Physical content: none.

Physical validity: none.

Temperature meaning of the final composite: none.

2. Timothy J. Osborn and Keith R. Briffa (2006) The Spatial Extent of 20th-Century Warmth in the Context of the Past 1200 Years.” [19]

Fourteen proxies — eleven of them tree rings, one dO-18 ice core (W. Greenland) — were divided by their respective standard deviation to produce a common unit magnitude, and then scaled into the measurement record. The ice core dO-18 had its physical meaning removed and its experimental uncertainty ignored.

Interestingly, between 1975 and 2000 the composite proxy declined away from the instrumental record. Osborn and Briffa didn’t hide the decline, to their everlasting credit, but instead wrote that this disconfirmation is due to, “the expected consequences of noise in the proxy records.

I estimated the “noise” by comparing its offset with respect to the temperature record, and it’s worth about 0.5 C. It didn’t appear as an uncertainty on their plot. In fact, they artificially matched the 1856-1995 means of the proxy series and the surface air temperature record, making the proxy look like temperature. The 0.5 C “noise” divergence got suppressed and looks much smaller than it really is. Actual 0.5 C “noise” error bars scaled onto the temperature record of their final Figure 3 would have made the whole enterprise theatrically useless, no matter that it is bereft of science in any case.

Any physical theory employed?: No

Strictly statistical inference?: Yes

Physical uncertainty in T: none.

Physical validity: none.

Temperature meaning of the composite: none.

3. Michael E. Mann, Zhihua Zhang, Malcolm K. Hughes, Raymond S. Bradley, Sonya K. Miller, Scott Rutherford, and Fenbiao Ni (2008) “Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia.” [20]

A large number of proxies of multiple lengths and provenances. They included some ice core, speleothem, and coral dO-18, but the data are vastly dominated by tree ring series. Mann & co., statistically correlated the series with local temperature during a “calibration period,” adjusted them to equal standard deviation, scaled into the instrumental record, and published the composite showing a resolution of 0.1 C (Figure 3). Their method again removed and discarded the physical meaning of the dO-18 proxies.

Any physical theory employed?: No

Strictly statistical inference?: Yes

Physical uncertainty in T: none.

Physical validity: none.

Temperature meaning of the composite: none.

4. Rosanne D’Arrigo, Rob Wilson, Gordon Jacoby (2006) “ On the long-term context for late twentieth century warming .” [21]

Tree ring series from 66 sites, variance adjusted, scaled into the instrumental record and published with a resolution of 0.2 C (Figure 5 C).

Any physical theory employed?: No

Strictly statistical inference?: Yes

Physically valid temperature uncertainties: no

Physical meaning of the 0.2 C divisions: none.

Physical meaning of tree-ring temperatures: none available.

Temperature meaning of the composite: none.

5.Anders Moberg, Dmitry M. Sonechkin, Karin Holmgren, Nina M. Datsenko and Wibjörn Karlén (2005) “Highly variable Northern Hemisphere temperatures reconstructed from low- and high-resolution proxy data.” [22]

Eighteen proxies: Two d-O18 SSTs (Sargasso and Caribbean Seas foraminiferal d-O18, and one stalagmite d-O18 (Soylegrotta, Norway), seven tree ring series. Plus other composites.

The proxies were processed using an excitingly novel wavelet transform method (it must be better), combined, variance adjusted, intensity scaled to the instrumental record over the calibration period, and published with a resolution of 0.2 C (Figure 2 D). Following standard practice, the authors extracted the physical meaning of the dO-18 proxies and then discarded it.

Any physical theory employed?: No

Strictly statistical inference?: Yes

Physical uncertainties propagated from the dO18 proxies into the final composite? No.

Physical meaning of the 0.2 C divisions: none.

Temperature meaning of the composite: none.

6. B.H. Luckman, K.R. Briffa, P.D. Jones and F.H. Schweingruber (1997) “Tree-ring based reconstruction of summer temperatures at the Columbia Icefield, Alberta, Canada, AD 1073-1983.” [23]

Sixty-three regional tree ring series, plus 38 fossilwood series; used the standard statistical (not physical) calibration-verification function to convert tree rings to temperature, overlaid the composite and the instrumental record at their 1961-1990 mean, and published the result at 0.5 C resolution (Figure 8). But in the text they reported anomalies to (+/-)0.01 C resolution (e.g., Tables 3&4), and the mean anomalies to (+/-)0.001 C. That last is 10x greater claimed accuracy than the typical rating of a two-point calibrated platinum resistance thermometer within a modern aspirated shield under controlled laboratory conditions.

Any physical theory employed?: No

Strictly statistical inference?: Yes

Physical meaning of the proxies: none.

Temperature meaning of the composite: none.

7. Michael E. Mann, Scott Rutherford, Eugene Wahl, and Caspar Ammann (2005) “Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate.” [24]

This study is, in part, a methodological review of the recommended ways to produce a proxy paleotemperature made by the premier practitioners in the field:

Method 1: “The composite-plus-scale (CPS) method, “a dozen proxy series, each of which is assumed to represent a linear combination of local temperature variations and an additive “noise” component, are composited (typically at decadal resolution;…) and scaled against an instrumental hemispheric mean temperature series during an overlapping “calibration” interval to form a hemispheric reconstruction. (my bold)”

Method 2, Climate field reconstruction (CFR): “Our implementation of the CFR approach makes use of the regularized expectation maximization (RegEM) method of Schneider (2001), which has been applied to CFR in several recent studies. The method is similar to principal component analysis (PCA)-based approaches but employs an iterative estimate of data covariances to make more complete use of the available information . As in Rutherford et al. (2005), we tested (i) straight application of RegEM, (ii) a “hybrid frequency-domain calibration” approach that employs separate calibrations of high (shorter than 20-yr period) and low frequency (longer than 20-yr period) components of the annual mean data that are subsequently composited to form a single reconstruction, and (iii) a “stepwise” version of RegEM in which the reconstruction itself is increasingly used in calibrating successively older segments. (my bold)”

Restating the obvious: CPS: Assumed representative of temperature; statistical scaling into the instrumental record; methodological correlation = causation. Physical validity: none. Scientific content: none.

CFR: Principal component analysis (PCA): a numerical method devoid of intrinsic physical meaning. Principal components are numerically, not physically, orthogonal. Numerical PCs are typically composites of multiple decomposed (i.e., partial) physical signals of unknown magnitude. They have no particular physical meaning. Quantitative physical meaning cannot be assigned to PCs by reference to subjective judgments of ‘temperature dependence.’

Scaling the PCs into the temperature record? Correlation = causation.

‘Correlation = causation is possibly the most naive error possible in science. Mann et al., unashamedly reveal it as undergirding the entire field of tree ring proxy thermometry.

Scientific content of the Mann-Rutherford-Wahl-Ammann proxy method: zero.

Finally, an honorable mention:

8. Rob Wilson, Alexander Tudhope, Philip Brohan, Keith Briffa, Timothy Osborn, and Simon Tet (2006), “Two-hundred-fifty years of reconstructed and modeled tropical temperatures.”[25]

Wilson, et al, reconstructed 250 years of SSTs using only coral records, including dO-18, strontium/calcium, uranium/calcium, and barium/calcium ratios. I’ve not assessed the latter three in any detail, but inspection of their point scatter is enough to imply that none of them will yield more accurate temperatures than dO-18.

However, all the Wilson, et al., temperature proxies had real physical meaning. What a great opportunity to challenge the method, and discuss the impacts of salinity, biological disequilibrium, and how to account for them, and explore all the other central elements of stable isotope marine temperatures.

So what did they do? Starting with about 60 proxy series, they threw out all those that didn’t correlate with local gridded temperatures. That left 16 proxies, 15 of which were dO-18. Why didn’t the other proxies correlate with temperature? Rob Wilson & co., were silent on the matter. After tossing two more proxies to avoid the problem of filtering away high frequencies, they ended up with 14 coral SST proxies.

After that, they employed standard statistical processing: divide by the standard deviation, average the proxies together (they used the “nesting procedure,” which adjusts for individual proxy length), and scale up to the instrumental record.

The honorable mention for these folks derives from the fact that they used only physically real proxies, and then discarded the physical meaning of all of them.

That puts them ahead of the other seven exemplars, who included proxies that had no known physical meaning at all.

Nevertheless,

Any physical theory employed?: No

Strictly statistical inference?: Yes

Any physically valid methodology? No.

Physical meaning of the proxies: present and accounted for, and then discarded.

Temperature meaning of the composite: none.

Summary Statement: AGW-related paleo proxythermometry as ubiquitously practiced consists of composites that rely entirely on statistical inference and numerical scaling. They not only have no scientific content, the methodology actively discards scientific content.

Statistical methods: 100%.

Physical methods: nearly zero (stable isotopes excepted, but their physical meaning is invariably discarded in composite paleoproxies).

Temperature meaning of the numerically scaled composites: zero.

The seven studies are typical, and representative of the entire field of AGW-related proxy thermometry. As commonly practiced, it is a scientific charade. It’s pseudo-science through-and-through.

Stable isotope studies are real science, however. That field is cooking along and the scientists involved are properly paying attention to detail. I hereby fully except them from my general condemnation of the field of AGW proxythermometry.

With this study, I’ve now examined the reliability of all three legs of AGW science: Climate models (GCMs) here (calculations here), the surface air temperature record here (pdf downloads, all), and now proxy paleotemperature reconstructions.

Every one of them thoroughly neglects systematic error. The neglected systematic error shows that none of the methods – not one of them — is able to resolve or address the surface temperature change of the last 150 years.

Nevertheless, the pandemic pervasiveness of this neglect is the central mechanism by which AGW alarmism survives. This has been going on for at least 15 years; for GCMs, 24 years. Granting integrity, one can only conclude that the scientists, their reviewers, and their editors are uniformly incompetent.

Summary conclusion: When it comes to claims about unprecedented this-or-that in recent global surface temperatures, no one knows what they’re talking about.

I’m sure there are people who will dispute that conclusion. They are very welcome to come here and make their case.

References:

1. Mann, M.E., R.S. Bradley, and M.S. Hughes, Global-scale temperature patterns and climate forcing over the past six centuries. Nature, 1998. 392(p. 779-787.

2. Dansgaard, W., Stable isotopes in precipitation. Tellus, 1964. 16(4): p. 436-468.

3. McCrea, J.M., On the Isotopic Chemistry of Carbonates and a Paleotemperature Scale. J. Chem. Phys., 1950. 18(6): p. 849-857.

4. Urey, H.C., The thermodynamic properties of isotopic substances. J. Chem. Soc., 1947: p. 562-581.

5. Brand, W.A., High precision Isotope Ratio Monitoring Techniques in Mass Spectrometry. J. Mass. Spectrosc., 1996. 31(3): p. 225-235.

6. Kim, S.-T., et al., Oxygen isotope fractionation between synthetic aragonite and water: Influence of temperature and Mg2+ concentration. Geochimica et Cosmochimica Acta, 2007. 71(19): p. 4704-4715.

7. O’Neil, J.R., R.N. Clayton, and T.K. Mayeda, Oxygen Isotope Fractionation in Divalent Metal Carbonates. J. Chem. Phys., 1969. 51(12): p. 5547-5558.

8. Epstein, S., et al., Revised Carbonate-Water Isotopic Temperature Scale. Geol. Soc. Amer. Bull., 1953. 64(11): p. 1315-1326.

9. Bemis, B.E., et al., Reevaluation of the oxygen isotopic composition of planktonic foraminifera: Experimental results and revised paleotemperature equations. Paleoceanography, 1998. 13(2): p. 150Ð160.

10. Li, X. and W. Liu, Oxygen isotope fractionation in the ostracod Eucypris mareotica: results from a culture experiment and implications for paleoclimate reconstruction. Journal of Paleolimnology, 2010. 43(1): p. 111-120.

11. Friedman, G.M., Temperature and salinity effects on 18O fractionation for rapidly precipitated carbonates: Laboratory experiments with alkaline lake water ÑPerspective. Episodes, 1998. 21(p. 97Ð98

12. Keigwin, L.D., The Little Ice Age and Medieval Warm Period in the Sargasso Sea. Science, 1996. 274(5292): p. 1503-1508; data site: ftp://ftp.ncdc.noaa.gov/pub/data/paleo/paleocean/by_contributor/keigwin1996/.

13. Shackleton, N.J., Attainment of isotopic equilibrium between ocean water and the benthonic foraminifera genus Uvigerina: Isotopic changes in the ocean during the last glacial. Colloq. Int. C.N.R.S., 1974. 219(p. 203-209.

14. Shackleton, N.J., The high-precision isotopic analysis of oxygen and carbon in carbon dioxide. J. Sci. Instrum., 1965. 42(9): p. 689-692.

15. Barrera, E., M.J.S. Tevesz, and J.G. Carter, Variations in Oxygen and Carbon Isotopic Compositions and Microstructure of the Shell of Adamussium colbecki (Bivalvia). PALAIOS, 1990. 5(2): p. 149-159.

16. Aldrich, J., Correlations Genuine and Spurious in Pearson and Yule. Statistical Science, 1995. 10(4): p. 364-376.

17. D’az, E. and R. Osuna, Understanding spurious correlation: a rejoinder to Kliman. Journal of Post Keynesian Economics, 2008. 31(2): p. 357-362.

18. Crowley, T.J. and T.S. Lowery, How Warm Was the Medieval Warm Period? AMBIO, 2000. 29(1): p. 51-54.

19. Osborn, T.J. and K.R. Briffa, The Spatial Extent of 20th-Century Warmth in the Context of the Past 1200 Years. Science, 2006. 311(5762): p. 841-844.

20. Mann, M.E., et al., Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia. Proc. Natl. Acad. Sci., 2008. 105(36): p. 13252-13257.

21. D’Arrigo, R., R. Wilson, and G. Jacoby, On the long-term context for late twentieth century warming. J. Geophys. Res., 2006. 111(D3): p. D03103.

22. Moberg, A., et al., Highly variable Northern Hemisphere temperatures reconstructed from low- and high-resolution proxy data. Nature, 2005. 433(7026): p. 613-617.

23. Luckman, B.H., et al., Tree-ring based reconstruction of summer temperatures at the Columbia Icefield, Alberta, Canada, AD 1073-1983. The Holocene, 1997. 7(4): p. 375-389.

24. Mann, M.E., et al., Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate. J. Climate, 2005. 18(20): p. 4097-4107.

25. Wilson, R., et al., Two-hundred-fifty years of reconstructed and modeled tropical temperatures. J. Geophys. Res., 2006. 111(C10): p. C10007.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
183 Comments
Inline Feedbacks
View all comments
Keith Sketchley
April 4, 2012 4:23 pm

The speculation by “Kev-in-Uk” about Pat Frank intuitively knowing something before his thorough review of the subject may be confusing. The word “intuitive” is vague and mis-used, a floating abstraction in many cases, often mis-used.
People can have discomfort, or inklings, from subconscious processing of information, but whatever pops up must be validated. (The cause might be an error of their own somewhere – misunderstanding some information for example, or a true contradiction in information.) People can suspect something, but they have to investigate and validate.
People can be concerned about a claim, as I am about alarmists’ anti-human conclusions, but the claim must be examined – doing that objectively is a challenge.

Dr. Deanster
April 4, 2012 6:25 pm

wmconnelley says ….
[i]> borehole thermometry of the ice streams/sheets and the analysis of o18 etc within layers
You might find Jouzel et al. interesting (http://courses.washington.edu/proxies/JouzelJGR1997.pdf). Since its one of the foundation papers for ice-core interpretation of d-O-18, its odd to find this “comprehensive” review missed it.
> Mr Connelly
Dr (I mention that every now and again because there is a minority here that is interested in politeness; you might well be one of them). And the spelling, of course.
[/i]
Well Dr. Connelley … I read your paper, and it deals with a completely different subject than that illustrated in the OP. The OP was concerned with the experimental error regarding the isotope readings liberated from Ca deposits, error intrinsic to the extraction method and error intrinsic to the formation process.
In contrast, the paper you linked is concerned with directly measuring D and O18 in water samples from precipitation. As noted in figure 8, the further one goes back in time, the more scatter we see, to the point that the paper itself proves the issue of the OP .. that being that isotopes in paleothermometers are not as accurate as one would think. The only point on the graph where the scatter is tight is at ZERO years before present.
One of the limitations mentioned in the text is again constistent with the OP. Conditions at the time of formation of precipitation have a significant influence on the D or O18 formed in the precipitate. We find this same difficulty with the Ca deposits. Further, spatial differences are also a source of error, as illustrated by the fact that the curves change over space, and isotope in precipitation are practically useless in the tropics and equatorial areas.
There are several other issues that you seem to ignore as well. The OP is talking about estimating tempeature within 1C. I don’t see the isotopes in this study making that fine of a measurement across time, not in the graphs, or in the text. Even the “calculated” vs “observed” graphs show differences in spacial differences as much as 2C and greater. … again consistent with the claim in the OP of the 2.75C error that is resident in isotope studies of Ca.
A final issue, is that the the OP really doesn’t criticize the isotopye guys .. and in fact, makes claims that they are very good science. However, the “tree-ring circus” lacks sound scientific theory to back up its claims.

April 4, 2012 9:22 pm

Tenuk, agreed. I’ve found the same lack of physical error bars looking at GCM outputs and at the surface temperature record. The neglect is endemic in AGW-relevant climatology.
Frank, you wrote, “The absolute size of peaks in a mass spectrum is irrelevant; only ratios are reported.
Mass spectrometers record M/z peaks, which means mass divided by charge. In heavy isotope mass spec, that’s the mass of the parent ion. The absolute intensity of all peaks are necessarily measured. The absolute peak intensities are needed to calculate the ratio of heavy isotope parent to light isotope parent. Your first statement, therefore, is wrong.
Here, for your edification, is a mass spectrum from K.I. Öberg, et al. “Photodesorption of ices I: CO, N2, and CO2. Among other things, they reported this heavy isotope mass spectrum:
<IMG src="http://i41.tinypic.com/2ivbrsg.gif&quot; length=300 width=400
Partial Figure Legend: “Mass spectra acquired during irradiation of a 6.2 ML thick 13C18O2 ice at 20 and 60 K … there are some background CO (m/z=12, 16 and 28), CO2 (m/z=44) and possibly some background H2O as well (m/z=18).
Note the peaks: absolute intensities, not ratios. Also relevant to our interests here, note the 18O2 at M/z = 36, and the 13C18O2 at M/z=49.
You wrote, “Any “losses and inefficiencies” won’t cause a change in the ratio unless they provide a path for separating some CO2 with O18 from ordinary CO2.
Your confidence is directly refuted by Shackleton’s scattered results, noted above. Using the identical sample, his measured dO-18 varied by (+/-)0.14%o. The scatter present in everyone else’s data likewise provides direct evidence of variably discrepant isotope measurements.
Clearly, there are uncontrolled variables affecting measured isotope ratios.
This comment, “Even IF such a path existed, that wouldn’t necessarily cause a problem WHEN we interpret CHANGES in these ratios, not the absolute value of these ratios.” assumes a constant systematic error that can be subtracted away. There’s no reason to suppose constant systematic error.
In fact, systematic error is rarely constant. It’s typically due to uncontrolled variables — not uncontrolled constants. And not necessarily uncontrolled variables of the same magnitude or influence. The relative impacts of uncontrolled variables can change with operator, with method, with instrument, and so forth — including between laboratories — and no one can predict the outcome. Look at the scatter in Figures 2-5. None of the systematic error is constant. I’ve yet to run across a case where it’s constant.
You wrote, “The appropriate issue is: How reproducible are these measurements over the full period of the study?” and the answer is in Figures 2-5, the table of Shackleton’s results, and the “~0.1 permil” error reported in Keigwin’s mass spec data. All of them indicate variable scatter.
Volker Doormann, 18-O is a stable isotope. It doesn’t have a decay time.

April 4, 2012 9:23 pm

Well, too bad. The image of the mass spectrum didn’t come through.

April 4, 2012 9:24 pm

But if you click on the link, you’ll see the mass spectrum at tinypic.com.

Kev-in-Uk
April 4, 2012 10:22 pm

Keith Sketchley says:
April 4, 2012 at 4:23 pm
Actually, I was meaning in the context of say a car mechanic, who, on hearing someone describe a symptom, will, from experience – likely ‘know’ roughly what the problem is. So, for myself, when reading some of the reports I have to review – if I read something and think ‘that doesn’t sound right’ I’ll re-read and double check. Within the context of Pat Franks review, this means/meant going back through to the roots (including previously quoted or referenced papers within other papers) to work it through and demonstrate the problem(s) to others. There are many papers technically ‘reliant’ on previous works, often as a result of the general acceptance that a published peer reviewed paper is ‘correct’. Clearly, this is not necessarily a good aspect of the science process when it happens.

April 4, 2012 10:25 pm

dave38, deuterium in water does produce a heavy water — DHO instead of H2O. DHO does have a higher boiling point than H2O. But deuterium and O-18 are so rare that the amount of DHO-18 in the world is vanishingly small.
Frank, you wrote, “Pat: The error bars you added to Keigwin’s Sargasso Sea reconstruction (Figure 7) are potentially correct and at the same time GROSSLY misleading. Stable isotopes are much more accurate describing temperature change, rather than absolute temperature. Most publications plot stable isotope ratios, rather than derived temperatures, on the vertical axis for precisely this reason.
Well, at least you think the error bars correct. 🙂 But you’re wrong about the grossly misleading part, as well as in the rest of that paragraph. Stable isotopes are incorporated according to local temperature, not according to the change in local temperature. Let’s be clear. All else being constant, if local temperature changes the O-18 ratio changes. It changes from the old O-18 ratio reflecting the prior temperature to the new O-18 ratio reflecting the new temperature. If you want a difference between two O-18 temperatures, you can take the difference between two O-18 ratios. The O-18 isotope ratios are a direct proxy for temperature, not temperature differences.
An O-18 ratio is taken between the O-18 in water v. the O-18 in carbonate at some temperature and at some given time. Or in some O-18-containing measurement standard at a series of set temperatures. A single ratio does not reflect different temperatures at different times.
I suspect you’re confusing a ratio with a difference (an anomaly).
But as O-18 ratios do not transmit temperature differences, they transmit temperature. They are not more accurate than shown in Figures 2-7.
Furthermore, temperature differences (anomalies) do not decrease systematic error unless the error is of constant magnitude in both temperatures. That is rarely the case, and virtually never the case in real-world field-measurements.
In short, the error bars in Keigwin’s Sargasso Sea reconstruction are good estimates of his systematic measurement error. I’ll demonstrate that in discussing your next mistake.
You wrote, “In simple terms,
t = m*d + b
“The slope, m, is fairly similar in all of these lines, but the y-intercept, b, is not: On the isotope delta scale -1 %o is always about +5 degC. If we consider two isotope measurements, d1 and d2 and calculate the temperature difference they represent:
t2 – t1 = m * (d2 – d1)…

Let’s take your d2-d1 and apply that to Keigwin’s data. Each T has a systematic error of 1-sigma=(+/-)0.75 C. Here is a nice page on error propagation.
Scroll down until you find this: “Addition and Subtraction: The square of the uncertainty in the sum or difference of two numbers is the sum of the squares of individual absolute errors.
Got that? The sum of the squares of the individual errors. So now let’s look at your d2-d1 difference. The error in your (d2-d1)=e(2-1) = sqrt[(e1)^2+(e2)^2].
For any two of Keigwin’s temperatures, the error in your temperature difference is e(2-1)=sqrt(0.75^2+0.75^2) = (+/-)1.06 C. That makes the 95% confidence limit of e(2-1) = (+/-)2.12 C in your difference temperature, up from (+/-)1.5 C. Your method has increased the uncertainty in the result by 71%.
The reason the error propagates like that is because the error is systematic, not random. But don’t feel too badly. AGW-related climate scientists make the same mistake all the time.
By this time, you should know that, “Looking at Figure 7, it may be possible to conclude that the temperature in the Sargasso Sea rose about 1-2 degC between 1500 years ago and 1000 years ago” is wrong.
And this comment, “Furthermore, if we know today’s temperature and isotope ratio (as well as m) we can calculate b for the site and completely eliminate its uncertainty.” merely shows that you’ve failed to grasp that the errors I’ve calculated have nothing to do with knowledge of “m” or “b,” which after all are only fitted constants.
The errors I calculated are methodological systematic errors made during the course of laboratory measurement. They are from the empirical point scatter around the line defined by your “m” and “b,” and are independent of “m” and “b.”
Perhaps a Monte Carlo approach would help.
Monte Carlo assumes random distributions. My post has to do with systematic error. A Monte Carlo approach could not be more irrelevant.

April 4, 2012 11:13 pm

Tom Ragsdale, if you know that for sure, where the heck is my money? 🙂
PG, thanks and good point. Frank mounted a real effort.
Mondo, I gave links to the discussion that’s got Steve Mosher so steamed. Do take a look and decide for yourself whether he’s got a case.
LazyTeenager,, how’s this: ‘stable isotopes excepted [from a diagnosis of zero physical methods], but [the] physical meaning [of stable isotope proxies] is invariably discarded in composite paleoproxies.’
You wrote, “If [proxy methods correlate with temperature under all relevant conditions] then a proxy measurement can be used to infer a temperature irrespective of its physical basis.
You’re equating induction to deduction, LT. Doing so is entirely wrong. Proxies have no physical meaning outside a physical theory. It doesn’t matter how well they correlate with temperature.
At best, a strong correlation may imply an underlying causal process somewhere. However, that process may be independently driving both the “proxy” and the temperature. If that’s true, the “proxy” is not a proxy. If the cause changes or turns off, the correlation disappears (as late 20th century tree rings have done).
Without a physical theory, there is no way to know anything about why the proxy is behaving as it does. And total ignorance makes it a “proxy.” As such, its use would be no more than an irony of science. It’s not a true proxy.
PCA isn’t adding up time series. PC’s have no physical meaning. Without an organizing physical theory, they’re no more than just a series of numbers.
Epicycles were an empirical model with parameters adjusted by observations. PCA is none of that.
Michael Tobis, so, then, what information do we get about ministers salaries from recordings of the price of vodka?
markx, are we OK now? 🙂
Len, thanks.
William M. Connolley, I asked you a fair question in reply. You ignored that. What’s the point of referencing boreholes if they have a large uncertainty bound?
LazyTeenager, the answer to your #2 is, ‘no, not out of bound.’ Two correlated series can only reproduce one another’s values within the length of the correlation. They have no predictive power — a correlation doesn’t predict past the regressed length of the correlated series. And that means they have zero explanatory power.
NW, hope the response worked for you.
Aaron, the early work looked at the effects of different acids. McCrae made such a study, for example. Phosphoric acid gave by far the best dO-18 results in all the test experiments.

April 5, 2012 6:09 am

wmconnolley says:
April 4, 2012 at 7:21 am
Will Mr Connolley beaddressing me as Mistress Kim2ooo? 🙂

April 5, 2012 6:26 am

Lazy Teenager
I think you have your logic reasoning backwards….
http://www.socialresearchmethods.net/kb/dedind.php

Roy
April 5, 2012 9:34 am

Excellent post. Once you get past the not inconsiderable difficulties of measurement, interpretarion of the results adds a whole new layer of of uncertainty. Leaving aside salinity, a further correction for the extent of glaciation is required as precipitation which forms the polar icecaps is deficient in O18 by about 60 parts per thousand compared to ocean water. This can mean that ‘every isotopic curve has to be re read taking cold to mean extensive continental glaciation and warm to mean glaciers reduced to their present level’. This is not new science, its a quote from Shackleton NJ (1967) Oxygen Isotopic Anayses and Pleistocene Temperatures Reassessed. Nature 215 pp15-17 Available without paywall from http://www.mendeley.com/research/oxygen-isotope-analyses-pleistocene-temperatures-reassessed/#

William M. Connolley
April 5, 2012 1:21 pm

> why should I know you’re a PhD?
I’m not.
> the climate science meme
Meme? What are you talking about?
> general public were filling their cars with fuel measured
Rather a poor example. Fuel is measured, but error bars are not given. Obviously. Have another go?
> Will Mr Connolley beaddressing me as Mistress Kim2ooo? 🙂
Since you say nothing at all worth responding to I rather doubt I’ll be addressing you at all. Oh, wait…
> It is beyond my knowledge to judge the validity of Pat Frank’s thesis, without years of study, but it seems thorough and is well presented.
Errm, can no-one see how crass this, and similar comments are? If you can’t see beyond the surface, you have no business to be praising it. Unless the surface gloss is all you’re interested in, of course.
> the paper you linked is concerned with directly measuring D and O18 in water samples from precipitation
Well spotted. You’ll immediately see the relevance, I’m sure.
> I’ve found the same lack of physical error bars looking at GCM outputs
GCMs don’t have error bars in the usual sense, because the output is exact, of course. But GCMs have interannual variability, and you’ll find that reported. If you actually read the papers.
> (a) You had no point, but mentioned “borehole thermometry”
If you can’t see the relevance of the borehole thermometry, no amount of further explanation form me will help you.
> William M. Connolley, I asked you a fair question in reply.
No. You asked a question that amounted to “I don’t want to read the paper you referenced, so I’m going to ask temporising questions that appear to excuse my not bothering to read it”. If you were actually interested in the information, that wouldn’t be your response.

Frank
April 5, 2012 4:31 pm

Pat: Thanks for taking the time to reply. I will, however, decline your kind invitation to join the ranks of pro-AGW climate scientists who don’t (or do) understand error propagation, and attempt to improve the quality of the science being offered to WUWT readers. (I see from Steve Mosher’s comment that you have tussled with picky skeptics before. I added a comment at the end of the Air Vent post he linked (:)). I would accept an invitation to join their company, but I haven’t earned that privilege.)
We are far more interested in climate change than in the exact mean annual temperature at any particular location. Therefore, the uncertainty in temperature change (t2 – t1) in any analysis is far more important than the uncertainty in temperature (t2 or t1), which is what you misleadingly included in the absurd error bars in Figure 7. There are at least two ways to estimate the uncertainty in temperature change (t2 – t1). The dumb way is by the formula in your reply for the uncertainty of a sum or difference.
The key point of my post was that the equation below provides a method for calculating a low uncertainty for temperature differences (t2 – t1) from the low uncertainty in the mass spec data (d2 – d1), when the y-intercept b is constant. Here is an improved explanation:
t2 – t1 = m * (d2 – d1)
According to your post, modern mass spectrometers can measure d2 and d1 with an accuracy of 1 part in 100,000. From your post, 1%o is 1 part in 1,000. The mass spec data therefore allow us to distinguish between d = -1.00 %o and d = -1.02 %o on the x-axis of Figure 6. When one translates uncertainty on the x-axis (%o) to the y-axis (temperature) via a slope of approximately -5 degC per 1%o, we should be able to trust temperature DIFFERENCES of roughly 0.1 degC – even though you believe the uncertainty in INDIVIDUAL temperatures is greater than 1 degC! (I didn’t personally check your error bars, but your discussion seemed sensible.) Since the uncertainty in d2 – d1 is so small, it doesn’t make any practical difference whether the slope of the lines in Figure 6 is -4, -5 or -6 degC per 1%o.
As I said in my original comment, we MUST be analyzing data belonging on a single line of Figure 6; the y-intercept, b, must be constant. Different values of b and m arise from different kinetic isotope effects during the processes that incorporate O18 into the proxy material. We can’t use the difference between d2 from Cape Cod oysters and d1 from Florida coral or – as you absurdly noted – d2 from water and d1 from calcium carbonate. HOWEVER, the data in your Figure 7 is presumably all from a sediment core from the Sargasso Sea, and probably from CaCO3 in that core. If you want to assert that b varies for this core, you need to explain why incorporation of O18 into the shells of the organisms that presumably deposited this CaCO3 has changed over the past several thousand years. Keigwin’s paper may explain how they ensured that their proxy material was as homogeneous as possible (a single organism?). Maybe these researchers were sloppy and there is no reason to assume b and m are constant, but you need to convince readers why different lines in Figure 6 are appropriate for the data points in Figure 7.
If my uncertainty analysis (+/- 0.1 degC) is correct, do you still think your error bars properly reflect our understanding of temperature CHANGE in the Sargasso Sea? Does anyone care if the temperature there was 22 or 25 degC a 1000 years ago?
Some publications leave isotope data in %o form (accurate to 0.01%o), rather than confront the complications of converting to temperature that you have exaggerated in this post. If Keigwin translated his isotope data into temperature data, he may have had good reasons for believing he knew a reliable method for translating isotope data into temperature data at his site and may have discussed his method in his paper.

Dr. Deanster
April 5, 2012 6:00 pm

Dr. Deanster > the paper you linked is concerned with directly measuring D and O18 in water samples from precipitation
wmconnelley says > Well spotted. You’ll immediately see the relevance, I’m sure.
Dr. Deanster says … I immediately see how shallow this response is, and how it ignored the rest of the post regarding observations of your linked study. At least you could respond to the meat of the post, as opposed to a sound bite … that says nothing.

April 5, 2012 8:02 pm

William M. Connolley wrote, “GCMs don’t have error bars in the usual sense, because the output is exact, of course. But GCMs have interannual variability, and you’ll find that reported. If you actually read the papers.
I’ve read the papers. The reason GCMs don’t have error bars — true physical error bars — is because no one knows what they are. No one has propagated the error through the physical theory in a GCM. No one has propagated the physical error per time step into a projection.
The GCM interannual variability, without physical error bars, represents little more than the internal variability of the model. It has numerical meaning only. Reporting those alone is a charade.
I’ve calculated the average cloudiness error made by GCMs (here, here) and equates to (+/-)100% of all the Anthro-GHG forcing, per time step. And that propagates into rapidly increasing error with each time step. GCMs tell us nothing about future climate.
You wrote, first quoting me, “William M. Connolley, I asked you a fair question in reply.”
No. You asked a question that amounted to “I don’t want to read the paper you referenced,…
Except that you had referenced no paper. Here’s your entire original comment to me: “The obvious thing to look at is the comparison of borehole thermometry to D-O18 in Greenland.
I don’t see a reference there, do you? You made a lazy off-hand riposte, worth exactly nothing. My question about error had been entirely appropriate. Your next response was another imposture, “Interesting to see how you react to new leads, new ideas…,” showing a lack of awareness that you had offered no new leads, no new anything
And now you wrote, “No. You asked a question that amounted to “I don’t want to read the paper you referenced, so I’m going to ask temporising questions that appear to excuse my not bothering to read it”. If you were actually interested in the information, that wouldn’t be your response.” which is a crock because you had never referenced a paper.
So stop with the pious posturing, William. Your own words expose you.
Finally, in your next post, you actually did link to a paper — Jouzel’s 1997 ice core paper — as your evidence of the high-level reliability that will refute my critique.
I have that paper and am guessing you never read it, because immediately in Figure 1 and in Figure 2, the scatter in the T:dO-18 points is so large that the error they represent will dwarf the errors described in my head post analysis.
The Greenland dO-18 data in Jouzel Figure 1 exhibit by far the least amount of scatter. Those data are referenced to S. J. Johnsen and J. W. C. White (1989) “The origin of Arctic precipitation under present and glacial conditions” Tellus 418, 452-468, where they appear in Figure 3.
So, here’s what I did for you, William. I digitized Johnsen’s Greenland dO-18 data, and regressed it against T, just as he did.
Johnsen’s equation:_dO-18%o = [0.67(+/-)0.02]*T-[13.7(+/-)0.5]
My regression:____dO-18%o = [0.71(+/-)0.02]*T-[13.01(+/-)0.5]; r^=0.99; not a bad reproduction.
Standard deviation of Johnsen’s point scatter about that line: (+/-)dO-18 = 0.359%o => (+/-)0.5 C
That (+/-)0.5 C is the lower limit of error in Jouzel’s (Johnsen’s) dO-18:T relationship — your exemplar of AGW-recouping wonderfulness, remember.
That makes the 95% confidence limit (+/-)1.0 C.
And that once again obviates any possible conclusion about historical or millennial, or geological unprecedentedness in 20th century temperature from dO-18 measurements.
Now what, William M. Connolley?

April 5, 2012 9:33 pm

Frank, Steve Mosher’s comment was that I “[refuse] to engage the argument.” He was untruthful. He also wrote that I “continue to make the same mistakes,” which is conveniently non-specific. And Steve didn’t link the tAV post. I did.
Your up-dated explanation is just as mistaken as your original. The error I discuss is not mass spec error. It’s laboratory error. It’s sample prep error. It’s handling error. It’s systematic methodological error that shows up as point scatter in the results. That point scatter has nothing to do with the slopes of the lines.
You wrote, “The mass spec data therefore allow us to distinguish between d = -1.00 %o and d = -1.02 %o on the x-axis of Figure 6.
No, they don’t because the lines on Figure 6 don’t represent data. They are just the lines fitted to T:dO-18 data. Look again at Figure 6. One of the arrows points to a line and says, “Epstein, 1953.” Now look at Figure 5. That shows the data of Epstein, 1953. Look at the point scatter in the dO-18 measurements. The temperature standard deviation of the scatter itself is (+/-)0.76 C.
The reverse polynomial regression yields 0.16%o as the experimental scatter in dO-18. That is again methodological error and it is 16 times larger than the mass spec precision, on which you have mistakenly focused. From Epstein’s equation, that (+/-)0.16%o scatter in dO-18 is equivalent to a systematic temperature uncertainty of (+/-)0.7 C.
Your point about mass spec accuracy is irrelevant.
Difference temperatures or difference dO-18s each encounters the equivalent systematic error width. It doesn’t matter which one you decide to emphasize. The error remains present and equivalent. Systematic errors combine in differences as the sum of their squares, as I already pointed out to you. You called error propagation “dumb,” and are advised to re-think that.
You’re also disputing direct evidence of uncertainty in result that the scientists involved in the dO-18 method development have themselves openly acknowledged. How smart is that?
You wrote, “We can’t use the difference between d2 from Cape Cod oysters and d1 from Florida coral or – as you absurdly noted – d2 from water and d1 from calcium carbonate.
You entirely misunderstood the point of Florida v. Cape Cod. The difference in T per dO-18 is due to the difference in salinity. The point to be raised inthat context is that no one knows what the paleosalinity was, when fossil shells were formed. That ignorance provides a potential for an error in derived T of up to about 2 C.
And your comment about “absurdly” using carbonate v. water O-18 merely shows that you don’t at all understand the method you’re arguing about.
You wrote, “… you need to convince readers why different lines in Figure 6 are appropriate for the data points in Figure 7.
Frank, I begin to think you didn’t even read my post before arguing about its content. I didn’t apply different lines from Figure 6 to the data of Figure 7 (Keigwin’s data).
The error bars in Figure 7 come from the (+/-)0.14%o scatter in Shackleton’s method — Keigwin used Shackleton’s equation — and Keigwin’s own reported (+/-)0.1%o mass spec limit of precision. Those error bars reflect the empirical uncertainty in Keigwin’s own experimental method.
You wrote, “If my uncertainty analysis (+/- 0.1 degC) is correct…“. It isn’t. Nothing you’ve written has been correct, including calling standard error propagation, “dumb.”

William M. Connolley
April 6, 2012 12:01 am

> The reason GCMs don’t have error bars — true physical error bars — is because no one knows what they are.
So, you haven’t read the papers. Like I say, all this stuff is lost and confused. Read up on the basics before trying to walk on your own.
> I’ve calculated the average cloudiness error made by GCMs
Why did you do that? There are plenty of papers out there that do it properly. You should read them, instead.
> I digitized Johnsen’s Greenland dO-18 data, and regressed it against T, just as he did.
Well, that was pointless then. Try reading the paper instead.
There’s a theme here.

April 6, 2012 12:32 am

Willy Conn says in reply to Dr Frank:
> I’ve calculated the average cloudiness error made by GCMs
Why did you do that? says Conn-man. There are plenty of papers out there that do it properly. You should read them, instead.
“Papers” are not a substitute for calculations, chump.
And:
> I digitized Johnsen’s Greenland dO-18 data, and regressed it against T, just as he did.
Well, says the Wiki-Connster, that was pointless then. Try reading the paper instead. There’s a theme here.
Yes, and the ‘theme’ is the Conn-man’s anti-science narrative that claims a “paper” trumps data. It does not, particularly in climate science. Willy Con should stick to the only thing he is competent at: censoring scientific views different from his own at the internet’s http://2012.bloggi.es/#science>Best Science site, who are not Conned by this jamoke. His censorship only works at Wikipedia. Not here. Tough noogies, conman.

Gail Combs
April 6, 2012 7:33 am

When William M. Connolley comes on to WUWT to jump down on a concept with both feet, it is a good indication that someone is posting important information.
I am not nearly as knowledgeable as Pat Frank is in statistics (a major failing of our university science programs IMHO) but as a chemist I certainly see his point about the error inherent in any and all test methods and how that error is compounded when you are talking multiple lab techs, different laboratories and samples from different locations with who knows what confounding factors built in.
REPLY: I wonder who pays Connolley to spend so much time here? Obviously he’s on a mission – Anthony

Agile Aspect
April 6, 2012 2:10 pm

Excellent article!
The link to your paper on global temperatures is still broken (although I was able to edit the URL in my browser and get the download to work.)
This might explain why moser, frank, mtobis and the wiki molester are having trouble grokking standard experimental error analysis taught in undergraduate physical science courses.
It would nice if someone could follow up this post with a post on the 83 year shift in the CO2 data from Shipley by Keeling.

Agile Aspect
April 6, 2012 2:24 pm

dave38 says:
April 3, 2012 at 11:40 amdave38 says:
April 3, 2012 at 11:40 am
“The O-18/O-16 ratio in sea water has a first-order dependence on the evaporation/condensation cycle of water. H2O-18 has a higher boiling point than H2O-16, and so evaporates and condenses at a higher temperature.”
I can accept that, but i wonder what effect the prescence of H2 Deuterium has on the temperature and the evaporation/ condensation and can it make much difference?
;————————–
The isotopic composition of rain and snow can vary by 4% at mid latitudes and up to 40% at the poles as a result of evaporation and condensation.
The isotopic composition of deep offshore ocean water is remarkably uniform.

Frank
April 7, 2012 1:39 pm

Pat: Let’s see what we can agree upon anything.
1) When interpreting Figure 7, I believe we want to know which large temperature changes are significant and which smaller changes may be due to experimental variability. The accuracy of the absolute temperature at any one time and place usually isn’t important. Do you agree?
2) Error propagation is important, but analyzing data in a manner which unnecessarily inflates the error is dumb. Statistics and signal processing are all about extracting reliable information from noisy data. My earlier comments show a better method for calculating the uncertainty in temperature CHANGE than the brute force method you applied to the absolute temperatures. However, my method only works when all of the data points belong on the same O18/temperature calibration line. Do you agree?
3) When a dO18/temperature calibration graph is constructed (with temperature on the y-axis as traditionally shown, even though temperature is really the independent variable during calibration), the isotope data should be presented with a horizontal error bar to reflect standard error of the mean for the isotope ratio obtained from shells grown at a single temperature. (See Figure 1a in Bemis) In theory, the width of those error bars can be reduced by more reproducible experimental technique and analyzing more samples. Do you agree?
4) You were right, it was ridiculous for me to suggest that the width of the horizontal error bars was determined by the ability of a mass spec to resolve 1.00 and 1.02%o. I should have read your post more careful.
5) With some caveats discussed below, the standard error of the isotope data used to construct a calibration curve can be converted into the standard error of the reconstructed temperatures by multiplying by the absolute value of the slope of the calibration curve (4.8 degC per %o). The variability of this slope isn’t large enough to influence this conversion. Do you agree?
6a) Table 1 of a Lea review article summarizing various isotope temperature proxies claims the standard error of O18 temperature reconstructions is 0.5 degC (when O18 in seawater is known), suggesting that standard error in calibrating isotope data typically is 0.1%o. (Lea http://www.geol.ucsb.edu/faculty/lea/pdfs/Lea%20TOG%20proof.pdf) One paper I looked at reported a figure of 0.08%o in their methods section. (http://epic.awi.de/24738/1/Wit2010e.pdf) In the absence of further information, we should trust the significance of temperature changes greater than 1 degC between any two points. Changes of 0.5-1.0 degC between periods with multiple readings (the MWP vs the LIA) could be significant.
6b) Main caveat: Foraminifera deposited at a real site will be less homogeneous than the foraminifera used to construct a calibration. Temperature changes during the year, so the spread in the O18 data will reflect the temperature range during the year and the mean O18 will reflect the mean temperature. Laboratory studies show that salinity, light and especially seawater O18 (which changes with rainfall, evaporation and ice ages) can influence O18 incorporation. The calcium carbonate in real samples will have a mean and spread of O18 values that reflect the annual variability in all of these factors. As long as the mean of these influences doesn’t change appreciably with time, changes in the O18 record will reflect changes in the LOCAL mean annual temperature. (O18 is a dubious proxy for analyzing changes due to ice ages because O18 in sea water changed with the size of the ice caps.)
6c) I agree with you that reconstructed absolute temperatures from different sites need error bars reflecting all of the possible lines in Figure 6.
7) If you believe that changes in mean annual salinity (or other factor) might significantly increase the uncertainty of O18 temperature reconstructions, calculate how large a salinity change is required to produce a bias of 0.25 degC and make the case that salinity could have changed this much in the Sargasso Sea over the last 3000 years. If you can’t make such a case, tell your readers.
8) The reliability of Keigwin’s reconstruction depends on the standard error of the O18 data HIS LAB obtained from control foraminifera raised under well-defined conditions. Lea’s standard error for 018 reconstructions (0.5 degC) is only valid if Keigwin’s standard error for control 018 samples was 0.1%o.
9) Bemis varied temperature in the lab by almost 10 degC, so he could easily study the relationship between O18 and temperature (without tight O18 data). The standard error of his O18 data (see his FIgure 1) was much greater than 0.1%o; his methodology certainly wouldn’t have resulted in a reconstruction with a standard error of 0.5 degC. However, you have no business attaching error bars from Bemis to results from a study by Keigwin. You need Keigwin’s quality control data to attach error bars to Keigwin’s study.
Adding insult to injury in your Figure 6, you show an isotope uncertainty of 0.4%o from Bemis intersecting DIFFERENT lines before being translated horizontally into temperature uncertainty. Until you demonstrate that the factors that produced the different lines in Figure 6 changed over time in the Sargasso Sea, you should have intersected only one of these lines.
10) WIth many graphs, it is traditional to put error bars that extend one standard deviation of the mean above and below the point. When the top of the error bar for a low result overlaps the bottom of the error bar for a high result, the difference between the low and high results is usually not significant. Your 95% confidence intervals probably confused many readers.
11) The most important thing I learned from this debate was that the standard error (according to Lea) for other isotope reconstructions is far lower for O18 than for other methods:
Mg/Ca, +/-1 degC; alkenone index +/- 1.5 degC. It is probably easier to find scandalous misuse of these proxies than O18.

April 7, 2012 2:53 pm

William M. Connolley, it’s very clear that you are unable or unwilling to mount a constructive argument in your own defense.
Smokey, please, Pat, not Dr. Frank. 🙂
Gail I didn’t know you were a fellow experimental chemist. It’s good to be in the company of someone who understands what a struggle it is to get low-error experimental data. Geoff Sherrington is a chemist as well, an analytical chemist in fact, and like you has a total respect for experimental error and its implications about reliability.

April 8, 2012 3:35 pm

Frank, in reply and using your numbering:
1) You show no understanding of the meaning of systematic error.
2) The error is in the data itself. Into what units the data are converted is irrelevant. Taking differences does not necessarily remove error. Your method doesn’t work at all. You really need to learn about systematic error, and propagation of error.
3) No. Systematic error can either shrink or grow with repeated sampling.
4) I didn’t use the word “ridiculous.” I used ‘mistaken.’
5) OK for linear standards, not OK for exponential standards.
6a) I’ve looked at Lea’s 1997 article. Table 1 gives dO-18 standard error of (+/-)0.5 C, referencing Kim and O’Neil, 1997 under 6.14.3.3. I’ve now evaluated the CaCO3 calibration data in Figure 2 and Table 1 of Kim and O’Neil, 1997. The measurement uncertainty in their 5 mM T:dO-18 calibration is (+/-)2.2 C. In their 25 mM T:dO-18 calibration it’s (+/-)4.0 C.
The Kim & O’Neil, 1997 25 mM data set is large enough (24 points) to evaluate the error envelope. There are at least two methodological error modes operating simultaneously in their experiment. The error therefore behaves as systematic, not random.
So, the error quoted by Lea is 4x to 8x smaller than just the measurement error actually in the data of the paper he cited.
Therefore, changes of 0.5-1.0 C are well below the level of resolution. Your 6a) is now moot.
6b) The unknowns you admit concerning paleo-salinity and foraminiferal depth, not to mention photosynthetic disequilibrium, destroy your conclusions about any reliability in “LOCAL” paleo-temperature. The uncertainty in paleo-salinity alone is worth at least (+/-)2 C.
6c) Thank-you.
7) The case in Figure 7 is about measurement error, not paleo-salinity. I have been explicit about that all along, including in the head-post article itself. I.e., I wrote, “The total measurement uncertainty in Keigwin’s dO-18 proxy temperature…” There’s nothing about paleo-salinity and no need to amend anything.
8) I pointed out in the head-post article that, “At the ftp site where Keigwin’s data are located, one reads “Data precision: ~1% for carbonate; ~0.1 permil for d18-O.
Lea’s standard error is uncritical and in any case is a general estimate not applicable to any specific case, such as Keigwin’s, where a true uncertainty can be calculated (which I did).
9) Figure 7: how many times need I repeat that the error bars on Keigwin’s data are from Keigwin’s own experiment? Those error bars have nothing to do with Bemis’s work.
Figure 7 caused no injury, nor does Figure 6 make an insult. You are merely continuing to make a false projection of Figure 6 onto Figure 7.
Figure 6 is stand-alone, Frank. It’s meant to show that different standard lines give different temperatures for the same dO-18%o (or vice-versa). This disparity among lines is due to uncontrolled variables. Those uncontrolled variables introduce uncertainty into any determination of T from dO-18.
Repeating: Figure 6 has nothing to do with Figure 7. The error bars in Figure 7 have nothing to do with Figure 6.
Are we clear on that now?
10) 95% confidence limits are the standard way of showing minimal surety in result. Most readers here at WUWT are very sophisticated in such matters and were certainly not confused. New readers will have to get used to it.
11) I haven’t looked at the other proxies in any detail, but would not be at all surprised if the measurement errors in Ca/Mg and alkanone proxies are 2x-4x larger than Lea’s entries.

Brian H
April 8, 2012 7:50 pm

Pat;
#10;
this “95%” disease has to be stomped. It is a squishy, slack standard, used in squishy social science because they can’t get any better. Tell me, what do chemists consider a minimum confidence level? How many sigma?