An Ocean of Overconfidence

Guest Post by Willis Eschenbach

I previously discussed the question of error bars in oceanic heat content measurements in “Decimals of Precision“. There’s a new study of changes in oceanic heat content, by Levitus et al., called “World Ocean Heat Content And Thermosteric Sea Level Change (0-2000), 1955-2010” (paywalled here). [UPDATE: Available here, h/t Leif Svalgaard] It’s highlighted over at Roger Pielke Senior’s excellent blog , where he shows this graph of the results:

Figure 1. From Levitus 2012. Upper graphs show changes in ocean heat content, in units of 1022 joules. Lower graphs show data coverage.

Now, there’s some oddities in this graph. For one, the data starts at year 1957.5, presumably because each year’s value is actually a centered five-year average … which makes me nervous already, very nervous. Why not show the actual annual data? What are the averages hiding?

But what was of most interest to me are the error bars. To get the heat content figures, they are actually measuring the ocean temperature. Then they are converting that change in temperature into a change in heat content. So to understand the underlying measurements, I’ve converted the graph of the 0-2000 metre ocean heat content shown in Figure 1 back into units of temperature. Figure 2 shows that result.

Figure 2. Graph of ocean heat anomaly 0.-2000 metres from Figure 1, with the units converted to degrees Celsius. Note that the total change over the entire period is 0.09°C, which agrees with the total change reported in their paper.

Here’s the problem I have with this graph. It claims that we know the temperature of the top two kilometres (1.2 miles) of the ocean in 1955-60 with an error of plus or minus one and a half hundredths of a degree C

It also claims that we currently know the temperature of the top 2 kilometers of the global ocean, which is some 673,423,330,000,000,000 tonnes (673 quadrillion tonnes) of water, with an error of plus or minus two thousandths of a degree C

I’m sorry, but I’m not buying that. I don’t know how they are calculating their error bars, but that is just not possible. Ask any industrial process engineer. If you want to measure something as small as an Olympic-size swimming pool full of water to the nearest two thousandths of a degree C, you need a fistful of thermometers, one or two would be wildly inadequate for the job. And the top two kilometres of the global ocean is unimaginably huge, with as much volume as 260,700,000,000,000 Olympic-size swimming pools …

So I don’t know where they got their error numbers … but I’m going on record to say that they have greatly underestimated the errors in their calculations.

w.

PS—One final oddity. If the ocean heating is driven by increasing CO2 and increasing surface temperatures as the authors claim, why didn’t the oceans warm in the slightest from about 1978 to 1990, while CO2 was rising and the surface temperature was increasing?

PPS—Bonus question. Suppose we have an Olympic-sized swimming pool, and one perfectly accurate thermometer mounted in one location in the pool. Suppose we take one measurement per day. How long will we have to take daily measurements before we know the temperature of the entire pool full of water to the nearest two thousandths of a degree C?

5 1 vote
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

135 Comments
Inline Feedbacks
View all comments
Chuck Nolan
April 23, 2012 10:38 am

John Endicott says:
April 23, 2012 at 9:07 am
Mark says:
And it’s a well-established fact that 76.38% of all statistics are made up on the spot
============================================
That was the old studies, newer, more accurate, studies put it at 82.3459% +-0.002% 🙂
———————-
The increase is because they are now including climate science in the statistics.
Old numbers were pre CAGW.

April 23, 2012 10:40 am

PPS—Bonus question. Suppose we have an Olympic-sized swimming pool, and one perfectly accurate thermometer mounted in one location in the pool. Suppose we take one measurement per day. How long will we have to take daily measurements before we know the temperature of the entire pool full of water to the nearest two thousandths of a degree C?
Answer
You will never know the temperature of the pool to within two thousandths of a degree C. All you will know is the temperature at that particular spot, to within the accuracy of your thermometer. The pools that I have been in have warm and cold spots.

And to make matters worse, in any pool of water exposed to the suns radiation you will always have temperature differentials in any given volume of water, where one area is slightly warmer or colder than another area. I own a pool service and spa repair business. When I add a colored chemical to the water, I love to watch he the chemical disperses. On a nice warm sunny day, you can see the chemical dye shifting this way and that as they disperse in the water, carried along by currents created by slight temperature differences in different areas of the pool of water.

Chuck Nolan
April 23, 2012 10:52 am

So it’s the co2 that makes the beer warm fast. Who knew?

April 23, 2012 10:59 am

Give a million monkeys each a rectal thermometer and you’d be surprised what you can measure, as any primatologist will tell you. For the heat reported–however accurately or not–I get a sea level rise of 14mm in 50 years, without acceleration but clearly dangerous (quick, raise the dikes!). This is better explained as recuperation from the LIA than anything else. –AGF

April 23, 2012 11:19 am

Steamboat Jack, I was kindof thumbing my nose at them 🙂
Here’s a thought about the Consensus:- Climategate Email 3165.txt
Many in the solar terrestrial physics community seem totally convinced that solar output changes can explain most of the observed changes we are seeing … the solar terrestrial group are not going to go away

April 23, 2012 11:22 am

Rich Lambert says:
April 23, 2012 at 7:44 am
“Climate science needs something like the ISO 9000 quality control standards used by industry.
REPLY: Yep, I argued for this back in 2008:”
They need a disciplinary body created by a legislative act as is the case for professional engineers. They have this for engineers because when they are wrong, bridges can fall down, mines collapse, dams breach, buildings fall over…. We have reached a point where incompetence and advocacy in science is heading for doing real damage to civilization and risk of death for billions by taking away abundant, affordable energy and wasting scarce resources and wealth on unworkable alternatives. A few of the main scientist advocates, well known to all, would already have been barred from practicing their “professions” by now.

Nic Lewis
April 23, 2012 11:28 am

tallbloke wrote:
“James Annan found some gross maths errors in Levitus et al 2000. Once he pointed out the maths error, Levitus et al’s corresponding author stopped corresponding with him…”
Apparently, Levitus felt under pressure to not to produce lower trends since that would be even less consistent with AOGCM simulations. The errors really are gross – the trend line in the global heat content graph disagrees to the data in the table by approaching 25%, if I recall correctly.
Even worse, when James Annan reported the errors to the editor at Science, he refused to anything about the matter.
Given this history, I find it impossible to assign a high level of confidence to any study authored by Levitus.

John T
April 23, 2012 11:39 am

“So I don’t know where they got their error numbers…”
Seems like it may be an accuracy vs. precision type of problem confounded by a lack of understanding of significant figures.
Kind of like thinking if one of my steps is ~1 yard, I can calculate the length of a field to the nearest 1/64th of an inch if I pace it off enough times.

Larry Ledwick (hotrod)
April 23, 2012 11:42 am

If they really believe they can make measurements at that accuracy I submit that they should be willing to wager a months salary that in a blind test two teams of their choice, measuring the oceans temperature on the opposite side of the same boat at the same time and depth can get measurements that agree with in 2x their alleged accuracy.
They will use new temperature measuring devices of their specification supplied directly from the manufacture.
Each team can make any number of measurements within a one hour time period.
All measurements must be made at the same depth. They will report their measurements in real time to an independent 3rd party as they take them, for offical documentation of the readings.
Then apply their pre-documented processing to that officially certified data converge on what they assert is the true temperature of the ocean at that location at that time.
They teams will at no time be aware of or able to determine the temperatures being reported by their opposite team.
The final measurements of the two teams must agree to an accuracy of 4 thousandths of a degree C.
If they win they get to keep their previous months salary. If they lose, all the team members donate a months salary to the charity of Anthony’s choice.
Myth busters where are you?
Larry

AndyL
April 23, 2012 11:53 am

My answer to the final question: 1 day
the measurement will be accurate to 2 thousandth. No improvement over this measurement is possible. You can just define it as “average”

Jim G
April 23, 2012 12:00 pm

John T says:
April 23, 2012 at 11:39 am
“So I don’t know where they got their error numbers…”
“Seems like it may be an accuracy vs. precision type of problem confounded by a lack of understanding of significant figures.
Kind of like thinking if one of my steps is ~1 yard, I can calculate the length of a field to the nearest 1/64th of an inch if I pace it off enough times.”
Excellent analogy. Probabilities and the tails of a normal distribution curve deal only with sample size and say nothing about sample selection, sample location, accuracy of measurement devices, etc., etc., ad infinitum ad nauseum.

jaschrumpf
April 23, 2012 12:03 pm

Refresh my memory on simple error bars, please. If I measure that an object moved 2.00 meters in 4.00 seconds, and my meter stick is in cm and my stopwatch in hundredths of a second, my error bars would be “it moved 2.00 m +/- 0.005 m in 4.00 s +/- 0.005 s”, correct?
Now if I use those numbers to determine velocity I would get 0.50 m / s +/- 0.005m/0.005s?
This is what happens when your physics classes were [cough] decades ago…

Gail Combs
April 23, 2012 12:04 pm

Rich Lambert says:
April 23, 2012 at 7:44 am
Climate science needs something like the ISO 9000 quality control standards used by industry.
_________________________
NO THANK YOU!
I used to believe that too until the FDA’s HACCP Requirements. ISO (and HACCP) substitute paperwork for actual testing and depends on faith in others who have a vested interest in lying.
This is what Quality Engineers like me think. (Quality Magazine)

“…Scott Dalgleish, [is] vice president of manufacturing at Spectra Logic Corp., a Boulder, CO, maker of robotic computer tape backup systems. Dalgleish, an ASQ certified quality manager who has worked in the quality profession since the late 1980s, is not happy with the direction that the quality movement has taken in recent years. And he sees the ISO 9000 family of standards as the primary negative influence.
Among other things, Dalgleish contends that ISO 9000 misdirects resources to an overabundance of paperwork that does almost nothing to make products better, while fostering complacency among top management and quality professionals alike. The recent conversion to the 2000 version of the standard has only made things worse, he says. While ISO 9000:2000 has almost no effect on how good companies operate, it requires huge amounts of time for document revision that could better be spent on real quality improvement, he believes.
http://www.qualitymag.com/Articles/Letters_From_the_Editor/e4100ee7f4c38010VgnVCM100000f932a8c0____
Probing the Limits: ISO 9001 Proves Ineffective
http://www.qualitymag.com/Articles/Column/17062620c7c38010VgnVCM100000f932a8c0____

I’m wondering if there might be a silent majority of Quality readers out there on the topic of ISO 9000. The response to my July editorial, “Eliminate ISO 9000?,” was the heaviest that we have received in some time. I got lots of e-mails from readers about the piece, which reported the views of Scott Dalgleish, a quality professional who has been publicly critical of the impact of ISO 9000 on manufacturers, and has suggested that companies eliminate ISO 9000 altogether from their quality management systems.
Many of the responses were quite articulate, and some were humorous and entertaining. You can read a sampling in this month’s Quality Mailbag department on p. 12.
One thing that struck me about the letters I received is that almost all expressed some level of agreement with Dalgleish, particularly on issues related to excessive ISO 9000 documentation requirements. As you’ll see in the Mailbag department, one reader even said that his company has already dropped its ISO 9001 certification with no apparent negative effects.
What surprised me is that the July editorial elicited no ardent rebuttals in defense of ISO 9000.
http://www.qualitymag.com/Articles/Letters_From_the_Editor/65730ee7f4c38010VgnVCM100000f932a8c0____

No amount of paperwork can substitute for honesty. A good documentation system is always useful but it should not be the area of focus. Decent honest testing and data gathering is the backbone of all science including Quality Engineering.

Somebody
April 23, 2012 12:05 pm

Check out the central limit theorem to find out what they’ll invoke for their spurious precision. If you’ll divide the precision of one instrument with the square root of the number of the measurements, you’ll find out how they obtained those error bars. So basically they decreased the size of the error bars over time by either increasing the number of measurements, or by increasing the instrument’s precision. The problem is… they apply the central limit theorem in a pseudo-scientific way. It is proven for independent variables (the classical one requires also to be identically distributed, which does not happen for temperatures on different points of Earth, very easy to verify that experimentally, both the median and expected values vary – this condition is weakened in variants, but the theorem does not hold for ANY condition).
The problem is… temperatures are not really independent. There is a dependence both in space, and in time.
That is, temperature in a point is dependent on the temperatures nearby, and the temperature at time t is dependent on the temperature at a previous moment (both the local one, and the nearby ones). That’s why one can have something like heat equation in physics. Because the temperatures are not independent.
So, it’s a pseudoscientific way to apply the central limit theorem when it does not apply. Unless they have a proof for it to hold even if variables are dependent. I don’t think they ever presented such a proof.

jaschrumpf
April 23, 2012 12:06 pm

agfosterjr says:
April 23, 2012 at 10:59 am
Give a million monkeys each a rectal thermometer and you’d be surprised what you can measure, as any primatologist will tell you.

Now you’ve put me in mind of the old joke:
“How can one tell the difference between an oral and a rectal thermometer?”
“The taste.”

Somebody
April 23, 2012 12:07 pm

Slight correction: “both the median and the variance vary”

JK
April 23, 2012 12:08 pm

Let me try again. This is not intended as an actual error analysis. The paper has not been officially published yet, so the appendix containing the author’s own analysis is not yet available. We should wait until that analysis is available before discussing it.
However, to get an idea of the power of averaging suppose that there are 3000 argo floats, each reporting once every 10 days for 5 years. This will give 547,500 measurements. Each of those measurements will have an error associated with it. But to the extent that the error is random these will cancel out. From basic statistics we can expect that the error on the mean will be down by a factor of about square root of 547,500 which is about 740. That means that if there is a random error in each individual float of about 1.5 degrees the error on the 5 year global mean will be about 0.002 degrees.
What would this apply to? Well each Argo measurement has to represent a large volume of ocean. Some float will find themselves warmer than the region they need to represent and some will find themselves cooler. If those differences are random, then in the average they will, to an extent, cancel out. Similarly, some thermometers will read a bit warm and some a bit cool. Of course it may be that all floats are biased warm or cool. That wouldn’t matter for assessing changes. It would matter if there was a systematic change in the bias. That would certainly require additional analysis.
Certainly it is right that trying to make a single measurement to 0.002 degrees is absurd. But is trying to make an Argo measurement to 1.5 degrees plausible? I don’t know. Here’s the thing: I don’t believe that anyone here knows off the top of their head either, or could figure it out with a few minutes thought about some analogous experience. I’m not arrogantly dismissing anyone as stupid. I am sure that we could all figure out a reasonable estimate if we put the work in. Of course I would like to see how the authors justify their estimate. But I don’t think that the answer is obvious.
You still might ask why a five year average is legitimate? Willis hints at this question. But we could chose any time period. The error on the one year averages will be higher. But what is special about one year? The one month, one week, and one day averages could be calculated with greater and greater error. The ten year average could be calculated with smaller error.
Presumably the authors would be interested in the highest frequency information they can get, and presumably chose a 5 year period in part as a compromise between error and time resolution.
In a case like this averaging works by sacrificing some information (for example year to year variation) in order to gain precision in another measure, the mean. It seems to me that, at least in general terms, this is an essential use of statistics throughout science. Maybe it has been applied badly here. But even if it has I don’t see any evidence presented that averaging is a fundamentally wrong strategy in this case.
To be clear, I quite agree with this statement:
“If you want to measure something as small as an Olympic-size swimming pool full of water to the nearest two thousandths of a degree C, you need a fistful of thermometers, one or two would be wildly inadequate for the job. And the top two kilometres of the global ocean is unimaginably huge, with as much volume as 260,700,000,000,000 Olympic-size swimming pools …”
I also agree that temperature measured by each Argo float will have a much, much larger than error than 0.002 degrees when taken as measure of surrounding ocean. That is not what I’m disputing.
I won’t defend the author’s error bar of 0.002 degrees, as I have not seen their justification. But I just don’t see any evidence presented in this post that it is possible to measure the global mean with much greater accuracy than that.
PS Having written this I have just read Willis’ previous post on Decimals of Precision where I see he addresses my argument above with the point that by this logic 30, or even 3, buoys could do a passable job. But square root scaling is just a back of the envelope estimate and to answer that I think we need to delve into details. For example, it may be that a large number of buoys is needed for spatial coverage, but we could do a passable with one hundredth of the number of observations from each buoy.
But at that point I think that it makes more sense to go from temperature to heat content and consider other details anyway.

Somebody
April 23, 2012 12:24 pm

“But to the extent that the error is random these will cancel out.” – or they will all add up. Another one that invokes something without actually checking the proof.
[In checking the proof, I see that you have a phony email address. Please provide a legitimated one. Thanks. ~dbs, mod.]

Gail Combs
April 23, 2012 12:28 pm

Gary Pearse says:
April 23, 2012 at 11:22 am
They need a disciplinary body created by a legislative act as is the case for professional engineers. They have this for engineers because when they are wrong, bridges can fall down, mines collapse, dams breach, buildings fall over…. We have reached a point where incompetence and advocacy in science is heading for doing real damage to civilization and risk of death for billions by taking away abundant, affordable energy and wasting scarce resources and wealth on unworkable alternatives. A few of the main scientist advocates, well known to all, would already have been barred from practicing their “professions” by now.
___________________________________
That I can certainly agree with. I took the ASQ Quality Engineering tests to be come certified although that is not the same as a PE. Giving engineers/scientists LEGAL immunity when the company is cheating is also a big plus that comes with being a PE.
You actually need both, minimum competency and legal immunity.

Somebody
April 23, 2012 12:29 pm

Just one more comment about that: one does not measure errors (so not those are the random variables the theorem is all about), but measures temperatures (those are the random variables that the theorem would be about, if they would be independent… the problem is, they are not).

Curiousgeorge
April 23, 2012 12:37 pm

Gail Combs says:
April 23, 2012 at 12:04 pm
Rich Lambert says:
April 23, 2012 at 7:44 am
Climate science needs something like the ISO 9000 quality control standards used by industry.
_________________________
NO THANK YOU!
*********************************************************************************************
Small world! 🙂 I knew Scott thru ASQ. He’s not the only one who had problems with the ISO concept, and related auditing industry. It was a godsend for consultant larvae. What isn’t often mentioned is that the ISO auditors or their parent organization are not contractually liable for anything. That remains with the audited company. So there is really no incentive for a company to contract with an auditing organization, such as ASME or any of the many others, especially since the costs of doing so are significant. Then there was that loophole – self-certification. And of course various regulatory agencies get very nervous about it. FAA in my case.
We had this discussion at Boeing in the ’90’s. If you go back in the history of the 9000 series you’ll find that it’s primary raison d’etra was to ease trade restrictions and such within the budding European Union, by imposing standards of various sorts on the participating countries. It was sold to US companies as a way to ease entry into Euro markets.
I’m a little surprised that it’s still around, actually. Anything a company needs to know for QA/process control is available in many books on the subject from Juran, Deming, etc. as well as professional’s like Scott.