Guest Post by Willis Eschenbach
Over at Judith Curry’s excellent blog there’s a discussion of Trenberth’s missing heat. A new paper about oceanic temperatures says the heat’s not really missing, we just don’t have accurate enough information to tell where it is. The paper’s called Observed changes in top-of-the-atmosphere radiation and upper-ocean heating consistent within uncertainty.
It’s paywalled, and I was interested in one rough number, so I haven’t read it. The number that I wanted was the error estimate for their oceanic heating rates. This error can be seen in Figures 1a and 3a on the abstract page, and it is on the order of about plus or minus one watt/m2. This is consistent with other estimates of upper ocean heat content measurement errors.
I think I can conclusively demonstrate that their claimed error is way too small. To understand why, let me take a detour through the art, science, and business of blackjack.
In a fit of misguided passion, some years back I decided to learn how to count cards at blackjack. I had money and time at the same moment, an unusual combination in my life, so I took a class from a guy I’ll call Jimmy Chan. Paid good money for the class, and I got good value. I’ve always been good with figures, and I came out good at counting cards. Not as good as Jimmy, though, he was a mad keen player who had made a lot of money counting cards.
At the time they were still playing single deck in Reno. And I was young, single, and stupid. So I took twenty thousand dollars from my savings for my grubstake and went to Reno. It was an education about a curious business.
Here are the economics of the business of counting cards.
First, if you count using one of the usual systems as I did, and you are playing single deck, it gives you about a 1% edge on the house. Not much, to be sure, but it is a solid edge. And you can add to that by using a better counting system or a concurrent betting system, where better means more complex.
Second, if you play head-to-head (just you and the dealer) you can typically play about a hundred hands an hour.
Doesn’t take a math whiz to see that if you don’t blow the count, you will win about one extra hand an hour.
And therein is the catch. It means that in the card counting business, your average hourly wage is the amount of your average bet.
It’s a catch because of the other inexorable rule of counting blackjack. This regards surviving the swings and arrows of outrageous luck. If you don’t want to go home empty-handed, you need to have a grubstake that is a thousand times your average bet. Otherwise, you could go bust just from the natural ups and downs.
Now, twenty thousand dollars was all I could scrape together then. So that meant my average bet couldn’t be more than twenty dollars. I started out at the five dollar level.
I’d never spent any time in a casino up until then. I felt like the rube in every movie I ever saw. I played a while at the five dollar level. You never win or lose much there, so nobody paid any attention to me.
After a day or so making the princely sum of $5 per hour, I started betting larger. First at the ten-dollar level. Then at the twenty-dollar level. That was good money back in those days.
But when you start to make a bit of money, like say you hit a few blackjacks in a row and you’re doubling down, they start paying attention to you, and the trouble begins. First they use the casino holodeck to transport a somewhat malignant looking dwarf armed with a pad and a pencil to your table. He materializes at the shoulder of the dealer, and she starts to sweat. I say she because most dealers were women then and now. She starts to sweat because the casino doesn’t really care about card counters. I was making $20 an hour on average? Big deal, everyone in the casino management made that and more.
What scares casino owners is collusion between dealers and players. With the connivance of the dealer a guy can have a “string of luck” that can clean out a table in fifteen minutes and be out the door, meeting the dealer later to split the money. That’s what casino owners worry about, and that’s why the dealer started sweating, she knew she was being watched too. The dwarf peered through coke-bottle thick glasses, and wrote down the number of chips on each stack in the dealer’s rack, how much money I had, how much other players had. He gave the dealer a new deck. He wore a suit that cost as much as my grubstake. His wingtip shoes were shined to a rich luster. He looked at me as though I were a rich man with a loathsome disease. He watched my eyes, my hands. I started sweating like the dealer.
If I continued to win, the holodeck went into action again. This time what materialized were two large, vaguely anthropoid looking gentlemen, whose suits were specially tailored to conceal a bulge under the off-hand shoulder. They simply appeared, one at each shoulder of the aforementioned vertically challenged gentleman, who looked even dwarfier next to them, but clearly at ease in his natural element. They all three stared at me, and when that bored them, at the dealer. And then at me again.
And if the dealer was sweating, I was melting. I’m not made for that kind of game, I’m not good at that kind of pretence. I found out you can take the cowboy out of the country, but you can’t make him go mano-a-mano with the casinos for twenty bucks an hour.
I lasted a week. I logged my hours and my winnings. During that time, I worked well over forty hours. I only made enough money to pay for the flight and the hotel, and that’s about it. I was glad to put my twenty grand back in the bank.
I couldn’t take the constant strain and pressure of counting and not looking like I was counting and trying to stay invisible and feeling like a million eyes in the sky were watching my every eyeblink and having an inescapable feeling of being that guy in the movies who’s about to be squashed like a bug. But for those who can make it a game and keep it up, what an adventure! I’m glad I did it, wouldn’t do it again.
The part I liked the least, curiously, was something else entirely. It was that my every move was fixed. For every conceivable combination of my cards, the dealer’s card, and the count, there is one and only one right move. Not two. Not “player’s choice”. One move. I definitely didn’t like the feeling that I could be replaced by a vaguely humanoid 100% Turing-tested robot with a poor sense of dress and a really, really simple set of blackjack instructions
But I was still interested in the math of it all. And I had my trusty Macintosh 512. And Jimmy Chan had an idea about how to improve the odds by changing his counting method. And so did some of Jimmy’s friends. And he had a guy who tested their new counting method for them, at some university, for five hundred bucks a run.
So I told Jimmy I’d do the analysis for a hundred bucks a run. He and his friends were interested. I wrote a program for my Mac to play blackjack against itself. I wrote it in Basic, because that was what was easy. But it was sloooow. So I taught myself to program in C, and I rewrote the entire program in C. It was still too slow, so I translated the critical sections into assembly language. Finally, it was fast enough. I would set up a run during the day, programming in the details of however the person wanted to do the count. Then I’d start it when I went to bed, and in the morning the run would be done and I’d have made a hundred bucks. I figured that I’d finally achieved what my computer was really for, which was to make me money while I slept.
The computer had to be fast because of the issue that is at the heart of this post. This is, how many hands of blackjack did the computer have to play against itself to find out if the new system beat the old system?
The answer turns out to be a hundred times more hands per decimal. In practice, this means at least a million hands, and many more is better.
What we are looking at is the error of the average. If I measure something many times, I can average my answers. Is the resulting mean value the true underlying mean of what I am measuring? No, of course not. If we flip a hundred coins, usually it won’t be exactly fifty/fifty.
But it will be close to the true average of the data. How close? Well, the measure of how close it is expected to be to the true underlying average is what is called the “standard error of the mean”. It is calculated as the standard deviation of the data divided by the square root of the number of observations.
It is the last fact that concerns us. It means that if we double the number of observations, we don’t cut the error in half, but only to 0.7 of the original value. One consequence of this is that if we need one more decimal of precision, we need a hundred times the number of observations. That is what I meant by a hundred times per decimal. If our precision is plus or minus a tenth (± 0.1) and we want to know the answer to one more decimal, plus or minus one hundredth (± 0.01), we need one hundred times the data to get that precision.
That is the end of the detour, now let me return to my investigation of their error estimate for the ocean heating rate for the top 1800 metres of the ocean. If you recall, or even if you don’t, that was 1 watt per square metre (W/m2).
Now, that is calculated from temperature readings from Argo floats, about 3,000 of them during the study period.
Let me run through the numbers to convert their error (in w/m2) into a temperature change (in °C/year). I’ve comma-separated them for easy import into a spreadsheet if you wish.
We start with the forcing error and the depth heated as our inputs, and one constant, the energy to heat seawater one degree:
Energy to heat seawater:, 4.00E+06, joules/tonne/°C
Forcing error: plus or minus, 1, watts/m2
Depth heated:, 1800, metres
Then we calculate
Seawater weight:, 1860, tonnes
for a density of about 1.03333.
We multiply watts by seconds per year to give
Joules from forcing:, 3.16E+07, joules/yr
Finally, Joules available / (Tonnes of water times energy to heat a tonne by 1°C) gives us
Temperature error: plus or minus, 0.004, degrees/yr
So, assuming there are no problems with my math, they are claiming that they can measure the temperature rise of the top mile of the global ocean to within 0.004°C per year. That seems way too small an error to me. But is it too small? If we have lots and lots of observations, surely we can get the error down to that small?
Here’s the problem with their claim that the error is that small. I’ve raised this question at Judith’s and elsewhere, and gotten no answer. So I am posing the question again, in the hope that someone can unravel the puzzle.
We know that to get a smaller error by one decimal, we need a hundred times more observations per decimal point. But the same is true in reverse. If we need less precision, we don’t need as many observations. If we need one less decimal point, we can do it with one-hundredth of the observations.
Currently, they claim an error of ± 0.004°C (four thousandths of a degree) for the annual average upper ocean temperature from the observations of the three thousand or so Argo buoys.
But that means that if we are satisfied with an error of ± 0.04°C (four hundredths of a degree), we could do it with a hundredth of the number of observations, or about 30 Argo buoys. And it also indicates that 3 Argo buoys could measure that same huge volume, the entire global ocean from pole to pole, to within a tenth of a degree.
And that is the problem I see. There’s no possible way that thirty buoys could measure the top mile of the whole ocean to that kind of accuracy, four hundredths of a degree C. The ocean is far too large and varied for thirty Argo floats to do that.
What am I missing here? Have I made some major math mistake? Their claimed error seems to be way out of line for the number of observations. I’ve not been able to find a good explanation of how they come up with these claims of extreme precision, but however they’re doing it, my math doesn’t support it.
And that’s the puzzle. Comments welcome.
Regards to everyone,
w.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
“I’m saying if they are right that 3,000 foats can measure to 0.004°C error, then 30 floats should be able to measure it to 0.04°C error … and that’s not possible….w.”
Nice article, but are you are complicating things unneccessarily?:
Why not simpy try to question how they can possibly calculate a 0.004°C error with the 3,000 floats?
“The part I liked the least, curiously, was something else entirely. It was that my every move was fixed. For every conceivable combination of my cards, the dealer’s card, and the count, there is one and only one right move. Not two. Not “player’s choice”. One move. I definitely didn’t like the feeling that I could be replaced by a vaguely humanoid 100% Turing-tested robot with a poor sense of dress and a really, really simple set of blackjack instructions … all of which I probably should add to the head post.”
This is what I liked most of all!! You could train someone to do it right and take all the decision making away from them. Then every so often, polygraph test them to make sure they are sticking to the rules etc!! Also you could simulate deviations from the perfect play, so you could calculate the cost of making plays that make you look like you are making mistakes.
In fact you could make yourself look like a total hopeless card counter to casino staff and other counters, yet have the smug knowledge you are better than the lot, while being still allowed to play, because you are soooo bad. Finally it is the weight of the money that brings you down. You just can’t keep hiding what you win.
Willis: after thinking some more I understand your point. I guess you have to assume that the measurements are far enough in time and space to be independent in order to land at 0.04 C accuracy with 30 floats. But if that assumption doesn’t hold the situation gets even worse, i.e. 30 floats would give an even better accuracy than 0.04 C, which is of course even less probable. I think I got it the other way around in my first comment.
The last ice age was about 90k years long and we have only been into the Holocene about 10k years. Considering the vast heat capacity of the oceans, there is no reason to think that heat flow to and from the oceans is in equilibrium. Under Halocene temperatures, there is still missing heat that is being stored in the oceans. Trenbreth and company might imply that the ‘missing” heat is a threat to climate in the near future but common sense says the heat will accumulate until the next ice age.
GaryW says:
January 27, 2012 at 8:17 am
You are correct, Gary. My understanding of those two terms is a bit different, although it may be semantic. Consider someone shooting at a target. This is accuracy versus precision.

A measure of accuracy is the RMS average of the distance of the hits from the center of the target.
A measure of precision is the standard deviation of the locations hit.
So precision is not exactly the “resolution in units of measure a value can be read”. It is the repeatability of the measurements. Your digital tape measure might be read in a resolution of tenths of an inch. But if you measure the same distance three times and get widely separated values, it is not precise.
One other important inequality. Your precision can be greater than your accuracy. But your accuracy can’t be greater than your precision.
Finally, an average can be more accurate than the underlying measurements. Consider the upper panel in figure 1. If the errors are symmetrical the average of a number of shots will be close to the center.
w.
Tim Clark says:
January 27, 2012 at 11:50 am
The claimed error of 0.004°C is from Argo only.
w.
GaryW says:
January 27, 2012 at 12:09 pm
Thanks, Gary, particularly for quoting my words. It allows me to explain why your claim is not true. I made a general statement with no assumptions at all. It was an IF … THEN statement. In fact the requirement is even weaker, the error only has to be symmetrical, not gaussian normal.
You are correct, that is not true for all distributions. But for the kinds of distributions we’re talking about here it is generally true, certainly close enough to gain some precision through averaging.
Note that this does not include systematic error. But then, the quoted error of 1 W/m2 also does not include systematic error, so we are still comparing apples and apples.
w.
“The weaker the data available upon which to base one’s conclusion, the greater the precision which should be quoted in order to give the data authenticity.”–Norman Ralph Augustine
Willis wrote:
So, assuming there are no problems with my math, they are claiming that they can measure the temperature rise of the top mile of the global ocean to within 0.004°C per year.
Of course, they are not.
You are all tangled up, Willis. Their published error is statistical, not instrumental. It is the uncertainty of the trend, not of a temperature measurement.
Before accusing professional scientists of being completely incompetent, shouldn’t you at least give them the courtesy of responding to your notions — which in this case are completely off-base, yet put here only, it seems, to try to embarrass them regardless of the facts.
It’s like Drudge — doesn’t matter if he’s right — throw enough junk against the wall, and something is bound to stick somewhere.
I presume you will be writing a letter to Nature Geosciences with your finding?
Well I think I read every single post here so far (on this thread).
And there were two things I don’t believe I saw in ANY post.
#1 The sea is not like a piece of rock (large rock), it is full of rivers with water flowing along every which way; meandering if you will, and this meandering is aided and abetted by the twice daily tidal bulge.
So the likelihood of a buoy, no matter how tethered or GPS located, being in the same water for very long is pretty miniscule, so you might as well assume that evry single observation, is actually a single observation of a different piece of water.
Second and far more important, this like all climate recording regimens, is a sampled data sytem.
So before you can even begin to do your statistication on the observations, you have to have VALID data, as determined by the Nyquist theorem.
You have to take samples that are spaced no further apart than half the wavelength of the highest frequency component in the “signal”. The “signal” of course is the time and space varying temperature or whatever else variable you want to observe. That of course means the signal must be band limited, both in space and time. If the Temperature shall we say undergoes cyclic variations in say a 24 hour period, that look like a smooth sinusoid if you take a time continuous record, then thn you must take a sample sooner that 12 hours after the previous one. If the time variation is not a pure sinusoid, then it at least has a second harmonic overtone component, so you would need one sample every six hours,
And if the water is turbulent and has eddies with spatial cycles of say 100 km, then you would need to sample every 50 km. OOoops !! I believe that all of your spatial samples need to be taken at the same time, otherwise you are simply sampling noise.
Now if you do it correctly (fat chance), then in theory it is possible to perfectly reconstruct the original continuous function (of two variables in this case)
Well you don’t really want the original signal do you (or need it). What you want to do is statisticate the numbers and get the “average”. Also known as the zero frequency signal.
Well the Nyqist theorem tells you that if you have a signal that you think is band limited to a frequency B, then you need to sample at a frequency 2B minimum to be able to recover the signal.
If your signal actually has a frequency component at B + b outside the band limit, then the reconstructed continuous function, will now contain a spurious signal at a frequency of B-b, which constitutes a noise.
And note that B – b is less than B, which means it is within the signal passband, so it is inherently impossible, no matter what, to remove the spurious signal by any knd of filter, without simultaneously removing actual real parts of the signal, which is just as bad a corruption as adding noise; known as “aliassing” noise. So you see why your statistics no longer works; not even the central limit theorem can save your hide; because your sampled data set is not avalid data set.
Now suppose instead of sampling at a frequency 2B like you are supposed to, you only sample at a rate B, half of the required minimum rate; and this case will arise, if you sample twice a day, every 12 hours, but your 24 hour signal is not sinusoidal so it contains at least a frequency of one cycle in 12 hours (or higher if it is third harmonic distortion.
So in that case you actually have a spurious signal at a frequency that is B + B, which after reconstruction becomes a noise signal at a frequency of B – B or zero frequency. Now this as we said is the average of the signal.
So even if you don’t need to reconstruct the signal, but only want it’s average, it takes only a factor of 2 in undersampling, and you can no longer recover even the average.
So forget all your fancy statistics; without a Nyquist valid set of samples you can’t do much.
This is the pestilence that afflicts Hansen’s GISStemp. All that his collection of data records is GISStemp, and nothing else, it has no validity as a Temperature for the earth.
Well same thing for the Argo buoys. They don’t tell you a thing about the global ocean Temperature; but they might be giving you interesting information about the general lcations where each of the buoys happens to be; but don’t forget what the ocean river meanders are doing to that locaition.
The very first lecture in climatism 101 should be the general theory of sampled data sytems.
Willis, you mention they are not discussing a decadal trend when in the abstract they say
“We combine satellite data with ocean measurements to depths of 1,800 m, and show that between January 2001 and December 2010, Earth has been steadily accumulating energy at a rate of 0.50±0.43 Wm−2 (uncertainties at the 90% confidence level).”
To me, it looks like their error bar applies to a decadal average rate. In other words, the error bar converted to temperature is about 0.04 degrees over the period from 2001-2010. The surface ocean temperature rises by typically 0.1 degrees per decade, so they can actually discern when a warming occurs with this dataset.
Willis,
“Thanks, Gary, particularly for quoting my words. It allows me to explain why your claim is not true. I made a general statement with no assumptions at all. It was an IF … THEN statement. In fact the requirement is even weaker, the error only has to be symmetrical, not gaussian normal.
You are correct, that is not true for all distributions. But for the kinds of distributions we’re talking about here it is generally true, certainly close enough to gain some precision through averaging.”
Perhaps the difference between are positions on this is semantic but that is actually the point I am trying to get across. Your use statistics and error analysis is certainly valid for many real world problems. Unfortunately, instrument accuracy is what we are discussing.
Your use of the graphics are interesting but have only a vague relationship to instrumentation concepts. The question is not how how accurate or precise the gun was that shot the bullets but how accurate and precise the target is at measuring the patterns the bullets produced. That is, I expect, a complete inversion of what you are thinking. Of course, it might be fun continuing with the target and gun example and I could probably even fit in a joke about the Texas Sharpshooter but actually it is a poor example for discussing instrument accuracy and precision. From an instrumentation perspective, the target is the measurement instrument, not the gun, and certainly not the bullet patterns.
If you are still not on board with this, consider you accuracy example. Averaging the locations of the bullet holes relative to the center of the target might give you a more precise indication of where the rifle is actually sighted but it will not improve the accuracy of either the rifle or the target. We are discussing instrument accuracy and averaging values.
The accuracy specifications of instruments are not simply made up by the manufacturer for advertising purposes (at least they are not supposed to be!) Each measurement technology has its own fundamental accuracy limits in terms of principle of operation, repeatability during manufacture, repeatability over time, linearity, and accuracy of error compensation features. Even those expensive RTDs in the ARGOS bouys require complex and careful linearity correction and range calibration. RTDs can be very good and are fairly linear in their response to temperature variations, but not 0.005 degree linear. Also, RTD resistance is measured by passing a current through it. That current causes a measurable heating of the sensor. The protective sheath it rides in has its own thermal resistance and heat capacity. Each technical issue and compensation has its own characteristic effect on a final temperature reading. The final result is that instrument errors, even when reduced to small values, are not random within the operating range of the instrument or from instrument to instrument. Assuming so is not valid.
I suspect the semantic problem exists because most folks do not look at an instrument from the perspective of someone trying to squeeze the best accuracy out of it, given the physical constraints. Users of that instrument do not see the tweaks necessary to correct for linearity and hysteresis quirks inherent in the design and still meet the required specs.
Phil. says:
January 27, 2012 at 3:09 pm
I’m still not seeing how 1,080 samples is a “small sample” in the climate science context. And I would be very surprised if 1,080 samples didn’t capture the standard deviation adequately.
W.
Michael J says:
January 27, 2012 at 3:58 pm
We’re talking about the “random” error, or “noise”. Systematic errors are a separate question.
w.
markx says:
January 27, 2012 at 4:05 pm
Actually, I’m simplifying things. Going your way you get into endless arguments about whether a particular assumption is justified.
I’m coming from the other end. I’m saying no matter how they calculated it, it can’t be that good.
w.
Erinome says:
January 27, 2012 at 8:12 pm
What do you think a “measuring a temperature rise” is if not a trend, Erinome? It’s certainly not a datapoint. I’d lay odds you haven’t even looked at where I got my data.
All I’ve done is ask the question, Erinome. I’ve invited them or anyone to show me where my math is wrong. I freely admit it may be wrong, but you know what?
So far, no one has come by to set me straight and explain where I went wrong.
Oh, piss off with your bogus morality. I just asked a question.
I admit I could be wrong, I say I don’t know the answer, so what the heck are you bitching about?
If you don’t like it, Erinome, fine, you can go and ask the authors yourself.
I await your return with the answer from the authors. This is great, we can have dueling methods. Are you going to let us see your letter before you send it to them? Please? This should be fun.
Actually, I was thinking more of GRL. Nature publishes such tawdry piffle these days I’d rather not go there.
w.
Jim D says:
January 27, 2012 at 8:35 pm
Jim, have you looked at where I got my data? See those yearly values? See those yearly error bars? They are not decadal, no matter how many times you find the word “decadal” in the paper.
w.
GaryW says:
January 27, 2012 at 8:40 pm
Gary, I read the whole thing and it was fascinating, but I don’t have a clue what that means about my question. Let me give my question again and you can take another shot at it.
I say that if 3,000 Argo buoys give an error of 0.004°C, then other things being at least approximately equal, 30 buoys should give an error of 0.04°C.
1. Are there logical or math errors in that calculation?
2. Do you think 30 Argo buoys (1,080 observations per year) can measure the annual temperature rise of the global ocean to within 0.04°C?
Thanks for your response,
w.
Willis, you write:
“Presumable the scientists have calculated, not the theoretical limit of error, but the actual measurement error from the measurements. So my calculation is also about the actual error.”
This is the core of 10+ separate comments above, including mine.
Determining the precision, as you noted, only requires testing the equipment many times to see how well it agrees with itself and other co-located instruments. But accuracy requires more knowledge. To find the “the actual measurement error” for a single float (or surface station, whatever), one needs to know the actual average temperature of the gridcell the float is attempting to measure. This is decidedly not easy.
And we haven’t seen this attempted in more than a rough fashion to my knowledge. And repeated for each individual gridcell, because they aren’t all identical – they are all unique. This is the core difference between a “weather measurement” and a “climate measurement” IMNSHO – one is a point-source, and is judged based on how well it mimics a similar instrument measuring the same value. The ‘instrumental error’ is reasonable here. The plane landing at the airport wants the temperature at the airport, which is why the instruments were placed there in the first place. But… not when you’re extrapolating that same instrument to an entire gridcell.
And as far as I can tell, finding “actual measurement error when used for measuring the gridcell temperature” is not happening. The ‘weather error’ appears to be propagated instead.
Rephrasing:
Measuring the precision and accuracy of a point-source thermometer is not tough.
Measuring the precision of a gridcell-thermometer is also not tough.
Measuring the accuracy of a gridcell-thermometer is daunting. And should result in entire rafts of papers on pure instrumental methods alone.
Brilliant work, Willis. We need to think of a memorable name for your argument so that Warmists can more readily fill with dread every time they hear it. It’s too late for me to think clearly. I will get the ball rolling with the “Thirty Buoy Argument.”
Springing off of this comment, maybe one could think of it backwards. Pick a resolution. Take the equivalent of a digital picture of the entire temperature field of the oceans. Then, use a compression algorithm to reduce the size of the resulting file. Obviously, the fuzzier the reconstruction that you can tolerate, the more you can compress the file. The resulting compressed file is the sampling that you need to do to reconstruct a picture of the temperature field that you have never seen. Compression in this example would be affected by 3 things, roughly: the original chosen resolution, the amount of fuzziness or loss on compression that one could tolerate and the structure of the file. If the file is composed of many repeating elements (e.g. an even background), the compression is greater or the sampling needed is smaller.
Can this be estimated? I would think it could be. Simply take actual digital photographs of the world’s oceans from space (clouds and all to simulate temperature variations), stitch them together into a mosaic and then compress the resulting digital file to 3,000 pixels. After you do that, blow it back up and see if it resembles in any way the original photographic mosaic. I would think that the result be more than a little fuzzy. The equivalent of 0.004°C resolution per reconstructed pixel?: not likely.
P.S. Your calculation appear to be correct, as usual.
Alan S. Blue says:
January 27, 2012 at 9:13 pm
Look, they are giving it as the measurement error. I don’t care what you call it. Call it the pizza error, it is the error that the scientists are saying that their results fall within.
My argument doesn’t depend on the name of their error. It doesn’t depend on the difference between precision and accuracy. It involves a statement and a couple questions, and I’d be interested in your answers.
The scientists say that 3,000 Argo buoys give an error of 0.004°C. Whatever kind of error it is, then other things being at least approximately equal, 30 buoys should give an error of 0.04°C.
1. Are there logical or math errors in that calculation?
2. Do you think 30 Argo buoys (1,080 observations per year) can measure the annual temperature rise of the global ocean to within 0.04°C?
Thanks,
w.
I’m reminded that, tucked away in Trenberth’s article on Missing Heat on SkS last year, was his own plot of sea surface temperatures showing a distinct curve which had just passed a maximum and was starting to decline. Yes, it appeared (without comment) on SkS of all places …
David L. Hagen says:
January 27, 2012 at 5:47 am
“The international quantitative standard for calculating the full uncertainty is the root mean square combination of all the errors.”
Catchy little slogan, David. Unfortunately, it does not apply to the real world, because of your use of the word, “all.” “All” includes METHOD ERRORS (MEs), IN ADDITION TO RANDOM ERRORS. In the real world, the MAGNITUDES of the MEs are usually unknown, and are often unknowable.
Example: Use Larch tree rings as proxies for temperature, as Keith Briffa did. His climate ‘science’ study extrapolated back to a time when there were no reasonably accurate thermometers, with which he could calibrate the ‘data’ from his (cherry-picked) tree rings. MEs are bloody inconvenient, wouldn’t you say?
Now I’ll try to explain the point that I attempted to make in my earlier post, but from a different angle.
Larry Fields says:
January 26, 2012 at 11:58 pm
Willis, you’re a truly extraordinary science writer, but in this case, you made a logic error that most of us could have made. An analogy will illustrate the point.
Case 1. When your son reaches his fourth birthday, start measuring his height every day. Use a state-of-the-art electronic height-measuring instrument, having sufficient readout resolution to give a slightly different value each time.
When his height stops growing at around 17 or 18 years of age, use this data to estimate the total growth after age 4. And for the sake of Science, throw in the usual uncertainty estimations.
Case 2. This is the fun part. Throw out the measurements from the 2nd, 4th, 6th, etc days. Then do the calculations all over again. The second growth estimate will be the same as the first. Have you done an appreciably (100*(1 – SQRT(2) = 29%) worse job of nailing down the uncertainty for the total height growth since age 4? No.
Why not? Because your son will have many barely measurable overnight growth spurts, and they do not happen every night. Measuring your son’s height every day does not contribute a whole lot to narrowing down the uncertainty estimate for the total height gain since age 4. In this case, the things that matter the most are the alpha and omega points.
The handy dandy Standard Error of the Mean formula (SEM) does not apply here. SEM only applies when you’re measuring THE SAME THING over and over again; and when that SAME THING does not monotonically increase over time, and it does not monotonically decrease over time. We’re definitely NOT doing that in my example.
And we’re not doing that with the Argo buoys either, unless they’re programmed for synchronous temperature measurements, and they never get knocked out of kilter. There are probably valid statistical methods for analyzing the Argo experiment, but I haven’t the foggiest idea what they are.
What I do know is that stats methods are based upon mathematical theorems. A typical theorem has an if-part (hypothesis), and a then-part (conclusion).
Even though I’m no Statistician to the Stars, I am fairly skilled at recognizing when someone is applying the then-part of a theorem, when the if-part is not satisfied. Willis, color yourself busted. 🙂
Willis,
As a fisherman, you obviously have seen how a drifting object in the ocean attracts organic life, both plant and animal. While the effect of seaweed and schools of small fish may be minor, it increases with time.