Monthly Averages, Anomalies, and Uncertainties

Guest Post by Willis Eschenbach

I have long suspected a theoretical error in the way that some climate scientists estimate the uncertainty in anomaly data. I think that I’ve found clear evidence of the error in the Berkeley Earth Surface Temperature data. I say “I think”, because as always, there certainly may be something I’ve overlooked.

Figure 1 shows their graph of the Berkeley Earth data in question. The underlying data, including error estimates, can be downloaded from here.

B.E.S.T. annual land surface average tempFigure 1. Monthly temperature anomaly data graph from Berkeley Earth. It shows their results (black) and other datasets. ORIGINAL CAPTION: Land temperature with 1- and 10-year running averages. The shaded regions are the one- and two-standard deviation uncertainties calculated including both statistical and spatial sampling errors. Prior land results from the other groups are also plotted. The NASA GISS record had a land mask applied; the HadCRU curve is the simple land average, not the hemispheric-weighted one. SOURCE

So let me see if I can explain the error I suspected. I think that the error involved in taking the anomalies is not included in their reported total errors. Here’s how the process of calculating an anomaly works.

First, you take the actual readings, month by month. Then you take the average for each month. Here’s an example, using the temperatures in Anchorage, Alaska from 1950 to 1980.

anchorage raw data plus avgFigure 2. Anchorage temperatures, along with monthly averages.

To calculate the anomalies, from each monthly data point you subtract that month’s average. These monthly averages, called the “climatology”, are shown in the top row of Figure 2. After the month’s averages are subtracted from the actual data, whatever is left over is the “anomaly”, the difference between the actual data and the monthly average. For example, in January 1951 (top left in Figure 2) the Anchorage temperature is minus 14.9 degrees. The average for the month of January is minus 10.2 degrees. Thus the anomaly for January 1951 is -4.7 degrees—that month is 4.7 degrees colder than the average January.

What I have suspected for a while is that the error in the climatology itself is erroneously not taken into account when calculating the total error for a given month’s anomaly. Each of the numbers in the top row of Figure 2, the monthly averages that make up the climatology, has an associated error. That error has to be carried forwards when you subtract the monthly averages from the observational data. The final result, the anomaly of minus 4.5 degrees, contains two distinct sources of error.

One is error associated with that individual January 1951 average, -14.7°C. For example, the person taking the measurements may have consistently misread the thermometer, or the electronics might have drifted during that month.

The other source of error is the error in the monthly averages (the “climatology”) which are being subtracted from each value. Assuming the errors are independent, which of course may not be the case but is usually assumed, these two errors add “in quadrature”. This means that the final error is the square root of the sum of the squares of the errors.

One important corollary of this is that the final error estimate for a given month’s anomaly cannot be smaller than the error in the climatology for that month.

Now let me show you the Berkeley Earth results. To their credit, they have been very transparent and reported various details. Among the details in the data cited above are their estimate of the total, all-inclusive error for each month. And fortunately, their reported results also include the following information for each month:

estimated B.E.S.T. monthly average errorsFigure 3. Berkeley Earth estimated monthly land temperatures, along with their associated errors.

Since they are subtracting those values from each of the monthly temperatures to get the anomalies, the total Berkeley Earth monthly errors can never be smaller than those error values.

Here’s the problem. Figure 4 compares those monthly error values shown in Figure 3 to the actual reported total monthly errors for the 2012 monthly anomaly data from the dataset cited above:

error estimates in 2012 berkeley earth dataFigure 4. Error associated with the monthly average (light and dark blue) compared to the 2012 reported total error. All data from the Berkeley Earth dataset linked above.

The light blue months are months where the reported error associated with the monthly average is larger than the reported 2012 monthly error … I don’t see how that’s possible.

Where I first suspected the error (but have never been able to show it) is in the ocean data. The reported accuracy is far too great given the number of available observations, as I showed here. I suspect that the reason is that they have not carried forwards the error in the climatology, although that’s just a guess to try to explain the unbelievable reported errors in the ocean data.

Statistics gurus, what am I missing here? Has the Berkeley Earth analysis method somehow gotten around this roadblock? Am I misunderstanding their numbers? I’m self-taught in all this stuff and I’ve been wrong before, am I off the rails here? Always more to learn.

My best to all,

w.

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
266 Comments
Inline Feedbacks
View all comments
Nick Stokes
August 18, 2013 12:58 pm

Willis,
“The point of your whole post seems to be that the error is small. First, we don’t know that overall, and it’s certainly not small in the Anchorage data. “
I agree that anomaly base error is not small relative to measurement error. My point is that it is inappropriate to make that comparison alone. They are different kinds of error.
We have two distinct uncertainties – one about the weather that was, and one about the weather that might have been. We actually calculate the anomaly using the weather that was. It’s a weighted sum of a whole lot of monthly averages, in turn sums of daily readings, and our uncertainty about that is mainly in how well those numbers were measured. So if May 2012 Anchorage is in that sum, how uncertain are we about that figure. That collected uncertainty is what Fig 3 expresses.
The anomaly error you are writing about is not concerned with the weather that was. We know the weather that was in 1951-80 (subject to measurement error).
The weather that might have been is what we need to think about when making deductions about climate. For anomalies, what if 1951-80 had by chance been a run of hot years? Now, instead of just a bunch of numbers that we were wondering about whether we’d measured them right, we’re thinking of them as an instance of a stochastic process. And we need a model for that process. Once we have that, and not before, we can calculate the anomaly error that you describe. That is the weather that might have been.
But once we do that, we are dealing with the other uncertainties about the weather that might have been – basically estimated by month-month variation. Eyeballing Anchorage, that looks like about 2°C. That’s when you need to worry about anomaly base error, and it’s a fairly fixed (and small) proportion. For any calc in which you are treating the temps as an instance of a stochastic process, the anomaly base error is about 3% added to the monthly anomaly instance. It’s quite large relative to measurement error, but put another way, measurement error is a small part of your uncertainty about climate.

August 18, 2013 1:02 pm

This is a test. Have I been banned?

Coldlynx
August 18, 2013 1:12 pm

To allow future improvement of measuring technology would use of absolute degree hours be very useful to calculate monthly average. K x h summed for each month. More frequent reading will show up as a more exact value. Old values will still be usable. Add then the monthly variance when presenting the resulting trends to give the reader a sanity check to see if the trend is within normal climate variance.

August 18, 2013 1:12 pm

richardscourtney said August 18, 2013 at 7:32 am

Hence, each determination of global temperature has no defined meaning: it is literally meaningless. And an anomaly obtained from a meaningless metric is meaningless.

It amuses me that those engaging in this numerological exercise so often claim that medieval philosophers spent their time debating how many angels could dance on the head of a pin, a claim for which there is absolutely no evidence.

Carrick
August 18, 2013 1:17 pm

Since it’s been brought up a few times, there was a bit of discussion on Jeff’s blog of Pat Frank’s E&E paper. This is a good place to start.
Lucia had a follow-up here.
This is another place the basic issues surrounding measurement uncertainty in absolute versus relative temperature are explored.

Pamela Gray
August 18, 2013 1:42 pm

I think at issue here is also that an anomaly value taken from an average of data does not transfer the range of that raw data (which is not an error in the common understanding of the term, it is just the range of temperatures collected in a climate region – for example NWS’ recently improved delineation of weather pattern regions in the US – collected at all points at one time to the anomaly calculation. To be sure, the error in calibration is relatively small compared to the range of temperatures collected for a day within a single weather pattern region. Maybe raw data needs to be grouped into weather pattern regions so calculate that range and go from there. Is there a way to transfer the raw data range into the anomaly?

Jeff Condon
August 18, 2013 2:19 pm

Pat,
My critical position hasn’t changed. One thing you did do, that a lot of your critics don’t, is put your work out there and I hope people understand that is no small thing. It would be good to see more effort put into the data but we are not government paid climate citizens.

Pamela Gray
August 18, 2013 2:28 pm

Note to self – USE THE PREVIEW BUTTON!

X Anomaly
August 18, 2013 3:18 pm

“Is there a way to transfer the raw data range into the anomaly?”
,
Bob Tisdale calculates weekly anomalies all the time. If you had daily data (and lots of it), you could calculate daily anomalies. If you had hourly data, say 10 years worth, or 87600 points of data, then yeah sure, you could do hourly anomalies as well.
I always find it fascinating that with just about all climate data anomalies there is an underlying cycle /trend (diurnal, seasonal) that is removed so that we can see the departures from what is considered ‘normal’. Sea Ice is an interesting anomaly recently because now even the anomalies themselves have a cycle embedded in them (recurring loss of summer sea ice). Maybe removing this new ‘recurring loss’ sea ice from the data might highlight something interesting (i.e. an anomaly of an anomaly!!!).
As for all this stuff about errors its given me a head ache! I will only add that when removing an recurring underlying cycle in the data, such as calculating anomalies, the standard deviation for each month will be exactly the same whether it an anomaly or not. So in essence the (monthly) data are unchanged.

bw
August 18, 2013 3:42 pm

Nice responses by Proctor, Istvan, Riser and Frank. Most of the “data” were never intended to have any scientific validity for a global analysis. Hoffer’s conclusion seems apt.
Note also substantial boundry layer effects on surface thermometers reported by Pielke, Sr.
http://pielkeclimatesci.files.wordpress.com/2009/10/r-321.pdf
The pielke paper has already examined most of this subject and more. He recommends rejecting most of the data, and only using Ocean Heat Content for analysis.
Another conclusion is that the Daily Tmax would be more appropriate than the Tave to detect global scale temperaure changes

ZT
August 18, 2013 4:20 pm

In science (sorry to mention that largely taboo subject) you model known information and show that your treatment/theory/etc. can explain known observation. When you have convinced yourself (and others) of the validity of your treatment, you then make predictions. Applying this ‘scientific’ approach to the discussion here, one would have various options. For example, you could generate some synthetic temperature data for a set of geographically distributed measurement sites with known variations, trends, noise, etc. based on random numbers (aka Monte Carlo), and show that the error estimates and trends recovered by your analysis match those used in creating the synthetic input. Of you could could divide your input information in half and use half your input to show that it never falls outside the projected estimates obtained from the data input to your procedure, etc. (This might in fact turn out to be reasonable use of a GCM – see if the land temperature analysis can recover the inputs to the GCM…).
From what I’ve read here, the true believers (and funding recipients) are (as usual) trying to convince the unwashed masses by shouting rather than showing how it is that their analysis can magically reduce the errors in inputs. How about providing a demonstration? (And no – I don’t have to provide a demonstration – the climatologists are trying to ‘prove’ something – not me!).

August 18, 2013 5:01 pm

Interesting response.
Regarding the use of quadrature to reduce error in final (averaged) results: it seems a number agree with me that quadrature is appropriate only when there are multiple readings of the same fixed parameter using the same devices and the same procedures. Using quadrature when combining data from 1500+ land stations and 3500 mobile ARGO sea stations is therefore inappropriate: the error one should use in these is that of the average individual station, not the square root of the increase in data (1/10th the error with 100X the data?).
This is a fundamental consideration. Many commenters have not addressed this even as a response to a comment. I ask the learned statisticians: is this a correct position wrt statistical analyses of a parameter with varying values, using varying instruments?
I know that “adjustments” are made to address the varying instruments and procedures, but there are a limited number of issues here, and they are again the lack of “sameness” applies: thousands of stations that are independent, each having its own error potential.
It is only possible to use the quadrature procedure if you believe that every station, even if unique, is consistent in its error. If a station is sometimes high, sometimes low, even with the same actual temperature. If, instead, human or machine read, the +/- means that if the temperature event occurred again, it would be either reading as higher or lower within the band OR that a temperature event within the high/low would give the same reading, you are stuck with the minimum error reading of, at best, the P50 group.
What I see is a disconnect within the concept of error: we get an error-bar for temperature, sea-level or Mann’s dendrochronology, and then behave as if the center is the correct value down to hundredths of a part. We don’t respect that at any given time the actual event measurement may be at the top or at the bottom, that a strange, random walk may be occurring within the error bars that actually defines the true state of affairs. If this is so, then a small variation – like the ARGO deep sea numbers – would be statistically meaningless, because it would be smaller than the recognized random walk of the world/universe/temperature history of the Earth as a globe and in its parts.
Comments?

Crispin in Waterloo
August 18, 2013 6:21 pm

P
Final paragraph is quite right. I see it all the time in crazily correct climate claims (C^4).
@matsibengtsson
“All rifles tend to go a little high to the right. You do not have a better clue of where the bullseye where, as long as there is a systematic error.”
That is an accuracy problem – the shooter should have compensated for a known issue – it should have shown up in calibration. That begs the question about who is doing the calibrations and how well are they doing it. Think about the usefulness of calibrating really well and getting a really precise measurement of a temperature that is strongly affected by UHI.
Measurement systems require system-level, rational analysis. I see great opportunities for mathematical sleight-of-hand when passing casually back and forth between anomalies. Willis is really poking the right dog here.
When I see someone in the lab measure something with a thermocouple that is precise to 0.02 degrees and then [black box, hum-haw, kerfuffle, kersproing] Presto! Out comes a reading to 6.5 digit precision! Then I know there is a carpet bag in the cloak room.

David L. Hagen
August 18, 2013 6:23 pm

Pat Frank notes August 17, 2013 at 3:08 pm:

However, sensor field calibration studies show large systematic air temperature measurement errors that cannot be decremented away. These errors do not appear in the analyses by CRU, GISS or BEST.

GlynnMhor August 17, 2013 at 3:34 pm
re: “systematic error is present and cancels out”
Pat Frank observes: Aug 18, 11:05 am

That is the case in air temperature measurements, for which the effects of insolation, wind speed, and albedo, among others, are all uncontrolled and variable. In these cases, systematic error can increase or decrease with N, but the direction is always unknown.

Watts et al. 2012 find

1) Class 1 & 2 stations 0.155 C/decade
2) Class 3, 4 & 5 stations 0.248 C/decade
3) NOAA final adjusted data 0.309 C/decade.

.
Further thoughts:
1) The Urban Heat Island effect (UHI) increases with population over time.
2) UHI increases with increasing energy use over time.
3) UHI for a station can change drastically with changing station microclimate.
E.g., by relocation nearer aasphalt, walls, airconditioners etc.
Or by adding airconditioners, or changing ground cover to concrete, cinder or asphalt.
4) UHI overall decreases as NOAA withdraws class 5 stations.
5) The UHI input to Type B error shows up in rising Tmin greater than Tmax.
6) NOAA adjustments to data adds the equivalent of ~0.61 C/decade Type B error.
7) With increasing population and relocating stations etc, UHI varies and generally increases with time.
The majority of such UHI errors appear to be Type B errors, which do NOT decline as 1/sqrt (N) Type A errors. It would help to distinguish and break Type A and Type B errors and trends.
With nonlinearly increasing time varying Type B errors, all the Type B errors will NOT cancel out when using a single long term average.
Willis
For the formal official international definitions and full equations, see JCGM 100:2008 and NIST TN1297 (1994) references below.
From my cursory review, I think time varying Type B variances need to be accounted for and distinguished from Type A variances to account for these varying UHI effect above, and how these in turn affect the mean used when calculating the anomalies. I will let you more mathematically included types dig into the details.
Regards
David
Uncertainty Analysis
Evaluation of measurement data – Guide to the expression of uncertainty in measurement. JCGM 100: 2008 BIPM (GUM 1995 with minor corrections) Corrected version 2010
Note the two categories of uncertainty:
A. those which are evaluated by statistical methods,
B. those which are evaluated by other means.
See the diagram on p53 D-2 Graphical illustration of values, error, and uncertainty.
Type B errors are often overlooked. E.g.

3.3.2 In practice, there are many possible sources of uncertainty in a measurement, including:
a) incomplete definition of the measurand;
b) imperfect reaIization of the definition of the measurand;
c) nonrepresentative sampling — the sample measured may not represent the defined measurand;
d) inadequate knowledge of the effects of environmental conditions on the measurement or imperfect measurement of environmental conditions;
e) personal bias in reading analogue instruments;
f) finite instrument resolution or discrimination threshold;
g) inexact values of measurement standards and reference materials;
h) inexact values of constants and other parameters obtained from external sources and used in the data-reduction algorithm;
i) approximations and assumptions incorporated in the measurement method and procedure;
j) variations in repeated observations of the measurand under apparently identical conditions.
These sources are not necessarily independent, and some of sources a) to i) may contribute to source j). Of course, an unrecognized systematic effect cannot be taken into account in the evaluation of the uncertainty of the result of a measurement but contributes to its error.

Type B uncertainties including unknown unknowns could be comparable to Type A uncertainties.
Furthermore:

when all of the known or suspected components of error have been evaluated and the appropriate corrections have been applied, there still remains an uncertainty about the correctness of the stated result, that is, a doubt about how well the result of the measurement represents the value of the quantity being measured.

( See also Barry N. Taylor and Chris E. Kuyatt, Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results, NIST TN1297 PDF)
The Type A standard uncertainty of n measurements vary as the square root of ((1/(n*(n-1)) times the sum of the squares of the errors.)
See JCGM 100 2008 4.2.2 & 4.2.3, or TN1297 Appendix equation (A-5)
The Type B standard uncertainty of measurements depends on the distribution of the uncertainty but NOT on the number of measurements. See JCGM 100:2008 4.3.7 or TN1297 Appendix equation (A-7)
5.1 The combined standard uncertainty of a measurement result,

suggested symbol uc, is taken to represent the estimated standard deviation of the result. It is obtained by combining the individual standard uncertainties ui (and covariances as appropriate), whether arising from a Type A evaluation or a Type B evaluation, using the usual method for combining standard deviations. This method, which is summarized in Appendix A [Eq. (A-3)], is often called the law of propagation of uncertainty and in common parlance the “root-sum-of-squares” (square root of the sum-of-the-squares) or “RSS” method of combining uncertainty components estimated as standard deviations.

PS JCGM 100:2008 0.7 “There is not always a simple correspondence between the classification into categories A or B and the previously used classification into “random” and “systematic” uncertainties. The term “systematic uncertainty” can be misleading and should be avoided.”

LdB
August 18, 2013 7:02 pm

@Pamela Gray says:
August 18, 2013 at 12:51 pm
Any kind of filter at all, removes the most important part of the data series in my opinion.
The problem is then you miss a very important science fact that a signals as opposed to noise can not be removed by any filter it needs a “specific filter” because a signal is not at all random.
This problem has a very important historic overtone .. look at the Holmdel horn antenna.
http://en.wikipedia.org/wiki/Holmdel_Horn_Antenna
What Arno Penzias and Robert Wilson were trying to work out is why they couldn’t silence and filter the noise out of the Holmdel horn antenna. They realized it was impossible because it was a signal and a signal is not random and you can not filter it out without understanding how the signal is distorting your data. It lead to one of the most important signals of all time the cosmic background radiation and the signal was everywhere.
To deal with a signal and filter it out you need to understand the signal and you can construct a filter to remove it because it isn’t a random effect it’s not 0.5 here and -0.5 there, it has a pattern and that pattern is not random.
The only way you could ignore the effect is if you were absolutely sure that the interfering signals were very small and you could even assign an error value to them so you would say that the background chaotic signals had a value of +- ? degree. For example in a normal radio scenario the CMBR is unimportant because the signal is so large compared to it but as you saw you start looking for signals from deep space and it became a problem. The CMBR level varies over frequency range and that itself also became important but that’s another story.
The same is probably true of the climate signal there will be different chaotic signals at different frequencies and so you will need to understand the background to remove them.
Eventually Pamela if climate science is right, and I suspect it is at least somewhat right, the climate signal will grow and you will be obviously able to separate it from the background (the radio equivalent here on earth) at the moment however you are in the deep space signal mode looking for a very small signal with other signals possibly of equal size mixed into the back of it.

Geoff Withnell
August 18, 2013 7:08 pm

Calibration is comparison to a standard. What is the standard? Remember, we are using the location of the bullet holes (the temperature readings from the instruments) to determine the location of the bullseye (the “true” temperature). So any calibration compensation would be based on what information? How do you know what direction to move the actual instrumental temperature readings to get them to center on the “correct temperature”?
Final paragraph is quite right. I see it all the time in crazily correct climate claims (C^4).
@matsibengtsson
“All rifles tend to go a little high to the right. You do not have a better clue of where the bullseye where, as long as there is a systematic error.”
That is an accuracy problem – the shooter should have compensated for a known issue – it should have shown up in calibration. That begs the question about who is doing the calibrations and how well are they doing it. Think about the usefulness of calibrating really well and getting a really precise measurement of a temperature that is strongly affected by UHI.
Measurement systems require system-level, rational analysis. I see great opportunities for mathematical sleight-of-hand when passing casually back and forth between anomalies. Willis is really poking the right dog here.
When I see someone in the lab measure something with a thermocouple that is precise to 0.02 degrees and then [black box, hum-haw, kerfuffle, kersproing] Presto! Out comes a reading to 6.5 digit precision! Then I know there is a carpet bag in the cloak room.

August 18, 2013 7:15 pm

OK, so I want to retract my statement above that the errors in monthly averages would affect only the intercept. This was a clear case of having the wrong mental picture: since the errors in monthly averages are all different, they add a 12-month periodic error sequence to the anomalies.
However, I’ve done some experiments that convince me that any reasonable-sized error in the monthly averages will have negligible effect on the trends in anomalies.
Here are the experiments. Take the BEST Anchorage, AK data from 1900 on (no missing values from then on). Compute the trendline.
Now, generate a random set of 12 monthly errors. I used a normal distribution mu = 0, sigma = 0.1 on the theory that this is worst-case: since there is a 30-day sample period per month, even one year’s worth of data should have an error in the mean of less than 0.1 per month.
Now systematically add those random errors to the monthly anomalies and recompute trends. I did this for the following time periods:
1900 – 1902
1900 – 1910
1900 – 1920
1900 – 1930
for 100 runs for each time period.
Results:
1900 – 1902. Baseline trend: -0.0253. Average of trends with error: -0.0256. Std dev: 0.0020
1900 – 1910. Baseline trend: -0.0026. Average of trends with error: -0.0026. Std dev 8E-5
1900 – 1920. Baseline trend: 0.002859. Average of trends with error: 0.002861. Std dev 2E-5
1900 – 1930. Baseline trend: 0.00406. Average of trends with error: 0.00406. Std dev 9E-6.
I conclude from this that an error of size 0.1 in the monthly averages — and this is large given a 30-day sample period, I would imagine — is negligible in terms of error in trends.
Thoughts?

LdB
August 18, 2013 7:43 pm

The problem is simple Jeff and it is the problem Pamela above is facing: [] your massive assumption is the error is random.
I am going to introduce a new small error so the problem is your shooter is now shooting outdoors and the target is a long way away in the open outdoors.
Can you run your analysis in such a situation without also recording wind strength which will effect the result of the bullets flight?
Think hard about what happens to your filtering of the data from the above.
Do you see what happens the accuracy of the shot will vary with wind speed and that isn’t random you can’t filter it out you aren’t guaranteed to shoot on 50% windy days and 50% still days.
You do your technique and you end up with complete rubbish if I shot on all windy days you get one result if I shot on all still days I get a totally different. Your error term is totally dependent on another signal.
That’s the problem of interfering signal noise versus random noise. You are all assuming you only have random noise in the data but you need to be dam sure of that given the level of accuracy you are trying to work at.

August 18, 2013 8:04 pm

@all: The time intervals above are “off-by-one.” They should read
1900 – 1901 (inclusive)
1900 – 1909
1900 – 1919
1900 – 1929
@LdB: I don’t know much about shooters and wind. However, a normal distribution of errors *for the monthly averages* is not a necessary assumption. The reason is simple, and has to do with why my experiment gives such consistent results.
Take the BEST data from 1900 to present and run a linear regression with your favorite stats package. I happen to be using JMP. Here is the output:
T(t)=β_1 t+β_0+ε_t
Results:
β_1=0.0013 95% CI:[0.0009,0.0017]
r^2=0.029 σ=2.8609
Notice two things about this. First, the variability is huge — 2.86 even with 1359 observations. This means that although the confidence interval for the mean is very tight, the confidence interval for individual estimates is very wide.
The second thing to notice is an obvious corollary: r^2 is tiny, laughably so.
The accompanying scatter plot tells the tale. The variability is humongous (and is approximately normally distributed).
In light of that large variability, a small error in the monthly averages is meaningless. That’s really the story here. If we add in quadrature the natural variability of the anomalies together with the error in the monthly averages, we essentially get the natural variability.
So I could change my distribution to be anything at all and get pretty much the same result. I bet you 10 quatloos.

JFD
August 18, 2013 8:42 pm

Very interesting, Willis. Good work as usual. You stirred them up with this one. The first thing I learned as a process plant engineer is that you can’t average temperatures. The enthalpy changes with temperature. The earth’s temperature ranges from minus 75 F to 145F. One would have to make a substantial enthalpy correction as well when dealing with a gas, in this case mostly N2 +O2. Even with an enthalpy correction, the answer would still be subject to the two readings a day problem.
I do this from time to time and that is look at the 29 temperature station readings within a 15 mile radius of my home. The wind is dead calm across the area. The elevation is 200 feet plus or minus 20 feet. The temperature stations are mostly Rapid Fire, with a few Madis and a few Normal. The area is pine trees with openings for many homes, a few light manufacturing sites, paved roads and a concrete interstate highway, one city, one small town, several shopping centers and some pasture land for cattle. This time the readings in F were, starting at the highest:
1. 88, 88 =2
2. 84.2, 84 = 2
3. 83.8, 84, 83.9, 83.7, 83.3 = 5
4. 82.4, 82.2, 82.9, 82.4 = 4
5. 81.9, 81.2 = 2
6. 80.6, 80.1, 80.1, 80.7, 80.6 = 5
7. 79 = 1
8. 78.1, 78.4 = 2
9. 77, 77.9 = 2
10. 76.7 = 1
11. 73.2 = 1
You could throw out the two high readings and the two low readings and the spread is still almost 7F. You could weight average the readings and narrow the spread but the difference would still be about 4F.
All of the palaver about errors is simply bull palaver. Using homogenization and averaging a series of temperature readings is for people who haven’t had to work out in the weather, in the hot, in the cold and sometimes in the nice. Those who have know that weather is variable from season to season, year to year and decade to decade.

LdB
August 18, 2013 9:12 pm

Cagle
@LdB: I don’t know much about shooters and wind. However, a normal distribution of errors *for the monthly averages* is not a necessary assumption. The reason is simple, and has to do with why my experiment gives such consistent results.
But you are missing the point you are assuming a controlled enviroment and this is real world data you may be only getting consistancy based on a fallacy.
Lets extend the problem initially the shooters only shot on fine days because windy days makes there sights wobble so they avoid shooting on windy days. Later sights are improved and they shoot on windy days and sunny days suddenly your neat assumption goes to pieces.
Anthony’s own urban heat island argument is a classic in this sort of problem you need to understand the problem properly you can’t just assume you can average it away.
This is why particle physics, radio communications and telescope observatories study there backgrounds in detail they actually study it almost as much as they study the signals.
Some interesting stories on background noise controls recently and how they deal with them
Ethan Segal on why Earth telescopes fire lasers into space
http://scienceblogs.com/startswithabang/2013/07/24/why-observatories-shoot-lasers-at-the-universe/
Tommaso Doringo on which is more sensitive to the Higgs ATLAS or CMS
http://www.science20.com/quantum_diaries_survivor/atlas_vs_cms_higgs_results_which_experiment_has_more_sensitivity-113044

August 18, 2013 10:34 pm

Crispin in Waterloo says:
August 18, 2013 at 6:21 pm
“…That is an accuracy problem – the shooter should have compensated for a known issue – it should have shown up in calibration…”
Lots of dangerous assumptions there. They did not know, they Had belief in CRUT. Thus the assumption you made on finding the bullseye is faulty. And all attempts to find the removed bullseye fail due to the unhandled non ignorable systematic errors behind it. The more systematically bad CRUT models that were used, the worse your error become, since you assume their models were correct. If you had realised some models were wrong, and known which bulletholes came from which rifle, and adjusted for which models were the worst, you would be better of than assuming they all were the same. But you would still have to know the systematic error to get right.
The issue was not known at the time. All rifles Had come from one factory “CRUT” (Center of Rifles Used for Testing), and it was believed all their models were correct. But all tests were done at the internal shooting range. Where the wind blow from up right, so all models at CRUT tended to shoot up to the right.
— Mats —

August 18, 2013 11:22 pm

Willis: On target!!
 
I know this has been discussed ad nauseam before, to my mind without viable solution;
And no, I am not requesting that the issue needs resolution. I’m just adding my nervousness to the overall worry about temperature and anomaly errors.
Every record or temperature starts as a temperature recorded during a 24 hour period called a day.
These 24 hour periods are problematic in themselves as ‘sunlight hours’ are retimed to better correlate with ‘business hours’ within localities. (I just plain take my watch off in Arizona and use the local clocks or I ask locals what time it is.)
 
Every one of these temperature recordings has it’s own error possibilities; only some of which are ever exactly recorded as meta-data.
 
Now I understand Willis, that for this thread you’re focusing on the monthly anomalies, but as other process analysts, e.g. Gail Combs, have mentioned errors are carried forward. Contributing errors towards a monthly anomaly error range are; station, instrument, individual and daily errors.
Adding to the error morass are the mass ‘adjustments’ made to temperature records, supposedly correcting for some of these sample errors. Every adjustment adds to the error measurement; unless each adjustment is verified and validated against a known error in a specific measurement.
 

“Nick Stokes says: August 17, 2013 at 2:52 pm
Willis,
…The error you are discussing won’t affect trends. “

 
Trends? This hour? Todays? This week? This month? This year? Or that magic 30 year trend?
If an error is significant enough to affect the trend within a day, it affects the monthly and yearly trend. Raising the view of the trend to the highest levels do not eliminate nor correct errors, it only masks them.
Assuming that errors cancel out is an assumption requiring proof, for every instance. Dismissal of discussion fails to provide the proofs.
 
Or is the inference in this statement that errors do not matter unless the trend is affected?
Then why calculate an anomaly at all? Temperatures themselves will give similar trends, only there is no ‘average’ then, just the absolute of the temperature.
 
In a way, weather is presented that way already. e.g. Today’s high was 93°F (33.9°C). Our record high (97°F, 36°C) for this day was recorded in 1934. Frankly, I think this easily beats the alarmist statements that our ‘trend’ is xx°C higher and we are all going to die as it gets hotter. The latter statement informs me of nothing, while the former clearly tells me that today is within reason, normal. And that normal could be hotter, could be colder, could be, could be, could be; none of them disastrous.
 
Which leaves me with the suspicion that trends are the sneaky pie charts for climatology.
 

“@matsibengtsson
“All rifles tend to go a little high to the right. You do not have a better clue of where the bullseye where, as long as there is a systematic error.”

“That is an accuracy problem – the shooter should have compensated for a known issue – it should have shown up in calibration. That begs the question about who is doing the calibrations and how well are they doing it. Think about the usefulness of calibrating really well and getting a really precise measurement of a temperature that is strongly affected by UHI…”


 
From a shooter’s perspective, no.
 
Adjustment for shooting can be taken in two formats. Physical adjustment of the firearm, sights, optics, shooting stance, breathing, trigger squeeze… Or mental adjustment, as in noticing the range wind sock indicates the wind just shifted and ‘Kentucky windage’ in aiming is applied.
All of these ‘adjustments’ can be made ‘on the fly’ as so many do, or they can be noted in a journal for future reference in improving one’s ability to shoot well.
 
All of which ignores what ‘bench shooters’ do while shooting versus, say the Olympic competitions. The latter try and place their shots within the X ring (center, so to speak). The former avoid shooting out their aiming point, bullets in the X ring are immaterial; group size is everything. There are reasons why, but no reason to explain them here.
The point is there are different kinds of shooters, who shoot their targets entirely differently. What is a month’s worth of targets worth if different shooters are shooting, under different conditions on different days?
 
We’re back to that meta-data issue; without accurate detailed meta-data, the data itself is suspect. All adjustments, transforms, stats are just a form of mystic passes hoping that the end result is better than the beginning result.
 

“Tom in Florida says: August 18, 2013 at 5:57 am
This may not be relevant to this thread but I have always wondered why monthly temperature measurements are grouped by man made calendars. Wouldn’t it make more sense to compare daily temperatures over a period by using celestial starting and ending points so they are consistant over time?. The Earth is not at the same point relative to the Sun on January 1st every year, will this type of small adjustment make any difference? Perhaps full moon to full moon as a period to average?”

 
Time as mankind defines it into 24 hour periods, 365 days is a celestial calendar of sorts. Mankind marks their time using regular intervals now measured using a cesium fountain clock, e.g. NIST-F1. That doesn’t mean everything keeps the same measurements.
 
Your question echoes some other issues I have with anomalies. Climate, nor weather is influenced by mankind’s calendar. Climate is affected by earth’s conditions while whirling around our sun.
Exactly why are summer’s hottest days or winter’s coldest days after the solstices? Well, the current theory is about the local presence of water.
 
What is missing from the whole daily/weekly/monthly temperature anomaly scenario is any allowance for climate as it relates to seasonal progression. If this year’s winter ended six weeks early, doesn’t that send a spike into six weeks of temperatures?
 
A rogue six weeks of hot spring weather is not the question though. The question is, given what little we know about climate; how do we identify, quantify and account for all cycles in climate?
 
Truth is, we can’t yet. Our measurements are minimal and crude, our records maintenance brutal and harsh, our information demands outrageous. We have distinct issues with identifying short weather cycles let alone understanding short climate cycles.
 
Yup, mystic passes.
Keep working the issues Willis!
PS It is also great to see so many gifted math folks threshing this issue out! Yes, this also means Steve Mosher and Nick Stokes. Terrific discussion!

August 18, 2013 11:31 pm

As long as “temperatures” and their “anomalies” continue to be used to demonstrate (or not) “global warming”, our knickers will remain in the proverbial twist on this whole subject. Heat energy can be measured directly by satellites, so why persevere with temperatures, a man-made proxy, devised to determine how relatively hot or cold it is?

richardscourtney
August 19, 2013 12:30 am

Crispin in Waterloo:
At August 18, 2013 at 6:21 pm you say

That begs the question about who is doing the calibrations and how well are they doing it.

YES! Nobody is” doing the calibrations” because no calibration is possible.
Please see my post at August 18, 2013 at 7:32 am. To save you needing to find it, I copy it here.

Willis:
Your observation is good.
However, there is a more basic problem; viz.
there is no possible calibration for global temperature because the metric is meaningless and not defined.
This problem is not overcome by use of anomalies. I explain this as follows.
Each team preparing a global temperature time series uses a different method (i.e. different selection of measurement sites, different weightings to measurements, different interpolations between measurement sites, etc.). And each team often alters the method it uses such that past data is changed;
see e.g. http://jonova.s3.amazonaws.com/graphs/giss/hansen-giss-1940-1980.gif
Hence, each determination of global temperature has no defined meaning: it is literally meaningless. And an anomaly obtained from a meaningless metric is meaningless.
If global temperature were defined then a determination of it would have a meaning which could be assessed if it could be compared to a calibration standard. But global temperature is not a defined metric and so has no possible calibration standard.
A meaningless metric is meaningless, the errors of an undefined metric cannot determined with known accuracy, and the errors of an uncalibrated measurement cannot be known.
The errors of a measurement are meaningless and undefinable when they are obtained for a meaningless, undefined metric with no possibility of calibration.
Richard

This thread is equivalent to discussion of the possible errors in claimed measurements of the length of Santa’s sleigh.
A rational discussion would be of why determinations of global temperature are not possible and ‘errors’ of such determinations are misleading.
Richard

1 5 6 7 8 9 11