Errors in Estimating Temperatures Using the Average of Tmax and Tmin—Analysis of the USCRN Temperature Stations

Guest post by Lance Wallace

Abstract

The traditional estimate of temperature at measuring stations has been to average the highest (Tmax) and lowest (Tmin) daily measurements. This leads to error in estimating the true mean temperature. What is the magnitude of this error and how does it depend on geographic and climatic variables? The US Climate Reference Network (USCRN) of temperature measuring stations is employed to estimate the error for each station in the network. The 10th-90th percentile range of the errors extends from -0.5 to +0.5 C. Latitude and relative humidity (RH) are found to exert the largest influences on the error, explaining about 28% of the variance. A majority of stations have a consistent under- or over-estimate during all four seasons. The station behavior is also consistent across the years.

Introduction

Historically, temperature measurements used to estimate climate change have depended on thermometers that record the maximum and minimum temperatures over a day. The average of these two measurements, which we will call Tminmax, has been used to estimate a mean daily temperature. However, this simple approach will have some error in estimating the true mean (Tmean) temperature. What is the magnitude of this error? How does it vary by season, elevation, latitude or longitude, and other parameters? For a given station, is it random or consistently biased in one direction?

Multiple studies have considered this question. Many of these are found in food and agriculture journals, since a correct mean temperature is crucial for predicting ripening of crops. For example, Ma and Guttorp (2012) report that Swedish researchers have been using a linear combination of five measurements (daily minimum, daily maximum, and measurements taken at 6, 12, and 18 hours UTC) since 1916 (Ekholm 1916) although revised later (Moden, 1939; Nordli et al, 1996). Tuomenvirta (2000) calculated the historical variation (1890-1995) of Tmean – Tminmax differences for three groups of Scandinavian and northern stations. For the continental stations (Finland, Iceland, Sweden, Norway, Denmark) average differences across all stations were small (+0.1 to +0.2 oC) beginning in 1890 and dropping close to 0 from about 1930 on. However, for two groups of mainly coastal stations in the Norwegian islands and West Greenland, they found strongly negative differences (-0.6 oC) in 1890, falling close to zero from 1965 on. Other studies have considered different ways to determine Tmean from Tmin, Tmax and ancillary measurements (Weiss and Hays, 2005; Reoicovsky et al., 1989; McMaster et al., 1983; Misra et al., 2012). Still other studies have considered Tmin and Tmax in global climate models (GCMs) (Thrasher et al, 2012; Lobell et al., 2007).

This short note examines these questions using the US Climate Reference Network (USCRN), a network of high-quality temperature measurement stations operated by NOAA and begun around 2000 with a single station, reaching a total of about 114 stations in the continental US (44 states) by 2008. There are also 4 stations in Alaska, 2 in Hawaii, and one in Canada meeting the USCRN criteria. Four more stations in Alaska have been established, bringing the total to 125 stations, but have only 2-3 years of data at this writing. A regional (USRCRN) network of 17 stations has also been established in Alabama and has about 4 years of data. All these 142 stations were used in the following analysis, although at times the 121- or 125-station dataset was used. The stations are located in fairly pristine areas meeting all criteria for weather stations. Temperature measurements are taken in triplicate, and other measures at all stations include precipitation and solar radiance. Measurements of relative humidity (RH) were instituted in 2007 at two stations and by about 2009 were being collected at the 125 sites in the USCRN network but not at the Alabama (USRCRN) network. A database of all measurements is publically available at ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/. The database includes hourly, daily, and monthly results. This database, together with single compilations of multiple files kindly supplied by NOAA, was used for the following analysis.

Methods

The monthly data for the 142 stations were downloaded one station at a time and joined together in a single database. (Note: at present, the monthly data are only available to the public as separate files for each station. Daily data are available as separate files for each year for each station. This requires 142 separate downloads for the monthly data, and about 500 or so downloads for the daily data. Fortunately, a NOAA database manager was able to provide the daily data as a single file of about 373,000 records.)

The hourly data include the maximum and minimum 5-minute average temperatures recorded each hour as well as the mean temperature averaged over the hour. The daily data include the highest 5-minute maximum and the lowest 5-minute minimum temperatures recorded in the hourly data that day (i.e. Tmax and Tmin) together with the mean daily temperature (Tmean). The average of Tmax and Tmin ({Tmax+Tmin}/2) is also included for comparison with the true mean. The monthly data includes the maximum and minimum temperatures for the month; these are averages of the observed highest 5-minute average maximum and minimum daily temperatures. There is also an estimate of the true mean monthly temperature and the monthly average temperature using the monthly Tmax and Tmin. The difference between the daily Tminmax and the true mean will be referred to as Delta T:

DeltaT = (Tmin+Tmax)/2 – Truemean

Data were analyzed using Excel 2010 and Statistica v11. For each station, the entire length of the station’s history was used; the number of months ranged from 47 to 132. Since the relationship between the true mean and Tminmax may vary over time, these were compared by season, where Winter corresponds to January through March and so on. The diurnal temperature range (DTR) was calculated for each day as Tmax-Tmin. For the two stations with the highest and lowest overall error, the hourly data were downloaded to investigate the diurnal pattern.

Results

As of Aug 11, 2012 there were 12,305 station-months and 373,975 station-days from 142 stations. The metadata for all stations are available at the Website http://www.ncdc.noaa.gov/crn/docs.html.

Delta T averaged over all daily measurements for each station ranged from -0.66 oC (Lewistowne, MT) to +1.38 oC (Fallbrook, CA, near San Diego). (Figure 1). A negative sign means the minmax approach underestimated the true mean. Just about as many stations overestimated (58) as underestimated (63) the true mean.

Figure 1. DeltaT for 121 USCRN stations: 2000-August 5, 2012. Error bars are standard errors.

A histogram of these results is provided (Figure 2). The mean was 0.0 with an interquartile range of -0.2 to +0.2 oC. The 10-90 percentile range was from -0.5 to + 0.5 oC.

Figure 2. Histogram of Delta T for 121 USCRN stations.

Seasonal variability was surprisingly low: in more than half of the 121 stations with at least 47 months of complete data, the Tminmax either underestimated (28 sites) or overestimated (39 sites) the true mean in all 4 seasons. Most of the remaining stations were also weighted in one direction or another; only 20 stations (16.5%) were evenly balanced at 2 seasons in each direction. 16 of these 20 were negative in winter and spring, positive in summer and fall. Over all 121 stations, there was a slight tendency for underestimates to be favored in winter and spring, with overestimates in summer and fall (Figure 3).

Figure 3. Variation of Delta T by season.

Since Delta T was determined by averaging all values over all years for each station, the possibility remains that stations may have varied across the years. This was tested by comparing the average Delta T for each station across the years 2008-9 against the average in 2010-11. The result showed that the stations were very stable across the years, with a Spearman correlation of 0.974 (Figure 4).

Figure 4. Comparison of Delta T for each station across consecutive 2-year periods. N = 140 stations.

When Delta T is mapped, some quite clear patterns emerge (Figure 5). Overestimates (blue dots) are strongly clustered in the South and along the entire Pacific Coast from Sitka, Alaska to San Diego, also including Hawaii. Underestimates (red dots) are located along the extreme northern tier of states from Maine to Washington (excepting the two Washington stations west of the Cascades) and all noncoastal stations west of Colorado’s eastern border.

Figure 5. DeltaT at 121 USCRN stations. Colors are quartiles. Red: -0.66 to -0.17 C. Gold: -0.17 to 0 C. Green: 0 to +0.25 C. Blue: +0.25 to +1.39 C.

Figure 5 suggests that the error has a latitude gradient, decreasing from positive to negative as one goes North. Indeed a regression shows a highly significant (p<0.000002) negative coefficient of –0.018 oC per degree of latitude (Table 1, Figure 6). However, other variables clearly affect DeltaT as shown by the adjusted R2 value indicating that latitude explains only 21% of the observed variance.

Table 1. Regression of DeltaT (Tminmax-True mean) on latitude

N=142 stations Regression Summary for Dependent Variable: DELTATR= .467 R²= .218 Adjusted R²= .212F(1,140)=38.9 p<.00000 Std.Error of estimate: .278

b*

Std.Err.

of b*

b

Std.Err.

of b

t(140)

p-value

Intercept

0.75

0.11

6.6

0.000000

LATITUDE

-0.466

0.075

-0.018

0.002

-6.2

0.000000

* Standardized regression results (μ=0, σ=1)

Figure 6. Regression of DeltaT on Latitude.

Therefore a multiple regression was carried out on the measured variables within the monthly datafile. The Spearman correlations of these variables with DeltaT are provided in Table 2. The largest absolute value of the Spearman coefficient was with latitude (-0.375), but other relatively high correlations were noted for Tmin (0.308) and RHmax (0.301). However, TMIN, TMAX, TRUEMEAN and DTR could not be included in the multiple regression, since they (or their constituent variables in the case of DTR) appear on the left-hand side as part of the definition of DELTAT. Also the three RH variables were highly collinear, so only RHMEAN was included in the multiple regression. Finally, because Alaska and Hawaii have such extreme latitude and longitude values, they were omitted from the multiple regression. These actions left 3289 station-months (out of 3499 total) and 6 measured independent variables, of which 4 were significant. Together they explained about 30% of the measured variance (Table 3, Figure 7). However, only latitude and RH were the main explanatory variables, explaining 28% of the variance themselves with about equal contributions as judged from the t-values. When the multiple regression was repeated for each season, in fall and winter the four significant and two nonsignificant variables were identical to those in the annual regression, with adjusted R2 values of 19-20%, but in spring and summer all six variables were significant, with R2 values of 47-50%. However, in all seasons, the two dominant variables were latitude and RH.

Table 2. Spearman correlations of measured variables with DeltaT.

VARIABLE

DELTAT

LONGITUDE (degrees)

0.075

LATITUDE (degrees)

-0.375

ELEVATION (feet)

-0.169

TMAX (oC)

0.231

TMIN (oC)

0.308

TMINMAX (oC)

0.272

TRUEMEAN (oC)

0.239

DTR (oC)

-0.134

PRECIP (mm)

0.217

SOLRAD (MJ/m2)

-0.043

RHMAX (%)

0.301

RHMIN (%)

0.124

RHMEAN (%)

0.243

Table 3. Multiple regression on DeltaT of measured variables

N=3289 station-months Regression Summary for Dependent Variable: DELTAT R= .5522 R²= .3049 Adjusted R²= .3037F(6,3282)=239.98 p<0.0000 Std.Error of estimate: .3683Exclude condition: state=’ak’ or state=’hi’

b*

Std.Err.

of b*

b

Std.Err.

of b

t(3282)

p-value

Intercept

-0.294812

0.085454

-3.4500

0.000568

LONG

-0.169595

0.018086

-0.005496

0.000586

-9.3772

0.000000

LAT

-0.407150

0.015910

-0.032380

0.001265

-25.5913

0.000000

ELEVATION

0.066710

0.018980

0.000013

0.000004

3.5147

0.000446

PRECIP (mm)

-0.008293

0.017129

-0.000055

0.000114

-0.4842

0.628291

SOLRAD MJ/m2)

0.000193

0.016465

0.000013

0.001099

0.0117

0.990630

RHMEAN

0.552356

0.021529

0.015417

0.000601

25.6565

0.000000

* Standardized regression results (μ=0, σ=1)

Figure 7. Predicted vs observed values of DeltaT for the multiple regression model in Table 3.

Since RH had a strong effect on DeltaT, a map of RH was made for comparison with the DeltaT map above (Figure 8). The map again shows the clustering noted for DeltaT along the Pacific Coast, the Southeast, and the West. However, the effect of latitude along the northern tier is missing from the RH map.

Figure 8. Relative humidity for 125 USCRN stations: 2007-Aug 8, 2011. Colors are quartiles. Red: 19-56%. Gold: 56-70%. Green: 70-75%. Blue: 75-91%.

Fundamentally, the difference between the minmax approach and the true mean is a function of diurnal variation—stations where the temperature spends more time closer to the minimum than the maximum will have their mean temperatures overestimated by the minmax method, and vice versa. To show this graphically, the mean diurnal variation over all seasons and years is shown for the station with the largest overestimate (Fallbrook, CA) and the one with the largest underestimate (Lewistowne, MT) (Figure 9). Although both graphs have a minimum at 6 AM and a maximum at about 2 PM, the Lewistown (lower) diurnal curve is broader. For example, 8 hours are within 2 oC of the Lewistowne maximum, whereas only about 6 hours are within 2 oC of the Fallbrook maximum. Another indicator is that 12 hours are greater than the true mean in Lewistowne but only 9 in Fallbrook.

Figure 9. Diurnal variation and comparisons of the true mean to the estimate using the minmax method for the two stations with the most extreme over- and underestimates.

Discussion

For a majority of US and global stations, at least until recent times, it is not possible to investigate the question of the error involved in using the Tminmax method, since insufficient measurements were made to determine the true mean. The USCRN provides one of the best datasets to investigate this question, not only since both the true mean temperatures and the daily Tmax and Tmin are provided, but also because the quality of the stations is high. Since there are >100 stations well distributed across the nation, which now have at least 4 years of continuous data, the database seems adequate for this use and the results comparing 2-year averages suggest the findings are robust.

The questions asked in the Introduction to this paper can now be answered, at least in a preliminary way.

“What is the magnitude of this error?” We see the range is from -0.66 oC to +1.38 oC, although the latter value appears to be unusual, with the second highest value only +0.88 oC.

“How does it vary by season, elevation, latitude or longitude, and other parameters?” The direction of the error is surprisingly unaffected by season, with more than half the stations showing consistent under- or over-estimates during all 4 seasons. We have seen a strong effect of latitude and RH, with a weaker effect of elevation. Geographic considerations are clearly important, with coastal and Southern sites showing strong overestimates while the northern and western stations mostly show strong underestimates of the minmax method. Although the Tuomenvirta (2000) results mentioned above are averages across all stations in a region, still their findings that the coastal stations in west Greenland and the Norwegian islands showed a strong delta T in the same direction as the coastal stations in the USCRN supports the influence of RH, whereas their finding of the opposite sign for the continental stations shows the same dependence we find here for the Western interior USCRN stations. (Note that their definition of delta T has the opposite sign from ours.)

“For a given station, is it random or biased in a consistent direction?” For most stations, the direction and magnitude of the error is very consistent across time, as shown by the comparison across seasons and across years.

Considering the larger number of stations in the US and in historical time, we may speculate that the error in the minmax method was at least as large as indicated here, and most probably somewhat larger, since many stations have been shown to be poorly sited (Fall et al, 2011). The tendency in the USCRN dataset to have about equal numbers of underestimates as overestimates is simply accidental, reflecting the particular mix of coastal, noncoastal, Northern and Southern sites. It may be that this applies as well to the larger number of sites in the continental US, but there is likely to be a bias in one direction or another in different countries, depending on their latitude extent and RH levels.

This error could affect spatial averaging. For example, the Fallbrook CA site with the highest positive DeltaT value of 1.39 C is just 147 miles away from the Yuma site with one of the largest negative values of -0.58. If these two stations were reading the identical true mean temperature, they would appear to disagree by nearly 2 full degrees Celsius using the standard minmax method. Quite a few similar pairs of close-lying stations with opposite directions of DeltaT can be seen in the map (check for nearby red and blue pairs). However, if only anomalies were considered, the error in absolute temperature levels might not affect estimates of spatial correlation (Menne and Williams, 2008).

Although the errors documented here are true errors (that is, they cannot be adjusted by time of observation or other adjustments), nonetheless it would not be expected that they have much of a direct effect on trends. After all, if one station is consistently overestimated across the years, it will have the same trend as if the values were replaced by the true values. Or if it varies cyclically by season, again after sufficient time the variations would tend to cancel and the trend be mostly unaffected. Of course, this cannot be checked with the USCRN database since it covers at most 4-5 years with the full complement of stations, and normal year-to-year “weather” variations would likely overwhelm any climatic trends over such a short period.

Acknowledgement. Scott Ember of NOAA was extremely helpful in navigating the USCRN database and supplying files that would have required many hours to download from the individual files available.

References

Ekholm N. 1916. Beräkning av luftens månadsmedeltemperatur vid de svenska meteorologiska stationerna. Bihang till Meteorologiska iakttagelser i Sverige, Band 56, 1914, Almqvist & Wiksell, Stockholm, p. 110.

Fall, S., Watts,A., Niesen-Gammon, J., Jones, E., Niyogi, D., Christy, J.R., and Pielke, R.A., Sr. Analysis of the impacts of station exposure on the U.S. Historical Climatology Network temperature and temperature trends. J Geophysical Research 116: DI4120. 2011.

Lobell, D.B., Bonfils, C., and Duffy, P.B. Climate change uncertainty for daily minimum and maximum temperatures: A model inter-comparison. Geophysical Research Letters 34, L05715, doi:10.1029/2006GL028726, 2007.

Ma, Y. and Guttorp, P. Estimating daily mean temperature from synoptic climate observations. http://www.nrcse.washington.edu/NordicNetwork/reports/temp.pdf downloaded Aug 18 2012

Menne, M.J. and Williams, C.N. Jr. 2008. Homogenization of temperature series via pairwise comparisons. J Climate 22: 1700-1717.

McMaster, Gregory S. and Wilhelm, Wallace , “Growing degree-days: one equation, two interpretations” (1997). Publications fromUSDA-ARS / UNL Faculty. Paper 83.

http://digitalcommons.unl.edu/usdaarsfacpub/83Accessed on Aug 18 2012.

Misra, V., Michael, J-P., Boyles, R., Chassignet, E.P{., Griffin, M. and O’Brien , J.J. 2012. Reconciling the Spatial Distribution of the Surface Temperature Trends in the Southeastern United States. J. Climate, 25, 3610–3618. doi: http://dx.doi.org/10.1175/JCLI-D-11-00170.1

Modén H. 1939. Beräkning av medeltemperaturen vid svenska stationer. Statens meteorologiskhydrografiska anstalt. Meddelanden, serien Uppsatser, no. 29.

Nordli PØ, Alexandersson H, Frisch P, Førland E, Heino R, Jónsson T, Steffensen P, Tuomenvirta H, Tveito OE. 1996. The effect of radiation screens on Nordic temperature measurements. DNMI Report 4/96 Klima.

Reicosky DC, Winkelman LJ, Baker JM, Baker DG. 1989. Accuracy of hourly air temperatures calculated from daily minima and maxima. Agric. For. Meteorol.46, 193209.

Weiss A, Hays CJ. 2005. Calculating daily mean air temperatures by different methods: implications from a non-linear algorithm. Agric. For. Meteorol. 128, 57-69.

Thrasher, B,L, Maurer, E.P., McKellar, C. and Duffy, P.B. Hydrol. Earth Syst. Sci. Discuss., 9, 5515–5529, 2012. www.hydrol-earth-syst-sci-discuss.net/9/5515/2012/ doi:10.5194/hessd-9-5515-2012. Accessed Aug 18, 2012.

TuomenvirtaR.H., Alexanderssopn,H.,Drebs,A., Frich,P., and Nordli, P.O. 2000: Trends in Nordic and Arctic temperature extremes. J. Climate, 13, 977-990.

APPENDIX

The main concern of this paper has been with Delta T and therefore almost all of the above analyses deal with that variable. However, another variable depending on the daily Tmax and Tmin is their difference, the Diurnal Temperature Range (DTR), which has its own interest. For example, the main finding of Fall et al., (2011) was that the poorly sited stations tended to overestimate Tmin and underestimate Tmax, leading to a large underestimate of DTR. However, the USCRN stations are all well-sited and therefore the estimates of DTR should be unbiased. What can we learn from the USCRN about this variable? We can first of all map its variation (Figure A-1).

Figure A-1. Variation of daily DTR across the US CRN. Colors are quartiles. Red: 4.7-10.8 C. Gold: 10.8-12.0 C. Green: 12.0-13.8 C. Blue: 13.8-19.9 C.

Here we see that the coastal sites have the lowest daily variation, reflecting the well-known moderating effect of the oceans. Perhaps the two sites near the Great Lakes in the lowest quartile of the DTR distribution are also due to this lake effect. The Western interior states have the highest DTRs.

A multiple regression shows that RH is by far the strongest explanatory variable (Table A-1). Solar radiation and precipitation have moderate effects, and latitude is weakly significant. The model explains about 46% of the variance, with RHMEAN accounting for most (42%) of that (Figure A-2).

 

Table A-1. Multiple regression on Diurnal Temperature Range.

N=3289 Regression Summary for Dependent Variable: DTR (CRNM0101_US_AL_AK_HI RH MERGED WITH METADATA NEW)R= .68257906 R²= .46591417 Adjusted R²= .46493778F(6,3282)=477.18 p<0.0000 Std.Error of estimate: 2.3620Exclude condition: v3=’ak’ or v3=’hi’

b*

Std.Err.

of b*

b

Std.Err.

of b

t(3282)

p-value

Intercept

20.14306

0.548079

36.7521

0.000000

LONG

0.010872

0.015854

0.00258

0.003759

0.6858

0.492909

LAT

-0.068287

0.013946

-0.03974

0.008115

-4.8965

0.000001

ELEVATION

-0.008117

0.016638

-0.00001

0.000025

-0.4879

0.625687

PRECIP (mm)

-0.170325

0.015015

-0.00829

0.000730

-11.3438

0.000000

SOLRAD (MJ/m2)

0.183541

0.014433

0.08967

0.007051

12.7167

0.000000

RHMEAN

-0.484798

0.018872

-0.09900

0.003854

-25.6888

0.000000

* Standardized regression results (μ=0, σ=1)

Figure A-2. Diurnal Temperature Range vs. mean RH.

The figure suggests that a linear fit is not very good; for RH between about 60-95% the effect on DTR (eyeball estimate) is perhaps twice the slope of -0.138 C per % RH for all the data..

Finally, how does the true mean temperature depend on the variables measured at the UCRN sites? The multiple regression is provided in Table A-2. Although all six variables are significant and explain about 79% of the variance, the relationship is largely driven (R2=59%) by solar radiation (Figure A-3).

Table A-2. Multiple regression of true mean monthly temperatures vs. measured meteorological variables.

N=3289 Regression Summary for Dependent Variable: TRUEMEAN R= .891 R²= .793 Adjusted R²= .793F(6,3282)=2095.9 p<0.0000 Std.Error of estimate: 4.58Exclude condition: State=’AK’ or State=’HI’

b*

Std.Err.

of b*

b

Std.Err.

of b

t(3282)

p-value

Intercept

9.972524

1.062680

9.3843

0.000000

LONG

-0.037057

0.009869

-0.027366

0.007288

-3.7548

0.000177

LAT

-0.201479

0.008682

-0.365153

0.015735

-23.2071

0.000000

ELEVATION

-0.307433

0.010357

-0.001414

0.000048

-29.6825

0.000000

PRECIP (mm)

0.151732

0.009347

0.022991

0.001416

16.2333

0.000000

SOLRAD (MJ/m2)

0.752289

0.008985

1.144690

0.013671

83.7285

0.000000

RHMEAN

-0.076282

0.011748

-0.048521

0.007473

-6.4931

0.000000

* Standardized regression results (μ=0, σ=1)

Figure A-3. True mean temperature vs. solar radiation.

===============================================================

This document is available as a PDF file here:

Errors in Estimating Temperatures Using the Average of Tmax and Tmin

About these ads

81 thoughts on “Errors in Estimating Temperatures Using the Average of Tmax and Tmin—Analysis of the USCRN Temperature Stations

  1. Nice work.

    You can use the R package CRN to download the individual files and create a compilation.

    The real test of course is to test the trend in anomalies which you can do by using hourly data from longer stations than CRN. unless the bias which is known to exist changes over time trends will not be effected. One factor that could change over time is RH, your other significant regressors such as latitude and coastal location dont change. Simply, if station A has a negative bias and station B has a positive bias there will only be a bias in trend if that bias changes over time. Looking a few long hourly stations or some long 5 minute data will provide some clues. I could dig that work out when time permits, or have a go at it yourself. Alternatively you could look at the longest CRN stations and see if the bias changes over time

    Areal averaging may be effected if you dont use krigging.

  2. Fascinating read. First I read the Abstract, looked at the methods and graphs, then perused the text and then read it in full. Two things I like about this paper are:

    1.) It discusses actual geographical and weather effects upon temp readings.

    2.) The statistics are used correctly and with full disclosure.

    With a longer time period Lance do you have any thoughts on what might be found–will the delta T measurements where there are under and over estimation essentially cancel out, or might you see larger discrepancies? Just curious on your hypothesis/thoughts.

  3. Of course T_maxmin=(T_max+T_min)/2 is not the same as T_mean. They both give reasonable measures of climate trends, and it is of interest that you find that T_minmax is a fairly unbiased estimator of T_mean. Using T_maxmin is not an “error” in estimating temperature – it’s just a different measure.

    The fact is that we have extensive records of T_maxmin and mostly short records of T_mean. And we have to work with what we have. It would be useful to test whether the trends of each over time are significantly different, though again the short availability of T_mean is a problem.

  4. I can’t believe they do something so stoopid. Your data clearly shows the moderating influence that the sea has on coastal locations, whilst continental locations will have a much greater range. Surely latitude and time of year influence this too. Might as well measure ocean temperature.
    Thanks for bringing this up.

  5. A tremendous amount of work which I would applaud because it illustrates a real issue.

    My reservation in regard to this is that it tackles only half the problem. The other half is that P in w/m2 varies with T^4. To get an EFFECTIVE mean temperature, all the temperature data needs to be first converted to w/m2, THEN averaged, and the average converted back to T. That would give you the EFFECTIVE mean temperature which, for the purposes of the climate discussion, is the actual value that we want.

  6. Nice. This is something that one can build upon. Again, very good.

    I’ve long wondered how solar radiation changes the mean and the max measurements. Thank you.

  7. Steven Mosher:

    OK, partial answer to your question about the biases possibly changing in time. I took the four full years 2008-2011 to avoid including fractions of years that might affect the results because of seasonal variation. There were 134 stations, most of which had the full 48 months of data. Regressions against time showed only 20 with significant slopes. The full set of slopes were extremely small, 10-90 percentile from 0.00002 to 0.00036. For a typical bias on the order of 0.1 C, the change over a decade, say, would be invisible. The longest periods of time available in the USCRN dataset were 132 months (two stations) and then a slowly increasing number of stations each year, so not that many with monitoring periods longer than 6 years. Besides I would have to check each one to get an integral number of years. But the slopes are so tiny it doesn’t seem worth it. One interesting result was that the slopes were overwhelmingly (121 to 13) positive. Can’t imagine why this would be so, perhaps just random yearly variation in the weather. This adds a bit of quantitative evidence supporting my conclusion in the article that the errors may not affect estimates of trends.

    However, the errors do of course affect estimates of absolute temperature. If the results apply outside the continental US (and they were confirmed when I added Alaska and Hawaii back in) then it would follow that Northern continental sites, say at Russian latitudes, would be larger underestimates than we saw for the Northern states, while tropical sites would have larger overestimates than for the Southern and Pacific states, perhaps in the range of 0.5-1 C in each direction. Then the true latitudinal temperature gradient (for noncoastal sites) would be smaller than presently estimated (Northern sites warmer, tropical sites cooler). Could perhaps have implications for estimating energy of hurricanes (reduced because of less temperature difference between colliding air masses).

  8. I’m going to have to disagree with their conclusions on the impact of using min+max/2 on temperature trend.

    This is an article I wrote on the effect of using min+max/2 compared with fixed time temperature measurements. It found the min+max/2 method over estimated the temperature trend by 47%. It uses Australian data, but I would expect the USA to be similar.

    http://www.bishop-hill.net/blog/2011/11/4/australian-temperatures.html

    And as for nearby stations showing large differences in DeltaT. I suspect irrigation is the problem.

    Otherwise, we need far more of these kinds of papers that look into the details of how, where, and when temperatures have changed.

  9. Another thought: I know nothing about GCMs. Do they input Tmin and Tmax in general to create their temperature fields? Then they may have these latitudinal and humidity biases affecting their calculations of winds and so on. If a butterfly’s flap can result in a hurricane, what about a small but widespread bias in the temperature field?

    Perhaps a better model than the one presented above could be built up using hundreds or thousands of global thermometers having both the Tmin Tmax and true mean data. Then run the GCM on the predicted true mean temperature rather than the observed (erroneous) Tminmax.

  10. Having read Lance Wallace’s comment above,

    It doesn’t surprise me that you didn’t find a bias in the min+max/2 method in recent years, as the temperature trend has been essentially flat over this period, and the bias results from how warming has occurred over the last 60 years. There is only a bias when there is a warming trend.

  11. Nick Stokes:
    “Of course T_maxmin=(T_max+T_min)/2 is not the same as T_mean. They both give reasonable measures of climate trends, and it is of interest that you find that T_minmax is a fairly unbiased estimator of T_mean.”

    Nick, I did find what you said I found, but I also commented that it was most likely an accidental reflection of the particular mix of continental and coastal sites in the US. Another country might have differences mostly in one direction. For example, suppose the Confederate States of America were around today–their Tminmax estimates would be almost uniformly biased high.

    Secondly, although we agree on the likelihood that these errors would not affect trends, there are more things than trends of interest. Can you comment on my speculation above that maybe the GCMs would be affected if we tried inputting true mean temperatures? As you say, we don’t know them for many stations, but maybe a better model than mine above, based on more data worldwide, would be useful.

  12. jcbmack^2: “With a longer time period Lance do you have any thoughts on what might be found–will the delta T measurements where there are under and over estimation essentially cancel out, or might you see larger discrepancies? Just curious on your hypothesis/thoughts.”

    See my speculation and answers to Mosher & Nick Stokes above. If the latitudinal and humidity relationships hold up, then latitudes higher than the 48 degrees or so in the continental US (e.g., Russia) would tend to have even more negative Delta T, while tropical sites would have higher positive DeltaT.

  13. Lance Wallace says:
    August 30, 2012 at 5:32 pm
    Another thought: I know nothing about GCMs. Do they input Tmin and Tmax in general to create their temperature fields?
    >>>>>>>>>>>>>>>>

    They only input “initial conditions”.

  14. David M. Hoffer says “They only input ‘initial conditions'”

    OK, but do their initial conditions include Tmin and Tmax? If so, then they are wrong much of the time.

    One of my references (Thrasher 2012) tries to clear up a small problem in 17 GCMS. It seems they “adjust” their inputs. The problem is that this sometimes results in Tmin>Tmax. (!). Thrasher suggests a fix that will eliminate this problem. One wonders why they needed a whole ‘nother paper to point that out. Oh, wait, the more papers published the better, right?

  15. I am going to paint a big target on my forehead by calling the “Standard Error” bars on Figure 1 total baloney.

    First of all, the only numbers we should be considering are individual monthly Tave derived from 30 Tmin and Tmax values. As I wrote on 8/27/12 in Lies….

    Where I think there is an underappreciated element of error (7/31/12) is that we forget that Tave at each location is never a measured value. It is a calculated “average” of the min and max values. Forget about TOB issues. The simple fact that you “average” 34 and 20 to get Tave = 27K implies that the uncertainty of that average is up to 7 deg K.

    Take 30 of those daily Tave estimates to get a Tave for a month, and the uncertainty on that monthly Tave is 1.3 deg K. That is quite an error bar; the 80% confidence on the Tave is plus or minus 2.0 deg K, with its own weak simplifying assumptions about the shape of the curve between the min and max non-randomly sampled points. WUWT 8/27/12 Lies…..

    So, the real monthly Tave value, month by month used in climate trend analysis does not have a an error bar 0.02 dec C tall, but one about 200 times larger!

    Averageing 140 months of Tave to get some smaller error bars IS POINTLESS and deceptive! It implies the error bar on each month’s Tave used in climate studies have tiny uncertainty when in fact it is embarassingly large.

    There! Target tattooed on my forehead. Tell me where I am mistaken.

  16. T_minmax shows nearly a degree C of warming over a 40 year period compared with 9am temperature, at Hamilton NZ. Data from NIWA

  17. Dude, you spelled Lewistown incorrectly. The people over there get very, very angry when this is done. They are Lewistown, MT.

    REPLY: other than that little irrelevancy, do you have anything USEFUL to contribute to the conversation? Or is this the only error you could find, so you had no chocie to run with it. – Anthony

  18. Minor typo correction to Rasey 6:59 pm. Replace “K” with “C”

    The simple fact that you “average” 34 C and 20 C to get Tave = 27 C implies that the uncertainty of that average is up to 7 deg C.

    Take 30 of those daily Tave estimates to get a Tave for a month, and the uncertainty on that monthly Tave is 1.3 deg C. That is quite an error bar; the 80% confidence on the Tave is plus or minus 2.0 deg C

  19. Philip Bradley says:
    “This is an article I wrote on the effect of using min+max/2 compared with fixed time temperature measurements. It found the min+max/2 method over estimated the temperature trend by 47%. It uses Australian data, but I would expect the USA to be similar.”

    Thanks, glad to see other analyses of this. Your article is based on work done in 2009 by a Ph.D. candidate named Jonathan Lowe at Gust of Hot Air. He took 21 Australian stations with measurements every 3 hours over the last 60 years. However, as far as I can tell, he averaged anomalies of all 21 stations each year. We don’t know the bias of any individual station, nor how many were coastal, how many were interior, or what the variation of the bias over time was for any of the 21 stations. I would want to look at the raw data and run a similar regression before venturing an opinion here.

  20. Stephen Rasey: “There! Target tattooed on my forehead. Tell me where I am mistaken.”

    Let me first congratulate you on reaching the mathematical literacy achieved by numerous disadvantaged teens enrolled in failing Atlanta schools. You, like most children succeeding under the infamously low standards of US public education, have a firmer grasp of basic calculation then professionals slaving away in the scientific discipline of doom. And yes, that is a sincere complement.

    But for the line of argument you’re attempting to chase you want to flesh out your knowledge with such things as the Central Limit Theorem and pathological distributions. Specifically on the latter with respect to such fun, and common in physics, bits such as the Cauchy and Levy distributions.

    For the short of it, if you assume you have a non-pathological distribution and you assume that the errors are the result of the sum of independent and well-distributed random variables then: Yes, more samples give a better finger on the mean value of that distribution.

    But that’s a bit like a shipwrecked economist with a tin of beans stating: “Assume a can opener.”

    Anything approaching responsibility with the data requires actually showing that the distribution is well behaved in the infinite limit. But given the non-linearity involved and systemic biases then this cannot be done without doing a site-by-site evaluation of the errors. (Such as in the OP graph: X overestimates as a roughly general consideration.) And this is all quite aside the notion of the propagation of errors when passing uncertainties through a mathematical sausage mill versus the notion that this last notion of less-basic (yet University level) mathematics can be whistled away by bias adjustments and homogenization through a multiplicity of distributions each with their own error.

  21. Stephen Rasey says: “I am going to paint a big target on my forehead by calling the “Standard Error” bars on Figure 1 total baloney. First of all, the only numbers we should be considering are individual monthly Tave derived from 30 Tmin and Tmax values….Tell me where I am mistaken.”

    OK, well, first, Figure 1 is not your “monthly Tave”, it is the difference between the true mean and “monthly Tave”. That is, I am entirely with you on the fact that monthly Tave is an erroneous measure. That is the point! Look at the title of my piece! So I am taking “monthly Tave” as a precise figure, exactly as the CAGW people do. Then I look at the measured Tmean and calculate the difference between these two numbers. This gives us between 47 and 132 monthly values for each station. Then there is a standard approach for calculating the standard error–it is the standard deviation divided by the square root of N. Of course, this assumes no correlation between the successive values. Here is where your objection may be valid. There is probably some correlation between successive months and most or all of the independent variables Tmax, Tmin, SOLRAD, PRECIP, RHmean. In that case, the standard errors are probably too small. However, they have no further influence on the conclusions regarding the effect of latitude, RH, etc.

  22. Lance Wallace;
    OK, but do their initial conditions include Tmin and Tmax? If so, then they are wrong much of the time.
    >>>>>>>>>>>>>>>>>>>

    No, initial conditions from a temperature perspective would be the exact temperature at each and every location around the globe at the start time. So I suppose a tiny subset of those at any given time would be a max or a min but the vast majority would be some other value.

  23. The definition of the Predicted Value as used Figure 7 seems to be missing from the paper. Is it the real monthly average from all 5 minute readings each month?

    The Monthly Observed value, I think, is the Tave from the Min, Max values of the day.

  24. Reblogged this on The GOLDEN RULE and commented:
    Here is some genuine science being used and the conclusion(s) do not support the need for concern about warming trends, if they in fact exist at all.
    I believe the article provides support for my general contention that however good the science is, the subject “climate and its factors” are of such complexity and variability as to make IPCC and political carbon control decisions untenable.

  25. “Lance Wallace says: August 30, 2012 at 5:32 pm

    “Another thought: I know nothing about GCMs. Do they input Tmin and Tmax in general to create their temperature fields? “

    GCM’s wouldn’t use station readings, and hence not T_max or T_mean. They compute from a complete flow model temperature fields on a regular grid at about 30-min intervals.

  26. It looks like the example set by the “Watts et al” paper to put a paper up here to be “fire-proofed” is being followed.
    Lance, don’t take the singes personally or let them bother you.

  27. Stephen Rasey says: The definition of the Predicted Value as used Figure 7 seems to be missing from the paper. Is it the real monthly average from all 5 minute readings each month? The Monthly Observed value, I think, is the Tave from the Min, Max values of the day.”

    Both Predicted and Observed values are Delta T, as stated in the caption. Recall that Delta T is the difference between the true mean and that calculated from Tmin & Tmax. The Predicted value is determined for each station and each month by inputting the coefficients of the multivariate regression model for each of the six variables multiplied by the values of the six variables for that month and that station. The sum of the six values plus the intercept gives the Predicted value.

  28. I also started earlier this week loading USCRN hourly02 data into a SQL database to see what the raw data would show, without adjustments or arbitrary decisions (i.e. infilling from “nearby” stations, grid averaging etc.)

    I successfully downloaded all the station hourly files using an FireFTP extension for Firefox. The download did take several hours, but it was automated, I just needed to kick it off.

    My first playing with the data was this very subject, was (Tmax+Tmin)/2 a valid alternate to the daily average calculated from the hourly average. At this point I’m still trying to ensure the schema I have is useable but an early query did show that there are significant differences. My histogram peaked at a positive 1-1.5C difference, but I had only loaded a small subset of the data (for testing queries & schema) at that point. Hopefully once I load all the data my results should replicate yours.

    My gut feel was that (Tmax+Tmin)/2 would not always represent a true average based upon living in the SF Bay Area, where most of the day might be fog bound in the 50Fs with sun breaking through around 1pm leading to an afternoon high of 80-90F before fog returning at 7pm, e.g. a day shaped liked Fallbrooke but with a much narrower peak.

  29. 1. With high frequency observations from airport sites, you should be able to detect jet wash heat effects if they happen as an aircraft taxis past, once you can get hold of aircraft movement data. Best done if possible on the types of thermometers that would record a transient and show it as a Tmax or Tmin.

    2. The mere fact of Tmax and Tmin happening on the same day does not permit unbridled use of their Tminmax values. For example, people have often correlated Tminmax between stations up to 2,000 km apart. However, when you separate out Tmax and Tmin and correlate them separately, you get quite a different picture. You can get an idea of this by looking at the following related exercise using lagged data at one station.

    http://www.geoffstuff.com/Extended%20paper%20on%20chasing%20R.pdf

    The correlation coeficients of Tmax and Tmin are rather different for data lagged by only one day. This would seem to place some unstated limits on assumptions in your data analysis. Particularly, one of either Tmax or Tmin might call the shots, placing different emphasis on day lengths as sites are nearer to the Poles. I get confused concepts inside the Arctic and Antarctic circles.

  30. OK, just a non professional observer here, but I have some questions:
    How exactly were these measurements taken? Were they recorded by a human using an “Eyeball Mk I”, or were they recorded using some kind of automatic recording device?
    What is the magnitude of error introduced by each of these methods, and what is the effect on the reading?
    Also, how is the error introduced in the colder latitudes influenced by the weather; ie: the human reader looks out the window and says: “the weather is too cold to read today, so I will do it tomorrow and double up”.

  31. This is very interesting, but at the same time it is somewhat irrelevant. We don’t care about the actual temperature to measure climate change. We care about anomalies, i.e. how those temperatures vary in time. What would be worrying is that the error of using this method greatly varied from year to year. We don’t care if it varies between stations, as long as it is relatively stable in time for any one given station. And this article seems to show that the error is indeed rather stable in time (although we would need much longer studies to really be able to conclude that).
    If a station uses the Tmin/Tmax method to conclude that the average temperature was 14C whereas the true average temperature was 13.5C, I don’t care! The actual temperature is irrelevant, firstly, because I am going to use that reading to represent the temperature of an area which is probably hundreths of square kilometres big. And one thing I know for sure, is that the thermometer is NOT giving me the average temperature in that area, no matter which method I use. The best that the thermometer can do is give me the temperature in its exact position. That and only that. So what if the thermometer gave me the exact 13.5C for the average temperature in its position? The average temperature in the area could be 14.5C! So what did I gain?

  32. Nick Stokes says: “GCM’s wouldn’t use station readings, and hence not T_max or T_mean. They compute from a complete flow model temperature fields on a regular grid at about 30-min intervals.”

    But NIck, they wouldn’t make up the temperature fields from whole cloth, would they? Or if they did, wouldn’t they have to check it at some point against the observed data? And as you point out, we go with what we have, so wouldn’t they go with the TminTmax data? And haven’t we shown that these don’t match the true means? In somewhat predictable ways, e.g. affected by latitude and RH? So what’s wrong with getting a nice big dataset of 1000 stations globally that have sufficient hourly data to give an idea of the delta compared to Tminmax, creating a model from those measurements that estimates the true mean temperature everywhere, and validating the GCM against those better estimates of temperature than the flawed Tminmax values?

  33. Climate Beagle says: “I successfully downloaded all the station hourly files…”

    Wow, that’s one humongous file–about 8,960,000 records by my calculation. I benefited from NOAA help to get the daily file, which I can share with you if you want to check the daily file that you can create from your hourly file. I also have the NOAA-created monthly file that I can share with you if you want. In return I wouldn’t mind taking a look at your hourly file on Dropbox or other mode. My email is lwallace73@gmail.com.

  34. It is indeed biased, and one of the further reasons i can think of is that mercury based thermometers will not function below -36F or -38C. It is artificially “warming” the average of Tmin and Tmax as a result, although it’s only a problem that might affect older stations and temperature records. Have you noticed as well that the NOGAPS ground temperature is limited to -50F, which means most of the antarctic continent might be actually colder. That’s not a problem in the first sight, but if you compute an anomaly over this data, it will result in a flawed and unusually high value.

    Regards and thanks for the very interesting article again from you.

  35. For the past 3 hours, I have been trying to figure out how the error bar for each of the stations in Figure 1 can be so small.

    So I went to the first one, MT-Lewistown, monthly records.
    ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/monthly01/CRNM0101-MT_Lewistown_42_WSW.txt
    48 months of data.
    DeltaT = (Column 8 – column 9) = TMINMAX – Truemean
    Mean DeltaT = -0.656 (that checks)
    Std Dev DeltaT = 0.278.
    Mean Std Error = 0.041
    P10 – P90 Range: -1.0 to -0.37.

    Ok. I can see that you are plotting the Mean Std Error
    But I do not see why that is important. We are not using population mean temperatures. We are using individual monthly TMINMAX vs time.

    The key point in the data is that any given month’s TMINMAX,
    appears to be an estimate of the TRUEMEAN
    with an std dev. error bar +/- 0.278 deg C

    Excel Chart Pic: http://i45.tinypic.com/2lnxsi1.png
    There is no trend about whether the error gets bigger or smaller at higher temperatures. Possibily the uncertainty in DeltaT gets bigger at higher temps.

  36. Obviously the small number of stations is not enough to give accurate results for the larger area . Yes it does show trends yet how accurate are they? Before becoming to happy with the information we need to expand the number of stations with the USCRN pristine conditions . We have already seen , thanks to Anthony and others , that station siting is important. Even though the USCRN stations are sited in pristine places for data collection I think they need to be even more widely distributed in order to take in more area . I am doubtfull of results from too small a sampling even though the results can be used to demonstrate the lack of agw . Let me be clear in stating I have no faith in the systems already in use as they have been so heavily manipulated that the information gleaned from them is meaningless. I would simply like to see more USCRN type sites in order to be confident in the results . I would like to see them doubled with the appropriate spread and if tripled more the merrier . When you think of maybe 450 sites that is still a pretty small number but better degree of accuracy anyway. Placement without bias will be an issue .

  37. Here is MT-Lewistown with data split at 2010 July. (blue 200807 to 201007, brown after 201007)

    (Mean, stddev, mean std err) before (-0.86, 0.42, 0.089)
    (Mean, stddev, mean std err) after (-0.69, 0.33, 0.068)
    I note that these two means are far enough apart to test for significance.

    At the same time, I note that the three most negative DeltaT’s are the first three months that station was in operation. Isn’t that a coincidence! Was the grass growing back?

  38. Geoff Sherrington says: “The mere fact of Tmax and Tmin happening on the same day does not permit unbridled use of their Tminmax values. For example, people have often correlated Tminmax between stations up to 2,000 km apart. However, when you separate out Tmax and Tmin and correlate them separately, you get quite a different picture. You can get an idea of this by looking at the following related exercise using lagged data at one station.
    http://www.geoffstuff.com/Extended%20paper%20on%20chasing%20R.pdf

    Geoff, I’m unclear on exactly what you are suggesting here. I checked out your Melbourne example and have no idea why Tmax had such low autocorrelations compared to Tmin. I did try looking at Spearman correlations of Tmax, Tmin, Tminmax, Truemean, and DeltaT in both the monthly and daily datasets. In the monthly dataset, the correlations were 0.88. 0.88, 0.88, 0.88, and 0.66, respectively. In the daily dataset, they were 0.94, 0.94, 0.96, 0.96, and 0.30. Interesting results, but not particularly in agreement with your Melbourne example. Of course, these are average results across 125 stations, so an individual station could behave rather differently. However, I don’t at this point see how this affects the analysis. We still have daily and monthly delta T values for every station. I should state that the daily and monthly values averaged over the entire station history do result in very slightly different estimates of Delta T for each station due to the different weights (slightly different lengths of a month) but these differences were usually in the range of a few percent.

    Anyway, if you still feel that the autocorrelation will affect the analysis in some way, please let me know.

    By the way, none of the USCN sites are at airports, so your jetwash suggestion would have to be checked at some other station.

  39. Nylo says:”This is very interesting, but at the same time it is somewhat irrelevant. We don’t care about the actual temperature to measure climate change. We care about anomalies, i.e. how those temperatures vary in time.”

    As I said in the text, and in agreement with Nick Stokes, I am fairly well convinced that the bias is indeed of little interest in calculating trends. But it is a true error (despite Nick’s comment) and therefore could affect our understanding of basic physical aspects of climate studies. For example, if the gradient between the tropics and northern latitudes is different than what we think now based on the Tminmax results, that would affect our calculations of energy transfer between tropics and higher latitudes. And in my view as a physicist, energy is the true fundamental driver here, not temperature.

  40. Willhelm says: “OK, just a non professional observer here, but I have some questions:
    How exactly were these measurements taken? Were they recorded by a human using an “Eyeball Mk I”, or were they recorded using some kind of automatic recording device?
    What is the magnitude of error introduced by each of these methods, and what is the effect on the reading?”

    The triplicate thermometers are platinum-resistance thermometers traceable to NIST standards. Each is in a separate enclosure at the site. To be acceptable, the values must be within 0.3C of each other. See the manual monitoring handbook at ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/documentation/program/

  41. Nylo says: August 30, 2012 at 10:48 pm “We don’t care about the actual temperature to measure climate change.”

    Oh yes we do, Nylo. Some examples.
    1. I know of a case where for secades the newspapers were given values different to those recorded for official use. How do you measure climate change from 2 starting points a degree apart?
    2. Estimation of other parameters from temperature, such as W/m^2. Not so easy with anomalies, is it?
    3. Variations in technique from year to year. The change from liquid-in-glass to thermistor/thermocouple devices, the change from one reading per day to one a minute or more frequently, satellite data – each of these methods casues CARE with temperature measurement because each can give a different ‘anomaly’.
    4. Spike rejection – can arise and be filtered different ways, once one determines what is the ‘actual temperature’ and how to measure it.
    5. It is simply sloppy science to use non-standard units like “degrees F minus 30-year reference period” unless there are compelling reasons to vary from K. We’ve moved away from strange units like “furlongs per fortnight” for velocity and most of the world now uses K or C not F. Where do U stand?

  42. I’m wondering too if more modern AWS stations that employ solid state sensors may have a different (lower) thermal inertia, which would create the illusion of a rising tmax?

  43. Roger Pielke Sr has discussed this sort of thing:

    http://pielkeclimatesci.wordpress.com/2012/08/21/comments-on-the-shifting-probability-distribution-of-global-daytime-and-night-time-temperatures-by-donat-and-alexander-2012-a-not-ready-for-prime-time-study/

    However I think you are correct in saying “energy is the true fundamental driver here, not temperature” as Roger Pielke Sr has also discussed in some depth:

    http://pielkeclimatesci.wordpress.com/2012/05/07/a-summary-of-why-the-global-average-surface-temperature-is-a-poor-metric-to-diagnose-global-warming/

    On a very simplistic level one can appreciate that air at low temperature with high humidity could have the same energy content as higher temperature air under drier conditions. Thus I don’t think the temperature alone is a conclusive metric for understanding the climate.

  44. Lance Wallace says: August 30, 2012 at 11:19 pm

    But Nick, they wouldn’t make up the temperature fields from whole cloth, would they? Or if they did, wouldn’t they have to check it at some point against the observed data? And as you point out, we go with what we have, so wouldn’t they go with the TminTmax data?”

    No, of course they don’t make them up. They solve the equations of fluid flow with heat transport, along with half-hourly insolation and IR heat components (and latent heat release, vertical transport modelled etc). All of this on a regular grid, roughly 100 km and half hour intervals. There’s simply no role for any kind of daily mean, and station readings are not used anywhere.

  45. Seems to me if you want to detect AGW you should use Tmin (usually nightime temps) since air temp at night is wholly dependent on the presence of greenhouse gases. Daytime temps will be higher the less greenhouse gas you have…..

  46. The astonishing thing about this whole argument is that it ignores some very basic mathematical ideas in signal processing.

    If you measure an electrical signal that is a proxy for temperature, i.e.: a thermocouple, one has a continuous signal. The rate at which you have to sample that signal is determined by Nyquist sampling theorem : you have to sample at twice the highest frequency in the signal. This can be limited by filtering the signal so that very short term variations: a bird flies over the temperature sensor, are eliminated.

    If you do not do this, and undersample the signal, all subsequent calculations will not reflect the behaviour of the signal.

    What then do we mean by mean temperature? The mean is the integral of the signal wrt to time, divided by the period of which it is integrated. In a stable system this will converge asymptotically to a stable value. However, this isn’t particularly useful because we recognise that “the mean might change”, so what we do is to low pass filter the temperature signal so that only low frequency fluctuations will become visible.

    It is easy to show that decimation of a signal by averaging an epoch, say a day and treating that average as a representation of the signal in question is incorrect and introduces errors into the derived time series.

    This is one of the most basic ideas in the manipulation of signals, is EE101 and seems to be robustly ignored by the temperature community, who frankly should know better.

  47. Ryan says:
    August 31, 2012 at 3:23 am
    Seems to me if you want to detect AGW you should use Tmin (usually nightime temps) since air temp at night is wholly dependent on the presence of greenhouse gases.
    ======================================================================
    And you factored in UHI for that?

  48. Perhaps I have completely missed the point of Nick Stokes’s comments, but I utterly fail to see how, in any discussion of global warming, there is no role for temperatures in GCMs. Isn’t the whole point of a global warming argument that temperatures are rising. Lance’s work shows, to me anyway, that there is a built in bias in the way composite (for want of a better word) temperatures are calculated, and that any discussion of global temperatures and global warming must consider the bias and error in the way temperature data is handled. It may turn out to be irrelevant, or it may turn out to be highly relevant, but it must be considered before rejecting it out of hand.

  49. Nick Stokes says:
    August 31, 2012 at 3:01 am

    “No, of course they don’t make them up. They solve the equations of fluid flow with heat transport, along with half-hourly insolation and IR heat components (and latent heat release, vertical transport modelled etc). All of this on a regular grid, roughly 100 km and half hour intervals. There’s simply no role for any kind of daily mean, and station readings are not used anywhere.”

    Not really. They use specific approximations to the governing equations with lots of terms missing. Of course, with codes like GISS Model E, we really DON’T know what they’re solving (and they probably don’t either since they don’t document it anywhere).

    Another thing about the climate “models”. If they do indeed solve essentially the same non-linear equations as a typical CFD code, why do their solutions remain “stable” for 100 years of integration time when we know full well that numerical weather prediction model become chaotic after a week? Could it be crappy, highly diffusive numeric discretizations and “controlling” the solutions through ad hoc logic and unphysical filters and source terms? Well, no one from the warmist camp ever wants to talk about these things (can’t blame them…).

  50. I wonder if this work could be used to determine whether the big reduction in thermometers back in the 90s might have an impact on the trend.

  51. Gunga Din says:
    August 30, 2012 at 8:30 pm

    It looks like the example set by the “Watts et al” paper to put a paper up here to be “fire-proofed” is being followed….
    ________________________
    Willis did it with his Thermostat theory too.

    I think it is a great idea. The best run company that I have ever worked for did it with new products all the time before the product went from the pilot plant stage to production. We caught a heck of a lot of costly mistakes that way.

    It only works however if egos are left at the door. At another company at a new product presentation, I did my usual critique based on the pilot plant findings and was roundly slammed for attacking their baby by the project engineers. Turned out I was correct and the company lost millions. To save face I was fired shortly there after for not being a “Team Player” – go figure.

    I think we see the same type of attitude problem with “Team Players” in climastrology. Dr. Phil Jones of the UEA CRU sent this in reply to Warwick Hughes when he asked for data. “Why should I make the data available to you, when your aim is to try and find something wrong with it?” From: An Open Letter to Dr. Phil Jones

    This type of attitude in any scientist is absolutely deadly. Unfortunately it is all to common especially at the Phd level. Max Planck stated (Translated from the German) “A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it”. paraphrased: “Science advances one funeral at a time.”

  52. @Mathew W.

    Thing is Matthew, that without any greenhouse effect caused by the gas in our atmosphere, the heat caused by UHI would immediately radiate into space and the measured temperature at night would be close to absolute zero, just as it is on the moon. Of course, most greenhouse gases block incoming radiation during the day to the same extent, thus making daytime temps colder than they would be. Team AGW claims that for CO2 this is different because CO2 doesn’t block the incoming radiation but does block the outgoing radiation. If this were true the easiest way to detect it would be to look at the difference in trends between nightime (usually Tmin) temps and daytime (usually Tmax) temps over time – if CO2 is really doing what Team AGW claim it is doing then it should cause nightime temps to be rising much faster than daytime temps. This measurement would remove a lot of the error in temperature measurements over time since you would be comparing the delta between two measurements made using the same equipment on the same site at roughly the same time(well, during the same 24 hr period which significantly reduces the chance of site changes). Not seen anybody do it yet though.

    I don’t think UHI would make much difference to these measurements, since the UHI sourced heat would be equally blocked by CO2 (or any other greenhouse gas) both during daytime and nightime. Thus looking for the delta would, in theory, cancel out the impact of UHI when looking for the AGW impact.

  53. Excellent work, and yes, this is something that is perfectly obvious. T_max – T_min/2 is nothing but a first order estimator of the mean of a nonlinear function that itself varies substantially by physical location, local conditions and season.

    I would point out, however, that as I’ve been building simple models of the GHE — ones sufficiently simple that they can convince even Latour and the “Slayers” that it exists and doesn’t violate the 2nd law of thermodynamics — I’ve become more and more convinced that the correct metric for global warming/cooling/climate is enthalpy, not temperature. Yes, enthalpy is LOCALLY directly proportional to temperature, mostly, except for that pesky but highly significant latent heat from water vapor, that sucks energy in and doles it out at constant temperature all of the time and over more than 70% of the globe (all open water surfaces, all wet surface soils).

    This is a critical contributor to the nonlinear temperature variation observed by this study. When the ground e.g. cools to the dew point of the atmosphere above it, dew forms. Dew rejects its latent heat into the ground at that point, blocking the further reduction of surface temperature even as the latent heat continues to be radiated away. The end result is that a patch of ground actually rejects more of the heat that was absorbed during the day by maintaining a warmer e.g. nighttime temperature to lose the daytime energy that was absorbed AS latent heat instead of being reradiated during the day. Latent heat is also extremely important in energy transport, both vertically through the GHG column and laterally from e.g. the oceans to the land. The continental US may have set records for mean warmth (or not, not my purpose to argue but parts of it were pretty hot compared to the historical record if not the hottest) but the Gulf of Mexico was actually abnormally cool.

    I have a theory that this means that this year was unusually efficient at LOSING energy. A hot, dry midwest and southwest radiates heat away from the Earth very efficiently (T^4, recall). The oceans, on the other hand, retain heat and their temperature is a nearly direct measure of the bulk enthalpy of at least the surface layer. The heat capacity of the oceans is enormous and the heat is mixed and stored at depth so it takes a long time to release, where the heat capacity of the ground surface layer is nearly irrelevant — one cold clear night and the ground surface is cold where the ocean remains warm(er) for weeks to months after the weather turns.

    A 2-3 C cooler Gulf going into autumn — which looks about where it is — represents a much lower base temperature from which winter cooling will proceed. The southwest and midwest will very likely cool quickly as fall proceeds, but will then be sandwiched between a normally cold arctic and a Gulf that cools to a significantly lower winter temperature than usual. This means that as a heat reservoir it will have less heat to give up to maintain the warmth of the southEAST US.

    The North Atlantic also appears to be warm, sure, but at least 2C cooler in the tropics than it has been, on average, over the last decade or so (26-29C compared to 30-31C) at this time of year. We’re already experiencing the first signs that fall weather patterns are kicking in in NC — well off of peak summertime temperatures, nights under 21C, less hazy skies. Hurricanes, recall, are HEAT ENGINES and hence actually cool the ocean from which they draw their energy, so even though they aren’t very common this year Isaac actually left a visibly cooler SST wake of 27C water that is showing no signs of rapidly warming back up to 28-29C. It, too, is currently dumping enormous amounts of latent heat/enthalpy both into space (radiated away at the top of the troposphere) and is going to have a strong net cooling effect on the ground it rains on both by reflecting sunlight and by dumping liquid water that sooner or later will (partially) evaporate from the soils on which it falls.

    These could be signs that the NAO is thinking about changing, as they are definitely a change from the predominant pattern of warming in the Gulf and Atlantic north of the ITCZ. It would be most interesting to know what the Hadley cells are doing this year.

    rgb

  54. Very good work, Lance, indeed.
    I used to work on this problem now for almost 4 years. I fully agree with your findings regarding the bias caused by the method (or “algorithm” as I name it) compared to the “true” mean, which you obtain only by hourly or quarter hourly or even finer measurements. I found the same by using the papers of Aguilar et al 2003 “GUIDANCE ON METADATA AND HOMOGENIZATION” , and Alisow 1956 “Lehrbuch der Klimatologie”. There you find for some stations in Puchberg Austria (9 year run) and in a number of places in Russia some measured biases due to the used algorithm.
    One should take in mind, that worldwide about 100 different ones (see Griffith 1997 “Some problems of regionality in application of climate change.”) had been in use and some different are still in use , where the max/min algorithm is only one of them, although rather widely used.

    I have written a paper about this “algorithm error” which is accepted by E & E, but not published yet.

    But I don´t agree with your conclusion (1) that the trend might not be influenced by this error due to the fact, that constructing the anomaly it will cancel out (implicitly stated that way).

    Because you´ll see
    1. a variation of this error over the full year. It is different every single month. And if you use monthly averages and deduct from them a 30 year station normal from the result , will have the same fluctuation as before. Eventually the magnitude is somewhat smaller.
    2. And if the station normal error itself is different due to various reasons – what it always is, because no station in the world remained unchanged over this time or after or before, the anomaly propagates this error in part or in full.
    3. And in case you mix all the various anomalies of the world- as is done- to obtain a global mean anomaly, than the various algorithms introduce an averaged systematic error (cautiously estimated) of about ± 0,2 K at least.

    Finally I agree completely with RCS who wrote:
    It is easy to show that decimation of a signal by averaging an epoch, say a day and treating that average as a representation of the signal in question is incorrect and introduces errors into the derived time series.
    Please excuse my poor english, but I hope I made myself understood.
    regards Michael

  55. Sorry the cite was missed
    (1) cite “Although the errors documented here are true errors (that is, they cannot be adjusted by time of observation or other adjustments), nonetheless it would not be expected that they have much of a direct effect on trends. After all, if one station is consistently overestimated across the years, it will have the same trend as if the values were replaced by the true values.”
    Michael

  56. Frank K. says: August 31, 2012 at 5:28 am

    Frank, I was explaining why T_minmax and T_mean, in my view, have no role in GCMs. That was the proposition that Lance was raising. Can you point to where GCMs rely on those quantities?

  57. Well It looks like a lot of work for statisticians. I thought that I read somewhere that (Tmax+Tmin)/2 was NOT a good measure of diurnal variations so gave the wrong average. So the first thing I looked for was a graph of ‘diurnal variations’ ; any place, any day. Why not July 4th 2,000 at 1600 Pennsylvania Avenue. Well not there; or anywhere else or any other day.
    Maybe fig 9, its got at least 23 hours, but evidently doesn’t reflect any one day.

    But I’ll take it as gospel, that any place, any day, the diurnal temperature over 24 hours looks like fig 9.

    Well obviously, fig 9 is NOT a 24 hour sinusoid. Looks like it has at least a second harmonic component, and likely higher ones, both odd and even.

    So the diurnal Temperature at any place is clearly not a one cycle per 24 hours band limited signal, it has once per 12, and once per 8 , and so on frequency components.

    So TWO samples per 24 hours (Tmax and Tmin) is just barely enough samples,for a perfect sinusoid, but for the second harmonic component, the undersampling is a factor of two, and worse for the higher frequencies, so calculating an average, which is the zero frequency component, is already corrupted by aliassing noise; so your min max stuff is not even real data, so all this statistics is being applied to essentially meaningless random numbers that do not comprise real data.

    You have to comply with the Nyquist criterion FIRST, before you even have data to statisticate on.

    Well so much for temporal sampling; what about spatial sampling.

    Wow, I really love those four Alaska weather station Temperature samples; how come so many; didn’t Briffa find that he only needed one Yamal Charlie Brown Christmas tree To determine the global climate.

    What is it they say ; “A noisy noise annoys an oyster !” Saw a mathematical proof of that once.

    I’d like to see this level of analysis applied to the Manhattan Telephone directory, to see what one can learn about the history of Telephone numbers.

  58. Ryan says:
    August 31, 2012 at 6:52 am

    I don’t think UHI would make much difference to these measurements, since the UHI sourced heat would be equally blocked by CO2 (or any other greenhouse gas) both during daytime and nightime. Thus looking for the delta would, in theory, cancel out the impact of UHI when looking for the AGW impact.
    =======================================================================
    Of course I could be wrong, but I thought it had already been shown that areas with UHI had higher (warmer) Tmin temps and that was one reason for the alleged warming trend.

  59. Lance, I’ve spent a lot of time looking at DTR from the NCDC’s Global Summary of Days data, and find over a large number of stations it is very consistent. If you follow the link in my name, you can see what I’ve done.

  60. Nick Stokes says:
    August 31, 2012 at 3:01 am

    “No, of course they don’t make them up. They solve the equations of fluid flow with heat transport, along with half-hourly insolation and IR heat components (and latent heat release, vertical transport modelled etc). All of this on a regular grid, roughly 100 km and half hour intervals. There’s simply no role for any kind of daily mean, and station readings are not used anywhere.”

    and Nick Stokes says:
    August 31, 2012 at 10:10 am
    Frank K. says: August 31, 2012 at 5:28 am

    Frank, I was explaining why T_minmax and T_mean, in my view, have no role in GCMs. That was the proposition that Lance was raising. Can you point to where GCMs rely on those quantities?

    Nick, this is really maddening. Are you saying the GCM modelers never check their work? At some point, don’t they have to compare their results with measured temperatures? At least historical retrodictions. And that means they compare to temperatures determined by Tmin & Tmax. Which means they are tuning their models to a set of somewhat biased numbers, which are not just random but may vary by latitude and RH.

  61. Michael Limburg says:
    August 31, 2012 at 9:18 am

    “But I don´t agree with your conclusion (1) that the trend might not be influenced by this error due to the fact, that constructing the anomaly it will cancel out (implicitly stated that way).

    Because you´ll see
    1. a variation of this error over the full year. It is different every single month. And if you use monthly averages and deduct from them a 30 year station normal from the result , will have the same fluctuation as before. Eventually the magnitude is somewhat smaller.
    2. And if the station normal error itself is different due to various reasons – what it always is, because no station in the world remained unchanged over this time or after or before, the anomaly propagates this error in part or in full.
    3. And in case you mix all the various anomalies of the world- as is done- to obtain a global mean anomaly, than the various algorithms introduce an averaged systematic error (cautiously estimated) of about ± 0,2 K at least.”

    Michael, thanks for your comment. There may in fact be some effect on the trend, I just haven’t been able to think of an obvious way such an effect could arise. As I alluded to, the USCRN timescale is far too short to create a 30-year station normal and resulting anomalies. Also, as Mosher pointed out, the latitude effect on the error is constant with the station, so no effect of trend there. But on the other hand, the model presented explained only about 30% of the variation, so there are obviously other effects on deltaT that could conceivably affect a trend. You have apparently done the work and found an effect, which is very interesting. I look forward to your paper in Energy and Environment. If you care to share a preprint with me, which I would keep confidential, my email is lwallace73@gmail.com .

  62. My 12:28 am above was a late-night blunder. Here is a correction post.

    Here is MT-Lewistown with data split at 2010 July. (blue 200807 to 201007, brown after 201007)

    (Mean, stddev, mean std err) before (-0.62, 0.21, 0.05)
    (Mean, stddev, mean std err) after (-0.69, 0.33, 0.07) (unchaged)
    These two means are not significantly different.

  63. Lance Wallace says: August 31, 2012 at 12:38 pm

    “At some point, don’t they have to compare their results with measured temperatures?”
    Not for the ongoing functioning of the GCM. But yes, for evaluation, which anyone with access to the output can do afterwards. And for that you can test any temperature statistic you have observations to match. T_max, T_mean, T_noon, anything. Because you have a stream of half-hourly GCM values. They would usually choose a regional measure rather than individual station values.

    Of course, some notion of observed temperature is used to initialize GCMs. But it is acknowledged that we can never get an initial state completely. That’s why GCMs are run for a settling in period before the time stretch to be modelled.

    • OK, I’m glad we agree that at some point the GCM output is matched against temperatures, as well as other climatic variables. Next question: does tuning occur? I imagine if the model doesn’t do well at retrodicting the past, it won’t be favored–so it might have to be tweaked. Now suppose it is trying to retrodict temperatures in say, a landlocked Northern country or perhaps a tropical one, where the Tminmax method may have been used. If the latitude effect applies, it will be trying to match temperatures that are biased low in one country and high in the other. Could this be a problem? Of course, the effect may be trivial, I don’t know.

      What I would like to see is a much larger database of global stations with both Tminmax and Truemean (perhaps every 3 or 6 hours could substitute for continuous measures here) measured over a long period. Then a model could test whether the latitude or RH effects noted for the USCRN database could be confirmed, or other significant parameters be identified. Then if the model was good enough to predict the true temperature field (a tall order), the GCMs could be tested against that rather than the suboptimal Tminmax field.

      • My understanding of the history, is that gcms did not match rising temps with increasing co2, and that Hansen added a climate sensitivity multiplier to “fix” them.

  64. I love it !! N = 3 ROFL and the error is already larger than the purported global warming signal.

    Between this study and the comments, the reality of how difficult it is to actually MEASURE temperature is laid bare and it’s about time.

    I have only 2 points to add to the discussion:
    1) daily temperature “population” is NOT normally distributed and therefore parametric statistics have no business being used in temperature studies like these.

    2) according to any and all sampling standards, 3 samples is a pitiful small portion of the ACTUAL NEEDED number of samples to estimate the really TRUE mean daily temperature given the variance.

    However, this study is vastly superior to the usual anecdotal attempts to describe the behavior of land temperatures. So thanks for that, very much !

    By the way, the photo says the station name is LEWISTON, MT, there is no town named Lewistowne here, and the site is not especially close to Lewistown, MT either. A closer town would be Hobson, MT but who knows why any name is selected. It is just like temperature perhaps – completely to somewhat arbitrary.

  65. Geoff Sherrington says:
    August 31, 2012 at 1:12 am

    1. I know of a case where for secades the newspapers were given values different to those recorded for official use. How do you measure climate change from 2 starting points a degree apart?
    You cannot use anomalies by combining records of two different stations unless you understand why those two stations give different readings and adjust accordingly. I don’t know how using the true averages would change that.

    2. Estimation of other parameters from temperature, such as W/m^2. Not so easy with anomalies, is it?
    Actually, it is impossible, either with true average temperatures or with anomalies. If you calculate the emissivity of the area surrounding a thermometer according to the average temperature of that thermometer, you are calculating it wrong. What you really need to get the correct value is to integrate the emissivity function (dependent of T^4) in time. And doing so would only give you the emissivity of a tiny area surrounding the thermometer. It tells you nothing about the area 1km further, as you don’t know the actual temperature there. So all you can do is guess. Given that it is an impossible problem, getting so ridicully closer does little to help. Error bars will still be huge.

    3. Variations in technique from year to year. The change from liquid-in-glass to thermistor/thermocouple devices, the change from one reading per day to one a minute or more frequently, satellite data – each of these methods casues CARE with temperature measurement because each can give a different ‘anomaly’.
    Correct, and again, I don’t know how using the true average, which is calculated from the same thermometers, would make it any better. If because of changing technology you start to get Tmax/Tmin 1C higher, that will also be true regarding average temperatures.

    4. Spike rejection – can arise and be filtered different ways, once one determines what is the ‘actual temperature’ and how to measure it.
    A spike that appears in average temperature also appears as a spike in the anomalies, so again there is no advantage about this.

    5. It is simply sloppy science to use non-standard units like “degrees F minus 30-year reference period” unless there are compelling reasons to vary from K. We’ve moved away from strange units like “furlongs per fortnight” for velocity and most of the world now uses K or C not F. Where do U stand?
    I’ve living in Spain all my life. Here nobody uses degrees F. But if we did, it would still make no difference regarding the benefits of using true averages vs Tmax/Tmin.

    Regards.

  66. Lance writes “Now suppose it is trying to retrodict temperatures in say, a landlocked Northern country or perhaps a tropical one, where the Tminmax method may have been used. If the latitude effect applies, it will be trying to match temperatures that are biased low in one country and high in the other. Could this be a problem? ”

    Its well known GCMs do poorly at modelling regional temperatures but the modellers dont think this is a problem. When their temperatures are all summed up to a global level the result is within cooee of the same summed up, error prone, smeared out values from the “real world’s past” and thats enough for them to argue their model has value.

    Yes, they’re tuned. They dont think this is a problem either.

  67. Well, duh, the Tmax and Tmin will always be affected by the presence of water, i.e. availability of humidity. This is what happens when you use units of measure that only indicate partial heat. The bulk of the atmosphere’s heat is contained by humidity and thus any change in specific humidity will greatly change the Tmax and Tmin. The proper measurement is in Enthalpy NOT Temperature. Qmax and Qmin gives the TOTAL heat and thus the true Mean of the energy content of the air. Hence any temperature anomaly based claim of warming or cooling is meaningless.

    The idea that a 10 degree rise in Pt. Barrow, Alaska is equal to a 10 degree rise in Tampa, Florida is an absurdity that anyone with an education in the physical sciences should catch. Maybe Hansen and Schmidt should take a basic course in Thermodynamics to learn what the proper units of measure are for any given observation based claim.

  68. The error due to the minmax filter on a daily basis would be greater than that shown here, as it is filtered by a monthly smooth. The smoothing by using monthly differences filters some of the high frequencies that are part of the daily minmax filter noise, thus reducing variance. The reduction in variance is not due to a cancellation of opposite errors. I would submit that the smoothing using months from the Julian calendar would add additional noise, as the smoothing window is not constant (it varies between 28, 29, 30 and 31 days), the smoothing is done at irregular intervals (i.e. between 28 and 31 days) and the smoothing is quantized at the same monthly intervals, as opposed to, for example, a straight 30-day smooth, where the window is 30-days wide and is shifted one day at a time. Also, only 80% of the variance (roughly) is contained within the “10-90 percentile range,” as opposed to showing a range containing 95% of the variance (2 sigmas assuming a normal probability distribution).

    The analysis of minmax bias is by far the most interesting part of this study. Showing the importance of relative humidity provides statistical support to those who have argued that moist enthalpy is a better measure than dry bulb temperature. It is hard to overemphasize the importance of this result.

    My only criticism involves the following statement:

    Although the errors documented here are true errors (that is, they cannot be adjusted by time of observation or other adjustments), nonetheless it would not be expected that they have much of a direct effect on trends. After all, if one station is consistently overestimated across the years, it will have the same trend as if the values were replaced by the true values.

    I would strongly disagree with the notion that there would be no effect on trends. Theoretically, the statement might be substantially accurate, but only if you use exactly the same stations during the entire century or so period of interest. As E.M. Smith has documented extensively, there have been numerous changes over the past century or so in the stations used to calculate trends, including the “great dying of thermometers” quite recently. The biases found in this study would seem to be significant enough that such stations changes should be assumed to significantly impact trend estimates unless proven otherwise.

    • Phil,
      I agree completely with you. That´s exactly what you will find out if you go deeper into this subject of stations treatment and condition during the time they are on duty.
      dScott.
      Your are also right. What Hansen and all the others statisticians overlooked is that temperature is a property of matter, and additionally an intensive quantity. They prefer to calculate just patterns within a bunch of figures forgetting totally that these figures are just local indicators of physical processes, which are not in thermodynamic equilibrium. So they do not represent temperatures at all.

  69. From an engineers perspective it is pretty clear that what is going on here is statistical fraud.

    What they should be aiming to present is the average temperature related to the presence of the greenhouse gas CO2.

    What they are actually measuring is the average temperature related to the presence of greenhouse gases, clouds and wind.

    In the UK a summers day without cloud could rise to about 36Celsius, but would only be about 14Celsius on a cloudy day, and even less if there is a strong wind from the north. In other words they are looking for a tiny CO2 related signal in the presence of a huge noise signal but then presenting the information as if there was no error in the CO2 related signal due to the presence of so much noise. The only attempt they are making to reduce the noise is averaging, but averaging is a simple filtering process that will retain much of the input noise at the filter output. This summer in the UK is a fantastic example of this – it has been one of the cloudiest on record – almost every day has been cloudy and so the output of the simple averaging filter will merely record the fact that this was a cloudy year with subsequently low daytime temperatures. CO2 didn’t really play a part in the UK summer – it was dominated by cloud.

    Put the real error bars into their charts due to this noise and you would see that their attempts to measure CO2 related trends in the presence of such a huge amount of noise is doomed to failure. They don’t put in such error bars. They don’t even put in error bars related the inaccuracies of using mercury thermometers in Stevensons screens for measuring temperature absolutes. Their error bars are simply related to the way rounding errors in the calculations can occur.

    I don’t condemn Team AGW for doing this kind of thing so much as the scientists how handle statistical analysis of data ever day and aren’t prepared to stand up and denounce this kind of schoolboy error.

  70. Lamentably late, but my condensed replies to some questions above are:
    Is Australian data DIFFERENT? Read http://www.bishop-hill.net/blog/2011/11/4/australian-temperatures.html

    My main objection to the anomaly method is that it involves an extra term derived from error-containing data, a needless step that adds noise.
    For Lance Wallace says: August 31, 2012 at 12:29 am One can examine temperature correlations in the time domain or the space domain, because over time weather systems move over weather stations. While there are several papers showing correlations between ststions in the the space domain, I had not seen anything precisely like my analysis of one station over lagged times. The results are a shocker. If Tmax correlates so poorly from one day to the next, then we have an effect (which I will not try to guess) that more or less invalidates using space correlations like BEST did. We should be seeing far worse correlations than we do, for weather stations separated typically by the movement of a system during one day. There is possibly another factor at play. Given that Tmax jumps around so much from day to day, a calculation of Tminmax also has a problem.
    What are your results like when you don’t use rank correlations? Can you email me your calculations? sherro1 at optusnet com au
    For Nylo says: August 31, 2012 at 11:25 pm
    Yes, I mostly agree with you, but you have the added complication of both errors and changes happening in the reference period. 1. Newspaper reports – if the reference term is dated after the wrong data were given to newspapers, then it will give a different anomaly to the official anomaly. If the reference period is within the era of wrong reporting, you might get near-identical anomaly residuals, but it would be a case of wrong method + good luck = right answer. Using absolute won’t remove the error, just the complexity of finding and understanding it.
    2. M/m^2 It is wrong to make this conversion as others have long noted, but it is still a widespread wrong. Again, the complications increase if you use a reference period derived from one instrument type and apply it to another. It’s again an unwanted complication to track down who used what and when.
    Paras 3-5 Much as before, the added complication over using (say) absolute is error prone and requires extra checking. Mistakes do happen. I’m pleading to go back to basics to reduce them.

    • Geoff Sherrington says:
      September 3, 2012 at 11:19 pm
      “What are your results like when you don’t use rank correlations? Can you email me your calculations? sherro1 at optusnet com au

      Geoff, I tried emailing the Pearson and Spearman correlations to the address listed, also to optus.net.au, also to optus.com.au but all 3 were bounced back.

      The correlations (1-day lag) for Temperature (Tmax, Tmin, Tminmax, Truemean) were all high (0.86-0.94) whether Pearson or Spearman. The correlations were very low for DeltaT (01-0.3) and for the diurnal variation, as you indicated (-0.06 to 0.6).

  71. Phil says:
    September 2, 2012 at 10:57 pm

    “I would strongly disagree with the notion that there would be no effect on trends. Theoretically, the statement might be substantially accurate, but only if you use exactly the same stations during the entire century or so period of interest. As E.M. Smith has documented extensively, there have been numerous changes over the past century or so in the stations used to calculate trends, including the “great dying of thermometers” quite recently. The biases found in this study would seem to be significant enough that such stations changes should be assumed to significantly impact trend estimates unless proven otherwise.”

    OK, I was not thinking of the problem of calculating a trend when the stations are changing. To me, the way to calculate it would be on a station-to-station basis, taking into account any changes in the location or other metadata aspects of the station. We would then have 36,000 or so “trends” (using the BEST stations), rather than a single global trend. Indeed this was how Lubos Motl treated the BEST data when it first came out, showing that about 12,000 stations (IIRC) had declining trends over the full period of the station (which again IIRC was something like an average of 80 years). Lubos also created a Voronoi diagram showing where the declining trends occurred.

    However, a single “global” trend is apparently desired and so we have this situation of “homogenizing” stations across a large spatial swathe. The number of stations is certainly changing, sometimes by huge numbers in a short time, so it seems possible that Phil and Michael may be right. If there were sufficient global stations with long records of monitoring Tmax,
    Tmin, and at least 3 or 4 hours during the day, along with RH at least, there would be a chance of testing their hypothesis.

Comments are closed.