Guest post by Lance Wallace
Abstract
The traditional estimate of temperature at measuring stations has been to average the highest (Tmax) and lowest (Tmin) daily measurements. This leads to error in estimating the true mean temperature. What is the magnitude of this error and how does it depend on geographic and climatic variables? The US Climate Reference Network (USCRN) of temperature measuring stations is employed to estimate the error for each station in the network. The 10th-90th percentile range of the errors extends from -0.5 to +0.5 C. Latitude and relative humidity (RH) are found to exert the largest influences on the error, explaining about 28% of the variance. A majority of stations have a consistent under- or over-estimate during all four seasons. The station behavior is also consistent across the years.
Introduction
Historically, temperature measurements used to estimate climate change have depended on thermometers that record the maximum and minimum temperatures over a day. The average of these two measurements, which we will call Tminmax, has been used to estimate a mean daily temperature. However, this simple approach will have some error in estimating the true mean (Tmean) temperature. What is the magnitude of this error? How does it vary by season, elevation, latitude or longitude, and other parameters? For a given station, is it random or consistently biased in one direction?
Multiple studies have considered this question. Many of these are found in food and agriculture journals, since a correct mean temperature is crucial for predicting ripening of crops. For example, Ma and Guttorp (2012) report that Swedish researchers have been using a linear combination of five measurements (daily minimum, daily maximum, and measurements taken at 6, 12, and 18 hours UTC) since 1916 (Ekholm 1916) although revised later (Moden, 1939; Nordli et al, 1996). Tuomenvirta (2000) calculated the historical variation (1890-1995) of Tmean – Tminmax differences for three groups of Scandinavian and northern stations. For the continental stations (Finland, Iceland, Sweden, Norway, Denmark) average differences across all stations were small (+0.1 to +0.2 oC) beginning in 1890 and dropping close to 0 from about 1930 on. However, for two groups of mainly coastal stations in the Norwegian islands and West Greenland, they found strongly negative differences (-0.6 oC) in 1890, falling close to zero from 1965 on. Other studies have considered different ways to determine Tmean from Tmin, Tmax and ancillary measurements (Weiss and Hays, 2005; Reoicovsky et al., 1989; McMaster et al., 1983; Misra et al., 2012). Still other studies have considered Tmin and Tmax in global climate models (GCMs) (Thrasher et al, 2012; Lobell et al., 2007).
This short note examines these questions using the US Climate Reference Network (USCRN), a network of high-quality temperature measurement stations operated by NOAA and begun around 2000 with a single station, reaching a total of about 114 stations in the continental US (44 states) by 2008. There are also 4 stations in Alaska, 2 in Hawaii, and one in Canada meeting the USCRN criteria. Four more stations in Alaska have been established, bringing the total to 125 stations, but have only 2-3 years of data at this writing. A regional (USRCRN) network of 17 stations has also been established in Alabama and has about 4 years of data. All these 142 stations were used in the following analysis, although at times the 121- or 125-station dataset was used. The stations are located in fairly pristine areas meeting all criteria for weather stations. Temperature measurements are taken in triplicate, and other measures at all stations include precipitation and solar radiance. Measurements of relative humidity (RH) were instituted in 2007 at two stations and by about 2009 were being collected at the 125 sites in the USCRN network but not at the Alabama (USRCRN) network. A database of all measurements is publically available at ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/. The database includes hourly, daily, and monthly results. This database, together with single compilations of multiple files kindly supplied by NOAA, was used for the following analysis.
Methods
The monthly data for the 142 stations were downloaded one station at a time and joined together in a single database. (Note: at present, the monthly data are only available to the public as separate files for each station. Daily data are available as separate files for each year for each station. This requires 142 separate downloads for the monthly data, and about 500 or so downloads for the daily data. Fortunately, a NOAA database manager was able to provide the daily data as a single file of about 373,000 records.)
The hourly data include the maximum and minimum 5-minute average temperatures recorded each hour as well as the mean temperature averaged over the hour. The daily data include the highest 5-minute maximum and the lowest 5-minute minimum temperatures recorded in the hourly data that day (i.e. Tmax and Tmin) together with the mean daily temperature (Tmean). The average of Tmax and Tmin ({Tmax+Tmin}/2) is also included for comparison with the true mean. The monthly data includes the maximum and minimum temperatures for the month; these are averages of the observed highest 5-minute average maximum and minimum daily temperatures. There is also an estimate of the true mean monthly temperature and the monthly average temperature using the monthly Tmax and Tmin. The difference between the daily Tminmax and the true mean will be referred to as Delta T:
DeltaT = (Tmin+Tmax)/2 – Truemean
Data were analyzed using Excel 2010 and Statistica v11. For each station, the entire length of the station’s history was used; the number of months ranged from 47 to 132. Since the relationship between the true mean and Tminmax may vary over time, these were compared by season, where Winter corresponds to January through March and so on. The diurnal temperature range (DTR) was calculated for each day as Tmax-Tmin. For the two stations with the highest and lowest overall error, the hourly data were downloaded to investigate the diurnal pattern.
Results
As of Aug 11, 2012 there were 12,305 station-months and 373,975 station-days from 142 stations. The metadata for all stations are available at the Website http://www.ncdc.noaa.gov/crn/docs.html.
Delta T averaged over all daily measurements for each station ranged from -0.66 oC (Lewistowne, MT) to +1.38 oC (Fallbrook, CA, near San Diego). (Figure 1). A negative sign means the minmax approach underestimated the true mean. Just about as many stations overestimated (58) as underestimated (63) the true mean.
Figure 1. DeltaT for 121 USCRN stations: 2000-August 5, 2012. Error bars are standard errors.
A histogram of these results is provided (Figure 2). The mean was 0.0 with an interquartile range of -0.2 to +0.2 oC. The 10-90 percentile range was from -0.5 to + 0.5 oC.
Figure 2. Histogram of Delta T for 121 USCRN stations.
Seasonal variability was surprisingly low: in more than half of the 121 stations with at least 47 months of complete data, the Tminmax either underestimated (28 sites) or overestimated (39 sites) the true mean in all 4 seasons. Most of the remaining stations were also weighted in one direction or another; only 20 stations (16.5%) were evenly balanced at 2 seasons in each direction. 16 of these 20 were negative in winter and spring, positive in summer and fall. Over all 121 stations, there was a slight tendency for underestimates to be favored in winter and spring, with overestimates in summer and fall (Figure 3).
Figure 3. Variation of Delta T by season.
Since Delta T was determined by averaging all values over all years for each station, the possibility remains that stations may have varied across the years. This was tested by comparing the average Delta T for each station across the years 2008-9 against the average in 2010-11. The result showed that the stations were very stable across the years, with a Spearman correlation of 0.974 (Figure 4).
Figure 4. Comparison of Delta T for each station across consecutive 2-year periods. N = 140 stations.
When Delta T is mapped, some quite clear patterns emerge (Figure 5). Overestimates (blue dots) are strongly clustered in the South and along the entire Pacific Coast from Sitka, Alaska to San Diego, also including Hawaii. Underestimates (red dots) are located along the extreme northern tier of states from Maine to Washington (excepting the two Washington stations west of the Cascades) and all noncoastal stations west of Colorado’s eastern border.
Figure 5. DeltaT at 121 USCRN stations. Colors are quartiles. Red: -0.66 to -0.17 C. Gold: -0.17 to 0 C. Green: 0 to +0.25 C. Blue: +0.25 to +1.39 C.
Figure 5 suggests that the error has a latitude gradient, decreasing from positive to negative as one goes North. Indeed a regression shows a highly significant (p<0.000002) negative coefficient of –0.018 oC per degree of latitude (Table 1, Figure 6). However, other variables clearly affect DeltaT as shown by the adjusted R2 value indicating that latitude explains only 21% of the observed variance.
Table 1. Regression of DeltaT (Tminmax-True mean) on latitude
| N=142 stations | Regression Summary for Dependent Variable: DELTATR= .467 R²= .218 Adjusted R²= .212F(1,140)=38.9 p<.00000 Std.Error of estimate: .278 | |||||
|
b* |
Std.Err. of b* |
b |
Std.Err. of b |
t(140) |
p-value |
|
| Intercept |
0.75 |
0.11 |
6.6 |
0.000000 |
||
| LATITUDE |
-0.466 |
0.075 |
-0.018 |
0.002 |
-6.2 |
0.000000 |
* Standardized regression results (μ=0, σ=1)
Figure 6. Regression of DeltaT on Latitude.
Therefore a multiple regression was carried out on the measured variables within the monthly datafile. The Spearman correlations of these variables with DeltaT are provided in Table 2. The largest absolute value of the Spearman coefficient was with latitude (-0.375), but other relatively high correlations were noted for Tmin (0.308) and RHmax (0.301). However, TMIN, TMAX, TRUEMEAN and DTR could not be included in the multiple regression, since they (or their constituent variables in the case of DTR) appear on the left-hand side as part of the definition of DELTAT. Also the three RH variables were highly collinear, so only RHMEAN was included in the multiple regression. Finally, because Alaska and Hawaii have such extreme latitude and longitude values, they were omitted from the multiple regression. These actions left 3289 station-months (out of 3499 total) and 6 measured independent variables, of which 4 were significant. Together they explained about 30% of the measured variance (Table 3, Figure 7). However, only latitude and RH were the main explanatory variables, explaining 28% of the variance themselves with about equal contributions as judged from the t-values. When the multiple regression was repeated for each season, in fall and winter the four significant and two nonsignificant variables were identical to those in the annual regression, with adjusted R2 values of 19-20%, but in spring and summer all six variables were significant, with R2 values of 47-50%. However, in all seasons, the two dominant variables were latitude and RH.
Table 2. Spearman correlations of measured variables with DeltaT.
| VARIABLE |
DELTAT |
| LONGITUDE (degrees) |
0.075 |
| LATITUDE (degrees) |
-0.375 |
| ELEVATION (feet) |
-0.169 |
| TMAX (oC) |
0.231 |
| TMIN (oC) |
0.308 |
| TMINMAX (oC) |
0.272 |
| TRUEMEAN (oC) |
0.239 |
| DTR (oC) |
-0.134 |
| PRECIP (mm) |
0.217 |
| SOLRAD (MJ/m2) |
-0.043 |
| RHMAX (%) |
0.301 |
| RHMIN (%) |
0.124 |
| RHMEAN (%) |
0.243 |
Table 3. Multiple regression on DeltaT of measured variables
| N=3289 station-months | Regression Summary for Dependent Variable: DELTAT R= .5522 R²= .3049 Adjusted R²= .3037F(6,3282)=239.98 p<0.0000 Std.Error of estimate: .3683Exclude condition: state=’ak’ or state=’hi’ | |||||
|
b* |
Std.Err. of b* |
b |
Std.Err. of b |
t(3282) |
p-value |
|
| Intercept |
-0.294812 |
0.085454 |
-3.4500 |
0.000568 |
||
| LONG |
-0.169595 |
0.018086 |
-0.005496 |
0.000586 |
-9.3772 |
0.000000 |
| LAT |
-0.407150 |
0.015910 |
-0.032380 |
0.001265 |
-25.5913 |
0.000000 |
| ELEVATION |
0.066710 |
0.018980 |
0.000013 |
0.000004 |
3.5147 |
0.000446 |
| PRECIP (mm) |
-0.008293 |
0.017129 |
-0.000055 |
0.000114 |
-0.4842 |
0.628291 |
| SOLRAD MJ/m2) |
0.000193 |
0.016465 |
0.000013 |
0.001099 |
0.0117 |
0.990630 |
| RHMEAN |
0.552356 |
0.021529 |
0.015417 |
0.000601 |
25.6565 |
0.000000 |
* Standardized regression results (μ=0, σ=1)
Figure 7. Predicted vs observed values of DeltaT for the multiple regression model in Table 3.
Since RH had a strong effect on DeltaT, a map of RH was made for comparison with the DeltaT map above (Figure 8). The map again shows the clustering noted for DeltaT along the Pacific Coast, the Southeast, and the West. However, the effect of latitude along the northern tier is missing from the RH map.
Figure 8. Relative humidity for 125 USCRN stations: 2007-Aug 8, 2011. Colors are quartiles. Red: 19-56%. Gold: 56-70%. Green: 70-75%. Blue: 75-91%.
Fundamentally, the difference between the minmax approach and the true mean is a function of diurnal variation—stations where the temperature spends more time closer to the minimum than the maximum will have their mean temperatures overestimated by the minmax method, and vice versa. To show this graphically, the mean diurnal variation over all seasons and years is shown for the station with the largest overestimate (Fallbrook, CA) and the one with the largest underestimate (Lewistowne, MT) (Figure 9). Although both graphs have a minimum at 6 AM and a maximum at about 2 PM, the Lewistown (lower) diurnal curve is broader. For example, 8 hours are within 2 oC of the Lewistowne maximum, whereas only about 6 hours are within 2 oC of the Fallbrook maximum. Another indicator is that 12 hours are greater than the true mean in Lewistowne but only 9 in Fallbrook.
Figure 9. Diurnal variation and comparisons of the true mean to the estimate using the minmax method for the two stations with the most extreme over- and underestimates.
Discussion
For a majority of US and global stations, at least until recent times, it is not possible to investigate the question of the error involved in using the Tminmax method, since insufficient measurements were made to determine the true mean. The USCRN provides one of the best datasets to investigate this question, not only since both the true mean temperatures and the daily Tmax and Tmin are provided, but also because the quality of the stations is high. Since there are >100 stations well distributed across the nation, which now have at least 4 years of continuous data, the database seems adequate for this use and the results comparing 2-year averages suggest the findings are robust.
The questions asked in the Introduction to this paper can now be answered, at least in a preliminary way.
“What is the magnitude of this error?” We see the range is from -0.66 oC to +1.38 oC, although the latter value appears to be unusual, with the second highest value only +0.88 oC.
“How does it vary by season, elevation, latitude or longitude, and other parameters?” The direction of the error is surprisingly unaffected by season, with more than half the stations showing consistent under- or over-estimates during all 4 seasons. We have seen a strong effect of latitude and RH, with a weaker effect of elevation. Geographic considerations are clearly important, with coastal and Southern sites showing strong overestimates while the northern and western stations mostly show strong underestimates of the minmax method. Although the Tuomenvirta (2000) results mentioned above are averages across all stations in a region, still their findings that the coastal stations in west Greenland and the Norwegian islands showed a strong delta T in the same direction as the coastal stations in the USCRN supports the influence of RH, whereas their finding of the opposite sign for the continental stations shows the same dependence we find here for the Western interior USCRN stations. (Note that their definition of delta T has the opposite sign from ours.)
“For a given station, is it random or biased in a consistent direction?” For most stations, the direction and magnitude of the error is very consistent across time, as shown by the comparison across seasons and across years.
Considering the larger number of stations in the US and in historical time, we may speculate that the error in the minmax method was at least as large as indicated here, and most probably somewhat larger, since many stations have been shown to be poorly sited (Fall et al, 2011). The tendency in the USCRN dataset to have about equal numbers of underestimates as overestimates is simply accidental, reflecting the particular mix of coastal, noncoastal, Northern and Southern sites. It may be that this applies as well to the larger number of sites in the continental US, but there is likely to be a bias in one direction or another in different countries, depending on their latitude extent and RH levels.
This error could affect spatial averaging. For example, the Fallbrook CA site with the highest positive DeltaT value of 1.39 C is just 147 miles away from the Yuma site with one of the largest negative values of -0.58. If these two stations were reading the identical true mean temperature, they would appear to disagree by nearly 2 full degrees Celsius using the standard minmax method. Quite a few similar pairs of close-lying stations with opposite directions of DeltaT can be seen in the map (check for nearby red and blue pairs). However, if only anomalies were considered, the error in absolute temperature levels might not affect estimates of spatial correlation (Menne and Williams, 2008).
Although the errors documented here are true errors (that is, they cannot be adjusted by time of observation or other adjustments), nonetheless it would not be expected that they have much of a direct effect on trends. After all, if one station is consistently overestimated across the years, it will have the same trend as if the values were replaced by the true values. Or if it varies cyclically by season, again after sufficient time the variations would tend to cancel and the trend be mostly unaffected. Of course, this cannot be checked with the USCRN database since it covers at most 4-5 years with the full complement of stations, and normal year-to-year “weather” variations would likely overwhelm any climatic trends over such a short period.
Acknowledgement. Scott Ember of NOAA was extremely helpful in navigating the USCRN database and supplying files that would have required many hours to download from the individual files available.
References
Ekholm N. 1916. Beräkning av luftens månadsmedeltemperatur vid de svenska meteorologiska stationerna. Bihang till Meteorologiska iakttagelser i Sverige, Band 56, 1914, Almqvist & Wiksell, Stockholm, p. 110.
Fall, S., Watts,A., Niesen-Gammon, J., Jones, E., Niyogi, D., Christy, J.R., and Pielke, R.A., Sr. Analysis of the impacts of station exposure on the U.S. Historical Climatology Network temperature and temperature trends. J Geophysical Research 116: DI4120. 2011.
Lobell, D.B., Bonfils, C., and Duffy, P.B. Climate change uncertainty for daily minimum and maximum temperatures: A model inter-comparison. Geophysical Research Letters 34, L05715, doi:10.1029/2006GL028726, 2007.
Ma, Y. and Guttorp, P. Estimating daily mean temperature from synoptic climate observations. http://www.nrcse.washington.edu/NordicNetwork/reports/temp.pdf downloaded Aug 18 2012
Menne, M.J. and Williams, C.N. Jr. 2008. Homogenization of temperature series via pairwise comparisons. J Climate 22: 1700-1717.
McMaster, Gregory S. and Wilhelm, Wallace , “Growing degree-days: one equation, two interpretations” (1997). Publications fromUSDA-ARS / UNL Faculty. Paper 83.
http://digitalcommons.unl.edu/usdaarsfacpub/83Accessed on Aug 18 2012.
Misra, V., Michael, J-P., Boyles, R., Chassignet, E.P{., Griffin, M. and O’Brien , J.J. 2012. Reconciling the Spatial Distribution of the Surface Temperature Trends in the Southeastern United States. J. Climate, 25, 3610–3618. doi: http://dx.doi.org/10.1175/JCLI-D-11-00170.1
Modén H. 1939. Beräkning av medeltemperaturen vid svenska stationer. Statens meteorologiskhydrografiska anstalt. Meddelanden, serien Uppsatser, no. 29.
Nordli PØ, Alexandersson H, Frisch P, Førland E, Heino R, Jónsson T, Steffensen P, Tuomenvirta H, Tveito OE. 1996. The effect of radiation screens on Nordic temperature measurements. DNMI Report 4/96 Klima.
Reicosky DC, Winkelman LJ, Baker JM, Baker DG. 1989. Accuracy of hourly air temperatures calculated from daily minima and maxima. Agric. For. Meteorol.46, 193–209.
Weiss A, Hays CJ. 2005. Calculating daily mean air temperatures by different methods: implications from a non-linear algorithm. Agric. For. Meteorol. 128, 57-69.
Thrasher, B,L, Maurer, E.P., McKellar, C. and Duffy, P.B. Hydrol. Earth Syst. Sci. Discuss., 9, 5515–5529, 2012. www.hydrol-earth-syst-sci-discuss.net/9/5515/2012/ doi:10.5194/hessd-9-5515-2012. Accessed Aug 18, 2012.
Tuomenvirta, R.H., Alexanderssopn,H.,Drebs,A., Frich,P., and Nordli, P.O. 2000: Trends in Nordic and Arctic temperature extremes. J. Climate, 13, 977-990.
APPENDIX
The main concern of this paper has been with Delta T and therefore almost all of the above analyses deal with that variable. However, another variable depending on the daily Tmax and Tmin is their difference, the Diurnal Temperature Range (DTR), which has its own interest. For example, the main finding of Fall et al., (2011) was that the poorly sited stations tended to overestimate Tmin and underestimate Tmax, leading to a large underestimate of DTR. However, the USCRN stations are all well-sited and therefore the estimates of DTR should be unbiased. What can we learn from the USCRN about this variable? We can first of all map its variation (Figure A-1).
Figure A-1. Variation of daily DTR across the US CRN. Colors are quartiles. Red: 4.7-10.8 C. Gold: 10.8-12.0 C. Green: 12.0-13.8 C. Blue: 13.8-19.9 C.
Here we see that the coastal sites have the lowest daily variation, reflecting the well-known moderating effect of the oceans. Perhaps the two sites near the Great Lakes in the lowest quartile of the DTR distribution are also due to this lake effect. The Western interior states have the highest DTRs.
A multiple regression shows that RH is by far the strongest explanatory variable (Table A-1). Solar radiation and precipitation have moderate effects, and latitude is weakly significant. The model explains about 46% of the variance, with RHMEAN accounting for most (42%) of that (Figure A-2).
Table A-1. Multiple regression on Diurnal Temperature Range.
| N=3289 | Regression Summary for Dependent Variable: DTR (CRNM0101_US_AL_AK_HI RH MERGED WITH METADATA NEW)R= .68257906 R²= .46591417 Adjusted R²= .46493778F(6,3282)=477.18 p<0.0000 Std.Error of estimate: 2.3620Exclude condition: v3=’ak’ or v3=’hi’ | |||||
|
b* |
Std.Err. of b* |
b |
Std.Err. of b |
t(3282) |
p-value |
|
| Intercept |
20.14306 |
0.548079 |
36.7521 |
0.000000 |
||
| LONG |
0.010872 |
0.015854 |
0.00258 |
0.003759 |
0.6858 |
0.492909 |
| LAT |
-0.068287 |
0.013946 |
-0.03974 |
0.008115 |
-4.8965 |
0.000001 |
| ELEVATION |
-0.008117 |
0.016638 |
-0.00001 |
0.000025 |
-0.4879 |
0.625687 |
| PRECIP (mm) |
-0.170325 |
0.015015 |
-0.00829 |
0.000730 |
-11.3438 |
0.000000 |
| SOLRAD (MJ/m2) |
0.183541 |
0.014433 |
0.08967 |
0.007051 |
12.7167 |
0.000000 |
| RHMEAN |
-0.484798 |
0.018872 |
-0.09900 |
0.003854 |
-25.6888 |
0.000000 |
* Standardized regression results (μ=0, σ=1)
Figure A-2. Diurnal Temperature Range vs. mean RH.
The figure suggests that a linear fit is not very good; for RH between about 60-95% the effect on DTR (eyeball estimate) is perhaps twice the slope of -0.138 C per % RH for all the data..
Finally, how does the true mean temperature depend on the variables measured at the UCRN sites? The multiple regression is provided in Table A-2. Although all six variables are significant and explain about 79% of the variance, the relationship is largely driven (R2=59%) by solar radiation (Figure A-3).
Table A-2. Multiple regression of true mean monthly temperatures vs. measured meteorological variables.
| N=3289 | Regression Summary for Dependent Variable: TRUEMEAN R= .891 R²= .793 Adjusted R²= .793F(6,3282)=2095.9 p<0.0000 Std.Error of estimate: 4.58Exclude condition: State=’AK’ or State=’HI’ | |||||
|
b* |
Std.Err. of b* |
b |
Std.Err. of b |
t(3282) |
p-value |
|
| Intercept |
9.972524 |
1.062680 |
9.3843 |
0.000000 |
||
| LONG |
-0.037057 |
0.009869 |
-0.027366 |
0.007288 |
-3.7548 |
0.000177 |
| LAT |
-0.201479 |
0.008682 |
-0.365153 |
0.015735 |
-23.2071 |
0.000000 |
| ELEVATION |
-0.307433 |
0.010357 |
-0.001414 |
0.000048 |
-29.6825 |
0.000000 |
| PRECIP (mm) |
0.151732 |
0.009347 |
0.022991 |
0.001416 |
16.2333 |
0.000000 |
| SOLRAD (MJ/m2) |
0.752289 |
0.008985 |
1.144690 |
0.013671 |
83.7285 |
0.000000 |
| RHMEAN |
-0.076282 |
0.011748 |
-0.048521 |
0.007473 |
-6.4931 |
0.000000 |
* Standardized regression results (μ=0, σ=1)
Figure A-3. True mean temperature vs. solar radiation.
===============================================================
This document is available as a PDF file here:
Errors in Estimating Temperatures Using the Average of Tmax and Tmin
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.












The definition of the Predicted Value as used Figure 7 seems to be missing from the paper. Is it the real monthly average from all 5 minute readings each month?
The Monthly Observed value, I think, is the Tave from the Min, Max values of the day.
Reblogged this on The GOLDEN RULE and commented:
Here is some genuine science being used and the conclusion(s) do not support the need for concern about warming trends, if they in fact exist at all.
I believe the article provides support for my general contention that however good the science is, the subject “climate and its factors” are of such complexity and variability as to make IPCC and political carbon control decisions untenable.
“Lance Wallace says: August 30, 2012 at 5:32 pm
“Another thought: I know nothing about GCMs. Do they input Tmin and Tmax in general to create their temperature fields? “
GCM’s wouldn’t use station readings, and hence not T_max or T_mean. They compute from a complete flow model temperature fields on a regular grid at about 30-min intervals.
It looks like the example set by the “Watts et al” paper to put a paper up here to be “fire-proofed” is being followed.
Lance, don’t take the singes personally or let them bother you.
Stephen Rasey says: The definition of the Predicted Value as used Figure 7 seems to be missing from the paper. Is it the real monthly average from all 5 minute readings each month? The Monthly Observed value, I think, is the Tave from the Min, Max values of the day.”
Both Predicted and Observed values are Delta T, as stated in the caption. Recall that Delta T is the difference between the true mean and that calculated from Tmin & Tmax. The Predicted value is determined for each station and each month by inputting the coefficients of the multivariate regression model for each of the six variables multiplied by the values of the six variables for that month and that station. The sum of the six values plus the intercept gives the Predicted value.
I also started earlier this week loading USCRN hourly02 data into a SQL database to see what the raw data would show, without adjustments or arbitrary decisions (i.e. infilling from “nearby” stations, grid averaging etc.)
I successfully downloaded all the station hourly files using an FireFTP extension for Firefox. The download did take several hours, but it was automated, I just needed to kick it off.
My first playing with the data was this very subject, was (Tmax+Tmin)/2 a valid alternate to the daily average calculated from the hourly average. At this point I’m still trying to ensure the schema I have is useable but an early query did show that there are significant differences. My histogram peaked at a positive 1-1.5C difference, but I had only loaded a small subset of the data (for testing queries & schema) at that point. Hopefully once I load all the data my results should replicate yours.
My gut feel was that (Tmax+Tmin)/2 would not always represent a true average based upon living in the SF Bay Area, where most of the day might be fog bound in the 50Fs with sun breaking through around 1pm leading to an afternoon high of 80-90F before fog returning at 7pm, e.g. a day shaped liked Fallbrooke but with a much narrower peak.
1. With high frequency observations from airport sites, you should be able to detect jet wash heat effects if they happen as an aircraft taxis past, once you can get hold of aircraft movement data. Best done if possible on the types of thermometers that would record a transient and show it as a Tmax or Tmin.
2. The mere fact of Tmax and Tmin happening on the same day does not permit unbridled use of their Tminmax values. For example, people have often correlated Tminmax between stations up to 2,000 km apart. However, when you separate out Tmax and Tmin and correlate them separately, you get quite a different picture. You can get an idea of this by looking at the following related exercise using lagged data at one station.
http://www.geoffstuff.com/Extended%20paper%20on%20chasing%20R.pdf
The correlation coeficients of Tmax and Tmin are rather different for data lagged by only one day. This would seem to place some unstated limits on assumptions in your data analysis. Particularly, one of either Tmax or Tmin might call the shots, placing different emphasis on day lengths as sites are nearer to the Poles. I get confused concepts inside the Arctic and Antarctic circles.
OK, just a non professional observer here, but I have some questions:
How exactly were these measurements taken? Were they recorded by a human using an “Eyeball Mk I”, or were they recorded using some kind of automatic recording device?
What is the magnitude of error introduced by each of these methods, and what is the effect on the reading?
Also, how is the error introduced in the colder latitudes influenced by the weather; ie: the human reader looks out the window and says: “the weather is too cold to read today, so I will do it tomorrow and double up”.
This is very interesting, but at the same time it is somewhat irrelevant. We don’t care about the actual temperature to measure climate change. We care about anomalies, i.e. how those temperatures vary in time. What would be worrying is that the error of using this method greatly varied from year to year. We don’t care if it varies between stations, as long as it is relatively stable in time for any one given station. And this article seems to show that the error is indeed rather stable in time (although we would need much longer studies to really be able to conclude that).
If a station uses the Tmin/Tmax method to conclude that the average temperature was 14C whereas the true average temperature was 13.5C, I don’t care! The actual temperature is irrelevant, firstly, because I am going to use that reading to represent the temperature of an area which is probably hundreths of square kilometres big. And one thing I know for sure, is that the thermometer is NOT giving me the average temperature in that area, no matter which method I use. The best that the thermometer can do is give me the temperature in its exact position. That and only that. So what if the thermometer gave me the exact 13.5C for the average temperature in its position? The average temperature in the area could be 14.5C! So what did I gain?
Nick Stokes says: “GCM’s wouldn’t use station readings, and hence not T_max or T_mean. They compute from a complete flow model temperature fields on a regular grid at about 30-min intervals.”
But NIck, they wouldn’t make up the temperature fields from whole cloth, would they? Or if they did, wouldn’t they have to check it at some point against the observed data? And as you point out, we go with what we have, so wouldn’t they go with the TminTmax data? And haven’t we shown that these don’t match the true means? In somewhat predictable ways, e.g. affected by latitude and RH? So what’s wrong with getting a nice big dataset of 1000 stations globally that have sufficient hourly data to give an idea of the delta compared to Tminmax, creating a model from those measurements that estimates the true mean temperature everywhere, and validating the GCM against those better estimates of temperature than the flawed Tminmax values?
Climate Beagle says: “I successfully downloaded all the station hourly files…”
Wow, that’s one humongous file–about 8,960,000 records by my calculation. I benefited from NOAA help to get the daily file, which I can share with you if you want to check the daily file that you can create from your hourly file. I also have the NOAA-created monthly file that I can share with you if you want. In return I wouldn’t mind taking a look at your hourly file on Dropbox or other mode. My email is lwallace73@gmail.com.
It is indeed biased, and one of the further reasons i can think of is that mercury based thermometers will not function below -36F or -38C. It is artificially “warming” the average of Tmin and Tmax as a result, although it’s only a problem that might affect older stations and temperature records. Have you noticed as well that the NOGAPS ground temperature is limited to -50F, which means most of the antarctic continent might be actually colder. That’s not a problem in the first sight, but if you compute an anomaly over this data, it will result in a flawed and unusually high value.
Regards and thanks for the very interesting article again from you.
For the past 3 hours, I have been trying to figure out how the error bar for each of the stations in Figure 1 can be so small.
So I went to the first one, MT-Lewistown, monthly records.
ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/monthly01/CRNM0101-MT_Lewistown_42_WSW.txt
48 months of data.
DeltaT = (Column 8 – column 9) = TMINMAX – Truemean
Mean DeltaT = -0.656 (that checks)
Std Dev DeltaT = 0.278.
Mean Std Error = 0.041
P10 – P90 Range: -1.0 to -0.37.
Ok. I can see that you are plotting the Mean Std Error
But I do not see why that is important. We are not using population mean temperatures. We are using individual monthly TMINMAX vs time.
The key point in the data is that any given month’s TMINMAX,
appears to be an estimate of the TRUEMEAN
with an std dev. error bar +/- 0.278 deg C
Excel Chart Pic: http://i45.tinypic.com/2lnxsi1.png
There is no trend about whether the error gets bigger or smaller at higher temperatures. Possibily the uncertainty in DeltaT gets bigger at higher temps.
Obviously the small number of stations is not enough to give accurate results for the larger area . Yes it does show trends yet how accurate are they? Before becoming to happy with the information we need to expand the number of stations with the USCRN pristine conditions . We have already seen , thanks to Anthony and others , that station siting is important. Even though the USCRN stations are sited in pristine places for data collection I think they need to be even more widely distributed in order to take in more area . I am doubtfull of results from too small a sampling even though the results can be used to demonstrate the lack of agw . Let me be clear in stating I have no faith in the systems already in use as they have been so heavily manipulated that the information gleaned from them is meaningless. I would simply like to see more USCRN type sites in order to be confident in the results . I would like to see them doubled with the appropriate spread and if tripled more the merrier . When you think of maybe 450 sites that is still a pretty small number but better degree of accuracy anyway. Placement without bias will be an issue .
Here is MT-Lewistown with data split at 2010 July. (blue 200807 to 201007, brown after 201007)
http://i47.tinypic.com/2vb9aj6.png
(Mean, stddev, mean std err) before (-0.86, 0.42, 0.089)
(Mean, stddev, mean std err) after (-0.69, 0.33, 0.068)
I note that these two means are far enough apart to test for significance.
At the same time, I note that the three most negative DeltaT’s are the first three months that station was in operation. Isn’t that a coincidence! Was the grass growing back?
Geoff Sherrington says: “The mere fact of Tmax and Tmin happening on the same day does not permit unbridled use of their Tminmax values. For example, people have often correlated Tminmax between stations up to 2,000 km apart. However, when you separate out Tmax and Tmin and correlate them separately, you get quite a different picture. You can get an idea of this by looking at the following related exercise using lagged data at one station.
http://www.geoffstuff.com/Extended%20paper%20on%20chasing%20R.pdf ”
Geoff, I’m unclear on exactly what you are suggesting here. I checked out your Melbourne example and have no idea why Tmax had such low autocorrelations compared to Tmin. I did try looking at Spearman correlations of Tmax, Tmin, Tminmax, Truemean, and DeltaT in both the monthly and daily datasets. In the monthly dataset, the correlations were 0.88. 0.88, 0.88, 0.88, and 0.66, respectively. In the daily dataset, they were 0.94, 0.94, 0.96, 0.96, and 0.30. Interesting results, but not particularly in agreement with your Melbourne example. Of course, these are average results across 125 stations, so an individual station could behave rather differently. However, I don’t at this point see how this affects the analysis. We still have daily and monthly delta T values for every station. I should state that the daily and monthly values averaged over the entire station history do result in very slightly different estimates of Delta T for each station due to the different weights (slightly different lengths of a month) but these differences were usually in the range of a few percent.
Anyway, if you still feel that the autocorrelation will affect the analysis in some way, please let me know.
By the way, none of the USCN sites are at airports, so your jetwash suggestion would have to be checked at some other station.
Nylo says:”This is very interesting, but at the same time it is somewhat irrelevant. We don’t care about the actual temperature to measure climate change. We care about anomalies, i.e. how those temperatures vary in time.”
As I said in the text, and in agreement with Nick Stokes, I am fairly well convinced that the bias is indeed of little interest in calculating trends. But it is a true error (despite Nick’s comment) and therefore could affect our understanding of basic physical aspects of climate studies. For example, if the gradient between the tropics and northern latitudes is different than what we think now based on the Tminmax results, that would affect our calculations of energy transfer between tropics and higher latitudes. And in my view as a physicist, energy is the true fundamental driver here, not temperature.
Willhelm says: “OK, just a non professional observer here, but I have some questions:
How exactly were these measurements taken? Were they recorded by a human using an “Eyeball Mk I”, or were they recorded using some kind of automatic recording device?
What is the magnitude of error introduced by each of these methods, and what is the effect on the reading?”
The triplicate thermometers are platinum-resistance thermometers traceable to NIST standards. Each is in a separate enclosure at the site. To be acceptable, the values must be within 0.3C of each other. See the manual monitoring handbook at ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/documentation/program/
Nylo says: August 30, 2012 at 10:48 pm “We don’t care about the actual temperature to measure climate change.”
Oh yes we do, Nylo. Some examples.
1. I know of a case where for secades the newspapers were given values different to those recorded for official use. How do you measure climate change from 2 starting points a degree apart?
2. Estimation of other parameters from temperature, such as W/m^2. Not so easy with anomalies, is it?
3. Variations in technique from year to year. The change from liquid-in-glass to thermistor/thermocouple devices, the change from one reading per day to one a minute or more frequently, satellite data – each of these methods casues CARE with temperature measurement because each can give a different ‘anomaly’.
4. Spike rejection – can arise and be filtered different ways, once one determines what is the ‘actual temperature’ and how to measure it.
5. It is simply sloppy science to use non-standard units like “degrees F minus 30-year reference period” unless there are compelling reasons to vary from K. We’ve moved away from strange units like “furlongs per fortnight” for velocity and most of the world now uses K or C not F. Where do U stand?
I’m wondering too if more modern AWS stations that employ solid state sensors may have a different (lower) thermal inertia, which would create the illusion of a rising tmax?
Roger Pielke Sr has discussed this sort of thing:
http://pielkeclimatesci.wordpress.com/2012/08/21/comments-on-the-shifting-probability-distribution-of-global-daytime-and-night-time-temperatures-by-donat-and-alexander-2012-a-not-ready-for-prime-time-study/
However I think you are correct in saying “energy is the true fundamental driver here, not temperature” as Roger Pielke Sr has also discussed in some depth:
http://pielkeclimatesci.wordpress.com/2012/05/07/a-summary-of-why-the-global-average-surface-temperature-is-a-poor-metric-to-diagnose-global-warming/
On a very simplistic level one can appreciate that air at low temperature with high humidity could have the same energy content as higher temperature air under drier conditions. Thus I don’t think the temperature alone is a conclusive metric for understanding the climate.
Lance Wallace says: August 30, 2012 at 11:19 pm
But Nick, they wouldn’t make up the temperature fields from whole cloth, would they? Or if they did, wouldn’t they have to check it at some point against the observed data? And as you point out, we go with what we have, so wouldn’t they go with the TminTmax data?”
No, of course they don’t make them up. They solve the equations of fluid flow with heat transport, along with half-hourly insolation and IR heat components (and latent heat release, vertical transport modelled etc). All of this on a regular grid, roughly 100 km and half hour intervals. There’s simply no role for any kind of daily mean, and station readings are not used anywhere.
You certainly earned the grant money.
Seems to me if you want to detect AGW you should use Tmin (usually nightime temps) since air temp at night is wholly dependent on the presence of greenhouse gases. Daytime temps will be higher the less greenhouse gas you have…..
The astonishing thing about this whole argument is that it ignores some very basic mathematical ideas in signal processing.
If you measure an electrical signal that is a proxy for temperature, i.e.: a thermocouple, one has a continuous signal. The rate at which you have to sample that signal is determined by Nyquist sampling theorem : you have to sample at twice the highest frequency in the signal. This can be limited by filtering the signal so that very short term variations: a bird flies over the temperature sensor, are eliminated.
If you do not do this, and undersample the signal, all subsequent calculations will not reflect the behaviour of the signal.
What then do we mean by mean temperature? The mean is the integral of the signal wrt to time, divided by the period of which it is integrated. In a stable system this will converge asymptotically to a stable value. However, this isn’t particularly useful because we recognise that “the mean might change”, so what we do is to low pass filter the temperature signal so that only low frequency fluctuations will become visible.
It is easy to show that decimation of a signal by averaging an epoch, say a day and treating that average as a representation of the signal in question is incorrect and introduces errors into the derived time series.
This is one of the most basic ideas in the manipulation of signals, is EE101 and seems to be robustly ignored by the temperature community, who frankly should know better.