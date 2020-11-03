By Andy May
While studying the NOAA USHCN (United States Historical Climate Network) data I noticed the recent differences between the raw and final average annual temperatures were anomalous. The plots in this post are computed from the USHCN monthly averages. The most recent version of the data can be downloaded here. The data shown in this post was downloaded in October 2020 and was complete through September 2020.
There are two ways to compute the difference and they give different answers. One way is to subtract the raw from the final temperature month-by-month, ignoring missing values, then average the differences by year. When computed this way from the USHCN monthly values, the values are only subtracted when both the raw and final average temperature exist for a given month and station. Using this method, the numerous “estimated” final temperatures are ignored, because there is no matching raw temperature. This plot is shown in Figure 1.
Figure 1. Plot of USHCN final temperatures minus raw data. The difference between final and raw monthly average temperatures is computed when both exist for a specific month. The differences are then averaged. Data used is from NOAA.
In Figure 1 we can see two things. First, the number of raw data stations drops quickly from 2005 to 2019. As we can see in Figure 2 this is not a problem for the final temperatures. How is this so? In turns out that as the active weather stations disappear from the network, the final temperature for them is estimated from neighboring stations. The estimates are made from nearby active stations using the NOAA pairwise homogenization algorithm or “PHA” (Menne & Williams, 2009a). The USHCN is a high-quality subset of the larger COOP set of stations. The PHA estimates are not made with just the USHCN high quality stations, the algorithm utilizes the full COOP set of stations (Menne, Williams, & Vose, 2009).
Figure 2. Final temperatures from 1900 to 2019. Notice 1218 values are present from 1917 to 2019. As shown in Figure 1, these are not all measurements, a significant number of the values are estimated. Data used is from NOAA.
So, what happens if we simply average the values for each month in both the raw dataset and the final dataset, ignoring nulls, then subtract the raw yearly average from the final yearly average? This is done in Figure 3. We realize that the raw values represent fewer stations and that the final values contain many estimated values. The number of estimated final values increases rapidly from 2005 to 2019.
Figure 3. USHCN final-raw temperatures computed year-by-year, regardless of the number of stations in each dataset. The sharp rise in the temperature difference is from 2015-2019.
Discussion
The above plots use all raw data and all final data in the USHCN datasets. Information about the data is available on the NOAA web site. In addition, John Goetz has written about the data and the missing values in some detail here.
The USHCN weather stations are a subset of the larger NOAA Cooperative Observer Program weather stations, the “COOP” mentioned above. USHCN stations are the stations with longer records and better data (Menne, Williams, & Vose, 2009). All the weather station measurements are quality checked and if problems are found a flag is added to the measurement. To make the plots shown here, the flags were ignored, and all values were plotted and used in the calculations. Some plots made by NOAA and others with this data are made this way and others reject some or all flagged data. Little data exists before 1900, so we chose to begin our plots at that date. There are less than 200 stations in 1890. All the weather stations in the USHCN network are plotted in Figure 4, those with more than 50 missing monthly averages between January 2010 and the end of 2019 are noted with red boxes around the symbol.
Figure 4. All USHCN weather stations. Those with missing raw data monthly averages have red boxes around them. Data source: NOAA.
Conclusions
The plots above show that the overall effect of the estimated, or “infilled,” final monthly average temperatures is a rapid recent rise in average temperature as is clearly seen in Figure 3. In Figure 3 the overall monthly averages from the estimated (“infilled”) final weather station values are averaged and then compared to the average of the real measurements, the raw data. This is not a station by station comparison. The station-by-station comparison is shown in Figure 1. In Figure 1 the monthly differences are computed only if a station has both a raw measurement and a final estimate. The values from 2010 to 2019 still look strange, but not as strange as in Figure 3.
Clearly the rapid drop-off of stations during this time, which averages more than 20 stations per year, is playing a role in the strange difference between Figures 1 and 3. But, the extreme jump seen from 2015-2019 in Figure 3 is mostly in the estimated values in the final dataset. We might think the 2016 El Nino played a role in this anomaly, but it continues to 2019. The El Nino effect reversed in 2017 in the U.S., as seen in Figure 2. Besides, this anomaly is not in temperature, it is a difference between the final and raw temperature values in the USHCN dataset.
Figure 4 makes it clear that the dropped stations (boxed in red) are widely scattered. The areal coverage over the lower 48 states is similar in 2010 and 2019, except perhaps in Oklahoma, not sure what happened there. But, in the final dataset, values were estimated for all the terminated weather stations and those estimated values apparently caused the jump shown in Figure 3.
I don’t have an opinion about how the year-by-year Final-Raw anomaly in Figure 3 happened, only that it looks very strange. Reader opinions and additional information are welcome.
Some final points. I used R to read and process the data, although I used Excel to make a lot of the graphs. The USHCN data is complete and reasonably well documented for the most part, but hard to read and get into a usable form. For those that want to check what I’ve done and make sure these plots were made correctly I’ve collected my R programs in a zip file that you can download and use to check my work.
I plan to do more with the USHCN data and its companion GHCN (Global Historical Climate Network) dataset. I’ll publish more posts on them as issues come up.
Confusing tob
One point of confusion in the data, unrelated to this post. NOAA calls their time-of-day corrected data “tob.” It stands for time-of-day bias and accounts for minimum and maximum temperatures taken at different times in different stations. All the tob data supplied on their ftp site has 13 monthly values. I’ve read the papers and the documentation but cannot figure out why there are 13 monthly values for tob, but only 12 monthly values for all the other datasets. I emailed them to ask but have not received an answer to date. Does anyone know? If so, please put the answer in the comments.
Download the R code used to read the USHCN monthly raw and final data and compute the data plotted in this post here.
Nice essay Andy. I don’t do temperatures because I mistrust the apparently innumerable corrections. But lots of folks do. I’ll be looking forward to reading the comments.
I don’t know why they even bother starting with thermometer readings these days.
Why not just make up some numbers right from the get-go?
Thank you Andy for an open and documented article.
The 13 month issue might be the fixed solar calendar: https://en.wikipedia.org/wiki/International_Fixed_Calendar
Andy, Tony Heller has been talking about this for sometime now. Check out Tony’s website, real climate science.com, where he posts his videos. NOAA is flagrantly filling-in phony estimated values to account for these missing stations you referred to. The estimates, of course, are warmer. Tony explains this in detail on his videos – he’s checked the raw data, too.
Andy,
My graph of the disappearance of stations agrees with the orange line on your figure 1. 400 stations gradually vanished starting in 1989, the MannHansen year.
https://theearthintime.com/reporting.jpg
Did these stations stop recording? Most had been recording for 100 years, a heroic and admirable contribution to science. It is doubtful they ‘went out of business’ because NOAA withdrew funds. Their instruments were in place, nothing expensive about noting TMAX, TMIN, SNOW, PERC each day.
Has anyone contacted one or more of the blocked stations? Is there a blog post anywhere that reports the direct scoop?
Surely the data still exists. Mr. Trump tear down this wall. Get the data.
Is there a system in ‘which stations must disappear’? Someone must organize the process of loss of stations.
What is planned?
Do you follow Tony Heller’s work?
This article appears to be very similar to Tony’s analysis–that he’s been blogging about for the last 10+ years:
“Alterations To The US Temperature Record
NOAA also provides the raw (unadjusted) monthly temperature data. For average temperature, it is available at this link. The unadjusted raw monthly graph shows much less warming than the adjusted data, and has recent years cooler than the 1930s.”
https://realclimatescience.com/alterations-to-the-us-temperature-record/
Excellent article from ‘realclimatescience’
I recently compared the global atmospheric temperature record, GISS, with the ocean surface temperature records, ENSO and AMO, since 1880. GISS shows an increase of 1.1 C in the atmospheric temperatures over that period, while ENSO and AMO shows little to no trend (only cyclic variations) in the ocean surface temperatures over the same period.
I would argue that it defies the laws of physics that the atmospheric temperature should decouple from the ocean surface temperatures by 1 C. over the last 140 years.
It seems that the ‘temperature adjusters’ have only adjusted the atmospheric temperature databases, but have forgotten the sea surface temperature databases. Another ‘smoking gun’ in the evidence of data manipulation to fabricate the global warming/climate change narrative
Gosh. Climate scientists are fiddling the statistics?
Are you kidding me?
I’m sure personal ethics & professional embarrassment will cause them to immediately explain & correct their work. If enough of this were to happen, untrained peasants would soon lose respect & trust in climate science.
For a real surprise, you should cross-plot the yearly mean temperature difference against atmospheric CO2 concentrations. The linear relationship will surprise you. Tony Heller has shown significant correlation between the data adjustments and rising atmospheric CO2 concentrations.
In other words, if you torture the data, it will confess!
NOAA has a time machine that can go to the past and collect data and a heat pump to modify the more recent ones. I was a data and database administrator and what I used anomalies for was to find data entry errors. Then I would go back to original documents or back to the original sources and do corrections. Infilling data was not allowed.
Thanks for digging into this Andy. It reminds of Ross McKitrick’s graph of some years ago, also showing rising temps coinciding with the great decline in # of stations reported.
https://rclutz.files.wordpress.com/2016/08/ave.-t-vs.-no.-stations.png
I did an analysis of the stations given top ratings for siting by WUWT, and found that the flagging allows removal of recent year data so that values are infilled from elsewhere. For example, see San Antonio
https://rclutz.files.wordpress.com/2015/04/san-antonio-ghchm-noaa.png?w=1000
https://rclutz.wordpress.com/2015/04/26/temperature-data-review-project-my-submission/
It always seemed to me that missing data should be infilled by means of the trend observed at that particular station.
“I plan to do more with the USHCN data and its companion GHCN (Global Historical Climate Network) dataset. I’ll publish more posts on them as issues come up.”
1. USHCN IS NO LONGER USED. It ceased to be the data of record YEARS AGO.
2. One of the issues with USHCN is it is NOT a high quality record like many peole think
A) many of the stations are spliced.
B) and many stations have estimated data ( as the author notes)
3. The ENTIRE adjustment approach was SCRAPPED
Basically NOBODY USES USHCN IN ANY GLOBAL RECORD. It has been replaced ENTIRELY
along with TOBS
GHCN is NOT A COMPANION to USHCN
So how is it that the decline of 1/3 in number of stations happen to be “warm” stations ? Since the trend of the remaining stations results in lower average temperature by a rather large whole degree C ?
Yet those expensive well sited USCRN stations, not part of USHCN, show…..NADA
https://www.ncdc.noaa.gov/temp-and-precip/national-temperature-index/time-series?datasets%5B%5D=uscrn¶meter=anom-tavg&time_scale=p12&begyear=2004&endyear=2020&month=12