Recent USHCN Final v Raw Temperature differences

By Andy May

While studying the NOAA USHCN (United States Historical Climate Network) data I noticed the recent differences between the raw and final average annual temperatures were anomalous. The plots in this post are computed from the USHCN monthly averages. The most recent version of the data can be downloaded here. The data shown in this post was downloaded in October 2020 and was complete through September 2020.

There are two ways to compute the difference and they give different answers. One way is to subtract the raw from the final temperature month-by-month, ignoring missing values, then average the differences by year. When computed this way from the USHCN monthly values, the values are only subtracted when both the raw and final average temperature exist for a given month and station. Using this method, the numerous “estimated” final temperatures are ignored, because there is no matching raw temperature. This plot is shown in Figure 1.

Figure 1. Plot of USHCN final temperatures minus raw data. The difference between final and raw monthly average temperatures is computed when both exist for a specific month. The differences are then averaged. Data used is from NOAA.

In Figure 1 we can see two things. First, the number of raw data stations drops quickly from 2005 to 2019. As we can see in Figure 2 this is not a problem for the final temperatures. How is this so? In turns out that as the active weather stations disappear from the network, the final temperature for them is estimated from neighboring stations. The estimates are made from nearby active stations using the NOAA pairwise homogenization algorithm or “PHA” (Menne & Williams, 2009a). The USHCN is a high-quality subset of the larger COOP set of stations. The PHA estimates are not made with just the USHCN high quality stations, the algorithm utilizes the full COOP set of stations (Menne, Williams, & Vose, 2009).

Figure 2. Final temperatures from 1900 to 2019. Notice 1218 values are present from 1917 to 2019. As shown in Figure 1, these are not all measurements, a significant number of the values are estimated. Data used is from NOAA.

So, what happens if we simply average the values for each month in both the raw dataset and the final dataset, ignoring nulls, then subtract the raw yearly average from the final yearly average? This is done in Figure 3. We realize that the raw values represent fewer stations and that the final values contain many estimated values. The number of estimated final values increases rapidly from 2005 to 2019.

USHCN Final-Raw

Figure 3. USHCN final-raw temperatures computed year-by-year, regardless of the number of stations in each dataset. The sharp rise in the temperature difference is from 2015-2019.

Discussion

The above plots use all raw data and all final data in the USHCN datasets. Information about the data is available on the NOAA web site. In addition, John Goetz has written about the data and the missing values in some detail here.

The USHCN weather stations are a subset of the larger NOAA Cooperative Observer Program weather stations, the “COOP” mentioned above. USHCN stations are the stations with longer records and better data (Menne, Williams, & Vose, 2009). All the weather station measurements are quality checked and if problems are found a flag is added to the measurement. To make the plots shown here, the flags were ignored, and all values were plotted and used in the calculations. Some plots made by NOAA and others with this data are made this way and others reject some or all flagged data. Little data exists before 1900, so we chose to begin our plots at that date. There are less than 200 stations in 1890. All the weather stations in the USHCN network are plotted in Figure 4, those with more than 50 missing monthly averages between January 2010 and the end of 2019 are noted with red boxes around the symbol.

Figure 4. All USHCN weather stations. Those with missing raw data monthly averages have red boxes around them. Data source: NOAA.

Conclusions

The plots above show that the overall effect of the estimated, or “infilled,” final monthly average temperatures is a rapid recent rise in average temperature as is clearly seen in Figure 3. In Figure 3 the overall monthly averages from the estimated (“infilled”) final weather station values are averaged and then compared to the average of the real measurements, the raw data. This is not a station by station comparison. The station-by-station comparison is shown in Figure 1. In Figure 1 the monthly differences are computed only if a station has both a raw measurement and a final estimate. The values from 2010 to 2019 still look strange, but not as strange as in Figure 3.

Clearly the rapid drop-off of stations during this time, which averages more than 20 stations per year, is playing a role in the strange difference between Figures 1 and 3. But, the extreme jump seen from 2015-2019 in Figure 3 is mostly in the estimated values in the final dataset. We might think the 2016 El Nino played a role in this anomaly, but it continues to 2019. The El Nino effect reversed in 2017 in the U.S., as seen in Figure 2. Besides, this anomaly is not in temperature, it is a difference between the final and raw temperature values in the USHCN dataset.

Figure 4 makes it clear that the dropped stations (boxed in red) are widely scattered. The areal coverage over the lower 48 states is similar in 2010 and 2019, except perhaps in Oklahoma, not sure what happened there. But, in the final dataset, values were estimated for all the terminated weather stations and those estimated values apparently caused the jump shown in Figure 3.

I don’t have an opinion about how the year-by-year Final-Raw anomaly in Figure 3 happened, only that it looks very strange. Reader opinions and additional information are welcome.

Some final points. I used R to read and process the data, although I used Excel to make a lot of the graphs. The USHCN data is complete and reasonably well documented for the most part, but hard to read and get into a usable form. For those that want to check what I’ve done and make sure these plots were made correctly I’ve collected my R programs in a zip file that you can download and use to check my work.

I plan to do more with the USHCN data and its companion GHCN (Global Historical Climate Network) dataset. I’ll publish more posts on them as issues come up.

Confusing tob

One point of confusion in the data, unrelated to this post. NOAA calls their time-of-day corrected data “tob.” It stands for time-of-day bias and accounts for minimum and maximum temperatures taken at different times in different stations. All the tob data supplied on their ftp site has 13 monthly values. I’ve read the papers and the documentation but cannot figure out why there are 13 monthly values for tob, but only 12 monthly values for all the other datasets. I emailed them to ask but have not received an answer to date. Does anyone know? If so, please put the answer in the comments.

Download the R code used to read the USHCN monthly raw and final data and compute the data plotted in this post here.

You can purchase my latest book, Politics and Climate Change: A History, here. The content in this post is not from the book.

The bibliography can be downloaded here.

2.5 2 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

162 Comments
Inline Feedbacks
View all comments
fred250
November 3, 2020 12:19 pm

USCRN has got the data tampering under control,

… so much so that Climdiv matches USCRN almost exactly, and has no effective warming since 2005

Just the bulge from the 2015 -> Big Blob/El Nino effect, now nearly disappeared.

comment image

Certainly, I would not trust anything before USCRN, it is so totally adjusted, infilled and tortured as to be worthless.

ron sterren
November 3, 2020 12:38 pm

I would love to see a plot of only data recorded compared to data with estimates included.
I would also like to see a year by year plot of stations that were removed and if they were rural or city stations.

Steven Mosher
Reply to  ron sterren
November 6, 2020 2:15 am

no difference

November 3, 2020 12:47 pm

Any temperature measurement that shows global warming or cooling is suspect. Earth’s sea surface temperature is tightly controlled with 271.3K the lower limit and 305K the upper limit. Ocean circulations can shift averages slightly between these limits but the limits are firm; no surprise that the global surface temperature is close to the arithmetic mean of these limits 288K. The moored buoys in the tropical oceans have no trend:
https://1drv.ms/u/s!Aq1iAj8Yo7jNg3LJuByjstkozrzc
Unlike the rubbish that comes out of climate models.

Bindidon
Reply to  RickWill
November 3, 2020 4:11 pm

RickWill

Again nonsense, like your comment I replied to this afternoon:

https://wattsupwiththat.com/2020/11/02/uah-global-temperature-update-for-october-2020-0-54-deg-c/#comment-3117974
*
How can you compare a tiny region like the Tropical Pacific with the rest of the World?
That is disingenuous AND incompetent.

I tell you this: instead of endlessly criticizing people on the basis of vague assumptions, try to learn what they really do.

J.-P. D.

fred250
November 3, 2020 1:27 pm

Funny what you can find if you go looking

comment image

fred250
November 3, 2020 2:02 pm

Which data does the US public get to see via the MSM?

Climdiv, with no warming since 2005, or USHCN fake data ?

fred250
November 3, 2020 3:06 pm

We can see from NOAA data 2000 when the hottest period in the USA was

comment image

Brief gap then no more warming from 2005.

November 3, 2020 3:46 pm

So another version of the huge problem of uneven data from surface stations and weather balloons.

Recall at one point that a big government database had only one reporting station north of 60 – the variable one at Alert, near 80N and Greenland. Location chosen long ago for a usable harbour to set up an establishment there. Subject to a Foehn wind.

Then there are the oceans, with reporting variable, much of from ship engine coolant water flows with measurement not intended as climate data and with questionable accuracy. (I ask if location is recorded with temperature, because local weather varies short term.) Certainly not in the days before SATNAV, except in coastal areas covered by Loran C.

Yes, satellites cover the globe but alarmists evade that data?

Reply to  Keith Sketchley
November 4, 2020 10:34 pm

Keith Sketchley

“Yes, satellites cover the globe but alarmists evade that data?”

The differences, in terms of both scale and trend, between the 2 main global satellite (lower troposphere) temperature producers are much greater than those between the surface data producers. Especially since 1998 when AMSU radiometers came into use. UAH v6 is consistently cooler than RSS v4 when both are set to the same anomaly base.

The overall warming trend since 1979 in UAH (+0.14 C/dec) is also substantially lower than that in RSS (+0.22 C/dec). By contrast, differences in the trends of the surface data sets over the same period are relatively small (+0.17; +0.19; +0.18 C/dec; HadCRUT, GISS and NCDC respectively).

Changes between versions of satellite data sets also tend to produce large fluctuations in trend. UAH v6 (2015) reduced the global warming trend in the previous version by -0.03C/dec. RSS v4 (2017) increased the trend in the previous version by +0.05C/dec. You often see the satellite data sets described as “pristine” or similar; but disagreements between producers over how to interpret AMSU data and the sometimes radical trend changes that can accompany updates suggests they are not quite there yet.

Reply to  TheFinalNail
November 5, 2020 4:07 am

Indeed there was an ‘overall warming trend’ beginning in ~1970s, but it ended, replaced by an overall cooling trend since then. How did the satellites miss it?

Steven Mosher
Reply to  TheFinalNail
November 6, 2020 7:03 am

UAH has a horrible bias over land, northern land, post 1997.
the reason is simple

Steven Mosher
November 6, 2020 3:04 am

jesus guys

“https://www.ncdc.noaa.gov/temp-and-precip/national-temperature-index/background

NCEI now uses a new dataset, nClimDiv, to determine the contiguous United States (CONUS) temperature. This new dataset is derived from a gridded instance of the Global Historical Climatology Network (GHCN-Daily), known as nClimGrid.

Previously, NCEI used a 2.5° longitude by 3.5° latitude gridded analysis of monthly temperatures from the 1,218 stations in the US Historical Climatology Network (USHCN v2.5) for its CONUS temperature. The new dataset interpolates to a much finer-mesh grid (about 5km by 5km) and incorporates values from several thousand more stations available in GHCN-Daily. In addition, monthly temperatures from stations in adjacent parts of Canada and Mexico aid in the interpolation of U.S. anomalies near the borders.

The switch to nClimDiv has little effect on the average national temperature trend or on relative rankings for individual years, because the new dataset uses the same set of algorithms and corrections applied in the production of the USHCN v2.5 dataset. However, although both the USHCN v2.5 and nClimDiv yield comparable trends, the finer resolution dataset more explicitly accounts for variations in topography (e.g., mountainous areas). Therefore, the baseline temperature, to which the national temperature anomaly is applied, is cooler for nClimDiv than for USHCN v2.5. This new baseline affects anomalies for all years equally, and thus does not alter our understanding of trends.

Figure 1
Figure 1: Comparison of national nClimDiv normals and national USCRN estimated normals for each month of the year for maximum, mean and minimum temperatures.
So as not to compare apples and oranges, the departures of nClimDiv and the U.S. Climate Reference Network (USCRN) values from normal values for each network are compared rather than the absolute values. The 30 years from 1981 through 2010 provide the basis for the normal period for each month and network. Data exist for nClimDiv from 1895 to present, so a normal is simply the 30-year average of the gridded data. USCRN observations since commissioning to the present (4-9 years) were used to find relationships to nearby COOP stations and estimate normals at the USCRN sites using the normals at the surrounding COOP sites derived from full 1981-2010 records (Sun and Peterson, 2005). The normal values for each month are then subtracted from each monthly temperature average to create anomalies or departures from normal that can be compared between networks over time. In the final step, the station data are interpolated to 0.25° latitude by 0.25° longitude grids over the lower 48 states and then combined in an area weighted average for the whole U.S. The final results of this process are the monthly National Temperature Index (NTI) values for nClimDiv and USCRN. Figure 1 displays the differences between nClimDiv normals and USCRN estimated normals for national maximum, mean and minimum temperature. Since the normal values in each of the networks do not maintain the same relationship from month-to-month, comparing absolute values would be confusing.

Figure 2
Figure 2: Comparison of annual national anomalies for the nClimDiv (solid) and USCRN (dashed) maximum (red), mean (green) and minimum (blue) temperatures.
In addition to comparing NTI’s for nClimDiv and USCRN for 1 month, users may compare anomalies between nClimDiv and USCRN for the last 3, 6, 12 months, year-to-date (YTD), and annual periods. These are available on the time series graphing page. Close agreement between these two datasets demonstrates that nClimDiv and USCRN are both capable of accurately measuring the surface air temperature of the U.S. currently, in the past, and into the future. By verifying the USCRN result, the nClimDiv shows that a network with a relatively small number of carefully located and extremely accurate stations can represent the U.S. temperature record into the future. Having both is the best of all worlds, two records moving in lockstep into the future and providing independent verification of the temperature changes in the U.S. during the coming century. Close agreement between these two datasets demonstrates that nClimDiv and USCRN are both capable of accurately measuring the surface air temperature of the U.S. currently, in the past, and into the future (Figure 2).

A more detailed description of the nClimGrid dataset, which is used to compute nClimDiv, is available as Vose et al., 2014. Interested users are encouraged to read Vose et al., 2014 and its citations for further information.

References
Sun, B. and T. C. Peterson, 2005: Estimating temperature normals for USCRN stations. Int. J. Climatol., 25: 1809-1817. doi: 10.1002/joc.1220.
Vose, R.S., Applequist, S., Durre, I., Menne, M.J., Williams, C.N., Fenimore, C., Gleason, K., Arndt, D. 2014: Improved Historical Temperature and Precipitation Time Series For U.S. Climate Divisions Journal of Applied Meteorology and Climatology. DOI: http://dx.doi.org/10.1175/JAMC-D-13-0248.1

Steven Mosher
November 6, 2020 3:11 am

The transition away from USHCN was announced in 2011

Fennimore et. al., Transitioning…, NOAA/NEDIS/NCDC (2011)

and completed in 2014

Steven Mosher
November 6, 2020 3:15 am

Nasa GISS used to use USCHN for input into global data

they STOPPED USING IT IN 2011

They HAVE NOT USED USHCN SINCE 2011

FFS people read the manual

https://data.giss.nasa.gov/gistemp/updates_v3/

See the BOTTOM entry in the change log

January 18, 2017: The legacy code, a mixture of Fortran/Python/C/sh routines was replaced by a pure Python code, an adaptation of the work done by the Clear Climate Code team. The results of the two codes are not identical, but the differences are well within the accuracy of the whole method. Global mean estimates e.g. may differ by an occasional ±0.01°C.

The new version of the code is made available on this site. A slightly modified version was made available at 6 PM EST to deal with the possibility that the switch to “https” might prevent a successful download of the input files.

October 21, 2016: While restructuring the “Station Data” site in April 2016, a bug was introduced in the utility creating the text and CSV tables of the station data. It caused the skipping of some years. That bug was fixed. It had no impact on the graph or any other part of the analysis. Thanks to David Appell for bringing it to our attention.

September 12, 2016: The gridding process was slightly modified at the poles: Rather than treating the 40 sub-boxes ending at the North pole and at the South pole as individual boxes with centers at 84N and 84S respectively, they are now combined into a single box centered at the pole. The impact of this change on the global means is insignificant; however the maps look somewhat more realistic in the polar regions.

May 13, 2016: Some temperature series from GHCN and the various SCAR sources turned out to be identical, but we treated them as separate stations, giving too much weight to such a series compared to neighboring stations during the gridding process. Now we are only using one of these series; if possible we select the SCAR READER/surface data, else the GHCN data; we use the remaining two SCAR sources only if no other data are available for that location. The impact on the results was minimal (less than 0.01°C for the global monthly means). This was done by including these duplicate stations into the list of discarded records. The downloadable source package was updated correspondingly.

January 20, 2016: Some tables are now presented in human-readable as well as machine-readable CSV format. Unfortunately, the wrong files were used. That mishap was corrected at 5:30 PM; however, the replacement files were created by a newer version of the GISS analysis. On 1/21/2016 they were replaced by the proper files.

The currently used SCAR data are made available from the main GISTEMP site. The animations were updated to include the most recent data.

The maps page has been updated and uses a rewritten program to generate the map and the zonal line plot. Plots for the most recent available month are displayed when the page is first loaded. We anticipate making related updates to the station data and graphs pages in the coming weeks.

July 19, 2015: The data and results put on the public site on July 15 were affected by a bug in the ERSST v4 part of the automated incremental update procedure. The analysis was re-done after recreating the full version of SBBX.ERSSTv4 separately. We would like to acknowledge and thank Nick Stokes for noticing that there might be a problem with these data.

July 15, 2015: Starting with today’s update, the standard GISS analysis is no longer based on ERSST v3b but on the newer ERSST v4. Dr. Makiko Sato created some graphs and maps showing the effect of that change. More information may be obtained from NOAA/NCDC’s website. Furthermore, we eliminated GHCN’s Amundsen-Scott temperature series using just the SCAR reports for the South Pole.

June 13, 2015: NOAA’s NCEI (formerly NCDC) switched from v3.2.2 to the new release v3.3.0 of the adjusted GHCN, which is our basic source. This upgrade included filling some gaps in a few station records and fixing some small bugs in the homogenization procedure. NCEI’s description of those changes is available here. One of the impacts was removing some data that the GISS procedure had always eliminated and the list of GISS corrections was correspondingly reduced. Hence the (insignificant) impact on the GISS analysis was slightly different from the impact described in that document. The changes produced a decrease of 0.006°C/decade for the 1880 to 2014 trend of the annual mean land surface air temperature rather than the 0.003°C/decade increase reported by NCEI. Both are substantially less than the margin of error for that quantity (±0.016°C/decade). Impacts on the changes of the annual Land-Ocean temperature index (global surface air temperature) were about 5 to 10 times smaller than the margin of error for those estimates.

Please note that neither the land data nor the ocean data used in this analysis are the ones used in the NCEI paper “Possible artifacts of data biases in the recent global surface warming hiatus” that appeared on June 4, 2015. For the ocean data, GISS still uses ERSST v3b rather than the newer ERSST v4, but will switch to that file next month, when we add the June 2015 data; the collection of land station data used in that paper includes many more sources than GHCN v3.3.0 and will probably be incorporated into a future GHCN v4.

May 15, 2015: Due to an oversight several Antarctic stations were excluded from the analysis on May 13, 2015. The analysis was repeated today after including those stations.

February 14, 2015: UK Press reports in January 2015 erroneously claimed that differences between the raw GHCNv2 station data (archived here) and the current final GISTEMP adjusted data were due to unjustified positive adjustments made in the GISTEMP analysis. Rather, these differences are dominated by the inclusion of appropriate homogeneity corrections for non-climatic discontinuities made in GHCN v3.2 in 2011/2012. See the earlier notes from December 14, 2011 and September 26, 2012; more details are provided in the FAQ.

December 29, 2014: The title on the US temperature graph was corrected by replacing “Continental US” by “Contiguous US”. References to the corresponding graphs in the literature were updated.

September 15, 2014: Color maps using the Robinson projection or polar projection are now presented without contour smoothing, since that process occasionally results in skipping some color bands. It seems however to work fine for the equirectangular projection.

July 14, 2014: The missing China May 2014 reports became available and are now part of our analysis. That correction increased the global May 2014 anomaly by a statistically insignificant 0.002°C.

June 17, 2014: Analysis was delayed hoping the missing reports from China would become available. Unfortunately, this has not been the case yet. Please note, that the current May 2014 data are therefore not directly comparable to previous records.

Febuary 14, 2014: Two January 2014 reports from Greenland (Godthab Nuuk and Angmagssalik) and one from Mongolia (Dauunmod) were disregarded since they seemed unusual and proved to be inconsistent with other reports.

January 21, 2014: The GISS analysis was repeated this morning based on today’s status of the GHCN data. The changes were well within the margin of error, e.g. the L-OTI mean for 2013 changed from 0.6048±0.02°C to 0.6065±0.02°C, a change of less than 0.002°C. However, rounding to 2 digits for the L-OTI table changed the 0.60°C used in some documents prepared last week to 0.61°C. This minuscule change also moved year 2013 from a tie for the 7th place to a tie for the 6th place in the GISS ranking of warmest years, demonstrating how non-robust these rankings are.

January 21, 2014: The GISTEMP maps webpage now defaults to using the Robinson map projection. The previous default “regular” projection is labeled as Equirectangular.

August 14, 2013: The July 2013 report from Jaskul (46.2N, 45.4E) is inconsistent with its June 2013 report unlike the reports from neighboring stations. In that region, the July mean has been consistently higher than the June mean and not 4.3°C colder as the current report would indicate. Hence that report was not used in our analysis.

May 24, 2013: The time series and seasonal cycle website plotting tools were restored, which completes the return of the interactive features disabled in January. A problem with porting graphics software between servers led to a longer delay than expected.

May 15, 2013: The 3/3013 report from Dushanbe was corrected and the 3/3013 report from Kuwait was deleted in GHCN v3, so that these two GISS deletions were dropped.

April 15, 2013: Two March 2013 reports, one from Kuwait International Airport and one from Dushanbe (38.5N, 68.8E), did not agree with neighboring reports or with Weather Underground data. Hence they were not used in our analysis. The faulty February 2013 report from Nema was replaced by a corrected report in GHCN v3.

April 1, 2013: A comparison of our global analysis using NOAA ERSST (our current approach) for ocean temperature as opposed to NOAA OISST concatenated with HadSST1 is available on Dr. Sato’s webpage.

March 21, 2013: This update was delayed by an investigation of some unrealistic looking reports from various stations in Mongolia. NCDC eliminated the reports today. In addition, the February 2013 report from Nema also seems unrealistic and has been eliminated. Finally, from now on we will incorporate into our analysis the reconstructed Byrd station data provided by Prof. David Bromwich.

February 24, 2013: The GISTEMP maps and station data website plotting tools were restored.

January 16, 2013: Starting with the January 2013 update, NCDC’s ERSST v3b data will be used to estimate the surface air temperature anomalies over the ocean instead of a combination of Reynold’s OISST (1982 to present) and data obtained from the Hadley Center (1880-1981).

January 14, 2013: Due to technical problems with the webserver onto which the GISTEMP webpages were recently migrated, interactive plotting tools such as making maps of the surface temperature anomaly and line plots of station data were disabled as the site was migrated onto newer hardware.

November 19, 2012: The machine which hosted the GISTEMP web pages will be decommissioned shortly, and all files and utilities have been moved to a new server. As the new machine uses a different architecture and OS, many utilities required some adjustment. Please send email to reto.a.ruedy@nasa.gov if you notice any problems.

September 26, 2012: NOAA/NCDC replaced GHCN v3.1 by GHCN v3.2. Hence the GISS analysis is based on that product starting 9/14/2012. Version v3.2 differs from v3.1 by minor changes in the homogenization of the unadjusted data. A description of the modifications in the adjustment scheme and their effects are available here.

February 17, 2012: The analysis was redone on Feb 17 after learning from NOAA/NCDC that the operational version of GHCN v3 was only made available that afternoon.

February 12, 2012: The reported December 2011 data for the stations LIEPAJA, ALEKSANDROVSK, and ST.PETERSBURG were replaced by corrected reports and the strange Dec 1991 report from MALAKAL is no longer part of the adjusted GHCN v3. The corresponding entries in the GISS list of suspicious data were removed.

January 18, 2012: The reported December 2011 data for the stations LIEPAJA, ALEKSANDROVSK, and ST.PETERSBURG were clearly incorrect and were discarded. Also, a likely artificial discontinuity for the station record of SHIQUANHE was eliminated by disregarding the data for 2005-present.

December 14, 2011: GHCN v2 and USHCN data were replaced by the adjusted GHCN v3 data. This simplified the combination procedure since some steps became redundant (combining different station records for the same location, adjusting for the station move in the St. Helena record, etc). See related figures.

Steven Mosher
November 6, 2020 4:06 am

Andy

it is important to distinguish SOURCE data from data collections

SOURCE data is data from which collections are made.

USHCn is not source data. it is a collection made from various SOURCES

source datasets are things like RAWS, SNOTEL, ASOS, SOD, GCOS, and ISH, USCRN, USRCRN, GHCND,
COOP, EC,NIFC,SNM
there are many many source datasets
these can be minute by minute, hour by hour, synoptic, or once daily

USHCN is collection of a TINY NUMBER of stations that is constructed from these UR sources
Then adjustments are made to USHCN. TOB and PHa

TOB is SPECIFIC to TWO datasets

USHCN and nCLIMDIV

The TOB adjustment is used to set all observations to a consistent recording time of midnight.

GHCN— global historical climate network has about 16 different sources
it does not use TOB. it cannot. it uses PHA exclusively

For global climate studies the suppliers of data products take GISS as an example
don’t use USHCN, they use GHCN M ( version 4)

later I will upload a bunch of TOB papers for you.

bottom line if yu read heller he doesn’t understand

A) USHCN is not used by GISS
B) TOB is NOT the result of “double counting” TMAX

it’s way more complex than that.

PS, I don’t have the 13 month problem maybe you read the data in wrong. let me double check

Steven Mosher
November 6, 2020 7:00 am

So Andy on TOB.

The 13th column is the yearly average as best I can recall. You can double check this, but I think
they do a yearly average and then round up and stuff it in column 13 I can write a quick program to verify, but I recall this from many years ago.

So on to TOB.

It is NOT due SOLEY to the Double counting of TMAX as some people think. It is more complicated than that. ONE part of the bias comes from counting previous month’s days as current month’s days.

Some explanation is required here. The official way to measure temperature is from midnight to
midnight. Stations that do this are called FIRST ORDER stations because they are the best. All over the
world this is the standard. You measure midnight to midnight. That’s the day. So Oct 31 ends midnight
Oct 31. The USA does have FIRST ORDER stations. These are NWS stations, FAA, Airforce, and navy.
A lot of these are in DSI-3210 as a primary source. First order stations will have minute by minute data,
or sometimes hourly, sometimes synoptic, and sometimes min max taken at midnight.

In the USA however, we decided to also let volunteers and citizens collect temperatures. No one could expect them to get up at night and read the thermometer. So they were allowed to record the temp at whatever time they pleased. 7AM, noon, evening. In this scheme they would record from noon Oct 31 to noon Nov 1
and call that period a day. When in truth it spreads over 2 dates. So comes the problem. Is the data
used as Nov 1 or Oct 31? well to monthly averages it matters.

Now in months where there isn’t a huge monthly swing ( say summer in the midwest) the “end of month
effect” ( also called “drift” in the literature) will be small. In winter months it is bigger.

Next comes the “cold front” effect. In months where a cold front comes in before your measurement taking
a morning cold event will replace the previous evenings cold temp. When you check the USHCN TOB data you will see that adjustments are made to TMIN as well as TMAX. Finally, in SOME months in SOME regions you will also get a bias to TMAX.

The bias comes about NOT by double counting TMAX solely (like some think). The bias comes from end of month shifting, from cold fronts plowing through and from the timing of TMAX relative to the time you collect. Both Tmin and Tmax are impacted and the cause isn’t simply double counting.

Also, if you want to see the effect, understand that it changes based on the month and the region

IF you want to see little effect from TOB, then look at the summer because the Time of Observation bias is small in the summer. If the temperature gradient throughout the month is low
and if there are no cold fronts and if the hottest part of the day occurs later then you’ll have small bias.
If you want to see bigger effects then look in the other seasons, like spring and winter.

The other thing you want to do to find little bias is check out particular states. Certain parts of the country have small TOB. Karl 86 details this in some hard to read charts. Other areas will have larger TOB. It is all geography dependent.

So if you want to pick cherries (like some) you can pick different seasons and different regions and show anything you want to: no change from TOB, negative changes, and positive changes. you can cause
a lot of confusion if that’s your thing.

Now on to the history. Short version. yes TOB is real. It mattered to INDUSTRY before it matter to
climate scientists. It’s a systematic bias that can be removed. It is a USA problem. Global dataset
do not EXPLICITLY correct for it

History

#1 Ellis 1890
https://drive.google.com/file/d/1iAY-6y7bFVDja6lAWhoUQMa8e4-0xqaU/view?usp=sharing
Ellis details the BIAS that comes from changing the time at which a min max reading is taken

“Examining the various columns of Table I. it is seen how persistent is the
tendency to difference in one direction. Especially is this so as regards the
means of the minimum readings, which not only differ more than do those of the
maximum readings, but differ also by amounts that vary considerably with the
time of year, being apparently greatest in spring and autumn, less in summer,
and least, being indeed reversed in direction, in winter. The difference between
the means of the minimum readings in the month of September is esiiecially
remarkable, particularly in the year 1886.”

$$$$$$$$$$$

Next up is Mitchell circa 1958

https://drive.google.com/file/d/1IMAVEdgjAcCVoYj83woXYlAxFB8M7FhU/view?usp=sharing

“In the United States, mean temperatures are
customarily derived from half the sum of the
daily maximum and minimum temperatures

occurring in the 24-hr period ending at obser-
vation time. In the case of first-order Weather

Bureau stations, this observation time nearly coin-
cides with midnight. In the case of the thousands

of cooperative stations throughout the country,
however, each voluntary observer is granted wide

latitude in selecting an observation time compati-
ble with his personal routine and with his own

use of the climatological data. Most observers

take their observations either in the early morn-
ing (near 0800), in the late afternoon (near

1700), or at the seasonally varying hour of sunset.

Nearly all of them share an understandable re-
luctance to read their extreme thermometers at

midnight.

The typical long-record cooperative station has
been manned during its history by a succession of
observers, sometimes succeeding generations of a
single family. Many changes of observer have
coincided with changes in station location, but

others evidently have not [9]. In the latter in-
stances, as in the former, it is reasonable to expect

a discontinuity of record, inasmuch as no two
observers are likely to have identical observing
schedules. There is also the possibility, of course,
that an observer finds it necessary to vary his
observation time at regular or irregular intervals,
however slightly, and not be disposed to note the

fact on his observation forms. The essential ques-
tion to be answered is: what is the typical range

of error introduced into climatological temperature
records by these events?”

$$$$$$$$$$$$$$$$$

next up is Schall agronomy department at Purdue, circa 1970s

https://drive.google.com/file/d/1dlhxW2AelcBQRC3IaDHyuQkHH5ygBadA/view?usp=sharing

I include Schall because you can see the issue this bias has on agriculture and predicting fossil fuel
use for industry.

Next up is Baker, soil science

https://drive.google.com/file/d/1csFut2iL9E9rZLZOuyzugWlNLSbbs1tO/view?usp=sharing

If you want to understand TOB you have to read baker.

next up is Head. Phd thesis so it’s clear and easy to understand and has a good bibliography

https://drive.google.com/file/d/15aB5Qy4U3bMWFve3V5DE-HjvrF2BHFby/view?usp=sharing

Since TOB is SYSTEMATIC BIAS, the bias is amenable to correction. Head’s Phd is one of the first
attempts to estimate this systematic bias for a large USA area.

next is Karl 86

https://drive.google.com/file/d/1bPxk6OXT-AYTMxH9ykChKe_2M9Yb0tkw/view?usp=sharing

A few points about karl

1. It forms the basis of the code used to correct for the systematic BIAS that comes when you CHANGE
your time of observation.
2.The Correction is USA specific. It only works with USA data. This is why it isn’t used on global
data. Also, Countries outside the USA don’t have this problem generally. The USA does because
we mixed FIRST ORDER stations ( midnight to midnight) with CIVILIAN VOLUNTEER data COOP data.

3. TOB adjustments are POSITIVE and NEGATIVE
4. TOB adjustment are different MONTH TO MONTH
5. TOB adjustments are different for different parts of the USA.

lastly is Vose

https://drive.google.com/file/d/1PCzXXAhi8z9PvlDkbN6XHvI0J-1ydjmZ/view?usp=sharing

Vose is a verification of the adjustment code.

now, everyone has a cow about TOB. but

1. the bias is real, documented in the literature back to 1890
2. the skeptic John Daly CONFIRMED the existence of this bias.
3. The Bias is systematic but geographically and seasonally complex
4. The bias can be positive, negative, big, small, and seasonally and geographically dependent.
5. IF you never CHANGE your time of observation there will be NO TREND ERROR
6. In the USA volunteers changed their time of observation. This introduced TREND errors
7. the trend errors happen in Tmin, Tmax, and Tavg.
8. Volunteers often changed their time of observation multiple times!
9. the bias does not come SOLEY from “double counting” of TMAX, TMIN can also be
double counted and there is a end of month drift problem.
10. In the USA summer months are least impacted.
11. TOB is used exclusively in USHCN and nCLIMDIV
12.USHCN and nCLIMDIV run TOB first, and then Pha
13. USHCN is not an official data collection anymore, nCLIMDIV is official record of the USA
14. USHCN uses 1218 stations derived from 11 different original sources.
15. nCLIMDIV uses 10000 stations derived from even more sources than USHCN
16. Most global climate records use GHCN M (monthly) as their source
17. GHCN M does not use TOB adjustments. it uses PHA.

the simple fact is PHA can also “find” and correct any TOB so the TOB program is redundant.

PHA is here
ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3/software/52i/

it has been open and available for test forever.