The first press release announcement thread is getting big and unwieldy, and some commenters can’t finish loading the thread, so I’m providing this one with some updates.
1. Thanks to everyone who has provided widespread review of our draft paper. There have been hundreds of suggestions and corrections, and for that I am very grateful. That’s exactly what we hoped for, and can only make the paper better.
Edits are being made based on many of those suggestions. I’ll post up a revised draft in the next day.
2. Some valid criticisms have been made related to the issue of the TOBS data. This is a preliminary set of data, with corrections added for the “Time of Observation” which can in some cases result in double max-min readings being counted if not corrected for. It makes up a significant portion of adjustments prior to homogenization adjustments as seen below in this older USHCN1 graphic. TOBS is the black dotted line.
TOBS is a controversial adjustment. Proponents of the TOBS adjustment (Created by NCDC director Tom Karl) say that it is a necessary adjustment that fixes a known problem, others suggest that it is an overkill adjustment, that solves small problems but creates an even larger one. For example, from a recent post on Lucia’s by Zeke Hausfather, you can see how much adjustments go into the final product.
The question is: are these valid adjustments? Zeke seems to think so, but others do not. Personally I think TOBS is a sledgehammer used to pound in a tack. This looks like a good time to settle the question once and for all.
Steve McIntyre is working through the TOBS entanglement with the station siting issue, saying “There is a confounding interaction with TOBS that needs to be allowed for…”, which is what Judith Curry might describe as a “wicked problem”. Steve has an older post on it here which can be a primer for learning about it.
The TOBS issue is one that may or may not make a difference in the final outcome of the Watts et al 2012 draft paper and it’s conclusions, but we asked for input, and that was one of the issues that stood out as a valid concern. We have to work through it to find out for sure. Dr. John Christy dealt with TOBS issues in his paper covered on WUWT: Christy on irrigation and regional temperature effects
Irrigation most likely to blame for Central California warming
A two-year study of San Joaquin Valley nights found that summer nighttime low temperatures in six counties of California’s Central Valley climbed about 5.5 degrees Fahrenheit (approximately 3.0 C) between 1910 and 2003. The study’s results will be published in the “Journal of Climate.”
Most interestingly, John Christy tells me that he had quite a time with having to “de-bias” data for his study, requiring looking at original observer reports and hand keying in data.
We have some other ideas. And of course new ideas on the TOBS issue are welcome too.
In other news, Dr. John Christy will be presenting at the Senate EPW hearing tomorrow, for which we hope to provide a live feed. Word is that Dr. Richard Muller will not be presenting.
Again, my thanks to everyone for all the ideas, help, and support!
=============================================================
UPDATE: elevated from a comment I made on the thread – Anthony
Why I don’t think much of TOBS adjustments
Nick Stoke’s explanation follows the official explanation, but from my travels to COOP stations, I met a lot of volunteers who mentioned that with the advent of MMTS, which has a memory, they tended not to worry much about the reading time as being at the station at a specific time every day was often inconvenient.. With the advent of the successor display to the MMTS unit, the LCD display based Nimbus, which has memory for up to 35 days (see spec sheet here http://www.srh.noaa.gov/srh/dad/coop/nimbus-spec.pdf) they stopped worrying about daily readings and simply filled them in at the end of the month by stepping through the display.
From the manual http://www.srh.noaa.gov/srh/dad/coop/nimbusmanual.pdf
Daily maximum and minimum temperatures:
· Memory switch and [Max/Min Recall] button give daily
highs and lows and their times
The Nimbus thermometer remembers the highs and lows for
the last 35 days and also records the times they occurred. This
information is retrieved sequentially day by day. The reading
of the 35 daily max/min values and the times of occurrence (as
opposed to the “global” max/min) are initiated by moving the
[Memory] switch to the left [On].
So, people being people, rather than being tied to the device, they tend to do it at their leisure if given the opportunity. One fellow told me (who had a Winneabago parked in is driveway) when I asked if he traveled much, he said he “travels a lot more now”. He had both the CRS and MMTS/Nimbus in his back yard. He said he traveled more now thanks to the memory on the Nimbus unit. I asked what he did before that, when all he had was the CRS and he said that “I’d get the temperatures out of the newspaper for each day”.
Granted, not all COOP volunteers were like this, and some were pretty tight lipped. Many were dedicated to the job. But human nature being what it is, what would you rather do? Stay at home and wait for temperature readings or take the car/Winnebago and visit the grand-kids? Who needs the MMTS ball and chain now that it has a memory?
I also noticed many observers now with consumer grade weather stations, with indoor readouts. A few of them put the weather station sensors on the CRS or very near it. Why go out in the rain/cold/snow to read the mercury thermometer when the memory of the weather station can do it for you.
My point is that actual times of observation may very well be all over the map. There’s no incentive for the COOP observer to do it at exactly the same time every day when they can just as easily do it however they want. They aren’t paid, and often don’t get any support from the local NWS office for months or years at a time. One woman begged me to talk to the local NWS office to see about getting a new thermometer mount for her max/min thermometer, since it wouldn’t lock into position properly and often would screw up the daily readings when it spun loose and reset the little iron pegs in the capillary tube.
Some local NWS personnel I talked to called the MMTS the “Mickey Mouse Temperature System” obviously a term of derision. Wonder why?
So my point in all this is that NWS/NOAA/NCDC is getting exactly what they paid for. And my view of the network is that it is filled with such randomness.
Nick Stokes and people like him who preach to us from on high, never leaving their government office to actually get out and talk to people doing the measurements, seem to think the algorithms devised and implemented from behind a desk overcome human urges to sleep in, visit the grand-kids, go out to dinner and get the reading later, or take a trip.
Reality is far different. I didn’t record these things on my survey forms when I did many of the surveys in 2007/2008/2009 because I didn’t want to embarrass observers. We already had NOAA going behind me and closing stations that were obscenely sited that appeared on WUWT, and the NCDC had already shut down the MMS database once citing “privacy concerns” which I ripped them a new one on when I pointed out they published pictures of observers at their homes standing in front of their stations, with their names on it. For example: http://www.nws.noaa.gov/om/coop/newsletters/07may-coop.pdf
So I think the USHCN network is a mess, and TOBS adjustments are a big hammer that misses the mark based on human behavior for filling out forms and times they can’t predict. There’s no “enforcer” that will show up from NOAA/NWS if you fudge the form. None of these people at NCDC get out in the field, but prefer to create algorithms from behind the desk. My view is that you can’t model reality if you don’t experience it, and they have no hands on experience nor clue in my view.
More to come…
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.

![USHCN-adjustments[1]](http://wattsupwiththat.files.wordpress.com/2012/06/ushcn-adjustments1.png?resize=640%2C465&quality=75)
Sure, but if the data (records of times temperatures were taken) exist and introduced biases can be removed mathematically following recognized, sound procedures, why not do so?
Christopher Dollis: “Now that I understand it better, I don’t agree — not at all:”
Fair enough. But in the quoted example you gave that would be true of every cold day that followed every warm day. And every warm day that followed a cold day. And so on. No matter what single specific time we choose for the reset of the Tmin/Tmax we are always taking a single hard mark over the previous 24 hour frame we’re dragging along. And then erasing the entire frame of data. We cannot do otherwise as we only have a single recording device.
This is similar to, but not synonymous with, Nyquist frequencies. Such that if we wish to avoid any issues arising from a specific time at which to reset the device then we need a pair of devices. Each recording the extremums over a 24 hour period but in which their resets are staggered on 12 hour periods. Any and every other solution is going to fall prey to the specific case you quoted. (Though this only allows us some continuity over daily periods, it does not allow us to speak intraday. For which we’d need 4 devices on staggered 6 hours resets. Which only allows us to … ad nauseum.)
For all else in regards human issues, which are certainly part of the instrumentation method, see Mr. Watts’ reply.
As with others, I find the TOBS issue somewhat perplexing and, to be honest, I do think it is looking for and making unnecessary complications..
My impression is that TOBS is the result of having the modern continuous Temp instruments and the realisation that the (Tmax-Tmin)/2 values do not actually represent anything like an average ‘temperature’ of any given day? Even manual recorded TOBS temps (the actual temp at time of obs) is essentially useless except for comparison to the Tmax and Tmin recorded at the same time to note whether the TOBS was ‘between’ Tmax and Tmin? I can’t imagine there are many stations where manual TOBS temp coincided with either Tmin or Tmax on a regular basis?
Hence, there seems little point in trying to adjust old manual data – it is like putting lipstick on a pig, and it is much better to separate the old and new datasets and use them independently.
The new datalogging type temp records can always be made to look like the old records (by a simple algorithm to ‘extract’ standard Tmax and Tmin?) if we need to make a general comparison to older data – but old data cannot be made up to modern standard without introducing some bias via various assumptions, etc, etc.
My general impression is that the older RAW data should be able to show obvious step changes from TOBS changes (if they are there), when plotted – kind of in a similar way to station relocations? But overall, I would expect them to be very very small effects? – I dunno, perhaps changing TOBS could cause a local dip or rise in the ‘average’ for a few days – but after that it would settle down again? Does anyone seriously think this would have a massive effect on long term trends? Has anyone ever done such an analysis? I presume it would require the manual records to be available, with details of the TOBS changes too? I would certainly prefer this kind of approach rather than some estimated/assumed guesswork on the actual effect of TOBS being ‘added’ en masse to millions of data points!
Kev
Anthony,
“TOBS adjustments are a big hammer that misses the mark based on human behavior for filling out forms and times they can’t predict.”
There’s nothing that can be done to improve data where observers were writing incorrect information on the forms. But if you believe what they wrote on the forms is worth taking account of, then it is there on the record, and can be analyzed by people sitting in offices as well as anyone.
The 2009 BAMS paper of Menne et al has a Fig 3 which shows the trend of observation times that observers actually reported. And Fig 4 shows the resulting effect of a TOBS adjustment on trends, based on the Fig 3 data and the known diurnal and day-to-day variability.
REPLY: Noooo…Fig. 3. Changes in the documented time of observation
in the U.S. HCN. is about the times they assigned the observers. There’s no proof the observers adhere to it. – Anthony
While MMTS TOBs is more complicated, there are also a lot of CRS stations that experienced TOBs changes in the 1979-2010 period, and that definitely introduces bias. It should also be reasonably easy to test the MMTS data (coupled with the TOBs recorded in that station metadata) and see if there are any detectable correlations between TOBs changes and temp changes. I might play around with the data once I get bandwidth; anyone know if the TOBs metadata is available in an easily digestible form?
REPLY: What Zeke illustrates is what I see as the biased nature of the whole network. There’s so many problems that all sorts of issues have to be invented to deal with it, and all the corrections they invent go UP. And like NCDC, Zeke’s trying to find the answers in the data, rather than looking at the reality of the measurement environment. Nobody wants to deal with that side of it. Color me unimpressed with the whole adjustment crusade. – Anthony
Anthony’s update certainly hits the mark with my experience of human behaviour. You get exactly what you pay for and ‘police’. And if that is ‘nothing’ and ‘not at all’ respectively then some folks will react accordingly. You will get poor quality work. We all do it to a greater or lesser extent. To pretend otherwise is naive and simplistic.
The more I read and understand of how this basic ‘climate’ data has been collected and processed and massaged and adjusted interpreted, the more certain I am that it is just not fit for the purposes to which it is put. 4
And when your base data is so fundamentally flawed, any theories or predictions that rely on it are pretty much useless also. Which seems to cover a huge swathe of climatology.
John Trigge (in Oz) says:
July 31, 2012 at 8:52 pm
Anthony,
I seem to recall but cannot find it now, that your original interest was to investigate the effect of different paint surfaces on Stevenson Screens. From that study you determined that the surface coating did have some effect on the temperature readings.
Where in the ‘adjustments’ is this taken into account and, if not, why not?
================================================================
Seems to be an element of the smart Alec in you post….. But you do hit on a valid point, the point being of course, that you can’t do “adjustments” for all the confounding elements intrinsic within the surface temp data…. The real fact is, these temperature measurements were never meant to be used to the resolution and exactitude that they are now being subjected to. I think this is Anthony’s opinion also… That’s why he said that even the condition of the Stevenson screen paint will have an effect on the temperature recorded inside, which then morphed into the fact the neglect, condition and siting was even worse at these sites then could have ever been imagined.
So basically, the surface record should be scrapped and the Satellite record adopted. The surface observation record is not really scientific data beyond the purpose that it was originally intended, which was simply Meteorological observations for the purpose of weather forecasting and general information…. not data collected for determining anthropogenic CO2’s effects on temperature and global climate down to a resolution of a 1000th of a degree F.
…. The “adjustments” end up creating bias and compounding errors…. which is the subject of this paper that Anthony is doing.
I hope Anthony agrees with my summation of his views and motives…Pretentious poppet that I be.. 😉
John Trigge (in Oz):
” … effect of different paint surfaces on Stevenson Screens … ”
Sure does. I have an RH/Temp logger set up in a semi-compliant Stevenson Screen. Siting probably not as bad as a “5” under Leroy 2010, but maybe not far off 🙁
Here at 19°S NQ late July I was getting ‘spikes’ from the low angle morning sun, one of which was higher than the maximum at 3pm. Additional (3rd) coat of paint made a difference. I might try cenospheric paint if I can find a sample pot, or the spheres themselves – preferably a metric handful rather than a 500kg bag 🙂
johanna says:
” … I’m guessing that one of the reasons for [StevM’s] frustration is that the significance of TOBS varies a great deal by location. ”
At least there is now scope for further study of TOBS, Leroy Classification etc. Data loggers starting around $41 with an accuracy of +/- 0.5 C, 12 hour sampling. My rig set up for less than $200: original purpose was to learn about data logging for assessing thermal performance of building materials.
I have a horrid feeling that if my comment is read I shall reveal to those readers my lack of knowledge. With regard to TOBs with the max/min thermometers I’ve used the maximum and minimum temperatures are clearly identifiable. I can see that if the thermometer isn’t read until very late the next day then “today’s” temperatures may well obscure “yesterday’s” temperatures. Possibly advances in technology have overcome that but for earlier readings, human nature being what it is, how can the TOB be accurately known? On another point, which I suspect has been done to death, knowing the SD or SEM of the decadal trends would seem very necessary to see if differences from the different groups of surface station are or are not statistically significant
From Nick Stokes on July 31, 2012 at 11:36 pm:
“Not Found”
File can be found here:
ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/
And why are you messing with Goddard’s old “hacked” site anyway, that Steve said had been taken over by someone he thought he could trust but was proven wrong?
Gee Nick, you’re not verifying your links. On the predecessor thread you couldn’t calculate temperature trends correctly. Is something bothering you to distraction, buddy?
Nick Stokes says:
July 31, 2012 at 11:36 pm
There’s nothing that can be done to improve data where observers were writing incorrect information on the forms.
—————————————
I would have 2 suggestions to deal with that:
1. Compare COOP stations with other stations with “really” known observation time.
2. Delete ALL entries, where either tmin or tmax is double on consecutive days and fill the gaps with gap filling algorithm which are already available.
Nick Stokes
Why wouldn’t the TOB adjustment simply be a single, identifiable, “step” change, easily discerned in the data?
And as Anthony notes considering the why would there be ANY need for a TOB adjustment for the Max-Min recording stations that have been in use for a long time now?
If I recall MMTS systems are well over 50% of the stations?
Konrad: “To make a valid TOB adjustment you would need to know whether an individual station was making evening or morning readings of a mercury thermometer, and if and when the reading time for that station changed …”
Excellent point.
Given that Anthony is only referencing the period from 1978-2008, then something like 50% of the stations would have ALREADY changed TObs (Time of Observation) from evening to morning. So these cannot be an issue. See DeGaetano 2000
http://journals.ametsoc.org/doi/pdf/10.1175/1520-0477(2000)081%3C0049%3AASCSOT%3E2.3.CO%3B2
Note: the above paper links to a dataset for all the HCN stations that identifies the TObs changepoint. So if push comes to shove, it should not be a great amount of work to use this or even to go through each site’s COOP B-91 forms (as they have the TObs on it) and confirm EXACTLY when and IF a change in TObs occurred on each of Anthon’y sites. And if necessary make an appropriate adjustment to the site record.
As to the size of the TOB likely overall adjustment… Menne et al 2009 say: “The net effect of the TOB adjustments is to increase the overall trend in maximum temperatures by about 0.015°C decade−1 (±0.002) and in minimum temperatures by about 0.022°C decade−1 (±0.002) during the period 1895–2007.” And in a pre-print here:
https://ams.confex.com/ams/pdfpapers/141108.pdf
Say: 0.012°C dec-1 and in minimum temperatures by about 0.018°C dec-1 over the period 1985-2006
This is a tenth of the trend Anthony has identified [i.e. “The Raw Tmean trend for well sited stations is 0.14°C per decade lower than adjusted Tmean trend for poorly sited stations”] – and given that the TOB issue affects at most some 30-40% of the stations used – the impact should be even less!
Time of Observation Bias may not really be a major issue and in no way invalidates Anthony’s paper’s main thrust that: good stations are adjusted by bad.
Anthony … can I suggest making the new paper a small linked “badge” and place it at top of right hand column for easy reference?
So if I understand correctly the TOBS issue means that if I take a high reading at say 3PM each day and tomorrow the high doesn’t reach as high as the temp at 3.01PM today then tomorrow’s high reading will be as at 3.01 PM today so there would tend to be a warm bias in the data for daily highs. Similar and opposite if I took low readings at 3AM and the next night was warmer I’d get an artificial low for tomorrow.
So I’d suggest that if most readings were taken during the day there would be a tendency to higher than real Tmax on average and if most readings were taken during the night there would be a tendency to be lower than real Tmin.
But that wouldn’t affect the trend of Tmax readings or the trend of Tmin readings as the bias is built in all along the time series unless practice changes at some point.
And wouldn’t most readings be done at start of day or end of day so lower risk of an issue.
So I need to understand why TOBS is an considered a significant issue for the trend results of Anthony’s study.
Damn good point, Anthony.
But for those who would like to see the difference in reading time monthly means, I took the hourly data from the airport near me via wunderground and computed the midnight, 7am, and 5pm monthly averages. All am readings adjusted to 7am criteria and all pm readings to 5pm criteria. That’s what I remember from the Karl paper. Also, they do some slight adjust for the end of month discrepancy.
March 2012
Midnight: Max = 47.3F(8.5C), Min = 29.0F(-1.7C), Avg = 38.2F(3.4C)
7am: Max = 46.8F(8.2C), Min = 27.3F(-2.6C), Avg = 37.1F(2.8C)
5pm: Max = 49.4F(9.7C), Min = 29.3F(-1.5C), Avg = 39.4F(4.1C)
Guess NOAA should do a survey on those issues you brought up, Anthony. They might be surprised at what is happening out there in the network.
The TOBS question is very simple. If your paper is discussing station quality bias, you could compare two different methods of combining TOBS adjusted data. That would answer the question which the paper superficially seems to address.
If you wish to address ALL of the adjustments, you need to include them all in your analysis. This is clearly a much more complex problem, so performing the first (quality based) analysis is the obvious first step. I guess this will be as inconclusive as the Fall 2011.
In changing 2 variables at once, you loose the ability to identify which variable is responsible for the result you observe. Post TOBS has a higher trend than pre-TOBS; well, yes… and?
Having commented on this paper elsewhere already, I feel slightly cheated.
The USHCN Version 2 Serial Monthly Dataset page:
http://www.ncdc.noaa.gov/oa/climate/research/ushcn/
Simple question but it jumps out at me.
Does the stepwise graph above show that observers were a lot less diligent from 1970 than previously? If so why?
I don’t believe they were but isn’t that an implication of the adjustments?
The TOBS adjustments are not there to correct any mathematical or observational failures, they are there primarily to all yet another round of temperature nudging toward the great holy goal of Global Warming. The observations in the field, shown in the article, reveal just how human nature really works and how temperature recording will be done in practice. Such is life!
Anthony,
From reading this thread and from gleaning comments on the warmist blogosphere, I think TOB is the main issue that this paper’s critics will use to attack it, and has got to be comprehensively addressed.
Regardless of how and whether you finally use TOB adjustments I think the paper needs a lengthy section describing the issue, like your …
“Why I don’t think much of TOBS adjustments”
…obviously with a more scholarly tone.
However, anecdotal evidence doesn’t look good in a scientific paper, no matter how compelling.
Weather station technology has changed over time, and I think there is no-one more informed about the details and implications of that for the United States dataset than you.
You mention the “advent of MMTS, which has a memory.” Are these “advents” sudden enough to demonstrate a step change in recordings? Is it known when the equipment changed at each station? Failing that, even the dates of introduction of new MMTS devices and subsequent sales figures might be instructive.
I think a clear before and after (introduction of MMTS and similar) comparison of data quality (both at individual stations and across the network) would be useful.
”REPLY: Noooo…Fig. 3. Changes in the documented time of observation
in the U.S. HCN. is about the times they assigned the observers. There’s no proof the observers adhere to it. – Anthony”
Precisely! – but this demonstrates a couple of important points:
1) the (old) data we have has to be taken at FACE value only – massaging it for supposed/expected errors could and would likely only compound any real errors – especially if the supposed errors aren’t actually there at all, or are perhaps intermittent!
2) apart from for clearly obvious and well recorded ‘introduced’ errors, such as station relocations, instrument/thermometer changes, etc – actual data ‘alteration’ is rather silly IMO. In any data series, it would be better to always produce and maintain the data as separate sections (as required) of the series, such that a detailed graph of the alterations alongside (or on top of) a graph of the raw data could always be referred to – in a suitable scale so that the alterations can be clearly seen and visually judged to see if they appear appropriate. And there’s the other rub – climate variation isn’t a smooth graph – so how can you make an assumption that a sudden temp change did not actually occur in the historical record? – you have to be very careful in making those assumptions, presumably aided by comparing with nearby stations, etc.
3) Maintaining the data as individual sections, rather than trying to meld them together as a complete series must be considered paramount – that way, all and any subsequent adjustments can be reviewed and assessed as potentially valid or not, especially if they are applied as a ‘blanket’ type adjustment to a melded series. Constantly updating and joining data and calling it ‘good’ – is not the way to work.
4) Homogenising data between stations must surely be evident as potentially highly flawed if the base data is itself flawed? Gridding data and averaging, etc – even moreso!
As an old school scientist and engineer it seems to me that the original data is always to be preserved, even if just to hold it up as a pile of cr*p! Manipulation and reproduction ALWAYS introduces further errors – even if just from simple typos?….
If one produces a piece of work saying I’ve adjusted this to get that – it should always present the before and after case and detailed reasoning. It strikes me, that in the historical climate data sense, this is not adhered to – and we have different versions of the data, and presumably adjustments upon adjustments, etc, etc. The question now is – what and where the flip is the ‘real’ data?
In my recent post on my blog, Anthony, I have conceded that there may be some validity to your criticisms (but so that you cannot accuse me of blatant self-promotion, I will not even attempt to include a link to it). However, the fact remains that the data you have examined relates to 2% of the Earth’s surface (within which you accept 50% of warming is real). Despite this, you seem to want the World to believe that you and your colleagues have uncovered the real story; and that the vast majority of climate scientists are insidious, incompetent, or simply imbeciles. But, be honest, how likely is that? Indeed, is it more or less likely that the WTC was brought down by a team of controlled demolition experts?
Also, if global warming stopped in 1998, perhaps you can also explain to me why May 2012 in the USA was… “the 327th consecutive month in which the temperature of the entire globe exceeded the 20th-century average, the odds of which occurring by simple chance were 3.7 x 10-99, a number considerably larger than the number of stars in the universe.” — Bill McKibbin (Rolling Stone magazine)
So what is the audit regime of the data collection?
How do the organisers check that the data is accurately reciorded at the right times.
I’m asking about the quality system that lies behind the data collection. There must be a written system. Does anyone know if it’s any good?
Christoph Dollis says:
July 31, 2012 at 6:00 pm: “You are correct, the pre-posting of this has been a remarkable success ….”
I think he jumped the gun by a week myself.
Thereby accruing seven days of outside review for error-spotting in data, methodology, descriptions and graphs; correction/explanation of ambiguous sentences, spelling, and grammatical usage; and addressing areas which may warrant additional scrutiny.
I think that was a pretty good idea, myself.