"Gross" Data Errors in GHCN V2. for Australia

UPDATE: 11/11/10 An errata has been posted, see the end of this essay – Anthony

Port_Hedland_Station_View_2006 — Port Hedland, WA BoM office

Guest post by Ed Thurstan of Sydney, Australia

Synopsis

This study shows that the NOAA maintained GHCN V2 database contains errors in calculating a Mean temperature from a Maximum and a Minimum. 144 years of data from 36 Australian stations are affected.

Means are published when the underlying Maximums and/or Minimums have been rejected.

Analysis

The Australian Bureau of Meteorology (BOM) provides NOAA with “entirely raw instrumental data via the Global Telecommunications System”. In the process of comparing BOM Max and Min outputs with NOAA “Raw” inputs, some oddities were noticed.

A database of Australian data (Country 501) was set up for each of GHCN V2.Max, V2.Mean, V2.Min. Each record consists of WMO Station ID, Modifier, Dup, Year, then 12 months of data Jan-Dec.

“Modifier” and “Dup” are codes which allow inclusion of multiple sets of data for the same station, or what appears to be the same station. This data is included rather than losing it in case it may be useful to someone. For this exercise, Modifier=0 and Dup=0 was selected.

Only those stations and years where all 12 months of data are present were selected. This results in about 14,000 station-years of monthly data being compared.

A compound key of Station ID concatenated with year was set up.

From Max and Min, an arithmetic mean was calculated to compare with V2.Mean.

Observation 1.

NOAA always rounds up to the nearest tenth of a degree in calculating V2.Mean.

Calculating (Reported V2.Mean – Calculated Mean) mostly gives a result of zero or 0.5 as shown in this example:

This appears to be poor practice, when the usual approach to neutralising bias is to round to the nearest odd or even number. However, the bias is small, as units are tenths of a degree.

This observation led to the discovery of larger errors.

Observation 2.

The difference between reported V2.mean and the calculated mean can be substantial.

Here is a cluster of (Reported V2.Mean – Calculated Mean):

For example, Station 94312 (Note: Port Hedland, Western Australia – Photo added: AW)

Port_hedland_enclosure — Port Hedland, WA BoM instrument enclosure - Source: http://www.bom.gov.au/wa/port_hedland/

Port_hedland_wxstation_aerial — Port Hedlan BoM station from the air - click to enlarge

In March 1996 shows that the reported GHCN V2.mean figure is 1.15^oC lower than the mean calculated from V2.max and V2.min.

There is no obvious pattern in these errors.

As a spot check, the raw data from GHCN V2 for station 94312 in 1996 is as follows:

The arithmetic mean for March should be (377+256)/2 = 316.5

But NOAA has calculated it as 305. An error of 11.5 tenths of a degree.

WMO Station 50194312 is BOM Station 04032.

Here are the monthly averages calculated from BOM daily data:

With one exception, they are within 0.1^oC of the NOAA figures. The exception is 0.2^oC.

There are 144 years of data from 36 Australian stations affected.

GISS V2 Carries NOAA’s version of V2.Mean. So GISS will be propagating the error.

Full Error List

The full error list of stations is available on request. It comprises 144 years of data from 36 Stations.

Observation 3.

Unless there is a severe problem in transmitting BOM data to NOAA, then NOAA’s quality control procedures appear to reject a lot of superficially good BOM data.

When this happens, NOAA replace the suspect data with “-9999”, and write a QC.failed record.

GHCN V2.mean now contains many instances where a mean is reported, but the underlying V2.max and/or V2.min are flagged -9999. That is, they are not shown.

For example, station 50194312 (BOM 0432) shows:

Spot check. Following is matching raw data from GHCN V2 for checking purposes:

Note that Means are published when corresponding Max and Mins are absent in Jan, Feb and April.

The corresponding BOM raw daily data for 1991, 1995 and 2005 was checked. It is complete, with the exception of three days of 1991 minimums in May 1991. Two of these days have missing results. The third is flagged with a QC doubt. Note that this BOM data comes from the present BOM database, and may not be what went to NOAA in earlier years.

Here is the BOM data corresponding to the NOAA product:

And here are the differences, BOM – GHCN

Here we can see substantial corrections to input data, especially in 2005.

V2.max.failed was checked for data from this station. There is only one entry, for 1951. V2.Mean.failed referred to the same 1951 QC failure. V2.min.failed also has a single entry for October 2004.

Summary

There is a lot of published criticism of the quality of NOAA’s GHCN V2. I now add some more.

In my profession, errors of this sort would cause the whole dataset to be rejected. I am astonished that the much vaunted NOAA quality control procedures did not pick up such gross errors.

The error is compounded in the sense that it propagates via V2 into the GISS database, and other users of GHCN V2.

Appendix – Source Data

The GHCN V2 database, giving Annual and Monthly data, is available at ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2. The file create date of the set used in this study was October 15, 2010.

The Australian Bureau of Meteorology (BOM) supplies raw instrument data to NOAA electronically. This data is accessible on the interactive BOM site at:

http://www.bom.gov.au/climate/data/

This is daily max and min data, and should be the data supplied to NOAA.

Ed Thurstan

thurstan@bigpond.net.au

October 20, 2010

=================================================================

UPDATE VIA EMAIL:

Hi Anthony,

I made an error in comparing GHCN data against Aust. BOM data. A Graeme W spotted it, and I have just posted a correction in the comments. I have offered to email anyone a corrected report.

I chose 1991, 1995 and 2005 data to compare GHCN and BOM. 1991 and 1995 comparisons are correct. But I inadvertently compared 2005 GHCN data against 2007 BOM data. (2007 also exhibits the GHCN error at issue in the report.)

ERRATA

In the section where I compare BOM data against GHCN data to highlight corrections made to GHCN input data, I inadvertently compared 2005 GHCN to 2007 BOM data. The offending data for 2005 should read

BOM MAX Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2005 39.4 38.2 38.4 38.2 33.3 26.4 27.9 28.6 31.4 33.6 37.2 36.3

BOM MIN Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2005 26.7 27.3 26.4 23.7 18.5 14.7 14 13.8 15.6 18 20.8 25

BOM MEANJan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2005 33.05 32.75 32.4 30.95 25.9 20.55 20.95 21.2 23.5 25.8 29 30.65

DIFFERENCES

MAX Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2005 0.0 0.0 0.0 – 0.0 0.0 0..0 0.0 0.0 0.0 0.0 0.0

MIN Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2005 0.0 0.0 0.0 – 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

MEAN Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2005 1.2 0.9 0.9 1.0 0.8 0.7 0.8 0.7 0.7 0.7 0.7 0.8

The correction does not diminish my argument in any way. The same type of effect would be apparent if 2007 GHCN were compared against 2007 BOM data.

Apologies to all for the error.

0 0 votes

Article Rating

96 Comments

Inline Feedbacks

View all comments

amicus curiae

November 11, 2010 7:42 am

James Sexton says:
November 10, 2010 at 12:26 pm
Wouldn’t it be nice, if just once, someone in the field would engage in self audits to a point where they can find and admit and correct mistakes?
========
now why? do something that sensible and honest??
after all the TRUTH wont scare anyone into submission for an ETS Carbon Tax.
as an aussie I am purely disgusted with Bom, CSIRO and our govt and the ABC.
all in cahoots- and all an epic FAIL!

dixon

November 11, 2010 7:48 am

Thanks steven mosher for explaining about the average temp being an unbiased estimate of the mean. I’d been fretting about that when the daily cycle of hourly data from Perth, WA is so unsymmetrical. But I confess to being too lazy/inept to figure out how such a major flaw could have survived (I like to assume some degree of competence on both sides).

amicus curiae

November 11, 2010 7:57 am

Enneagram says:
November 10, 2010 at 2:39 pm
Evidently Post Modern Science is more lost than Adam on Mother’s Day 🙂
===
now that!! is funny!
the BoM…is NOT!

steven mosher

November 11, 2010 8:22 am

Chris Gillham says:
November 10, 2010 at 10:55 pm
NOAA is still putting 999.9 error codes into the GHCN database, which is then being used by GISS, even though BoM data is available for the relevant month.
########
One think that would be helpful is to put the problem into perspective for people.
If you write a program you can count the number of times the BOM has data that NOAA does not pick up. Then, you can do a summary and show rather easily in one number the impact this has on total record for australia.

Steven Mosher

November 11, 2010 1:28 pm

dixon says:
November 11, 2010 at 7:48 am (Edit)
Thanks steven mosher for explaining about the average temp being an unbiased estimate of the mean. I’d been fretting about that when the daily cycle of hourly data from Perth, WA is so unsymmetrical. But I confess to being too lazy/inept to figure out how such a major flaw could have survived (I like to assume some degree of competence on both sides).
########
when I first started looking at temperatures ( back in 2007 ) these are the things that immediately got my attention and which i set to work on trying to understand by looking at the data and running tests for myself.
1. The accuracy issue. How can we measure something to 1/10ths when the instrument is not good to 1/10th? That was easy. The Law of Large numbers.
2. The rounding issue. For the US we do this. We take the min in F ( round it) we take the max in F( round it) we average the two. ( round it) we then take a monthly average and report out to many decimal places. By creating series of of random numbers where I knew the GROUND TRUTH I could then simulate this process and show that rounding didnt bias the answer.
3. the min/max problem. Since I have worked in an industry built on Shannon and Nyquist the idea that you could get the “average” by sampling the min and the max made no sense whatsoever. But after spending time with the real problem I understood what they were doing. From a historical perspective we have several different forms of temperature collection:
A. min/max for the day.
B. 4-6 measures per day
C. hourly measures.
D. sub hourly measures ( today we have over 7 years of temperature every 5 minutes
from 3 different sensors at the same location– see the CRN network)
IF the problem you want to solve is figuring out whether or not the temperature has gone up, you are stuck with the historical min/max approach. That’s what they did in 1900. we have no time machine. So that is the metric you have. A physicist can look at this and say “if you integrate the function you will get a different answer than if you sample it twice” Well, That’s not the question. The questions are:
A. if you want to compare today to yesterday should you use the same method?
B. Does min/max bias high or bias low or does it give a random error?
C. does the trend derived from min/max differ from the trend derived from integrating the function? and is that difference biased high or low?
Once you realize that you can easily find the solution. You could prove it mathematically ( I suppose) but I just did it experiementally with real data
and with simulated data. Again, you create a temperature series that has the
statisical properties of the typical series ( down to the second if you like) you
then apply a random distribution of trend components. You calculate ground truth.
Then you test your sampling approach and only sample min/max. You then compute trends and see if the trend is bias 0 or not. And in the end you find out that it just doesnt matter. So Physically the integrated Mean does not equal the average of min and max. But the average of min/max ESTIMATES the mean and that estimate is unbiased. that’s the best we can do. The perfect should not be the enemy of the good.
4. The station drop problem. Mathematically it does not matter. But again I proved it doesnt matter by several methds of looking at the data.
There are bunch of other problems that Might matter. But getting people to pay attention to those is hard. Anthony gets these problems. SteveMc gets these problems. People writing papers today (in submission) get these problems, but most skeptics are missing the key issues and focusing on other issues that don’t matter.

Graeme W

November 11, 2010 1:35 pm

JamesS says:
November 10, 2010 at 11:04 pm
What caught my eye first off is that for some reason, the raw temperatures are stored in the database as integers, with what I presume is a one significant digit decimal. 273 becomes 27.3, 315 becomes 31.5, etc.
I’ve been a database developer for 25 years now, and I’m stumped to think of a reason for those temperatures to be stored in that manner….

On the other hand, I’ve also been working with databases for 25 years now, and I can think of a couple of reasons. The first and simplest is disk space. If you go back 25 years, disk space was much more expensive than it is today. Programmers would try to minimise the amount of disk space large datasets would take up, and integers would commonly take up half the disk space of a float. Given that almost all the data in the dataset is numbers, that means storing the number data as integers would have resulted in a significant reduction in disk space required, with a considerable saving in money for the project at the time.
The other reason I could think of was processing time. Again, with lots of number crunching being potentially required, integer arithmetic was much faster than floating point arithmetic. Having the data as integers would have offered a decrease in processing time. Whether that was significant would be dependent on what power machines they had.
Neither reason is particularly valid today, but I remember in my early days in IT of always running out of disk space on the machines we had available to us (we weren’t a large organisation with almost unlimited resources available to us), so a trick like this to reduce the amount of disk space needed would have been seriously considered.

dixon says:
November 11, 2010 at 7:48 am
Thanks steven mosher for explaining about the average temp being an unbiased estimate of the mean. I’d been fretting about that when the daily cycle of hourly data from Perth, WA is so unsymmetrical. But I confess to being too lazy/inept to figure out how such a major flaw could have survived (I like to assume some degree of competence on both sides).

When you only have two figures (max and min), the mean (commonly called the average) is about the best calculation you can get, but, as you point out, it doesn’t allow for asymmetrical temperature patterns (especially if they’re not consistent). Given that BOM takes, I believe, half hourly readings, much better approaches are possible.
Personally, I would have thought that a median value of the half-hourly readings would be a better judge of the ‘average’ temperature for a day, rather than a mean. That is, if all the readings were sorted into increasing order, the one in the middle of that sort would be the median. That would eliminate any effects from temporary temperature spikes.
Alas, we don’t have historical half-hourly readings going back far enough, and you can’t mix median and mean averages and do any sort of fair comparison, so we’ll just have to stick with the mean, and the lowest-common-denominator approach of simply averaging the max and min for the day.

Tim Folkerts

November 11, 2010 1:57 pm

There seem to be legitimate concerns about the data presented here. As far as I am concerned, until the original author comes back to address these issues, it it not even worth worrying about this whole blog entry.
In effect, a peer review has been done and significant questions have been raised. If the author doesn’t address these issues, why should the rest of us have more interest in it than the author does?
Unfortunately, the unreviewed version has already been “published” and casual readers will get no more than ” “Gross” Data Errors in GHCN V2. for Australia” headline. If the author cannot address the questions sufficiently in a short period of time, then a retraction would be in order — perhaps ” Gross Data Errors spotted in “Gross” Data Errors in GHCN V2. for Australia.”

Ed Thurstan

November 11, 2010 3:41 pm

Mike
To equate to the daily temperature cycle, I think your integral should have been over [0,2pi]. But no matter, I agree that an integrated temperature is more satisfying to an engineer than a min/max average. But only AMO stations give this, and that would limit the data supply to maybe 30 years.
I invested in some of Anthony’s very excellent temperature dataloggers to investigate this. I am doing a very simple rectangular integration on hourly data to compare it with the min/max average. I don’t have summer data yet, but so far I am surprised. Plotting integrated temperature against min/max average, I expected to see the slope of the correlation change between seasons, but I can’t see it yet. But I am at 33South. So I checked hourly data from Canada, from 85North. Again, not much difference between summer and winter. I plan to do some more work.
RuhRoh
I don’t agree that “application of ‘judgement'” has a place in any science that relies on recorded numerical data. It is little different from consensus scientific decisions.
Rational Debate
The one way half adjustments do create a tiny bias, but otherwise, you are correct. The errors are simply indicative of sloppiness, lack of thought and inadequate quality control of the output of the GHCN principal product – a mean temperature. Would you accept the same sloppiness in a financial institute’s calculation of your mortgage repayment ?
MikeA, Nick Stokes, David Evans
You may well be right, but you are surmising. You don’t know, and nor do I. And NOAA does not tell us their method. Just how many days are you prepared to miss in a month and still believe the monthly average of those remaining days ? How about we reject any suspicious low temperatures – especially recent ones ? Can we apply this short month principle to annual data, so that we can report an annual temperature when a couple of months data are missing ? To paraphrase a well known person – NO WE CAN’T !
NOAA simply adjust suspicious (in their minds) data which they receive from BOM, and delete some. I can’t find out who calculates V2 Mean. NOAA ? or was it supplied by the Aust. BOM ? But I am a simple engineer, and I expect that when someone puts up two numbers and their mean, then unless qualified, I expect that mean to be calculated the way it always has been.
I once worked for a Govt. research establishment. If the objective of an experiment had been to calculate a daily average based on hourly recordings, and I missed one, the whole day’s work was lost. If I had gone along to my section leader and said “Boss, I missed a reading while on a coffee break, but it does not matter. But if you want, I can figure out what it should have been from the surrounding data”, then I would have been booted out of that organisation.
Scott
Data appears on that site daily. The Aust BOM have no basis on which to adjust it at that frequency. It matches (with rare exceptions) the GHCN raw data which is purported to be raw as received from supplying country (at least for Australia). The BOM Daily High Quality data IS adjusted.
Graeme W
I am grateful to you for spotting that error, then going further to identify what I had done wrong.
2005 and 2007 both exhibited errors. I chose 2005 from GHCN, but unfortunately chose 2007 BOM data, then calculated the the differences. The 1991 and 1995 comparisons are correct.
The offending data for 2005 should read
BOM MAX Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2005 39.4 38.2 38.4 38.2 33.3 26.4 27.9 28.6 31.4 33.6 37.2 36.3
BOM MIN Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2005 26.7 27.3 26.4 23.7 18.5 14.7 14 13.8 15.6 18 20.8 25
BOM MEANJan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2005 33.05 32.75 32.4 30.95 25.9 20.55 20.95 21.2 23.5 25.8 29 30.65
DIFFERENCES
MAX Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2005 0.0 0.0 0.0 – 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
MIN Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2005 0.0 0.0 0.0 – 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
MEAN Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2005 1.2 0.9 0.9 1.0 0.8 0.7 0.8 0.7 0.7 0.7 0.7 0.8
I will give Anthony an Errata to be appended to my paper. Anyone who wishes a copy of the corrected paper may email me for one at thurstan@bigpond.net.au.
Ed Thurstan

Kev-in-UK

November 11, 2010 4:09 pm

The various comments about means and calculations are indeed interesting. But the most important calculation is the one thats been used by GHCN (or GISS or CRU, etc,). I have a simple question, which I believe has been partly alluded to in other comments – and that is – what is the data they have and what are the calculations do they use to arrive at the so called ‘mean’?
As someone, who as a schoolboy, took part in the daily met observations which were supposedly submitted to the met office in the 70’s – I remember that we had to take the obs at a certain time each day (9am as I vaguely remember) – and I remember also the max/min thermometer, etc. From this recollection, and the knowledge of the more modern equipment taking continuous readings – how can anyone reconcile older observations with modern ones? For certain, the old min/max thermometer just gave the ‘actual’ min/max value and (ignoring any device errors) this was an absolute value – I presume (but don’t know) that modern electrical devices simply ‘store/record’ the min/max the same as the old manual method? – if so, comparison to older records should at least be reasonable – but what if the new electronic devices (or the software that collects the readings) already do some kind of temp averaging over a 24hr period (i.e. by totting up all the readings and dividing by the number of readings of that day) ?
Someone mentioned half hour readings, for example, – could this miss a real min/max value? And more importantly, are we comparing chalk with cheese when trying to look at old and modern data?

Graeme W

November 11, 2010 6:57 pm

Okay, based on the new data, I can confirm that there is a problem with the calculation of the mean in the GHCN data file.
It doesn’t matter if you take the mean of the daily (max+min)/2 or if you take the monthly mean max and min and average them (the calculation results in the exact same number — an exercise for the mathematicians to prove if you’re bored, but the proof isn’t difficult, I just don’t know how to do the subscripting required to put it here), the numbers don’t add up.
I took the Jan 2005 figures from the BOM site for Port Hedland (one of the links I gave earlier gives the daily data for 2005), and from that calculated the mean max, mean min and the average of the two. The final answer doesn’t agree with the V2.mean data from GHCN.
My result was 33.06 (to two decimal places) because I didn’t do any rounding until the end, but that’s a long way from the 31.9 that the GHCN data reports. I can’t see how they got their figure (the BOM site doesn’t report mean daily temperatures, only max/min temperatures).
Could someone who is more knowledgeable about GHCN answer the question, or email an appropriate person at NCDC to ask the question? Everywhere I’ve looked people have assumed that the v2.mean file is the ‘raw’ mean temperature, but whatever it is, for Jan 2005 for Port Hedland, it’s not the ‘mean’ as per standard mathematical calculations.
[REPLY – So far as I know, v.2 is not raw, but the “new” adjusted number for GHCN (USHCN went to v.2 about a year back). ~ Evan]

Tim Folkerts

November 11, 2010 9:13 pm

Ed,
I appreciate your efforts to correct the errors. It would be wonderful if all such issues were addressed as rapidly. I haven’t had time to look at the specific changes to see exactly how it affects your original post or conclusions, but I’m sure other readers will continues to examine your results.
Tim

Steven Mosher

November 12, 2010 12:07 am

Kev-in-UK
read the observers handbook. we’ve discussed it many times in the past three years

Steven Mosher

November 12, 2010 12:20 am

ed.
http://www.john-daly.com/tob/TOBSUMC.HTM
190 stations with hourly data for several years.
or here. the calculations are already done
http://www.ncdc.noaa.gov/crn/observations.htm?network=crn
data every 5 minutes
http://www.ncdc.noaa.gov/crn/sensors.htm?stationId=1007&date=2010111122&timeZoneID=US%2fAlaska
or here
http://www.ncdc.noaa.gov/crn/products.html

Steven Mosher

November 12, 2010 12:33 am

http://climateaudit.org/2007/09/07/hansen-and-the-ghcn-manual/
http://www1.ncdc.noaa.gov/pub/data/documentlibrary/tddoc/td9100.pdf

Steven Mosher

November 12, 2010 2:47 am

if you want to understand why the ghcn numbers dont match I think you need to understand
“The reason why GHCN mean temperature data have duplicates while
mean maximum and minimum temperature data do not ……
“DUPL : Duplicate number. One digit (0-9). The duplicate order is
based on length of data. Maximum and minimum temperature
files have duplicate numbers but only one time series (because
there is only one way to calculate the mean monthly maximum
temperature). The duplicate numbers in max/min refer back to
the mean temperature duplicate time series created by
(Max+Min)/2.
”
very simply: the v2min and v2max are CALCULATED from all the duplicate records. So, when ed picks a duplicate record (Dup=0) from the v2mean he is picking one record. the longest record.
When you look at a record in V2Max you dont have duplicate records. You have ONE record. it has a duplicate number, but there are no Dups. there is only one record.
That one record has the MAX found in all duplicate records.
For V2Mean NOAA just records all the duplicate records. If the duplicate record as a min and max.. then a mean is calculated from min max of that dup record
For V2max NOAA looks at all the duplicate records and picks the Highest value for the max ( it could come from dup record 1) and the lowest value for MIN ( it could come from dup 2)
So V2min and v2max do NOT get used to create V2mean.
so: There is a source file that has all duplicate records. ( we dont have access to this)
3 files get created:
V2mean: reduces all duplicates to means
V2Max: looks at all duplicates and picks the max of all duplicates
V2Min: picks the min.
V2max and v2min DONT get used to create V2MEAN. they cant. becaise they only have one record per station. While V2mean has the duplicates
(duplicates are not duplicates… They differ ) Thats a whole nother story

Al Cooper

November 12, 2010 11:27 am

The high temp for a day might persist for one hour and the low temp might persist for several hours. Using these for a daily “mean/average” would be very inaccurate and misleading.
I would like to see a RMS (root-mean-square) result of temps taken at one hour or less intervals over a one day/week/month/year period.

Steven Mosher

November 12, 2010 1:43 pm

Al Cooper says:
November 12, 2010 at 11:27 am (Edit)
The high temp for a day might persist for one hour and the low temp might persist for several hours. Using these for a daily “mean/average” would be very inaccurate and misleading.
I would like to see a RMS (root-mean-square) result of temps taken at one hour or less intervals over a one day/week/month/year period.
#######
read my comments. You dont understand the terminology being employed. if you would like to understand it, go look at the data I pointed people at and calculate your own numbers.
The min/max is an ESTIMATOR of “average” unbiased estimator. It is not meant to capture an average that is the integral of the function. It is meant to ESTIMATE that value. Sometimes its high, sometimes its low. You can go look at years of data collected every 5 minutes and see for yourself. Nobody will do your work for you.

Al Cooper

November 12, 2010 2:13 pm

Steven Mosher says:
November 12, 2010 at 1:43 pm
“read my comments. You dont understand the terminology being employed. if you would like to understand it, go look at the data I pointed people at and calculate your own numbers.”
Sterve I am not interested in an ESTIMATOR of temps that are costing me $$$.
I want the ACCURATE truth.

Al Cooper

November 12, 2010 2:32 pm

Steven Mosher says:
November 12, 2010 at 1:43 pm
“read my comments. You dont understand the terminology being employed. if you would like to understand it, go look at the data I pointed people at and calculate your own numbers.”
You are the professed expert; I expected no replay that was not cordial.
You assume I do not follow your math.
You say I don’t understand the terminology, you do not know that.
If this is your best answer to my post, you have lost my respect.

Steven Mosher

November 13, 2010 12:47 am

Al Cooper says:
November 12, 2010 at 2:13 pm (Edit)
Steven Mosher says:
November 12, 2010 at 1:43 pm
“read my comments. You dont understand the terminology being employed. if you would like to understand it, go look at the data I pointed people at and calculate your own numbers.”
Sterve I am not interested in an ESTIMATOR of temps that are costing me $$$.
I want the ACCURATE truth.
#######
There is no such thing Every measurement is an estimation. I can explain that, but I cannot make you understand.
in 1850 in Anytown USA they looked a thermometer ( a min/max) thermometer once a day. They recorded 10C for the low and 20C for the high. I cannot change that history. in the year 2010, we have a thermometer in the same place. We record a min temp of 11C and a high temp of 20. We would estimate that it is warmer now. We would not estimate that it is cooler. We might bemoan the fact that we dont have temperature measures every nanosecond. But we dont have them. so we have to estimate.
We Could throw a tantrum, but that’s easy. The tough thing is to do the best you can with the data you have and characterize your uncertainty. There is ALWAYS uncertainty.
if you want accurate truth try math or logic they come closer.

Steven Mosher

November 13, 2010 12:48 am

Al:
I would like to see a RMS (root-mean-square) result of temps taken at one hour or less intervals over a one day/week/month/year period.”
well, whats stopping you? I pointed you at data sources. get to it.

« Previous 1 2 3 4

wpDiscuz

Welcome to Watts Up With That, one of the most well-known climate blogs! We gather the latest scientific research, news, and expert opinion to help you understand how our planet is changing and what implications it may have for humanity. Our approach is based on facts, objective analysis, and open discussions about one of the most critical issues of our time. Watts up with that climate and what changes await us – let’s figure it out together!

Watts Up With That covers a wide range of topics related to climate change and its impact on the world. Here’s what’s important to us:

Global warming – its causes, consequences, and future forecasts.
Analysis of current climate research and its findings.
Climate change news.
Extreme weather events – hurricanes, droughts, floods, and their connection to climate change.
The impact of different energy sources on the environment and the development of sustainable technologies.
Political and economic aspects and how states and international organizations respond to climate change.

Watts Up With That?

"Gross" Data Errors in GHCN V2. for Australia

Synopsis

Analysis

Observation 1.

Observation 2.

Full Error List

Observation 3.

Summary

Appendix – Source Data

Like this:

Synopsis

Analysis

Observation 1.

Observation 2.

Full Error List

Observation 3.

Summary

Appendix – Source Data

Share this:

Like this:

Related Posts

New Temperature Study in Reno Finds Strong Urban Heat Island Bias at Official Climate Station

How Did Last Month’s (UK) Rainfall Compare With 1929?

Met Office’s N Ireland Rainfall Dataset Is Worthless

BBC’s Fake Record Rainfall Claims