Australia and ACORN-SAT

Guest Post by Willis Eschenbach

As Anthony discussed here, some Australian climate scientists think that there was an “angry summer” in 2012. Inspired by the necromantic incantations in support of the Aussie claims coming from the irrepressible Racehorse Nick Stokes, I went to take a look at the Australian temperature data. I found out that in response to hosts of complaints about their prior work, in March of 2012 the Australian Bureau of Meteorology (BoM) released a new temperature database called ACORN-SAT. This clumsy acronym stands for the Australian Climate Observations Reference Network (overview here, data here)

acorn-sat overview

It’s a daily dataset, which I like. And they seem to have learned something from Anthony Watts and the Surfacestation project, they have photos and descriptions and metadata for each individual station. Plus the data is well error-checked and vetted. The site says:

Expert review

All scientific work at the Bureau is subject to expert peer review. Recognising public interest in ACORN-SAT as the basis for climate change analysis, the Bureau initiated an additional international peer review of its processes and methodologies.

A panel of world-leading experts convened in Melbourne in 2011 to review the methods used in developing ACORN-SAT. It ranked the Bureau’s procedures and data analysis as amongst the best in the world.

and

Methods and development

Creating a modern homogenised Australian temperature record requires extensive scientific knowledge – such as understanding how changes in technology and station moves affect data consistency over time.

The Bureau of Meteorology’s climate data experts have carefully analysed the digitised data to create a consistent – or homogeneous – record of daily temperatures over the last 100 years.

As a result, I was stoked to find the collection of temperature records. So I wrote an R program and downloaded the data so I could investigate it. But when I had just gotten all the data downloaded started my investigation, in the finest climate science tradition, everything suddenly went pear-shaped.

What happened was that while researching the ACORN-SAT dataset, I chanced across a website with a post from July 2012, about four months after the ACORN-SAT dataset was released. The author made the surprising claim that on a number of days in various records in the ACORN-SAT dataset, the minimum temperature for the day was HIGHER than the maximum temperature for the day … oooogh. Not pretty, no.

Well, I figured that new datasets have teething problems, and since this post was from almost a year ago and was from just after the release of the dataset, I reckoned that the issue must’ve been fixed …

… but then I came to my senses, and I remembered that this was the Australian Bureau of Meteorology (BoM), and I knew I’d be a fool not to check. Their reputation is not sterling, in fact it is pewter … so I wrote a program to search through all the stations to find all of the days with that particular error. Here’s what I found:

Out of the 112 ACORN-SAT stations, no less than 69 of them have at least one day in the record with a minimum temperature greater than the maximum temperature for the same day. In the entire dataset, there are 917 days where the min exceeds the max temperature …

I absolutely hate findings like this. By itself the finding likely make almost no difference for most applications. These are daily datasets, with each station having around 100 years of data, 365 days per year, that means the whole dataset has about 4 million records, so the 917 errors are 0.02% if the data  … but it means that I simply can’t trust the results when I use the data. It means whoever put the dataset out there didn’t do their homework.

And sadly, that means that we don’t know what else they might not have done.

Once again, the issue is not that the ACORN-SAT dataset had these problems. All new datasets have things wrong with them.

The issue is that the authors and curators of the dataset have abdicated their responsibilities. They have had a year to fix this most simple of all the possible problems, and near as I can tell, they’ve done nothing about it. They’re not paying attention, so we don’t know whether their data is valid or not. Bad Australians, no Vegemite for them …

I must confess … this kind of shabby, “phone it in” climate science is getting kinda old …

w.

THE RESULTS

Station, Bad days in record (w/ min. temperature exceeding the max. temp)

Adelaide, 1

Albany, 2

Alice Springs, 36

Birdsville, 1

Bourke, 12

Burketown, 6

Cabramurra, 212

Cairns, 2

Canberra, 4

Cape Borda, 4

Cape Leeuwin, 2

Cape Otway Lighthouse, 63

Charleville, 30

Charters Towers, 8

Dubbo, 8

Esperance, 1

Eucla, 5

Forrest, 1

Gabo Island, 1

Gayndah, 3

Georgetown, 15

Giles, 3

Grove, 1

Halls Creek, 21

Hobart, 7

Inverell, 11

Kalgoorlie-Boulder, 11

Kalumburu, 1

Katanning, 1

Kerang, 1

Kyancutta, 2

Larapuna (Eddystone Point), 4

Longreach, 24

Low Head, 39

Mackay, 61

Marble Bar, 11

Marree, 2

Meekatharra, 12

Melbourne Regional Office, 7

Merredin, 1

Mildura, 1

Miles, 5

Morawa, 7

Moree, 3

Mount Gambier, 12

Nhill, 4

Normanton, 3

Nowra, 2

Orbost, 48

Palmerville, 1

Port Hedland, 2

Port Lincoln, 8

Rabbit Flat, 3

Richmond (NSW), 1

Richmond (Qld), 9

Robe, 2

St George, 2

Sydney, 12

Tarcoola, 4

Tennant Creek, 40

Thargomindah, 5

Tibooburra, 15

Wagga Wagga, 1

Walgett, 3

Wilcannia, 1

Wilsons Promontory, 79

Wittenoom, 4

Wyalong, 2

Yamba, 1

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
150 Comments
Inline Feedbacks
View all comments
July 1, 2013 9:58 am

“My issue was not I hadn’t heard their excuses for the problem. It was that they had not fixed the problem despite having a year to do so.”
Because there is nothing to fix if the temps are taken at different times. The only way to know for sure it to look at the hourly data for Feb 28, Mar 1, and Mar 2 to see what the actual readings were and how it changed.
Doesnt make sense they would have one to read TMin and one to read TMax. Here in Canada, Environment Canada takes hourly measurements, then the Tmin and Tmax are taken from that dataset. So TMin cant be higher than TMax.
This is why I suspect it is the timing they are taking the temps, assuming night is cooler than mid day.
Only the hourly measurements will tell us. Is that available?

July 1, 2013 3:14 pm

Willis, you have highlighted what we have been saying for a year- if there are so many mistakes of this magnitude, and they missed any quality checking before publication, and haven’t been fixed a year later, what confidence can we have in the whole dataset? As I mentioned previously, Acorn is riddled with errors- 10 degree mistakes are easy to find, as are adjustments of over 8 degrees. We only have Blair Trewin’s word for it that there is no bias, and I’m not convinced- that’s what David Jones assured me about the previous “High Quality” (sic) dataset before I proved it to be comprehensively biased by at least 40%. ACORN should have been called A CON. Having said that, the record since 1979 reflects UAH for Australia quite well.
Thanks for your work.
Ken Stewart

barry
July 1, 2013 7:17 pm

Obviously, The Scientist In Charge of ACORN-SAT didn’t take the trouble to do what I did, look at each individual error. He made an incorrect assumption, and has now hurried off to implement an incorrect solution.

What was the assumption you allege he made? Seems to me like he explained how such anomalies occur.

PS—I did love The Scientist’s explanation of how it will all come right …
“Clearly in these cases either the estimate of the max is too low or the min is too high; however, providing the adjustment process is unbiased, these cases will be offset by cases where the max in too high/min is too low, and there is no overall bias.”
A real scientist would, you know, actually determine if “the adjustment process is unbiased” before making such an unsupported claim, rather than simply assuming that it is unbiased …

He didn’t state an assumption, he gave a conditional caveat. Aren’t you making an assumption yourself? A real scientist would test the proposition. A real scientist would have asked questions and investigated. Looks to me like you threw your hands up when you came across some anomalies and made a grand statement about the quality of ACORN-SAT. That is blog-standard science. At any time you could have contacted ACORN and asked about the anomalies, but it took a commenter to go through the painstaking task of discovering the email address of the ACORN director, and laboriously constructing some sentences to discover more about the issue. That commenter followed a reasonable procedure in investigating the issue.

And that is why, as an accountant, I become concerned when I see such small “trivial” errors. Because it makes me wonder—what other errors are hidden in there?

You discovered a 0.02% ‘error’ based on the 917 inverted min/max in the data out of several million data points. The BOM say they have an error rate of a few tenths of a percent, an order of magnitude greater than you discovered. How is that hiding the errors? They even discuss some of the errors that crop up. If you want to find out what kinds of errors there have been, you can read their reference material, and, if you are a real investigator, contact them for further details. ‘Real scientists’ go the distance.
But if you can’t be bothered doing that, how about calculating what difference the ‘errors’ you discovered would make to the claim that, based on BOM surface data, summer 2012/2013 was the warmest on record. If the record was broken by 0.2C, then how much impact could 917 data errors have of 4 million? For instance – summer data is 10416 data points per annum for a network of 112 stations. If all 917 errors occurred in 2012/2013 summer, and all were biased high by 3.9C, what impact would that have on the record if you removed those anomaly altogether? But you know when these anomalies occurred, so you can simply take them out of the record and see what impact that has. Then you would be applying a statistical test to the issue that kicked off your investigation.
Either work with the information you have and do some statistical analysis with appropriate caveats, or find out more about the information you don’t have and do a throrough job. So far you’ve made sweeping criticisms with little effort to explore them.

barry
July 1, 2013 7:21 pm

summer data is 10416 data points per annum for a network of 112 stations

That would be the number of averages, the min/max data points would be twice as many, obviously.

July 1, 2013 8:48 pm

Barry, the commenter who contacted BOM and got such a rapid response was indeed fortunate- normal response time from Webclimate is 3 days, and it took 3 months and a complaint to the minister before I got a reply to my queries re HQ data, which didn’t really answer my questions even with continued pushing. The point remains- this is one example of the many errors in Acorn which have not been fixed 15 months after its first release.
Ken Stewart

barry
July 1, 2013 9:23 pm

Ken, can you describe what the error is, exactly, and how it leads to a bias in the records sufficient to undermine the assertion that Australian summer 2012/13 is the warmest on record? I can’t see how this ‘error’ would make a substantial difference.

barry
July 1, 2013 10:07 pm

He did explain how they occur. He said they happened because of the adjustments made to the min and max datasets. That was an assumption, as the data shows.

I still don’t see an assumption. He neither assumes the anomalies are correct (he says the opposite), nor makes an assumption about the adjustment process. What are you referring to? Explanation =/= assumption in my dictionary.

And The Scientist In Charge, obviously, didn’t actually look at the errors. If he had, he wouldn’t have tried a bullshit excuse like the one about the data adjustments causing the problem to handwave away a 3.9°C error.

Now, that is an assumption. Why not just ask them if they noticed these anomalies, and why they didn’t do something about it if they did?

Next, you claim that the tiny size of the error should make it immune from discussion.

On the contrary. I said:

how about calculating what difference the ‘errors’ you discovered would make to the claim that, based on BOM surface data, summer 2012/2013 was the warmest on record

Rather than making the ACORN data set immune to analysis, which seems to be the point you are pursuing, I urged you to work with the information you have. Analyse what you know – don’t throw out the data just because it’s problematic. And for what you don’t know, investigate more deeply. Talk about projection!
<blockquote.Check out the links in Jo Nova’s post upstream.
None, that I could see, refer to this particular issue. When was the BOM made aware of the min/max anomalies a year ago? Do you have a link?
I am reminded of Fall et al, which actually did the hard yards, made a reference network and compared trends and data. That was ‘real science’, and they analysed and documented problems with the min/max trends (while finding the average values seemed to be ok).
I have no problem with, indeed I encourage you (or anyone else) to investiage problems you perceive with the BOM data. What I think is outlandish and ironic is BOM being scolded for not doing their ‘homework’ when you’ve clearly done very little on this particular issue (min/max anaomalies) yourself. Handwaving? That’s when one is dismissive without doing much analysis, isn’t it?

In short, your assumption of good will on their part is touching…

That is incorrect and irrelevant. Analysis should take place without assumption of good-will or bad. ‘Real science’ is neutral. Think there’s not enough information? Then take steps to rectify that. You have speculated that min/max adjustments may be biased. Follow your inquiry, test data randomly, and also consider the validity or not of adjustment methods. Anyone can make a graph from selected data to make a point as Jo Nova does, but that’s not neutral analysis.
If you could formulate your scientific criticisms into questions for the BOM, what would they be? These would clarify your concerns and focus your investigation. Maybe you could politely email the BOM for further information.

July 2, 2013 1:19 am

Sorry Barry, not with you- I’m not sure we’re talking about the same things. There were obvious errors causing bias in the HQ dataset. There are numerous errors in Acorn- I’ve mentioned a couple above. Many stations have past winter maxima cooled, but there is no evidence of deliberate bias. I don’t know whether the errors cause bias, but they certainly cause lack of confidence in the record. And I haven’t mentioned the Angry Summer in this thread. And Willis- Tennant Creek in the hot interior- I can’t imagine minima ever exceeding maxima for any reason,
Ken

barry
July 2, 2013 3:21 am

although if your adjustments lead to physically impossible situations, wouldn’t you question the adjustments?

Sure. But are the anomalies physically impossible? “It is not possible that the warmest time in a 24 hour period could be when the thermometers are reset” (9:00 am in the case of BOM practises, mostly) – I would also test that assertion. I did some googling to see if there were other places in the world where this has happened, and indeed it does, as far as weather watchers have posted. Of billions of data worldwide, surely this could happen a few times. But I would not assume that this was the case for the BOM data either. I’d make no assumptions.

Barry, perhaps you can explain to the class how The Scientist In Charge was NOT making assumptions, but was actually correct when he said that these errors are from “adjustments” …

Short of doing the work, I could not explain that. The same should apply equally to anyone else who has not investigated the matter.

I can explain it to you, and I have, several times

Really? I may have missed it, but it seems to me you have made assertions (eg, “it’s physically impossible”). But have you tried to replicate the process? Can you explain the adjustment process to begin with, and what is wrong with it? To the point that initiated this branch of the conversation, would they make a difference? Would ‘errors’ of the kind you’ve pointed out lead to a biased temperature record sufficient to discredit the notion of a record-breaking summer or not?

Ken didn’t say anything about an error being “sufficient to undermine the assertion” about the Australian summer being so hot.

It appears Ken has done some work on BOM data, so I wondered if he wanted to weigh in on the summer temps issue that provoked your interest.
And it’s my interest, too, because I live in Australia, and followed the weather reports around the country during the summer. My ‘experience’, limited as it was to anecdotes and an array of data points, was that summer nationally was a particularly warm one, and there were clear records broken across the country. That doesn’t ‘prove’ that the national average was a record-breaker, of course, but that’s why I’m curious about the issue as raised here.

July 2, 2013 6:38 am

barry says:
July 1, 2013 at 9:23 pm
Ken, can you describe what the error is, exactly, and how it leads to a bias in the records sufficient to undermine the assertion that Australian summer 2012/13 is the warmest on record? I can’t see how this ‘error’ would make a substantial difference.
———-
I’ve only looked at Alice Springs, but TMax for the year has gone up since 1910. However, it was a faster rise from 1910 to 1960. 1960 had the highest Tmax, since then the rise over all is very shallow. But what is interesting is that every 9 to 11 years Alice Springs has an abnormally cool summer. Those years are quite prominent.

July 2, 2013 6:57 am

Barry, here is Alice Springs record Jan TMax:
Day – Temp – Year
02 – 45 – 1960
03 – 45.2 – 1960
04 – 44.2 – 1972
05 – 44.6 – 2004
08 – 43.3 – 1932
09 – 43.4 – 1932
10 – 42.9 – 1915
11 – 42.8 – 1935
12 – 42.7 – 1928
13 – 43.5 – 1981
14 – 43.9 – 1936
15 – 44 – 1944
16 – 43.3 – 1932
17 – 43.2 – 1939
18 – 44.4 – 2001
19 – 43 – 1928
20 – 43 – 1928
21 – 42.8 – 1935
22 – 43.2 – 1939
23 – 42.9 – 1915
24 – 43 – 1928
26 – 43 – 1928
27 – 43 – 1928
28 – 43.9 – 1936
29 – 42.6 – 1938
30 – 44.7 – 1990
You can see it is dominated by years before 1950. Even the two recent records, 5th and 18th of Jan, were below the highest of 45C in 1960. Record breaking years has nothing to do with getting warmer. It has to do with accounting. In the first year of records, every day is a record breaker. As the years of data accumulate, the number or record breaking days drops off in a decay curve. The reason is the number of possible slots is huge. If the range for any given temp for any day in Jan is between 20 and 45C, measured in 1/10C, then there are 250 possible slots, for the full year times that by 366.
To fill all those possible slots would take somewhere round 3000 years.

johanna
July 2, 2013 7:25 am

Barry said:
But if you can’t be bothered doing that, how about calculating what difference the ‘errors’ you discovered would make to the claim that, based on BOM surface data, summer 2012/2013 was the warmest on record. If the record was broken by 0.2C, then how much impact could 917 data errors have of 4 million? For instance – summer data is 10416 data points per annum for a network of 112 stations. If all 917 errors occurred in 2012/2013 summer, and all were biased high by 3.9C, what impact would that have on the record if you removed those anomaly altogether?
——————————–
Barry, I am not a scientist nor a mathematician. But, I did spend the best years of my life feeding numbers to very senior politicians to sprout in Parliament. In the early years, I had people up the line checking everything I did, but later, as I got better at it, so not so much. To the best of my knowledge, nobody ever gave wrong information to Parliament based on my briefs.
What I learned is (i) always do a back-of-the-envelope check about where the decimal point should be; and (ii) small errors often conceal, or flag, large ones.
The point that Willis has very patiently been trying to make to you is that the thing the BOM consistently avoids is transparency about errors and weird results. Nobody is jumping up and down yelling “gotcha” because errors occur. Of course they do. Nobody is instantly claiming conspiracy theories because of errors or weird results – they happen, sometimes for valid reasons.
The problem is that they refuse to be transparent about what they are doing. They make adjustments, expunge records from their working datasets, use new algorithms, invent new metrics – without leaving a visible trail or providing more than a bunch of platitudes to explain what they are doing.
When preparing briefs for Premiers and Prime Ministers, when confronted with this kind of bullshit, I was always careful to insert the words “I am advised by the BOM (or whoever) that …”. No way was I going to drop some uninformed politician into saying that he/she actually believed it. Some of them chose to drop the qualifier – that was their call.
But “I am advised that …” doesn’t work so well on WUWT. What possible justification is there for not laying on the table exactly what is happening with publicly funded weather statistics, when, and why?

barry
July 2, 2013 7:36 am

That’s interesting, jr, but incidental. No state broke the record, and plenty of places locally did not. BOM did not claim that Alice Springs broke its record. The BoM claim is about the national average.
There will always be more record-breakers early in the record at the time, and by rights there should be fewer and fewer in an unchanging climate as records come. You’re right that record breakers by themselves do not describe a warming trend. Trend analysis is a different, only slightly related facet to my query.

Reply to  barry
July 2, 2013 8:13 am

Barry, that depends on what is being used for the trend. If the average is being used, it is largely irrelevant and grossly misleading. Average is simply (TMax-Tmin)/2. But that average is not the median temp. Those two extreme ends of the day may have only been under the hour interval of measurements. If you add up all the hourly temps, then divide by 24, you get a different number, more often than not, below the average.
An increase in the average isnt tell us what is physically going on. For example, in Canada, the average temp have been going up since 1900. But that average increase is because winters are not going as cold. Milder shorter winters. In fact, max temps in the summer in Canada have been dropping since the mid 1930’s. Just the winter increase is faster than the summer decrease, hence the increase in average.
Thus the only way to see what is physically going on, beyond the claim of increasing average, is to look at each station’s daily temps. If that data is corrupted in AU, then one can’t make any claims one way or the other. Scientifically, that’s unfortunate.

barry
July 2, 2013 8:22 am

The point that Willis has very patiently been trying to make to you is that the thing the BOM consistently avoids is transparency about errors and weird results.

I’ve read several documents on their methods, which mention errors that crop up and – to some degree – how they deal with them. I don’t see Willis trying to explain that, but he has linked me to Jo Nova pages, advising me that I must be unaware of the flaws in BOM. I flicked through those to see if what he has lit upon is covered, but it isn’t.

What possible justification is there for not laying on the table exactly what is happening with publicly funded weather statistics, when, and why?

There is data, discussion of methods, uncertainty and problems with the data easily accessible at the BOM website. Commenters who are regulars here have posted links to them, and so have I. A commenter emailed a question regarding the issue Willis brought up here and they responded. This could be the beginning of a dialog, but neither Willis nor anyone else seems to want to make it so.
I have tried to focus on the issue Willis brought up, particularly with regard to the matter that initiated his investigation (record-breaking summer) but people keep talking about how awful BOM is. They may or may not be right, but reading complaints and seeing a lack of willingness to investigate very deeply, or engage with BoM; can you understand why this might not be persuasive?
I read at Jo Nova’s that after much lobbying, BoM addressed its data (with little change in general results).
Rather than work with the data that’s available, the default is to rail against BoM. It seems like a distraction; a talking point to avoid number-crunching. I have asked what the upshot is of throwing out the anomalies. Nothing. I have asked for Ken and Willis to describe what they think has happened to the inverted anomalies. Nothing. I have asked what steps Willis has taken to understand the adjustment processes. Nothing. Did Willis contact BoM to enquire about the issue he discovered? Nothing.
This pattern does not encourage me to take on the general blandishments. No one is obliged to do any of this, of course, but as sweeping statements are made, I wonder what steps could be taken to address the questions they raise. “BoM have not done their homework”? Statements like that make me skeptical. I’ve read about their methods, and that is not the impression I get. Are their methods invalid? Are their adjustments unreasonable? I don’t know, but the answers are not here yet, or at Jo Nova’s, as far as I’ve read.
http://cawcr.gov.au/publications/technicalreports/CTR_049.pdf
BoM provides raw and adjusted data (Jo Nova et al made use of it) The above links to an overview of their methods. What is missing in it that you think is needed?

July 2, 2013 8:25 am

Wrote a little program which would count the number of record breaking days for each year as if each year was new from 1910 to 2012 for Alice Springs. Interesting, more of an asymptotic drop:
http://cdnsurfacetemps.files.wordpress.com/2013/07/recordbreakperyear.jpg?w=689
(Not sure how to embed images in a post here).

barry
July 2, 2013 8:39 am

jrwakefield,

If that data is corrupted in AU, then one can’t make any claims one way or the other. Scientifically, that’s unfortunate.

if the problems with the data are not understood, then no one can say anything. That’s my point.
‘Corrupted’ is loaded language. We need less of that if we want to illuminate issues rather than use them as talking points. As Willis said;

All datasets contain errors

barry
July 2, 2013 8:48 am

Willis, I missed a question of yours upthread.

But since you are the first person in this thread to use the term “substantial difference”, I’m clueless what error you might be referring to.

The one you pointed out in your article above, and which I’ve mentioned consistently since we started sommunicating.

In the entire dataset, there are 917 days where the min exceeds the max temperature…

You kind of answered the question in your article.

By itself the finding likely make almost no difference for most applications.

But I wondered if you’d put that to any testing.
You also said;
<blockquote.it means that I simply can’t trust the results when I use the data. It means whoever put the dataset out there didn’t do their homework.
And sadly, that means that we don’t know what else they might not have done.
Have you read this?
http://www.cawcr.gov.au/publications/technicalreports/CTR_049.pdf

6. Data quality control within the ACORN-SAT data set ………………………………30
6.1
Quality control checks used for the ACORN-SATdata set ……………………………… 31
6.2
Follow-up investigations of flagged data………………………………………………………. 39
6.3
Common errors and data quality problems …………………………………………………… 41

johanna
July 2, 2013 8:56 am

barry says:
Rather than work with the data that’s available, the default is to rail against BoM. It seems like a distraction; a talking point to avoid number-crunching.
—————————————————–
barry, if Willis never responds to you again, I could understand it. The BOM controls the data “that is available”. They alter it, delete it and reframe it with explanations (if any) that the Tax Commissioner would not accept – like – I just thought up a better way of recording my expenses, and have therefore thrown out my receipts.
I am reminded of a passage in Spike Milligan’s war memoirs, where they found themselves stationed next to an old cemetery. Mrs so-and-so’s marble slab was being used as a washboard by one of his colleagues. Her inscription said: “Not dead, only sleeping”.
“She’s not fooling anyone but her bloody self”, muttered Spike’s pal, as he wrung out his socks on her.

barry
July 2, 2013 3:41 pm

johanna,
is the take-home message “there’s no use, you simply can’t do anything good with BoM data?”
Do they not make the raw data available? I believe they are in the process of making the codes and methods, the programs that are particular to the computer system that they use for ACORN data, available via the internet.
I’ve just started posting at Ken’s blog, where they’ve been working with the data. No ill will intended to anyone, by the way.

1 4 5 6