Guest essay by Ed Thurstan
The Australian Bureau of Meteorology (BoM) released a new temperature dataset – the “Australian Climate Observations Reference Network – Surface Air Temperature“ (ACORN-SAT) in early 2012, with data to the end of 2011. It is supposedly a ground-breaking daily homogenised dataset.
Ground-breaking it certainly is. The BoM has brought new meaning to the term “temperature inversion”. In “refining” the data, the homogenisation process creates, from apparently normal reported daily raw Max/Min data, homogenised data where the Maximum daily temperature is less than the Minimum temperature for that same day. This error is visible in 70 of the 112 stations included in the ACORN-SAT dataset.
In February 2013 the BoM released an update to the previous ACORN-SAT dataset, adding 2012 to the dataset. But the BoM has not corrected the previous errors a year later in their latest release.
I find that:
· The methods used to create ACORN-SAT are wrong.
· Maintenance of ACORN-SAT as a production database is not practicable and not reproducible – Not by the BoM, and not by its author. It is one person’s view of Australian temperature history.
· The original Raw temperature data used to create the ACORN-SAT dataset has been corrupted by data merging and homogenising, and in particular by the Percentile Matching method employed.
· At least 60% of the entire ACORN-SAT dataset could be corrupted, and is of questionable value.
· BoM’s use of ACORN-SAT as their principal temporal temperature database is unsound. It should not be a regular BoM data product.
· ACORN-SAT in its present form cannot become the output of a “production” BoM system.
In 2012 the Australian Bureau of Meteorology (BoM) released a new Temperature dataset, known as the “Australian Climate Observations Reference Network – Surface Air Temperature” (ACORN-SAT). It replaced the “High Quality” dataset, and the “Reference” dataset that were developed to emulate the NOAA administered US Climate Reference Network (USCRN). The USCRN consists of a number of purpose-built surface weather stations designed to be protected from local effects that might introduce human induced temperature errors. It has only about 10 years of history. Faced with the threat of an audit by the Australian National Audit Office, the BoM has established a 100 year history for 112 stations by synthesising, merging and homogenising station data on a daily basis using a variety of mathematical and statistical methods. This ACORN-SAT database was announced in 2012 with much fanfare. The BoM documentation covering development and implementation of ACORN-SAT is here.
An Expert Panel was appointed to review the product. It was chaired by Ken Matthews AO, and comprised:
· Dr Thomas Peterson, Chief Scientist, National Climatic Data Center, National Oceanic and Atmospheric Administration, United States
· Dr David Wratt, Chief Scientist (Climate), National Institute of Water and Atmospheric Research, New Zealand
· Dr Xiaolan Wang, Research Scientist, Climate Research Division, Environment Canada.
That Panel provided this report to to Dr. R Vertessy, then Deputy Director of the BoM. It concluded, among other favourable comments, that
“ ‘The Panel is convinced that, as the world’s first national-scale homogenised dataset of daily temperatures, the ACORN-SAT dataset will be of great national and international value. We encourage the Bureau to consider the dataset an important long-term national asset.’
ACORN-SAT International Peer Review Panel Report, 2011”
In July 2012 I published a note concerning the quality of the first release of ACORN-SAT data. It appeared in Jo Nova in July 2012 and later in Andrew Bolt. (I think this might be the article Willis Eschenbach referenced in his recent note on ACORN-SAT). John McLean discussed the same subject in the January 27, 2013 edition of Quadrant Online, although incorrectly attributing authorship.
I reported in July 2012 that BoM data merging and homogenisation methods were introducing errors in temperature series that were not apparent in their “raw” data. I said that I had found 954 such errors, but that this was low estimate of the magnitude of the problem. These adjustments were made to large blocks of data, spanning many years. Thus if some were clearly wrong, then whole blocks of data must be suspect. I said in that report that ACORN-SAT should be withdrawn until these errors were corrected.
The second release of ACORN-SAT was in January 2013, to Y/E Dec 31, 2012. My data was downloaded Jan 21, 2013. I performed a detailed comparison between Releases 1 and 2.
This time I reported on April 28, 2013 that
1. All the Release 1 errors to Y/E Dec 2011 are repeated in Release 2.
2. The errors (in a 0900-0900 reporting regime) where today’s maximum is reported as less than tomorrow’s minimum are repeated in Release 2.
3. There was missing data in some Release 1 series. I did not comment on it at the time because it was all at the end of 2011 data, and I thought the problem might be simply late delivery of data. But Release 2 appends a further year of data, and those missing dates are still there, now embedded in the series.
4. Every recorded temperature in the Release 1 dataset was compared with its matching one in the Release 2 dataset to the end of 2011. There were no differences.
I suggested that no effort had been made to review the Release 1 product before appending 2012 data to form the Release 2 database. I said that it appears that the 2012 data is simply raw BoM data, with only rudimentary quality control checks applied.
Purpose of this report
This report attempts to quantify the extent that observed errors have propagated through the entire ACORN-SAT database, given that the observed errors appear in large blocks of data that have the same “adjustments” made by BoM, and which must therefore all be suspect.
Max < Min Error
The data runs from 1910 to 2012 – 103 years. The 954 errors appear in 95 of those years. The distribution of errors by year ranges from 2.6% occurring in 1916, to 0.1% in 2006, the last year in which the error appears.
The distribution and age of Australian stations until the mid 20th Century was weighted towards cooler higher latitude and coastal locations. As additional stations were added in the hot north of the continent, a gridded average continental temperature was artificially biased towards a warming trend. ACORN-SAT corrected that problem by creating long term time series by “compositing” (merging/homogenising) neighbouring stations to synthesize an early temperature history of the selected stations. This mostly occurred in the 1910-1960 period, and may account for the apparent trend in error rate to 1960.
The improvement in error rate shown after 1990 may well be due to the progressive introduction of automated stations from about 1990.
The errors appear to be scattered over the whole Australian continent, appearing in 70 of the 112 stations in the dataset.
Max today < Min Tomorrow Error
These 353 errors (of which 118 also show the Max < Min error) are distributed differently.
The error occurs in substantially different sets of stations.
Errors virtually cease at the end of the period where there is synthesis of data to fill the 1910-1960 gap.
Analysis of selected stations
Three stations were reviewed in more detail:
· Sydney Observatory – A long history station with major UHI influences
· Cabramurra –A remote high altitude station with a single move
· Cape Otway Lighthouse – A remote single long history station
Two other stations – Wilson’s Promontory and Mackay – were considered for analysis, but rejected as matching raw data was not available for comparison.
The BoM supplied an “Adjustments.xls” workbook to describe the changes made to create the ACORN-SAT database. It is in a diary format. The sections corresponding to the three stations are shown below. Note that in these examples, the worksheets are truncated to delete 8 columns of “stations used”, and occasional one line text notes. The full Adjustments workbook is available here.
The “Raw” data cited comes from here. This data is updated daily by the BoM and has undergone only basic quality control checks before publication.
Sydney Observatory – BoM 66062
This station was established in 1859. It had multiple screen types until the early 1900s when the first Stevenson screen was installed. It has had one physical move of about 200 metres. It is one of Australia’s worst sited stations, with large UHI influences. The BoM notes on adjustments made to create the ACORN-SAT database are:
Looking at the Max < Min errors in this station by date:
The blank Raw temps indicate that these entries in the raw data files were blank. That is, the ACORN data shown for those dates comes from an unknown source, or has been synthesized.
Calculating the differences between ACORN and Raw on an annual basis gives:
The changes specified in the worksheet are visible in the graph:
· Min was raised 0.2 from 1964 back to 1910.
· Max was raised 0.3 from 1984 to 1978
· Max was raised 0.5 from 1946 to 1938
· Max was raised 0.9 from prior to 1917
The BoM “Adjustments” workbook, when it says things like “1 Jan 1938 Max -0.53; 1 Jan 1946 +0.53”, implies to me that the intention is (going back in time) to raise the Max data in that period by 0.53. The result is evident in the above graph of the annualised difference between ACORN Max and Raw Max.
Similarly, for the period between 1920 and 1935 (which encompasses 10 of the 12 errors) the intention appears to be that ACORN Maximums should be raised by about 0.1, and the Minimums raised about 0.2. The distribution of these daily adjustments was examined.
And the distributions of those 16 years of daily adjustments are:
So the ACORN Maximum data has moved by a mean of about 0.1, but far from being a constant adjustment, it has a substantial skewed distribution.
Correspondingly, the Minimums have been shifted up about 0.2, but have been given another skewed distribution.
The adjustments have also had an effect on the DTR, as shown. It gives the mean DTR by month for the period 1910-1982. After 1982 Raw and ACORN data are equal. That is a substantial change in DTR, not discussed by the BoM.
Cape Otway – BoM 90015
This site is at Cape Otway Lighthouse on the Victorian coast at 38.86S. It has moved only slightly since the 1800s. Adjustments have only been required for screen and thermometer issues.
The spike at 1995 is reported in the Metadata as a failure in the AWS unit.
This station has 63 reported cases where ACORN Max < ACORN Min. Of these, 23 appear in the period 1959-1987, which seems to be a stable period from this graph, and will therefore be examined. BoM adjustments are noted to be:
The distribution of the daily adjustments made to the 1959-87 is as follows:
While the annualised adjustment intention appears to have been for the period 1959-87 to
· Lower Maximums by 0.5
· Raise Minimums by 0.1
Some maximums have been lowered by as much as 1.8 when the intended annual mean was 0.5, and some minimums have been raised by as much as 1.1 when the intended annual mean increase was 0.1.
An odd artefact was noticed in 90015 Cape Otway. The ACORN data since 2006 is exactly the same as the Raw data, with one exception. This occurs on Dec 24, 2012 when you see
· Raw data: Max 18.6 Min 16.9
· ACORN data: Max 31.8 Min 16.9
That is a puzzling adjustment.
Cabramurra – BoM 72161
Cabramurra (72161) is an alpine (by Australian standards) station at 1482m. It started as Station 72091 in 1962, which was open until April 1999. The replacement station, 72161 is 400m from the earlier one, and opened in 1997. In the following analysis, 1997, 98 and 1999 data are omitted in analysing ACORN adjustments to avoid issues concerning how the merge was handled. So we can compare the “adjusted”, “refined” ACORN-SAT data with a single source of Raw data, either 72091 or 72161 – about 48 years of data.
In total, Cabramurra exhibits 209 cases of Max < Min. This excludes 3 cases where the error occurs in the 1997-1999 period.
A plot of all data, ACORN and Raw, annualised and including the overlap period shows:
The break between the two contributing stations is evident. Looking at the distribution of adjustments made to the Raw data, excluding the 3 year overlap period, we see the same type of distribution seen in earlier Sydney and Cape Otway cases:
The Max distribution is skewed with the tail on the low side. Rising to meet it, the Min distribution is skewed to the high side, in effect colliding with the tail of the Maximums. That collision creates the errors where adjusted Maximums are less than adjusted Minimums. The adjustments, even if appropriate, are excessive. But by how much ? We don’t know.
The effect of the ACORN adjustments is to lower the DTR over the period 1962-1996 by about 0.9oC.
Wilson’s Promontory – BoM 85096
This station was supposedly opened in 1872, although publically available Raw data starts in 1957.
All 79 cases of Max < Min in the ACORN-SAT data occur before 1949, so an analysis of the distribution of adjustments is not possible.
Mackay – BoM 33119/33297/33047/33046
This station has a complex history, stable only since about 1959. See here for detail.
It exhibits 61 errors where Max < Min. All of these appear in the period 1913 to 1958. – that is, in the period where data compositing/homogenization has occurred. An analysis of distribution of adjustments is therefore not practicable.
This workbook supplied by the BoM purports to record all the adjustments made to Raw data to create ACORN-SAT. It lists, by ACORN Station ID, the changes made to each time series, whether it was Max or Min, a reason, and up to 10 other stations which have been used in some way to the adjustment.
There are 625 line entries in that table. Of these, 100 stations are named where an adjustment was made to the Maximum. There are 103 stations named whose minimums were adjusted. (Remember, there are 112 ACORN-SAT stations). In total there are 109 stations listed as having an adjustment of some sort. Four of these are not ACORN stations. They are 43034, 70014, 84030 and 94069. That leaves 7 ACORN stations that are not listed as having adjustments made. These are:
Four have Max < Min errors as shown, implying that they have been adjusted by some process. The other three listed appear to have not had adjustments, although in each there are ACORN Temps where the corresponding Raw is null, and vice versa.
So the “Adjustments” workbook is an incomplete record of the changes made.
There are 5186 stations listed as contributing to all adjustments. 930 of those are unique, and represent all the stations since 1910 that have contributed in some way to ACORN-SAT.
1. The ACORN-SAT database method of creation is wrong
Paraphrasing the well- known adage:
No matter how elegant and ground breaking the ACORN-SAT daily homogenised database might be, if it produces absurd results, it is wrong.
The current ACORN-SAT database is wrong. It creates absurd data – in particular, records that say on many occasions that the daily Maximum temperature was less than the Minimum for the same day.
2. The ACORN-SAT creation method is impractical for production use
The construction of the ACORN-SAT dataset is very labour intensive. It was built essentially by one person. The methods employed are not a formally designed sequence of procedures. Rather, they are a description of the possible procedures that might be applied to the data in order to achieve the desired daily homogenised dataset. The author acknowledges this.
The Expert Review Panel politely alluded to this in saying
“4. The Panel also encourages the Bureau to more systematically document the processes used, and to be used, in the development and operations of ACORN-SAT. Some aspects of current arrangements for measurement, curation and analysis are non-transparent even internally, and are therefore subject to significant “key persons risk”, as well as a risk of inconsistency over time.”
That seems to imply that the process of creating and maintaining the ACORN-SAT dataset is known to few people in the BoM (possibly only one person), and that his methods are at least in part, subjective. In my view the current ACORN-SAT dataset is not reproducible – not by the BoM, and not by Blair Trewin, the author.
Trewin mentions this in a very expansive report concerning the construction of ACORN-SAT.
“8. Spatial intercomparison of daily data – first iteration
Data were flagged if the temperature anomaly at the candidate site, T, varied from Tint by more than a specified value L. L was set at either 4°C, 5°C or 6°C, based on a subjective assessment of network density and local climatological gradients”
Subjectivity is also evident in his varying choice of neighbouring stations used to check and homogenise data. That is most undesirable, if the dataset is to be used for anything other than academic interest.
In talking about the extensive checks applied to the ACORN-SAT data, Trewin says
“These checks are being used in preference to processing through QMS, as the checks for ACORN-SAT were carried out on all the necessary temperature data by one individual. The combination of these two factors requires specifically designed tools, that allowed the user to make well-informed decisions by using their detailed knowledge of the observing network and local influences contributing to temperature at particular locations, as discussed in section 6.2. This is in contrast with QMS, which has been primarily designed for data managers (and not necessarily climate scientists) to make use of more labour-intensive interactive tools that cover additional observation quantities such as air pressure and wind speed, and would have been too time-consuming to apply to the volumes of data involved in ACORN-SAT.”
While inferring that much of the Raw BoM temperature data (more than 700 stations)is of a less than desirable quality for ACORN-SAT purposes (because it has only been subjected to BoM “QMS” checks) it is curious that Trewin uses mostly those non ACORN-SAT stations to synthesise, adjust and homogenise components of the ACORN-SAT dataset. So the errors in those 700 stations will become ingrained in the 112 “superior” ACORN-SAT 100 year temporal records.
3. It is the compositing and homogenisation applied to daily data that creates absurdities in the output
Trewin fails to consider that the percentile matching method creates a distribution in both Maximum and Minimum data such that the lower tail of the maximums approaches the upper tail of the minimums. On days when the DTR is small, the two tails effectively collide, creating the absurdity that the “refined, homogenised” Maximum is less than the Minimum for that same day.
CAWCR Report 050 (Fawcett, Trewin, Braganza, Smalley, Jovanovic and Jones – March 2012), reviewing the output of ACORN-SAT, also fails to consider that possibility. Their focus appeared to be on comparing ACORN-SAT output with other published datasets concerning AGW.
No one in the BoM appears to have noticed the bad data in the ACORN-SAT output, or else they have ignored it. The BoM has treated the ACORN-SAT database as two independent sets of temperatures. That is, they have ignored the necessary relationship that must exist between daily Max and Min – that Max must be greater than, or equal to Min. They check the inputs for adherence to this rule, but not the outputs.
The University of East Anglia now incorporate ACORN-SAT data in CRUTEM 220.127.116.11. I’m sure it won’t affect their calculation of global temperatures much, but I wonder if they realise they have included data that is constructed by the BoM to be patently wrong. ? I wonder if they would care if it is wrong?
4. How much of the ACORN-SAT might be corrupted ?
Alice Springs (a hot, dry town in the centre of Australia) is typical of all stations that have been “adjusted”. The following graph shows the annual average DTR for 1941 to 2012, and the count of the days in each year where ACORN and Raw DTRs differed. That count is mostly in the 340 to 350 range. The 15 to 25 balance were mostly where there was missing data.
It is evident that there has been massive adjustment of Max, Min or both in almost every daily record, all a consequence of homogenisation and Percentile Matching. These types of adjustment pervade the whole ACORN-SAT dataset up until about 1990.
I estimate that 60% of the ACORN-SAT data is corrupted by the homogenising and Percentile Matching methods employed. That is, most of the data from 1910 to 1960, and much of the data between 1960 and 1990.
Look at Blair Trewin’s statistics:
“All observations flagged by the checks described in the previous section were subject to followup
investigations in order to make a final decision as to whether to accept or reject the value. This was the most time-consuming part of the project as several hundred thousand observations were involved (out of a total of about 7 million observations in the ACORN-SAT data set).
As a result of these follow-up investigations, 18,400 individual observations and 515 blocks of observations of three or more days were flagged as suspect and excluded from further analysis or amended, while 50 blocks of observations were shifted in time (mostly maximum temperatures brought forward by one day, but also including a few cases of months that had been swapped). The bulk of these issues were between 1957 and the early 1970s. Relatively few errors were identified after the early 1970s (and particularly after the mid-1990s), presumably because of improved quality-control procedures over time, whilst most pre-1957 data were only digitised in the last 10 years and therefore also underwent relatively effective quality control.”
5. BoM use of ACORN-SAT for their stated purposes is unsound.
The BoM has recently developed a pre-occupation with “Extreme Events”, reporting these events in “Special Climate Statements” such as this one.
“Special Climate Statement 43 – extreme
heat in January 2013
Fourteen of the 112 stations in the Bureau’s long-term high-quality temperature observation network (ACORN-SAT) set all-time record high maximum temperatures during the 2013 event (Table 1), with a fifteenth (Mount Gambier) equalling its record.
No previous event has resulted in so many records at ACORN-SAT stations, the previous benchmark being set in the January 1939 heatwave, in which eleven ACORNSAT stations set records and three equal records.”
The BoM states the purpose of ACORN-SAT to be:
“… to provide the best possible data set to underlie analyses of variability and change of temperature in Australia, including both analyses of annual and seasonal mean temperatures, and of extremes of temperature and other information derived from daily temperatures.
Documentation and traceability of the data and adjustments at all stages, an increasing priority
as described in Thorne et al. (2011), are also a high priority in the ACORN-SAT data set.”
Warwick Hughes, in correspondence with the BoM (Comment 5 here) was told that:
“…. the (ACORN-SAT) data could be considered ‘official’ at the time of publication, but subject to the qualification that the data are subject to change. In other words, the data are kept current. That is, they are subject to retrospective changes as required. This includes changes to account for additional digitised historical data, additional quality control, and changes, corrections and updates to methodologies. All these occur operationally as required.”
That is, the BoM will use the 100 year ACORN-SAT dataset to highlight record-breaking extremes, while reserving the right to go back in time to adjust earlier data “as required”. That does not sound like good scientific practice.
But perhaps the BoM is having second thoughts about the merits of ACORN-SAT. When Warwick asked:
“Is it your intention to update ACORN-SAT regularly, and if ‘Yes’, at what frequency will those releases occur, and how long after the end of the reporting period will they appear ?”
The reply was:
The Bureau has no official reporting period for ACORN-SAT. The Bureau produces monthly, seasonal and annual summaries, but these are not coupled to specific data set development. The AWAP data are updated daily, including real-time spatial homogenisation, and published publicly on the web the next day.
The ACORN-SAT data set is updated in real-time each day, internally, by the Bureau, and that data is used in reporting as required. The ACORN-SAT data set will be updated publicly online around once a year, though this is subject to various considerations. Complete revisions of the data will be required from time to time, and tracked via version control, to account for changes such as those mentioned in point 1 above. It is impossible to temporally homogenise data in real-time, as opposed to the spatial homogenisation that is performed for AWAP. Only limited temporal homogenisation can be applied after gathering an additional year of data. A full analysis of required temporal homogenisation will be applied to new data at five to ten year intervals. This could be shorter in the event of a significant systematic change affecting the underlying temperature network (e.g. a change in observing practice causing significant data inhomogeneities).
The latest 2013 update to ACORN-SAT shows very minor corrections, possibly applied by BoM’s QMS system, rather than ACORN.
Three West Australian stations, Dalwallinu, Bridgetown and Katanning, were flagged to be replaced in the first ACORN-SAT station catalog. They appear to have closed in August 2012, but no replacements have appeared.
The commitment to a five to ten year update interval suggests a lack of enthusiasm for the ACORN-SAT product. Even UEA does better, with a lot more data.
6. ACORN-SAT can cause headaches
I started looking at ACORN-SAT when Blair Trewin, the architect of the product provided this very thorough report CAWCR Report No. 049, this set of Python code to describe the “system” that creates the product, and this description of the adjustments he actually made to the raw data.
My intention was to treat the report as a programming specification, then define an automated system that would create the product.
I came to a halt at this point of Report 049:
6. Data quality control within the ACORN-SAT data set ………………………………30
6.1 Quality control checks used for the ACORN-SAT data set ……………………………… 31
6.2 Follow-up investigations of flagged data ………………………………………………………. 39
6.3 Common errors and data quality problems …………………………………………………… 41
6.4 Treatment of accumulated data…………………………………………………………………… 42
7. Development of homogenised data sets…………………………………………………42
7.1 What particular issues exist for climate data homogenisation in Australia? ………. 43
7.2 The detection of inhomogeneities…………………………………………………………………44
7.3 Adjustment of data to remove inhomogeneities – an overview…………………………47
7.4 The percentile-matching (PM) algorithm ……………………………………………………….49
7.4.1 The overlap case……………………………………………………………………………………………………49
7.4.2 The non-overlap case…………………………………………………………………………………………… 50
7.5 Monthly adjustment method…………………………………………………………………………53
7.6 Evaluation of different adjustment methods……………………………………………………53
7.7 Implementation of data adjustment in the ACORN-SAT data set ……………………..60
7.8 Identification of locations whose extremes were not homogenisable ………………..64
I invite you to study these sections of the report where I bogged down, and failed. You too can share my headache.