Should We Worry About the Earth’s Calculated Warming at 0.7C Over Last the Last 100 Years When the Observed Daily Variations Over the Last 161 Years Can Be as High as 24C?
Guest post by Dr. Darko Butina
In Part 1 of my contribution I have discussed part of the paper which describes first step of data analysis known as ‘get-to-know-your-data’ step. The key features of that step are to established accuracy of the instrument used to generate data and the range of the data itself. The importance of knowing those two parameters cannot be emphasized enough since they pre-determine what information and knowledge can one gain from the data. In case of calibrated thermometer that has accuracy +/- 0.5C it means that anything within 1C difference in data has to be treated as ‘no information’ since it is within the instrumental error, while every variation in data that is larger than 1C can be treated as real. As I have shown in that report, daily fluctuation in the Armagh dataset varies between 10C and 24C and therefore those variations are real. In total contrast, all fluctuations in theoretical space of annual averages are within errors of thermometer and therefore it is impossible to extract any knowledge out of those numbers.
So let me start this second part in which I will quantify differences between the annual temperature patterns with a scheme that explains how thermometer works. Please note that this comes from NASA’s engineers, specialists who actually know what they are doing in contrast to their colleagues in modelling sections. What thermometer is detecting is kinetic energy of the molecules that are surrounding it, and therefore thermometer reflects physical reality around it. In other words, data generated by thermometer reflect physical property called temperature of the molecules (99% made of N2 and O2 plus water) that are surrounding it:
Let us now plot all of Armagh data in their annual fingerprints and compare them with annual averages that are obtain from them:
Graph 1. All years (1844-2004) in Armagh dataset, as original daily recordings, displayed on a single graph with total range between -16C and +32C
Graph 2. All years (1844-2004) in Armagh dataset, in annual averages (calculated) space with trend line in red
Please note that I am not using any of ‘Mike’s tricks’ in Graph 2 where Y-axis range is identical to the Y-axis range in Graph1. Since Graph 2 is created by averaging data in Graph 1 it has to be displayed using the same temperature ranges to demonstrate what happens when 730-dimensional space is reduced to a single number by ‘averaging-to-death’ approach. BTW, I am not sure whether anyone has realised that not only a paper that analyse thermometer data has not been written by AGW community, but also not a single paper has been written that validates conversion of Graph 1 to Graph 2 – NOT A SINGLE PAPER! I have quite good idea, actually I am certain why that is the case but will let reader make his/her mind about that most unusual approach to inventing new proxy-thermometer without bothering to explain to wider scientific community validity of the whole process.
The main reason for displaying the two graphs above is to help me explain the main objective of my paper, which is to test whether the Hockey stick scenario of global warming, which was detected in theoretical space of annual averages, can be found in the physical reality of the Earth atmosphere, i.e. thermometer data. The whole concept of AGW hypothesis is based on idea that the calculated numbers are real and thermometer data are not, while the opposite is true. Graph 1 is reality and Graph 2 is a failed attempt to use averages in order to represent reality.
The hockey stick scenario can be represented as two lines graph consisting of baseline and up-line:
The main problem we now have is to ‘translate’ 730-dimensional problem, as in Graph 1, into two-line problem without losing resolution of our 730-bit fingerprints. The solution can be found in scientific field of pattern recognition that deals with finding patterns in complex data, but without simplifying the original data. One of the standard ways is to calculate distance between two patterns and one of the golden standards is Euclidean distance, let’s call it EucDist:
There are 3 steps involved to calculate it: square difference between two datapoints, sum them up and take square root of that sum. The range of EucDist can be anywhere between ‘0’ when two patterns are identical and very large positive number – larger the number, more distant two patterns are. One feature of using EucDist in our case is that it is possible to translate that distance back to the temperature ranges by doing ‘back-calculating’. For example, when EucDist = 80.0 it means that an average difference between any two daily temperatures is 3.14C:
1. 80 comes from the square root of 6400
2. 6400 is the sum of differences squared across 649 datapoints: 6400/649=9.86
3. 9.86 is an average squared difference between any two datapoints with the square root of 9.86 being 3.14
4. Therefore, when two annual temperature patterns are distant 80 in EucDist space, their baseline or normal daily ‘noise’ is 3.14C
Let me now introduce very briefly two algorithms that will be used, clustering algorithm dbclus, my own algorithm that I published in 1999 and since then has become one of the standards in field of similarity and diversity in space of chemical structures, and k Nearest Neighbours, or kNN, which is standard in fields of datamining and machine learning.
Basic principle of dbclus is to partition given dataset between clusters and singletons using ‘exclusion circles’ approach in which user gives a single instruction to the algorithm – the radius of that circle. Smaller the radius, tighter the clusters are. Let me give you a simple example to help you in visualising how dbclus works. Let us build matrix of distances between every planet in our solar system, where each planet’s fingerprints contain distance to all other planets. If we start with clustering run at EucDist=0, all planets will be labelled as singletons since they all have different grid points in space. If we keep increasing the radius of the (similarity) circle, at one stage we will detect formation of the first clusters and would find cluster that has the Earth as centroid and only one member – the Moon. And if we keep increasing that radius to some very big number, all planets of our solar system would eventually merge into a single cluster with the Sun being cluster centroid and all planets cluster members. BTW, due to copyrights agreement with the publisher, I can only link my papers on my own website which will go live by mid-May where free PDF files will be available. My clustering algorithm has been published as ‘pseudo-code’ so any of you with programming skills can code in that algorithm in any language of your choice. Also, all the work involving dbclus and kNN was done on Linux-based laptop and both algorithms are written in C.
Let us now go back to hockey stick and work out how to test that hypothesis using similarity based clustering approach. For the hockey stick scenario to work you need two different sets of annual temperature patterns – one set of almost identical patterns which form the horizontal line and one set that is very different and form up-line. So if we run clustering run at EucDist=0 or very close to it, all the years between 1844 up to, say 1990, should be part of a single cluster, while 15 years between 1990 and 2004 should either form their own cluster(s) or most likely be detected as singletons. If the hockey stick scenario is real, youngest years MUST NOT be mixed with the oldest years:
The very first thing that becomes clear from Table [4] is that there are no two identical annual patterns in the Armagh dataset. The next things to notice is that up to EucDist of 80 all the annual patterns still remain as singletons, i.e. all the years are perceived to be unique with the minimum distance between any two pairs being at least 80. The first cluster is formed at EucDist=81 (d-81), consisting of only two years, 1844 and 1875. At EucDist 110, all the years have merged into a single cluster. Therefore, the overall profile of the dataset can be summarised as follows:
· All the years are unique up to EucDist of 80
· All the years are part of a single cluster, and therefore ‘similar’ at EucDist 110
Now we are in a position to quantify differences and similarities within the Armagh historical data.
The fact that any two years are distant by at least 80 in EucDist space while remaining singletons, translates into minimum average variations in daily readings of 3.14C between any two years in the database.
At the other extreme, all the years merge into a single cluster at EucDist of 110, and using the same back-calculation as has been done earlier for EucDist of 80, the average variation between daily readings of 4.32C is obtained.
The first place to look for the hockey stick’s signal is at the run with EucDist=100 which partitioned Armagh data into 6 clusters and 16 singletons and to check whether those 16 singletons come from the youngest 16 years:
As we can see, those 16 singletons come from three different 50-years periods, 3 in 1844-1900 period, 5 in 1900-1949 period and 8 in 1950-1989 period. So, hockey stick scenario cannot be detected in singletons.
What about clusters – are any ‘clean’ clusters there, containing only youngest years in the dataset?
No hockey stick could be found in clusters either! Years from 1990 to 2004 period have partitioned between 4 different clusters and each of those clusters was mixed with the oldest years in the set. Therefore the hockey stick hypothesis has to be rejected on bases of the clustering results.
Let me now introduce kNN algorithm which will give us even more information about the youngest years in dataset. Basic principle of kNN is very similar to my clustering algorithm but with one difference: dbclus can be seeing a un-biased view of your dataset where only similarity within a cluster drives the algorithm. kNN approach allows user to specify which datapoints are to be compared with which dataset. For example, to run the algorithm the following command is issued:
“kNN target.csv dataset.csv 100.00 3” which translates – run kNN on every datapoint in target.csv file against the dataset.csv file at EucDist=100.00 and find 3 nearest neighbours for each datapoint in the target.csv file”. So in our case, we will find 3 most similar annual patterns in entire Armagh dataset for 15 youngest years in the dataset:
Let me pick few examples from Figure 8: year 1990 has the most similar annual patterns in years 1930, 1850 and 1880; supposedly the hottest year, 1998 is most similar to years 1850, 1848 and 1855, while 2004 is most similar to 1855, 2000 and 1998. So kNN approach not only confirms the clustering results, which it should since it uses the same distance calculation as dbclus, but it also identifies 3 most similar years to each of the 15 youngest years in Armagh. So, anyway you look at Armagh data, the same picture emerges: every single annual fingerprint is unique and different from any other; similarity between the years is very low; it is impossible to separate the oldest years from the youngest years and the magnitude of those differences in terms of temperatures are way outside the error levels of thermometer and therefore real. To put into context of hockey stick hypothesis – since we cannot separate oldest years from the youngest one in thermometer data it follows that whatever was causing daily variations in 1844 it is causing the same variations today. And that is not due to CO2 molecule.
Let us now ask a very valid question – is the methodology that I am using sensitive enough to detect some extreme events? First thing to bear in mind is that all that dbclus and kNN are doing is simply calculating distance between two patterns that are made of original readings – there is nothing inside those two bits of software that modify or adjust thermometer readings. Anyone can simply use two years from the Armagh data and calculate EucDist in excel and will come up with the same number that is reported in the paper, i.e. I am neither creating nor destroying hockey sticks inside the program, unlike some scientists whose names cannot be mentioned. While the primary objective of the cluster analysis and the main objective of the paper were to see whether hockey stick signal can be found in instrumental data, I have also look into the results to see whether any other unusual pattern can be found. One year that has ‘stubbornly’ refused to merge into the final cluster was year 1947, the same year that has been identified as ‘very unique’ in 6 different weather stations in UK, all at lower resolution than Armagh, either as monthly averages or Tmax/Tmin monthly averages. So what is so unusual about 1947? To do analysis I created two boundaries that define ‘normal’ ranges in statistical terms know as 2-sigma region and covers approximately 95% of the dataset and placed 1947 inside those two boundaries. Top of 2-sigma region is defined by adding 2 standard deviations to the mean and bottom by taking away 2 standard deviation from the mean. So any datapoints that venture outside 2-sigma boundaries is considered as ‘extreme’:
As we can see, 1947 has most of February in 3 sigma cold region and most of August in 3 sigma hot region illustrating the problem with using abstract terms like abnormally hot or cold year. So is 1947 extremely hot or extremely cold or overall average year?
Let me finish this report with a simple computational experiment to further demonstrate what is so horribly wrong with man-made global warming hypothesis. Let us take a single day-fingerprint, in this case Tmax207 and use the last year, 2004 as an artificial point where the global (local) warming starts by adding 0.1C to 2004, then another 0.1C to the previous value and continue that for ten years. So the last year is 1C hotter than its starting point, 2004. When you now display daily patterns for 2004+10 artificial years that have been continuously warming at 0.1C rate you can immediately see a drastic change in the overall profile of day-fingerprints:
What would be worrying, if Figure 10 is based on real data is that a very small but continuous warming trend of only 0.1C per annum would completely change the whole system from being chaotic and with large fluctuation into a very ordered linear system with no fluctuations at all.
So let me now summarise the whole paper: there is not a single experimental evidence of any alarming either warming or cooling in Armagh data, or in sampled data from two different continents, North American and Australia since not a single paper has been published, before this one, that analysis the only instrumental data that do exists – the thermometer data; we do not understand temperature patterns of the past or the present and therefore we cannot predict temperature patterns of the future; all temperature patterns across the globe are unique and local and everything presented in this paper confirms those facts. Every single aspect of man-made global warming is wrong and is based on large number of assumptions that cannot be made and arguments that cannot be validated: alarming trends are all within thermometer’s error levels and therefore have no statistical meaning; not a single paper has been published that have found alarming trends in thermometer data; and not a single paper has been published validating reduction of 730-dimensional and time dependent space into a single number.
Let me finish this report on a lighter note and suggest very cheap way of detecting arrival of global warming, if it ever does come to visit the Erath: let us stop funding any future work on global warming and instead simply monitor and record accuracy of next day temperatures instead! If you look at the above graph, it becomes obvious that once the next day temperature predictions become 100% accurate it will be clear and unequivocal sign that the global warming has finally arrived using following logic:
· chaotic system=no warming or cooling=0% next day prediction accuracy
· ordered-linear system=global warming=100% next day prediction accuracy
And let me leave you with two take-home messages:
· All knowledge is in instrumental data that can be validated and none in calculated data that can be validated only by yet another calculation
· We must listen to data and not force data to listen to us. As they say, if you torture data enough it will admit anything.
===============================================================================
Dr Darko Butina is retired scientist with 20 years of experience in experimental side of Carbon-based chemistry and 20 years in pattern recognition and datamining of experimental data. He was part of the team that designed the first effective drug for treatment of migraine for which the UK-based company received The Queens Award. Twenty years on and the drug molecule Sumatriptan has improved quality of life for millions of migraine sufferers worldwide. During his computational side of drug discovery, he developed clustering algorithm, dbclus that is now de facto standard for quantifying diversity in world of molecular structures and recently applied to the thermometer based archived data at the weather stations in UK, Canada and Australia. The forthcoming paper clearly shows what is so very wrong with use of invented and non-existing global temperatures and why it is impossible to declare one year either warmer or colder than any other year. He is also one of the co-authors of the paper which was awarded a prestigious Ebert Prize as best paper for 2002 by American Pharmaceutical Association. He is peer reviewer for several International Journals dealing with modelling of experimental data and member of the EU grants committee in Brussels.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
He makes such elegant sense.
I have to agree AC. One of the recurring problems has been the naive approach to data.
How is it possible that the same brain that devised dbclust, which sound like a useful procedure, can also believe that Figure 10 is a reasonable expectation of that will happen under global warming. Why should weather stop when global warming starts? Annual fluctuations should be superimposed on the trend, not replaced by the trend.
Part one of this series was silly, this is even sillier.
“1047” should be “1947” just above Fig. 9
[Reply: Fixed, thanks. -ModE ]
Quote from paper: “Every single aspect of man-made global warming is wrong and is based on large number of assumptions that cannot be made and arguments that cannot be validated: alarming trends are all within thermometer’s error levels and therefore have no statistical meaning; not a single paper has been published that have found alarming trends in thermometer data; and not a single paper has been published validating reduction of 730-dimensional and time dependent space into a single number.”
There you are, MSM, “front-page” news and lead-off stories all done for ya.
(Don’t anyone hold their breath….)
Bravo, Dr. Butina! Thank you for so generously sharing (and so patiently explaining) the fruits of your labor with us. That you are a master of your subject is proven, I believe, by the fact that a non-science major like I understood (well, I THINK I understood!) most of what you wrote. You are not only a fine scientist, but a teacher par excellence. A bright scientist can discover truth, but only a true master (and I consider “master” in this context to include both males and females) can teach it.
Q. Would it be more accurate to change: “…once the next day temperature predictions become 100% accurate it will be clear and unequivocal sign that the global warming has finally arrived…” (from “lighter note” at end of paper) TO READ: “… once the [predictions of] next day’s [temperature relative to the previous day’s temperature] become 100% accurate it will be clear… .” ? That is to 100% accurately state: either “Tomorrow, it will be warmer,” or “Tomorrow it will be cooler.” I AM TRYING TO SAY WHAT I MEAN (I know I mean what I say… arrgh) #:)
If the absolute temperature value needs to be predicted, if you would be so kind, please explain. Thanks!
Elegant, simply presented and absolutely brilliant! Answers a lot of questions that I have had about the silliness that passes for science in some quarters.
Beside the need to recognise the instrument limitations, some sort of allowance needs to be made for reading inaccuracies. I have no opinion as to where that allowance should be set but any allowance at all would have the effect of flattening any trend line.
Dr. Darko Butina,
I found your web site, and the pdf describing dbclus, but didn’t see a reference to pseudo-code or c code, could you kindly point me in the right direction?
Only 24C? I have personally witnessed 105F temperatures in the day time in the Black Rock Desert drop to 18F overnight with all water not in coolers frozen solid at dawn. A drop of 48C in 12 hours. This is not super uncommon.
Fascinating.
Robert Wycoff:
In case you missed Part 1, the author is working with the Armagh data set, not the Black
Rock Desert data set.
Robert Wykoff at 2:34 pm
Dr Butina’s work relates to a single weather station daily dataset collected at Armagh Observatory (UK) between 1844 to 2004.
I’ve just never understood how such a tiny rise in temperature could be converted into the most horrible event in the history of mankind. How could so many be so thoroughly fooled? Of course, it should never be forgotten that half of the people on Earth are below average in intelligence.
The crock gets smellier the more you think about it from a real-world point-of-view.
I suppose the witchcraft mania got worse the more the skeptic thought about it, too, and the obvious insanity of it must have boggled them the same way CAGW is boggling us.
If you did the same thing with the ocean heat content and temperature, you would find this: that the reality is just a statistical artefact without any meaning other than to academics pursuing ideology and grants.
I am reminded of the arguments about how many angels can dance on the head of a pin.
It would also be useful for an experimentalist to analyze the appropriateness of propagating the 0.5C ‘instrumental error’ when the thermometer is used to provide the annual average gridcell temperature.
People prominent in the debate feel a mere 67 thermometers is sufficient coverage.
Not in disrespect but I had a chuckle when the Dr, says that he took his calibration and how a thermometer works from NASA engineers that “actually know what they are doing”
must have been after Hansen left
richard telford says:
April 17, 2013 at 1:47 pm
How is it possible that the same brain that devised dbclust, which sound like a useful procedure, can also believe that Figure 10 is a reasonable expectation of that will happen under global warming. Why should weather stop when global warming starts? Annual fluctuations should be superimposed on the trend, not replaced by the trend.
Part one of this series was silly, this is even sillier.
######################
ya I didnt think it was possible to outdo the howlers in the first part.
From Darko Butina
My website darkobutina@l4patterns.com should be live this weekend (it only has test-run page at the moment) with the full paper (free) and dbclus paper (as pseudo code) as well. May I express my gratitude to Anthony for his vision and dedication to science to allow my paper to be presented on this website knowing that it will upset all those readers that use global averages as reference points to everything and have forgotten to look into instrumental data. I salute you Anthony.
Darko Butina
Dr Butina,
What is the effect of mixing multiple data sets, from the pole to the equator, mixing summer and winter temperatures, where 30% of the data sets show cooling trends, to look for a hockey stick in the fictitious “average global temperature” ?
I would recommend to Dr. Butina, that he consider, that on any typical northern midsummer day, the hottest surface Temperature on earth could be as high as +60 deg C (140 deg F), with air Temperature of +55 deg C. At the other end of the earth, is is Antarctic winter midnight, and Temperatures could be as low as about -90 deg C, at places like Vostok Station. (-130 deg F).
So that’s a possible extreme daily range of 150 deg C, and due to an argument by Galileo, Galilei, there must be some spot somewhere, where any Temperature in that range could be found. In fact, there are an infinite number of such points.
So your 24 deg C is just the tip of the iceberg.
).7 deg Cin 150 years is peanuts.
Well the first article didn’t make sense to me, and this one doesn’t either. Even the analogy isn’t apt. Planets are physical bodies that exist at any given time in a specific location in 3 dimensional space. Likening that to temperature data is nonsensical. Telford’s points are also valid. I think I see the value of your approach for certain applications, this just isn’t one of them.
I have to agree with the incredulous responses. This paper reminds me of Xeno’s paradox, the argument that in order to move from one point to another, we must first cross the half-way mark…and since any number can be halved, then halved again to infinity, it is clearly impossible to move from one place to another and the world therefore IS NOT REAL.
Dr Butina is of course also refuting the concept of seasons. I follows from this paper that no living person has ever truly witnessed summer follow from spring.
You are right on. You dun good.
“Should We Worry About the Earth’s Calculated Warming at 0.7C Over Last the Last 100 Years When the Observed Daily Variations Over the Last 161 Years Can Be as High as 24C?”
wow. let the clueless inspire. And, I am sure some will find this inspiring.