Guest Post by Kip Hansen
This post does not attempt to answer questions – instead, it asks them. I hope to draw on the expertise and training of the readers here, many of whom are climate scientists, both professional and amateur, statisticians, researchers in various scientific and medical fields, engineers and many other highly trained and educated professions.
The NY Times, and thousands of other news outlets, covered both the loud proclamations that 2014 was “the warmest year ever” and the denouncements of those proclamations. Some, like the NY Times Opinion blog, Dot Earth, unashamedly covered both.
Dr. David Whitehouse, via The GWPF, counters in his post at WUWT – UK Met Office says 2014 was NOT the hottest year ever due to ‘uncertainty ranges’ of the data — with the information from the UK Met Office:
“The HadCRUT4 dataset (compiled by the Met Office and the University of East Anglia’s Climatic Research Unit) shows last year was 0.56C (±0.1C*) above the long-term (1961-1990) average. Nominally this ranks 2014 as the joint warmest year in the record, tied with 2010, but the uncertainty ranges mean it’s not possible to definitively say which of several recent years was the warmest.” And at the bottom of the page: “*0.1° C is the 95% uncertainty range.”
The David Whitehouse essay included this image – HADCRUT4 Annual Averages with bars representing the +/-0.1°C uncertainty range:
The journal Nature has long had a policy of insisting that papers containing figures with error bars describe what the error bars represent, I thought it would be good in this case to see exactly what the Met Office means by “uncertainty range”.
In its FAQ, the Met Office says:
“It is not possible to calculate the global average temperature anomaly with perfect accuracy because the underlying data contain measurement errors and because the measurements do not cover the whole globe. However, it is possible to quantify the accuracy with which we can measure the global temperature and that forms an important part of the creation of the HadCRUT4 data set. The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius. The difference between the median estimates for 1998 and 2010 is around one hundredth of a degree, which is much less than the accuracy with which either value can be calculated. This means that we can’t know for certain – based on this information alone – which was warmer. However, the difference between 2010 and 1989 is around four tenths of a degree, so we can say with a good deal of confidence that 2010 was warmer than 1989, or indeed any year prior to 1996.” (emphasis mine)
I applaud the Met Office for its openness and frankness in this simple statement.
Now, to the question, which derives from this illustration:
(Right-click on the image and select “View Image” if you need to see more clearly.)
This graph is created from data directly from the UK Met Office, “untouched by human hands” (no numbers were hand-copied, re-typed, rounded-off, krigged, or otherwise modified). I have greyed-out the CRUTEM4 land-only values, leaving them barely visible for reference. Links to the publically available datasets are given on the graph. I have added some text and two graphic elements:
a. In light blue, Uncertain Range bars for the 2014 value, extending back over the whole time period.
b. A ribbon of light peachy yellow, the width of the Uncertainty Range for this metric, overlaid in such a way as to cover the maximum number of values on the graph.
Here is the question:
What does this illustration mean scientifically?
More precisely — If the numbers were in your specialty – engineering, medicine, geology, chemistry, statistics, mathematics, physics – and were results of a series of measurements over time, what would it mean to you that:
a. Eleven of the 18 mean values lie within the Uncertainty Range bars of the most current mean value, 2014?
b. All but three values (1996, 1999, 2000) can be overlaid by a ribbon the width of the Uncertainty Range for the metric being measured?
Let’s have answers and observations from as many different fields of endeavor as possible.
# # # # #
Authors Comment Policy: I have no vested opinion on this matter – and no particular expertise myself. (Oh, I do have an opinion, but it is not very well informed.) I’d like to hear yours, particularly those with research experience in other fields.
This is not a discussion of “Was 2014 the warmest year?” or any of its derivatives. Simple repetitions of the various Articles of Faith from either of the two opposing Churches of Global Warming (for and against) will not add much to this discussion and are best left for elsewhere.
As Judith Curry would say: This is a technical thread — it is meant to be a discussion about scientific methods of recognizing what uncertainty ranges, error bars, and CIs can and do tell us about the results of research. Please try to restrict your comments to this issue, thank you.
# # # # #


Go back a step for a more fundamental question. To calculate the average height of persons in a room do we measure only the tallest and shortest – no. Why is the midpoint between the maximum and minimum outliers for a day called an “average”? Next step – anything in modern climate science that’s called an average for a month or year or a grid region, etc is compiled from these “averages” that would not be called averages in any other field.
Reply to Gary in Erko ==> See Zeke Hausfather’s explanation of adjustments at Climate Etc.
The question you ask is “answered” by adjustments — I do not wish to discuss that can of worms (or is a a barrel of monkeys?) here.
A general question about error margins. Suppose we want the average of A +/-a and B +/-b. The average of A & B is (A+B)/n where n=2, but how do we calculate the error margin for this average, or extending that, where there are more than two terms with different error margins?
Isn’t it something like (A+B)/2 +/- sqrt(a^2 +b^2) ??
Gary, in some engineering fields the problem of disparate error sources (i.e. one part of the system contributes MOST of the error) is analyzed with “RSS” math. This is “Root Sum Squared” math, each error source is assigned an error value, each value is squared and a sum taken, then the root of the sum is calculated.
When the error sources are not correlated this technique works quite well. For example if you build a complex optical lens (like the Hubble Telescope) you can measure the “wavefront error” of each element of the lens. This is a measure of how far away each optical surface is from the desired “perfect shape” and is usually measured in microns. If you “RSS” all those error terms together you get a very representative value for the wavefront error of the entire optical system.
This has worked well for subsystem errors that are not correlated. For example, if you order 100 pieces of lumber cut into 1 foot long pieces (from 100 different sawmills) and then stack them together you will find that the final height is very close to 100 feet.
Of course, if you order 100 pieces of lumber from the same sawmill the chances are that there is a systemic bias and your final “stack up” will be off by many feet.
Cheers, KevinK
Kevin & George – thanks. I’d forgotten this from high school maths. From various graphs I reckon the error margins quoted for ‘world temperature anomalies’ are a more simple product of an estimate of number of gauges in use in various periods. They quote a fixed error margin for maybe 20 years, then a slightly larger margin for the preceding decade or two. They don’t look like a legitimate calculation or even an approximate estimate based on likely errors of each gauge.
A few typical graphs with error margins constant across consecutive years, then sudden changes for another bunch of years.
http://www.metoffice.gov.uk/media/image/r/8/Graph-300×2501.jpg
http://www.metoffice.gov.uk/hadobs/hadcrut3/diagnostics/comparison.html
Reply to Gary in Erko and KevinK ==> The Uncertainty Range used by the Met Office is a VRG (very rough guess) based on two exhaustive papers, whose links I’ve given a couple of times above. See the FAQ link given in the essay, find the “warmest year” answer, the papers are linked there as well.
“The accuracy with which we can measure the global average temperature of 2010 is around one tenth of a degree Celsius.”
Delusional.
Reply to jorgekafkazar ==> Credit should be given to Met Office UK for making any reasonable, clear-cut statement about Uncertainty Range, even if many think that it is “way too small”.
For example, BEST shows much smaller CIs for current averages.
“Liars figure and figures lie” is an old but accurate statement that could be applied to both sides of the AGW argument. However there should be no “argument” or “consensus” or “belief” when it comes to science…..No? Once again we enter the realm of politics.
Mostly it is all fantasy.
Proclaiming a significant signal smaller than the noise.
The uncertainty range offered by these “experts” who claim they can MEASURE the average global temperature, appears to be only that of the statistical massaging of their chosen input.
The claims of 0.01C differences in these created anomalies is comedy.
Very low comedy.
The error range of the recorded temperatures from weather stations, is another matter.
A 1/10 of a degree error (uncertainty) range is wildly optimistic.
Or is the choice of words revealing;”0.1C is the 95% uncertainty range”?
Then there is the change of instrumentation, I read recently a mercury in glass thermometer can take 2 or 3 minutes to stabilize on temperature rise and up to 10 minutes to stabilize on a drop in temperature.
The electronic resistance thermometer stabilizes either way in 2.5 to 10 seconds.
If these response times are accurate, we can expect multiple “record” high temperatures since the transition to automatic stations.
The state of the past data is such that we do not know what trends are afoot, if any.
Probably the broad indicators like global sea ice are the best we can do.
Not only that John, that’s all we have to do. If the job at hand is to determine if we are heading for a significant and dangerous warming or cooling, or an inundation of sea water, there is no need for all these homogenizations and fudge factors and crustal rebound calculations. If the signal is significant, a half a dozen thermometer around the equatorial zone, and the North and South Poles would be enough. We could measure sea level rise with and ax handle once every 10 years at high tide. The way they go about it! O.1C error, 2.5mm a year…..after all the kriging and friging is a total expensive farce.
OT I fear but is icecap blog dead?
http://icecap.us/
Why? Last post was Jan. 28.
Global temperature data is heterogeneous. Overall errors cannot be less than measurement errors. Most data prior to recently had a recording accuracy of +/-0.5 deg C. Overall errors cannot be less than this. When UHI and a host of other influences are taken into account, I’d be surprised if most data over the past century was better than +/- 1 deg C.
Reply to Tony ==> I believe you are quite correct — “Overall errors cannot be less than measurement errors.” Original measurement error does not magically disappear, nor does averaging reduce it (unless one is making multiple measurements of the same thing at the same time with multiple measuring devices.)
Kip wrote;
” Original measurement error does not magically disappear, nor does averaging reduce it (unless one is making multiple measurements of the same thing at the same time with multiple measuring devices.)”
Actually the way it works is; IF you make multiple measurements with a SINGLE instrument at a SINGLE location at a frequency that is significantly different from the frequency content of the noise source you can REDUCE THE NOISE by averaging.
BUT averaging never increases the accuracy, NEVER, NEVER, NEVER. Try making one hundred measurements with my proverbial “wooden yardstick” then average them all together, they will still likely be less accurate than one measurement with my Invar yardstick,
Oh, the Invar yardstick is not an imaginary item, surveyors used to use them for very accurate work, turns out surveyors work outdoors were temperatures and humidity varies a bit depending on the weather, whoops of course I meant climate. Perhaps climate change is making surveying less accurate ? (add it to the list).
See here where they discuss “leveling rods” (like a vertical yardstick) made of Invar;
http://en.wikipedia.org/wiki/Invar
Cheers, KevinK
But that is exactly what they claim: As sample size => infinity, error => 0
Not so long ago the AGW’ers were all in a rage about Venice sinking below the Adriatic. A check indicates Venice is alive and well as the Acqua Alta.
I would hazard that the internet enable mercury thermometers of the NCDC and their “High Resolution High Fangled Thermometer Network” will be alerting us to impending calamity for decades, or at least until soon after the next Presidential Election. Ha Ha.
In a year or so the University of Arizona “Graduate Committee” of the graduate student at UA whose “research” paper neglected more than 90% of the long standing (about 30 years) GPS stations and focused on only those stations deployed in 2005 to current (lacking calibration or even error checking), will not be kind.
Kip, this is a good question, but asked at an inconvenient time.
Please, ask again.
In philosophical terms, the 18 data points are “specific instances” of a “general rule” aka “scientific theory.” By itself the illustration provides one without the means for getting from the specific instances of it plus the related “background information” to the general rule. To get to the general rule is, however, the objective of a scientific study.
To describe the events underlying the general rule is a requirement for getting to it from the specific instances of it plus the background information. Climatologists persistently fail to describe these events. Thus, climatological research persistently fails to produce general rules. In the absence of these rules the climate remains uncontrollable. Misunderstandings fostered by interest groups have led the makers of public policy to think they can control the climate when they are incapable of doing so.
How can error bars be talked about when there is no log of the adjustments that have been made to the temperatures that will be used in calculating the averages. Who made the adjustments? What formula was used to make the adjustment? When as the adjustment made? Maybe a Harry read me file should be made (Oh I just did the change). Every temperature measurement should be open to absolute scrutiny. Every point if seen to be an outlier needs to be calibrated. We are spending Billions of dollars on some estimates made by a few “programmers”. Without the detailed logs none of the data would be acceptable in any professionally run lab.
I can hear the bleats of these “programmers” as well they did not have the budget or scope to do professional work. If not then publicly say so. Oh I forgot this is a political agenda.
“The David Whitehouse essay included this image – HADCRUT4 Annual Averages with bars representing the +/-1°C uncertainty range:”
“It should read as The David Whitehouse essay included this image – HADCRUT4 Annual Averages with bars representing the +/-0.1°C uncertainty range:”
reply to ashok patel ==> Boy, you are absolutely right — my big big typo! Thank you! -kh
Moderator : Please make this correction in the original essay. It should read:
“The David Whitehouse essay included this image – HADCRUT4 Annual Averages with bars representing the +/-0.1°C uncertainty range:””
done, -Anthony
Error bars are usually an estimate of the spread of data due to imprecise measurements and the random error. You could randomly remove 10-20% of the data and redo the calculations. The spread will give you an idea of the random error.
Systematic errors are different and aren’t usually represented in error bars. You identify them and fix them. You also include the uncertainty in the amounts needed to adjust the data through to your final result.
On top of that, the average of surface temperature measurements is just that, not the global temperature. There can be degree differences between means of hourly readings and the mean of max and min measurements if a cold front moves in through the late afternoon of evening. There can be a few degrees difference of the minimum temperature overnight due to the dew point. There can be a few degrees difference between the ground and the station on a cold and dry night . If you want an indicator of how the climate is changing, treat the max and min separately (and the months).
There are multiple things going on here. First of all, modern meteorological thermometers have measurement error of +/- 0.1°C. That is a fact. This means that any averaging of thermometer readings can never have a better accuracy than +/- 0.1 degrees. From time to time I see someone claiming that beacuse there are thousands of readings every day this measurment error can be reduced by dividing the error with the square of the number of measurements, but that is just wrong. In order to use that method you need multiple measurments at each site at the same time, and that does not happen.
Now as I said, if the global mean temperature was just an average of all available stations then the error would be +/- 0.1 degree, and it could be higher if some stations had higher measurement error. I believe it is William Briggs who have repeatedly stated that when you have the measured temperatures there is no uncertainty involved – the “95% unceratinty range” does not apply here. You know what you have measured, within the stated accuracy of your equipment.
BUT, the global mean is not an average of all readings available. It is an average of something else. This something else is the infilling that is going on to create the notion that we “know” the temperature in each grid cell on the entire surface of the globe. This procedure must increase the uncertainty, both physically AND mathematically. My claim is that once they start the infilling there is no meaningful way of performing statistics on the results. Some stations will have a much higher impact on the overall “average” in this way and that will affect the result.
Others have mentioned the fact that the relationship between the radiative balance of the earth and temperature is non-linear, which of course makes the change in the global average less than meaningful in that respect.
Basically they should state the average of the station values. There is enough problems with that, what with all the station changes and such. But at least that would give an average of the real measurements. I am not saying that it would give any more meaning, I am just saying that it would at least be something.
Valland says:
“From time to time I see someone claiming that beacuse there are thousands of readings every day this measurment error can be reduced by dividing the error with the square of the number of measurements, but that is just wrong. In order to use that method you need multiple measurments at each site at the same time, and that does not happen.”.
So every thermometer is only used for one measurement then.
I did not know that. Now I do. And I did not know that measurements from different thermometer with the same measurement error will not do anything to measurement error.
Your reply is really not very enlightening. You do know that most meteorological stations have one or two thermometers, don’t you? And you do realize that two simultaneous measurements really does not reduce measurement error? It is true that each thermometer of the same type and make has the same accuracy. If you make measurements of the same parameter simultaneously with several thermometers you will be able to reduce the measurment error. It is important to realize that since temperature is changing both spatially and temporally the measurments have to be simultaneous in both space and time. Meteorological measurements are neither. That is why measurment error prevails.
Anders:
Just one quibble:
“…It is true that each thermometer of the same type and make has the same accuracy…”
I would agree that each thermometer of the same type and make has the potential for the same accuracy.
Without proper calibration procedures and tracking data that potential is never achieved. Nor are the climastrologist’s accuracy assumptions valid or relevant whenever they announce their silly calculations.
As far as rooter and his loathsome buds, they are here to distract, irritate and just plain waste commenter time and space. Just ignore rooter; nothing irritates time wasters more than not getting a response from the commenters they are trying to impede.
” If you make measurements of the same parameter simultaneously with several thermometers you will be able to reduce the measurment error. It is important to realize that since temperature is changing both spatially and temporally the measurments have to be simultaneous in both space and time. Meteorological measurements are neither. That is why measurment error prevails.”
Seems like Anders Valland is saying that measurement error changes with temperature and location. And therefore the only way to reduce measurement error is to have lots of simultaneous measurement at the same location.
To AtheOK: I agree with you, I deliberately kept the issue of calibration etc out of this. Doing real life measurements is a messy job 🙂
To rooter @ur momisugly February 2, 2015 at 9:05 am: Either you are deliberately misunderstanding (your “style” of writing implies this), or you really are in the dark when it comes to measurements. Or both at the same time. So instead of me trying to explain this to you, could you please tell us just how the measurement error should be handled in these types of measurements? How does it propagate in the calculations, rooter?
Vallend says:
“To rooter @ur momisugly February 2, 2015 at 9:05 am: Either you are deliberately misunderstanding (your “style” of writing implies this), or you really are in the dark when it comes to measurements. Or both at the same time. So instead of me trying to explain this to you, could you please tell us just how the measurement error should be handled in these types of measurements? How does it propagate in the calculations, rooter?”
This is very basic Valland. Take one measurement. Accuracy of the instrument describes how well that instrument can give the correct reading. This error is +- and random. Not a systematic bias. Take two or more measurements and the randomness of the error will cancel out the errors. Because it is not a systematic bias. That error in measurement has nothing to do with different time or location. That has nothing to do with how the effect of that measurement error is reduced with many measurements.
To rooter February 3, 2015 at 4:34 am :
The measurement error contains a systematic and a random element. You have no way of knowing which is which. That is why the level of accuracy, or measurement error, is stated for the instrument. Since we are talking about measurement of air, which is an ever changing fluid with varying thermodynamic properties (mostly due to the amount of water vapour and liquid), you need multiple simultaneous measurements in time and space to be able to reduce measurement error. You need to read up on error handling and especially the preconditions for the special rule you are applying.
Tell me, rooter. In my current work we do pressure and temperature measurements inside an engine cylinder during combustion. This amounts to thousands of measurements during a few seconds. Do you believe we would pass peer review if we said that this alone made our measurements close to infinitely accurate as you claim that we could (remember, this is not climate science)?
And do you seriously believe that my home thermometer is infinitely accurate since I have read it several thousands of times since I bought it? Because that is really what you are saying. The more I look out the window and see a number, the better the accuracy because somehow the error the instrument had last time is cancelled out by the error it has now. That is bollocks.
Anders Valland says:
“Tell me, rooter. In my current work we do pressure and temperature measurements inside an engine cylinder during combustion. This amounts to thousands of measurements during a few seconds. Do you believe we would pass peer review if we said that this alone made our measurements close to infinitely accurate as you claim that we could (remember, this is not climate science)?”
That is very interesting. Why do you need thousand of measurements during few seconds?
Anders Valland says:
“And do you seriously believe that my home thermometer is infinitely accurate since I have read it several thousands of times since I bought it? Because that is really what you are saying. The more I look out the window and see a number, the better the accuracy because somehow the error the instrument had last time is cancelled out by the error it has now. That is bollocks.”
Of course the accuracy of the instrument will be better with repeated readings. But repeated reading will improve the accuracy of the mean of the reading. Incredibly basic. You could check that yourself. Do daily readings for one month of your thermometer. As accurate as the thermometer can give. Make another series with rounding of those readings to the nearest degree. That is: decrease the accuracy (+-1). Then compare the mean of the two series. Check if the mean of the least accurate series can deviate one degree from the mean of the more accurate series.
You will probably be surprised by the answer.
Again, it will not decrease the mean of the reading. You are stating that if I read -3.5 °C +/- 0.1°C today and then I read -1.6°C +/- 0.1°C tomorrow that the mean would be 2.6°C +/- 0.07°C. That is silly, and contrary to all current knowledge in this area.
I think we can now say that we have enough samples of rooter to say that his precision is low, and his accuracy even worse. It does not really matter what you believe, rooter. Your belief does not change facts in this matter. Read up, or go on believing.
rooter says: “That is very interesting. Why do you need thousand of measurements during few seconds?”
Seriously?
To my own comment, it should state “…that the mean would be -2.6°C +/- 0.07°C.”
Anders Valland says:
“Again, it will not decrease the mean of the reading. You are stating that if I read -3.5 °C +/- 0.1°C today and then I read -1.6°C +/- 0.1°C tomorrow that the mean would be 2.6°C +/- 0.07°C. That is silly, and contrary to all current knowledge in this area.”
You’re getting there Valland. Random errors cancel out.
Anders Valland asks:
“rooter says: “That is very interesting. Why do you need thousand of measurements during few seconds?”
Seriously?”
Yes, seriously. Why thousands of measurements?
As I said, we are measuring temperature during combustion in an IC engine. That is why we get some thousand points in a few seconds.
No, rooter, the mean can not be more accurate than the individual measurements when you are measuring different things. If I had two simultaneous measurements each day it could do that.
But I only have one each day. You really should read up on this.
If this is the way you handle knowledge I guess you will be getting into trouble quite often. Nature really does not care about your beliefs as anyone versed in experiment will tell you.
“BUT, the global mean is not an average of all readings available. It is an average of something else. This something else is the infilling that is going on to create the notion that we “know” the temperature in each grid cell on the entire surface of the globe. This procedure must increase the uncertainty, both physically AND mathematically. My claim is that once they start the infilling there is no meaningful way of performing statistics on the results. Some stations will have a much higher impact on the overall “average” in this way and that will affect the result.”
Averaging is infilling.
“Averaging is infilling”.
How?
Consider temperature indexes with gridcells and not interpolation between cells. Take a grid cell. Compute the average of the temperature stations. The whole gridcell will get that average. Two stations or 50 stations. Areas inside that gridcell without measurements will be infilled with the average of the stations.
Do the same with a hemisphere. The average for that hemisphere will consist of the average from the gridcells with temperature measurements. The gridcells without measurements will be infilled with the average of the gridcells with measurements.
Comment to rooter, February 2, 2015 at 8:54 am: Averaging is infilling ONLY if you consider gridcells. The way I described it, it is not infilling.
Anders Valland says:
“Averaging is infilling ONLY if you consider gridcells. The way I described it, it is not infilling.”
Well. You have not described your kind of averaging. Averaging without some kind of areaweighting is infilling in the same way as the making the average of a gridcell is infilling the gridcell’s value with the average of the measurements from that gridcell. The only difference is the size of the gridcell. A simple average of all the measurements is using the whole globe as one big gridcell. And some areas with many measurements will be given bigger weight than the should have.
Perhaps this is the time for Valland to formulate his alternative?
rooter commented
One could look at this differently, first is that more measurements actually means there is less uncertainty in those over sampled areas, maybe they deserve more weight.
When I create averages of larger areas I don’t adjust for weighting, but I’m not really trying to generate a field value (and I have 1×1 averages if it’s important), I’m trying to describe the response of a large number of sensors that i have no control over what their location is.
My “alternative” was given further up: calculate the mean of the station values. I qualified that by stating something about its meaningfulness, I could probably say that it would make just as much sense as the current methods used for infilling.
Anders Valland says:
“My “alternative” was given further up: calculate the mean of the station values. I qualified that by stating something about its meaningfulness, I could probably say that it would make just as much sense as the current methods used for infilling.”
Then Valland says it is ok with infilling. That is the same as infilling one grid with the mean of the stations in that grid. Except that he uses a very big grid. In his case the whole world.
And he adds one big error. The mean of those station values will be strongly affected by the fact that there are many more stations in continental US and Europe. Those areas will be given too much weight. That is the worst infilling method.
Hehehe, rooter, is that the best you’ve got?
Since we are looking for a global mean of measurements temperaturen it does not constitute an error to take the simple average of all available stations. It is just one other method, not an error. People seems to be preoccupied by the changes in the mean value and that can also be accomplished here. We are not looking for an accurate absolute value, thus the area weighting is meaningless. It makes just as much sense to use the simple mean as to construct any fancy infilling and weighting.
“Basically they should state the average of the station values. There is enough problems with that, what with all the station changes and such. But at least that would give an average of the real measurements. I am not saying that it would give any more meaning, I am just saying that it would at least be something.”
That is what the temperature indexes are. With some kind of area weighting. Which is of course necessary because different number of stations worldwide. A simple average of stations would therefore give too much weight to the continental US.
That is why I explicitly stated that there are problems with it, rooter. And why I stated that it would not necessarily give any meaning.
Anders Valland says that temperature indexes don’t give any meaning. What do Anders Valland prefer then? Nothing? A simple average of stations an no area weighting?
If, we agree that we can not increase the precision beyond a single dp, with surface data that is +/- 0.1F, then the collective change in temp for 95 million samples from 1940 to 2013 is effectively 0.0 for change in min temp and 0.0 for the change in max temp (calculated min temp change = -0.097392206 max = 0.001034302).
This is what the stations actually measured.
Now, because the surface has an annual temperature cycle, you need as close to a full year as possible for that cycle to cancel. I select each stations by year with a minimum of 240 daily sample that year, if a year has less than 240 that year for that station is excluded.
Over that same period daily rising temp average is 17.5F , and the following nights average falling temps are 17.6F
Global warming is entirely a product of the processing methods used on the data.
Roughly 70% of the globe is ocean. I wonder what the errors in sea surface temperatures is?
I wonder what it was in 1880!
Reply to Jeremy Shiers ==> An attempt at answering this is in this paper.
10 years as a croupier in the Casino Industry. Probability mathematics is a wonderful thing. There was a good reason for Douglas Adams to use the idea to build the 2nd most advanced space ship ever created or to be created in his Hitchker series (restaurant mathematics is behind the most advanced one!). In short, the graph means diddly squat. It doesn’t have enough “rolls of the dice”, enough “spins of the wheel, “hands of cards”, etc to discern anything with confidence. I have personally witnessed 15+ Red, Black, Odd, Even, etc (take your pick) combinations in a row on multiple occasions and though statistically unlikely you have know way of knowing if this graph is showing each point in its statistically most likely, least likely or something inbetween, position because you don’t have enough data on your graph.
Please excuse the use of “know” instead of “no”. Was originally thinking “no way of knowing” but mistyped whilst daydreaming ahead
Reply to wicked… ==> You aren’t the first to make homonymic typos here….no need to apologize. Even my fingers sometimes type the wrong “there” or “its”.
It is simply “the graph”, not mine however. Its significance is much in doubt, whether it means anything at all given its number of data points.
Thank you for the Gamblers-view of it!
The first thing I would do is to ask where the graph came from. Look at the raw data first. Check the method used to process the raw data into the display “data”. Look to see if features were added that weren’t in the raw data. My understanding is that the methods used are extremely unreliable, involving subjective measures such as comparing “adjacent” sites and making adjustments to make sure they show a similar trend. Odd choice of adjacent sites (as seen for Albury), creating climbing slopes out of negative or neutral slopes. Anyone looked at Paraguay, lately?
Only then is it worth asking questions about the graph. My first observation is that the question of what year is highest is pointless. They are all much the same. 300 degrees K, give or take a small amount of noise. It is hard to go past that.
You can’t calculate the error bars without understanding the errors. UKMO are playing games as all the climate people are. It is possible/likely that the real error bars would show no significant warming for 150 yrs.
Just a point on your yellow band, though. You should really use yellow blocks rather than a flat band to create your graph. The blocks would show the margin of error for each point individually. You could then plot the “worst case scenario” for each side of the debate. You could manufacture a plausible (but unlikely) graph showing a rising trend or a declining trend within those blocks to fit whichever view of the world best suits.
reply to wicked…. ==> If I were graphing the data for general use, you would be absolutely correct. The above is NOT the proper way to show an Uncertainty Range.
However, I wanted to show how many points of the 18 year period would “fit inside of” the Uncertainty Range — thus this illustration. I purposefully called it an “illustration” and not a graph for this very reason.
The Met Office TV weather forecasts have recently started telling us that the night-time temps it shows are in the country and that in towns the temperatures will be several degrees higher. They don’t say that it’s UHI but I can’t think what else it could be. So even the Met Office is now admitting that its recorded/predicted temperatures vary widely depending on whether they’re in town or country. What possibility is there that they could calculate an accurate average temperature for one day for a district, let alone an average temperature for a year for the world?
And just looking at the weather map for the UK, ought to have convinced Lord Stern (who compiled a report on the disastrous consequences of global warming) that a change of 2 or 3 degC would be of no concern whatsoever for the UK.
Temps in Scotland would be more like those in Northern England, those in Northern England more like the Midlands, the Midlanda more like the South West, the South West more like the Sout East, the South East more like the Channel Isles, and the Channel Isles more like Brittany/Normandy. Whats not to like about that?
Wouldn’t all parts of the UK greatly benefit by such a temperature rise, Global Warming (there being no such thing since climate is regional) would be a god send for Northern Lattitude Countries such as the UK, Holland, Germany, Scandinavia, Canada etc.
‘What does this illustration mean scientifically?’
My answer would be that this does not add anything to what we already know in the shape of the time series and simple summaries like slope of linear regression.
Record setting does not occur in any serious science. It belongs to the world of entertainment, sport, advertisement, and hype. Record setting events are a statistical disaster because they are by definition severely dependent upon each other. They depend on the trivial start of record keeping, and they do not contain much objective information because what’s a record breaking event now, stops to be that when a new record is set. Perhaps their only useful function is to inform us about the range of a variable, like encountering somewhere a person of 120 years old with a valid birth certificate.
For the statisticians wasting their time on these non-scientific issues: where do the error bars come from and do these also apply at a series of outliers? You can be sure that the bars must be huge and must also contain bias.
It’s worse than they thought! A brief look at any undergraduate text on propagation of experimental errors shows that if independent experimental values of T1 and T2 have errors of dT1 and dT2, then T1-T2 has an error of sqrt (dT1**2 + dT2**2). That is, if dT1 = dT2 = dT, then T1 – T2 has an error of 1.4*dT. If the wildly optimistic assumption of dT = 0.1 is accepted, then the error in their difference is +/-0.14. Also, numbers should never be quoted with more significant figures than their experimental error, so to say T1 – T2 =0.01 is completely invalid.
well as a non scientist, but truly pondering matters with what i cll “logic reasoning” i would answer these questions you state as following:
“2014 has a slightly little more posibility of being the warmest year then 2010 and 1998 and this possibility goes in diminishing order with all the other years on the graph that near the error margin area you colored.”
i would thus say statistically according to the values 2014 has the most chance of being just a tiny teenie weenie warmer then 2010 if you would bet for the “hottest year” but not of a significance.
on question 2 it is a bit harder, but pure logic tells me that: If 18 values are in the domain of uncertainty range, then again you speak of possibilities. again it’s a game of what i call “chances”
i see the error bars in a bit more unconventional way: i take in account what this error region means by thinking of “best guess with most chance of being correct” if the value would be error free.
so let’s assume we did go for a betting game for this hadcrut as tomorrow we would have error free temperature results.
in that way
2014 would be the bookmaker’s choice as favorite (like the favorite horde in a horserun)
then second would be 2010
third 1998
and so on
However being able to cover 18 readings in an uncertainty field does not say it all. it also depends where the dots in that field are however it does say one thing: there is a possibility no matter how small that they all may have the same value.
as result i got this conclusion: compared to 1982 – 1998, 1998 till 2014 did not made a signicicant rise outside the error bars region, while the episode 1982-1998 clearly did. Therefore as the current trend is within the error bar range, there is logically no significant change.
Reply to Frederik Michiels ==> The critical point is that the Uncertainty Range calculated by the Met Office is not your run-of-the-mill CI. Confidence Intervals are statistical animals and have (IMHO) very little to do with measured values (Steven Mosher tells us that his project, Berekley Earth, doesn’t produce averages but rather predictions — so I don’t know what their 95% uncertainty means at all).
Read the Met Office FAQ statement in the original essay.
While interesting, the probabilities about the data points are not, well, the point.
NOAA states that the annual mean global surface temperature for 1907 is 16°C, and 1907 has the lowest annual mean global surface temperature of all the years from 1900 to 1997.
http://www.ncdc.noaa.gov/sotc/global/1997/13
NOAA also states that the annual mean global surface temperature for 2014 is 14.59°C, and 2014 has the highest annual mean global surface temperature of all the years from 1880 to 2014.
http://www.ncdc.noaa.gov/sotc/global/2014/13
Why would anyone believe anything, that NOAA publishes?
In 1995, NASA claimed, that the current mean global surface temperature is 281 k (8°C).
https://pds.jpl.nasa.gov/planets/special/earth.htm
In 1998, NASA claimed, that the current mean global surface temperature is 15°C.
http://www.giss.nasa.gov/research/briefs/ma_01/
If we put these two together, the mean global surface temperature of the earth during the 90s is 11.5(+/-3.5)°C, or 285(+/-3.5) k.
According to Carl Sagan and George Mullen, the mean global surface temperature of the earth in circa 1972 could have been either 289(+/-3) k (16(+/-3)°C)), or 281(+/-3) k (8(+/-3)°C).
http://courses.washington.edu/bangblue/Sagan-Faint_Young_Sun_Paradox-Sci72.pdf
It does not seem unreasonable to suppose, that all estimates of mean global surface temperatures come with a caveat of +/-(not less than 3)°C
Carl Sagan and George Mullen also mention 286degK to 288degK for the mean surface temperature.
But rthe upshot of all of the above, is that no one has any real idea as to the average surface temperature of the globe within about +/- 5 degC, ie., about 12 degC +/- 5 degC.
Plenty of ideas how to do it, but rightfully recognized as problematic to do. Hence anomalies.
Perhaps you would be interseted in the work of Pat frank, highlighted by Chefio.
http://meteo.lcd.lu/globalwarming/Frank/uncertainty_in%20global_average_temperature_2010.pdf
ABSTRACT
Sensor measurement uncertainty has never been fully considered in prior appraisals
of global average surface air temperature. The estimated average ±0.2 C station error
has been incorrectly assessed as random, and the systematic error from uncontrolled
variables has been invariably neglected. The systematic errors in measurements from
three ideally sited and maintained temperature sensors are calculated herein.
Combined with the ±0.2 C average station error, a representative lower-limit
uncertainty of ±0.46 C was found for any global annual surface air temperature
anomaly.
or this one
http://multi-science.metapress.com/content/t8x847248t411126/fulltext.pdf
Reply to A C Osborn ==> Thanks for the links!