Analysis of Met Office data back to mid 1800's

John Graham-Cumming has posted an interesting analysis, he could benefit from some reader input at his blog.

See here and below: http://www.jgc.org/blog/

Adjusting for coverage bias and smoothing the Met Office data

As I’ve worked through Uncertainty estimates in regional and global observed temperature changes: a new dataset from 1850 to reproduce the work done by the Met Office I’ve come up against something I don’t understand. I’ve written to the Met Office about it, but until I get a reply this blog post is to ask for opinions from any of my dear readers.

In section 6.1 Brohan et al. talk about the problem of coverage bias. If you read this blog post you’ll see that in the 1800s there weren’t many temperature stations operating and so only a small fraction of the Earth’s surface was being observed. There was a very big jump in the number of stations operating in the 1950s.

That means that when using data to estimate the global (or hemispheric) temperature anomaly you need to take into account some error based on how well a small number of stations act as a proxy for the actual temperature over the whole globe. I’m calling this the coverage bias.

To estimate that Brohan et al. use the NCEP/NCAR 40-Year Reanalysis Project data to get an estimate of the error for the groups of stations operating in any year. Using that data it’s possible on a year by year basis to calculate the mean error caused by limited coverage and its standard deviation (assuming a normal distribution).

I’ve now done the same analysis and I have two problems:

1. I get much wider error range for the 1800s than is seen in the paper.

2. I don’t understand why the mean error isn’t taken into account.

Note that in the rest of this entry I am using smoothed data as described by the Met Office here. I am applying the same 21 point filter to the data to smooth it. My data starts at 1860 because the first 10 years are being used to ‘prime’ the filter. I extend the data as described on that page.

First here’s the smooth trend line for the northern hemisphere temperature anomaly derived from the Met Office data as I have done in other blog posts and without taking into account the coverage bias.

And here’s the chart showing the number of stations reporting temperatures by year (again this is smoothed using the same process).

Just looking at that chart you can see that there were very few stations reporting temperature in the mid-1800s and so you’d expect a large error when trying to extrapolate to the entire northern hemisphere.

This chart shows the number of stations by year (as in the previous chart), it’s the green line, and then the mean error because of the coverage bias (red line). For example, in 1860 the coverage bias error is just under 0.4C (meaning that if you use the 1860 stations to get to the northern hemisphere anomaly you’ll be too hot by about 0.4C. You can see that as the number of stations increases and global coverage improves the error drops.

And more interesting still is the coverage bias error with error bars showing one standard deviation. As you might expect the error is much greater when there are fewer stations and settles down as the number increases. With lots of stations you get a mean error near 0 with very little variation: i.e. it’s a good sample.

Now, to put all this together I take the mean coverage bias error for each year and use it to adjust the values from the Met Office data. This causes a small downward change which emphasizes that warming appears to have started around 1900. The adjusted data is the green line.

Now if you plot just the adjusted data but put back in the error bars (and this time the error bars are 1.96 standard deviations since the published literature uses a 95% confidence) you get the following picture:

And now I’m worried because something’s wrong, or at least something’s different.

1. The published paper on HadCRUT3 doesn’t show error bars anything like this for the 1800s. In fact the picture (below) shows almost no difference in the error range (green area) when the coverage is very, very small.

2. The paper doesn’t talk about adjusting using the mean.

So I think there are two possibilities:

A. There’s an error in the paper and I’ve managed to find it. I consider this a remote possibility and I’d be astonished if I’m actually right and the peer reviewed paper is wrong.

B. There’s something wrong in my program in calculating the error range from the sub-sampling data.

If I am right and the paper is wrong there’s a scary conclusion… take a look at the error bars for 1860 and scan your eyes right to the present day. The current temperature is within the error range for 1860 making it difficult to say that we know that it’s hotter today than 150 years ago. The trend is clearly upwards but the limited coverage appears to say that we can’t be sure.

So, dear readers, is there someone else out there who can double check my work? Go do the sub-sampling yourself and see if you can reproduce the published data. Read the paper and tell me the error of my ways.

UPDATE It suddenly occurred to me that the adjustment that they are probably using isn’t the standard deviation but the standard error. I’ll need to rerun the numbers to see what the shape looks like, but it should reduce the error bounds a lot.

WUWT readers please go to http://www.jgc.org/ to discuss and see the latest updates.

Advertisements

132 thoughts on “Analysis of Met Office data back to mid 1800's

  1. Cherry picking error reports?
    This is a form of experimental bias.
    Having used fortran and done research, the sampling methods they use lack randomness and do not represent fairly the temps of the planet.
    a simple example. How did they ever gather data by means of equally spaced weather stations in vast areas without roads?

  2. The number of stations is not the correct parameter to use. Imagine that a small country [e.g. Denmark] suddenly gets the idea that more stations are good, so installs a million sensors throughout the smallish land area. Clearly that will do nothing for the error in the global average.
    The proper parameter must include the distribution of those stations. Something like the mean area around a station where there are no other stations. This problem has been studied at length in the literature [I can’t remember a good reference, off hand, but they exist].

  3. “So I think there are two possibilities:
    A. There’s an error in the paper and I’ve managed to find it. I consider this a remote possibility and I’d be astonished if I’m actually right and the peer reviewed paper is wrong.
    B. There’s something wrong in my program in calculating the error range from the sub-sampling data.”
    Don’t give up on A. Confirmation bias is a powerful thing. If it turns out to be B that’s OK. But if it is A…… then Wow. It is worth checking.

  4. Even with reduced error bounds, wouldn’t correcting for UHI effect still place today’s temp readings within the past error boundary?

  5. great work here sir … way outside my experience vis a vi statistics … my background is in software design and development with a diploma in Ocean Engineering so I’m not completely out of my comfort zone.
    Your work appears to be inside the programming, i.e. data in –> scary cloud of data massaging –> Hockey Stick …
    My concern would be that the data in, which up until now everone assumed to be clean is anything but clean, accurate and consistant …
    so while you may be able to show that the methods used inside the cloud are invalid its becoming all too likely that we are also experiencing a garbage in situation …
    garbage in = garbage out assumes the code in between is exactly right, which you appear to have found out may not be the case …
    garbage in –> garbage code –> garbaged (squared) out …
    G2O the new greenhouse gas …

  6. Well I’ve said it time and again, we simply cannot trust surface data – it’s far too wooley. All we can say (with any degree of certainty) is that the satellite data is about as good as we’ll get. That makes it very difficult to know if we’re warming or not, and if we could be anything to do with it anyway. But there it is. Recorded data from ‘way back when’ are just too unreliable. As John says, the error bars mean it could be as warm now as it was 150 years ago – who knows? Let’s stick to satellite http://discover.itsc.uah.edu/amsutemps/execute.csh?amsutemps+002

  7. I haven’t read the report but I remember a History Channel program abot the Royal Navy loging temperature and barometric readings with every navigational sighting (at least twice a day). So here is my question does the data include those readings and if not why not? Considering the number of ships at sea at any given moment there should be thousands of readings per day all available in the Royal Navy archives. Second thought if they are used you should be able to extend the series further back in time with a near global coverage since the sun never set on the british empire just my 2 cents
    Medic 1532

  8. Medic1532 (08:26:21) :
    That thought has been brought up before. It is my understanding that data exists but has not been studied or used anywhere. It would be a HUGE effort, but with computers is feasable

  9. Why is there a coverage bias instead of just a plain old coverage error (which like many types of random measurement errors would be as likely to be positive as negative, giving a mean close to zero)? In other words, do we know that when there is only a small number of regional temperature measurements and they are put into an algorithm to get a regional average temperature, that the calculated average will always tend to be off in one direction — that is, it will be biased off the true value instead of just equally likely to be in error in either a plus or minus way from the true value? I’m sorry if I missed your explanation of this point in the original blog, but was not satisfied that I understood this aspect of what you did after reading it twice.

  10. Dear Anthony-
    I am sure you will crack this statistcal nut. When you do i think you’ll find that the UK and Northwest Europe have had material temp increases during the last 150 years. The UK was industrialized and heavily populated in 1860 so UHI affect is limited. The UK also had localized Human impact in 1860, the height of industrialization in the UK, with massive coal burning causing soot and sulphur dioxide created low hanging smog clouds. Those are cooling agents, and when coupled with the possibility that natural Atlantic Ocean currents bringing warmer Gulf Stream water farther north, the result is a material change to UK temps. Climate change. Is that AGW/CO2 caused ? NO. But it is local climate change.

  11. I’m thankful for the link to my blog, but was it necessary to rip off the entire content of my blog post? In doing so you’ve missed out on the latest updates I’ve made.
    REPLY: Actually John, I’m trying to elevate your work, as I think it is relevant. I wrestled with a partial post but I just didn’t think it would do it for you.
    If you wish, I’ll be happy to remove it. Please advise.
    In the meantime, readers please go to http://www.jgc.org/ to discuss and see the latest updates. – Anthony

  12. On first look, the trend looks quite a bit like those produced with “value added” data, which is not surprising.
    What I noticed is the use of standard deviation as the standard error bands is incorrect. I’m thinking you should be calculating for the standard error of the mean using the population model (n) as opposed to a sample model (n-1) because you are calculating for the entire avalilable data set. Any sites that would be excluded (for whatever reason) don’t meet the inclusion criteria and therefore would not contribute to n (they do not represent available samples).
    Also, I don’t think you should use a fixed standard deviation of 1.96 as the numerator when calculating for standard error across the entire timeline (it is not clear if that is what you are doing) as each point on the timeline will have a different variance. You should re-calculate StdDev at each point and obtain your standard error from the re-calculated StdDev.

  13. Quite a bit OT, but I’ve read John Graham-Cumming’s book “The Geek Atlas: 128 Places Where Science and Technology Come Alive” and I quite enjoyed it. 🙂

  14. NK (08:52:59)
    You bring up some good points on early industrial UHI. But while I agree that sulfur dioxide is a cooling gas, the latest we’ve been hearing on the scientific front is that soot (black carbon) is a net warming gas and has been underestimated in its forcing capabilities. And with the Industrial Revolution relying heavily on coal- and some petroleum-based fuels, this could have been a significant contributor to global temperatures. Whether the soot balanced out the SO2 or not, I have no idea. But it does complicate the implication of early measurements.

  15. John, having your post copied here on WUWT is like having a free one-page advertisement in the New York Times….

  16. If you have a set of long-term stations that have not been biased by urbanization or other changes, look at how well those stations represent the modern record when analyzed separately. That should give you some idea of the range of error you will see when looking at the historical record when those are the only stations. Not precisely what you are looking for, but something to find out how big the ball park is.
    The distribution of those stations is critical, as 10 stations outside London don’t tell you nearly as much as one station in each of 10 countries. They might as well be one station.

  17. Leif Svalgaard (08:18:19) :
    “The number of stations is not the correct parameter to use. […] The proper parameter must include the distribution of those stations. Something like the mean area around a station where there are no other stations.”
    Of course, the number of stations put a limit on the mean area around each station or any other similar parameter. You should be able to estimate a lower bound of the error based on only the number of stations (i.e. the error assuming they were optimally placed).

  18. Oops, I forgot to include wood burning as a possibly significant soot source during the Industrial Revolution.
    The net result of the soot/SO2 forcings is that readings of temps from 1860-1920 or so may have been artificially high due to pollution. If so, the immediate post-LIA temperatures would have been lower than the data suggests and steepening the slope of the 150-year rise. Since CO2 didn’t budge much for 1850-1900 or so, this would indicate that CO2 as the primary forcing agent isn’t true.

  19. Leif Svalgaard (08:18:19) : Says:
    “The number of stations is not the correct parameter to use. Imagine that a small country [e.g. Denmark] suddenly gets the idea that more stations are good, so installs a million sensors throughout the smallish land area. Clearly that will do nothing for the error in the global average.The proper parameter must include the distribution of those stations…”
    The use of confidence intervals in regards to population proportions assumes that the sample is representative of the entire population. John Graham-Cummings use of sample size to arrive at a margin of error is correct. What you and others speak of refers to sampling error (i.e. that the sampling is not representative or is biased in other ways). Sampling error would only add additional uncertainty to the estimate derived by Graham-Cummings and in no way invalidate his base margin of error estimate.

  20. As I understand it, the Navy records are of seminal interest because they are basically travelling weather-stations covering the whole globe. And, it is likely they may contain data that is inconvenient to the Warmists. I also gather that the data is now being ‘looked-after’ by the Tyndale Centre.
    Hmmmmm. Now where is that institution based? Stand-by for them to be gently ‘rubbished’ as inaccurate, via PR releases. And, I guess you’ll have to wait a looong time before any proper studies are done on them. [/cynicism]
    And OT but I reckon that surface-station weather records can only really be used in one specific way, as they are records of air temperatures measured at one geographic spot on the earth, at one time.
    So for each site, pick a decade where things have not changed ( i.e. nothing moved, no instruments changed, or building work built-up nearby, or vegetation growth, etc.. you get the picture ) …and then differentiate the daily readings to get rid of standing offsets/biases. You should them have a measure of whether or not that physical spot on the earth got warmer or cooler over that decade.
    Then do it for as many stations as you have got, and you should be able to do a world map showing decadal warming/cooling. And that is about as good a picture of past climate-change as you will ever get from these records.
    Oh, and while you are at it, do the same with the rainfall, windspeed/direction, and barometric pressure. Now that WOULD be interesting.

  21. @Hank
    If you read my blog posting (not this copy of it) you’ll see that I’ve updated and I’m pretty sure that it’s the standard error they are using in the paper, not the standard deviation. They never mention the standard error, but it makes more sense.
    As for the 1.96 that just the multiplier on whatever sigma is to get the 95% confidence interval.

  22. The word “Denier” is thrown around a lot in an ad hominem attack calling the AGW skeptics “Deniers”, so you think of yourselves as Holocaust Deniers”.
    The word “Holocaust” comes from an ancient Greek word that when translated in English is “Burnt Offering”. The Jewish people don’t have an exclusive claim to “Holocausts”. The Chinese Burnt Offering Holocaust killed 100 million people. The Russian Burnt Offering Holocaust killed 50 million people. The Jewish Burnt Offering Holocaust killed 5 million people. I do not “Deny” that any of these “Holocausts” happened. Whenever I am accused of being an “AGW Denier” I respond by saying, I am not a Jewish Burnt Offering Holocaust Denier, and whenever I mention the word Holocaust, I always pre-qualify the word Holocaust using it’s English translation with the words “Burnt Offering”.

  23. To Jerrym (09:21:59) :
    Agreed, the affects of UK industrial age deforestration, low clouds, SO2, soot etc. (although soot is primarily linked to arctic climate) are probably too complex to ever untangle. BUT, clouds and ocean current drive planetary climate and local climates. As a maritime climate, the UK is particularly affected by the North Atlantic ocean currents. My main point is, if a statistically valid review of Met temp. records shows a material increase, it is a function of currents first, local cloud formation patterns (possible Human impact) second, AGW/CO2, not do much.

  24. Well I tried to leave a post over there at John’s site. It wouldn’t take the long one and it wouldn’t taske the short one; I only have so much patience.
    The problem with the data set is non-recoverable so why bother. It is nothing more than a violation of the Nyquist Sampling Theorem. You can’t use statistics to recover information from corrupt data. The data is likely a reasonably good record of that set of thermometers. In no way does, or can that set of thermometers be used to represent the temperature of the earth, or even the northern hemisphere.
    But don’t worry, neither does GISStemp or HadCRUT

  25. Having a free time I invested in 100 thermometers, just wanted to see for myself, I placed all in my fenced property, plus minus 1 hectare, I placed 3 on 30 wooden polls 2 meters high, evenly spaced starting from 50cm 1.00 and 1.50 I put 15 in the shade and 15 in direct light, ( 10 thermos. kept as spares ) after reading 2 times in daylight and 2 times at night ( same time ) at the end of 8 weeks 3 temp. on pole always showed different temp, the highest always cooler, the temp. in my garden in sunlight shows plus minus 1.03C average difference in shade 1.8 C difference from east to west, north to south shows 0.97C and 1.67C respectively, and my science is settled!

  26. @medic1532
    I’ve sailed across the Atlantic twice on the beautiful clipper Stad Amsterdam, and I’ve taken quite a few water- and air temperature and other measurements and uploaded these via satellite to the Dutch KNMI (Royal Dutch Meteorological Institute), but I must confess, from a scientific point of view, these were possibly quite inaccurate and subjective.
    To take the water temperature you dunk an insulated bucket overboard, haul it up, and stick a thermometer in it. Leave it for a few minutes (how long? well, that varies on how long it takes to smoke your fag). Next you read the temp, and note it. This wasn’t a digital one, but a mercury thermometer, so you could be off by half a degree depending on how good your eyesight is.
    Air temp was taken by swinging a wet bulb thermometer around for, well, how long you felt like, and reading it again depends on your eyesight.
    Cloud cover (percentage, type of cloud, height of cloud, etc.) was again very subjective (stand on deck and look around), the same with ocean swell (period, length of wave, wave height, etc.).
    So, if these types of measurements are done like this on the oceans worldwide, there is no telling how they were done, by whom, etc., so I personally would most certainly not consider these measurements to be anywhere near accurate.

  27. Whether this analysis is correct or not it provides this EE with a better understanding of what the temperature plots (or their creators) are trying to say.
    Also, While conceding Leif’s distribution argument, it is not at all obvious to me that quantity is irrelevant. As an example it would seem that the temperature of Antarctic would be better represented with 3 (reasonably distributed) measurements than with 2 or 1.

  28. What curve do you get if you just use the stations that have been reporting the full time period? If such stations exist, and if they don’t suffer from urban heat island effects, it would be an interesting check. In other words, don’t try to make it a global or hemispherical average, just track trends at those admittedly small sample of sites.

  29. “”” Leif Svalgaard (08:18:19) :
    The number of stations is not the correct parameter to use. Imagine that a small country [e.g. Denmark] suddenly gets the idea that more stations are good, so installs a million sensors throughout the smallish land area. Clearly that will do nothing for the error in the global average.
    The proper parameter must include the distribution of those stations. Something like the mean area around a station where there are no other stations. This problem has been studied at length in the literature [I can’t remember a good reference, off hand, but they exist]. “””
    Well one good place to start is;- “Digital and Sampled Data Control Systems” by Julius Tou (Purdue University).
    Or you can just google “nyquist sampling theorem”
    The problem is NOT a problem of statistics, and it cannot be solved by statistical methods; it is a problem of sampling theory; and there is no known solution (post facto).
    GISStemp and HadCRUT are accounts of GISStemp and HadCRUT respectively, which are data from a small number of thermometers. They do not represent the Earth’s surface or lower atmosphere temperature.

  30. Isn’t any analysis done using the “value added” data (riiiiight!) suspect? What we really need are the raw data and even then I doubt the coverage and accuracy are good enough that one could extract a meaningful “climate temperature signal” from it.

  31. Theodore de Macedo Soares (09:29:02) :
    John Graham-Cummings use of sample size to arrive at a margin of error is correct.
    I don’t think so, as values inside the sample are correlated and adding more stations does not decrease the error.

  32. Here’s an interesting experiment for some of you computer nerds to try when you have an evening free; maybe I’ll get my musican savvy son to do this.
    Take your favorite cd recording of Beethoven’s Fifth Symphony; or even that shrieker Celine Dion, or Madonna.
    play the thing through and store it on your hard drive. these days Terrabyte drives cost next to nothing.
    Then have your computer (you write the code) go through the data, and pull out every 200th digital sample of that piece of music. So maybe the disc is recorded at 88 Khz sampling rate or something like that, so you are going to end up with about 4.4 Khz rate of selected samples.
    Now have your computer play back those samples at the correct rate, about 4.4 khz, and run it into your hi-fi system, and see how you like the result.
    Now if you are also a climatologist or statistician; get to work with your statistical maths and try to fix that sub-sampled music piece.
    Good luck !

  33. John Graham-Cumming has posted an interesting analysis, he could benefit from some reader input at his blog.
    See here and below: http://www.jgc.org/blog/
    Adjusting for coverage bias and smoothing the Met Office data
    As I’ve worked through Uncertainty estimates in…

    Anthony: Your attribution is in good faith (not “ripped off”), but, as others have noted here before, it is sometimes difficult to tell where guest posts begin and end. It would be nice to see Mr. Graham-Cumming’s work clearly set off from the introduction. Blockquotes are a familiar convention in paper text, but with column constraints on the internet, maybe that uses up too much vertical space. And nobody (either you or readers) would want to deal with quotes within quotes. Italics raise other issues. Perhaps you’ve thought about (and discarded?) the idea of different text colors for guest posts.

  34. Maybe you should take every 20th sample instead of every 200th. Well I don’t think it makes much difference.

  35. Dr. Leif Svalgaard is right here. Take it to the limit where there are 2 temperature sensors. One is rural where rural is 98% of the land area. The other is urban where urban is the remaining 2% of the land area. The rural sensor shows no rise in day or night (max and min) temperature. The urban sensor shows no rise in the daytime but a 2 degC rise in the nighttime temperatures.
    If you average min and max you get 0 deg change for rural and 1 deg for urban. The average of both is then 0.5 deg which is true for urban but not true for 98% of the (rural) land area. The temperatures need to be weighted by the area they cover.

  36. Leif Svalgaard (10:10:20) :
    I don’t think so, as values inside the sample are correlated and adding more stations does not decrease the error.
    Yes. I agree that if the samples are correlated, adding more such samples would not decrease the error.
    Sampling error can occur by chance or by bias in the collection of the samples. A confidence interval (CI) for a population proportion accommodates the first type of sampling error but not the second type. My point was that there is nothing wrong with deriving a CI solely on the basis of sampling error due to a small number of samples increasing the possibility of error due to chance. A resulting wide CI can, by itself, undermine possible conclusions.
    A sampling error due to bias such as correlated samples or non-sampling measurement type errors is another story and if not taken into account, as you suggest, can be much more problematic than chance type errors.

  37. Quote: “Having a free time I invested in 100 thermometers, just wanted to see for myself, I placed all in my fenced property, plus minus 1 hectare…”
    Fred Lightfoot, am just curious, were you able to purchase mercury thermometers where you live? I became frustrated several years ago in my engineering job at the university, when I found that I could not replace broken mercury thermometers with an identical model, as the mercury filled models had been banned by the EH&S Dept at the university. I was required to purchase the red dye alcohol filled types – I found they were not worth a damn for accuracy, and suffered separation problems. (For separated fluid, T meniscus > T actual. There’s a warming source, i.e. corrupted data.)
    I’ve never performed outdoor measurements to the extent that you did, but can tell you that a laboratory oven set to +37 deg C having an internal height of about 60cm will have a temperature gradient of about 8 deg C over the 60cm height, in the absence of forced convection. I was rather surprised the first time i discovered that, and ended up retrofitting all of our natural convection lab ovens to forced convection. Even with forced convection, the gradient was 2 to 3 deg C under the same conditions.

  38. Marc:
    “So, if these types of measurements are done like this on the oceans worldwide, there is no telling how they were done, by whom, etc., so I personally would most certainly not consider these measurements to be anywhere near accurate.”

    Maybe the errors would cancel out and some “signal” could be extracted statistically? Maybe statistical corrections could be made to compensate for the variable locations. It would be better than nothing, if it at least indicated long-term trends.

  39. I think JGC needs to take a few deep breaths before hitting the return key. Anthony has been very generous in the way he has shared this blog with other people who are trying to get the message out.
    Resolving problems with the temperature datasets is not going to happen overnight, and we really need to stick together on this. I wouldn’t have known about JGC’s work if it weren’t for this blog.
    That said I liked what he did with putting the grids on Google Earth using KML files, and it got me thinking if it would be worthwhile doing something like Leif is suggesting, a sort of polycell distribution rather than a gridcell distribution.
    Temp stations aren’t evenly distributed, but if each was allocated a polygonal area based on how close it’s neighbouring stations are, i.e. more stations equals smaller polygons, would that lead to a more accurate picture about temps?
    It would get over the land-sea problem with gridcells and it would also mean that temps could be calculated by country, then just divide the total land area by the country area to get it’s contribution to the overall temperature record. Then do the same with sea temps. It could then be compared to satellite records, and if it matched them closely, then it could be a valid way of analysing the temp dataset.
    I’ve glanced at the maths that would be involved in doing something like that, and it’s not pretty, but it would only have to be done once initially for all existing stations and then as new stations are added or subtracted, it would have to be redone for the neighbouring polygons.
    Would this be a valid way of providing an alternative dataset, or would it be too computationally heavy for what it added?

  40. So the Royal navy logged temperatures twice a DAY.
    WHEN? Take a look at ANY 24 hour temperature data on Weather Underground.
    Unless temperature is taken continuously, the error bounds on diary and ship log temperatures HAS TO BE THE ERROR BOUND of the timing of the taking of the temperature.
    This could be HUGE. (I.e., plus or minus say 5 to 10 C depending on the location and the time of the year.)
    Yet another “error” to put into the mix.
    This historical stuff is so much STUFF AND NONSENSE.
    Hugoson

  41. Marc:
    Sea Temperature Records Obtained From Bucket Collected Water Samples
    Having made sea temperature records from the contents of a bucket that had been thrown over the side of a ship, I would support Marc’s opinion that such records are of dubious accuracy.
    There are many things one wants to do in a force 9 storm at sea; make an accurate sea temperature measurement isn’t one of them.

  42. Theodore de Macedo Soares (11:34:30) :
    “A sampling error due to bias such as correlated samples or non-sampling measurement type errors is another story and if not taken into account, as you suggest, can be much more problematic than chance type errors.”
    It seems that when you compare their very small computed error with the wide natural variation that exists, a good deal of the problem comes from the 2nd type. And, from what we’ve learned from climategate, we can add deliberate bias for good measure.

  43. Anthony, if I were you (and to stop the author posting any more petulant comments) I’d just remove the entire thread and any links that go with it. Sometimes (I know from experience myself) you just can’t help some people – and they can’t see it! Amazing.

  44. The analysis claims an error of +/- 0.8 degrees, compared to the IPCC’s +/- 0.1 degree, back in 1850.
    Here’s a description of what actual temperature measurement was like back in 1850, long before the days of arguing the effects of dirt and degradation on a Stevenson screen with automatic recording:
    http://climate.umn.edu/doc/twin_cities/Ft%20snelling/1850sum.htm
    Even modern recording accuracy is generally +/-0.5 degrees.
    http://www.srh.noaa.gov/ohx/dad/coop/EQUIPMENT.pdf
    Might I suggest that both these estimates are extremely optimistic ?

  45. Looking at the mets graph it seems that increasing the number of stations from around 50 (in 1850) to over 1400 (in 1960) has negligible effect on their error bars.
    That seems ridiculous at first glance. But if it’s true lets cut the number of stations back to 50 and make sure that they are really well sited.

  46. That JGC seems to be a surly and ungrateful SOB. We should allow him ti luxuriate in his well-deserved obscurity.

  47. “”” jknapp (12:52:15) :
    Looking at the mets graph it seems that increasing the number of stations from around 50 (in 1850) to over 1400 (in 1960) has negligible effect on their error bars. “””
    Well your observation is correct; and it wouldn’t matter if ther were 14,000 stations, it’s stil not nearly enough.
    From Leif’s posting, I get the impression that in fact it is not the normal practice to associate each of these reporting stations with a certain land area around them that is presumed to have the same temperature as the thermometer (at all times).
    If that is the case, then there is no way to obtain a global average that has any meaning.
    I would put Tgl = S[T.A]/S[A} where my S is sigma, and A is the area associated with each sensor, and S[A] is the total surface area of the earth.
    If they are NOT doing that, then it fits in the GIGO folder.
    And as Anthony’s Station study has discovered; for the US stations at least, a very large fraction are on airport runways. Well of course they are ther because the airport could give a rip about the climate; they want to know the weather; and specifically, the REAL TEMPERATURE on the runway, as that is of interest to a pilot trying to land or take-off on that runway. They want to know the temperature at the time they want to land not what it might have been 24 hours ago.

  48. Good points from George E. Smith.
    Discrete measurement systems need to take into account signal bandwidth. Get it wrong, and results might not just be slightly wrong, they can be downright misleading.
    Imagine the silhouette of a mountainous landscape. An aeroplane passes overhead with radar which takes merasures altitude at discrete distances. Imagine the radar takes three samples as it crosses the mountains, but (unlucky) each sample just happens to fall into three valleys. Join the points together, and we have an image of a flat landscape. Totally wrong.
    The answer to “aliasing” is to reduce the distance between samples by 5 or 10 times in the above example, if this is what we have absolute confidence in our ability to spot the mountains and create a reasonable reconstruction of the landscape.
    But how do we know this at the outset? We need to look at the properties of the signal and make it part of the measuring system design. This is an issue of careful design.
    It is worth noting that the theoretical minimum sammpling rate is 2 times the signal bandwidth. But in practical systems, it really needs to be 5 to 10 times.
    What about the spacing for sampling the putative global average temperature?
    We have seen how Darwin may sit more than a thousand km away from neighbouring measuring sites, What about the spatial distribution of measurements near the poles? Or the Pacific Ocean?
    I have no idea what is the spatial bandwidth of the climate system, and what distribution is necessary to avoid aliasing. I wonder what there is in the literature to give us comfort that this is not an issue with the temperature data. I haven’t seen anything.

  49. Excuse the typos in my last post. But I’d like to add another comment in support of some of the earlier posts on this thread.
    When I first looked at the above plots and saw the shaded areas reducing to near-zero, I just thought to myself “no way”.
    There is a huge difference between having 1000 measurements, and 1000 properly sampled data points with statistically independent noise terms.
    If you have the latter, you have the basic inputs to statistical analysis.
    If you have the former, you might not have as much real information as you think. And if you are wrong, statistical analysis will give you a misleading measurement of properties such a variance and standard error. True variance will be greater.

  50. John Graham-Cumming…
    Also consider that having your post on a heavy duty site like WUWT is likely to result in a very nice traffic boost to your site and better discussion.

  51. Reading through all the comments here and on my blog, I still have two nagging concerns:
    1. It isn’t intuitive that with so few stations in the 1800s that the error in Brohan et al. appears to stay the same across the years. Although there’s another diagram (Figure 12: top) which shows just the land anomaly trend and that does have wider earlier errors although they do not appear to be caused by the number of stations. But even that’s a bit hard to tell because I can’t see clearly how the limited coverage error is combined with the other errors. It would be good to be able to see the underlying data, but http://hadobs.org/ doesn’t appear to have the limited coverage error data.
    2. I’m not sure how accurate the sub-sampling can be given that the samples are correlated (spatially).
    Here’s hoping my email to the Met Office gets answered. It’s frustrating not to fully understand this paper.

  52. George E. Smith (10:20:39) :
    Here’s an interesting experiment for some of you computer nerds to try when you have an evening free;
    have your computer (you write the code) go through the data, and pull out every 200th [20th] digital sample of that piece of music. So maybe the disc is recorded at 88 Khz sampling rate or something like that, so you are going to end up with about 4.4 Khz rate of selected samples.
    Now have your computer play back those samples at the correct rate, about 4.4 khz, and run it into your hi-fi system, and see how you like the result.
    Ever heard of MP3 format Mr. Smith?
    or
    PASC (Precision Adaptive Sub-band Coding)
    or
    ATRAC etc.
    Compression for MP3 is 34MB to 3.4MB 10:1 even at this compression the quality is difficult to tell apart from the original.
    CD 44.1khz 16 bits – to – MP3 44.1kHz – to – 44.1kHz 16 bits is not lossless but it’s damn good quality.

  53. “”” Jordan (14:00:30) :
    Good points from George E. Smith.
    Discrete measurement systems need to take into account signal bandwidth. Get it wrong, and results might not just be slightly wrong, they can be downright misleading. “””
    In this case Jordan, we have a two variable system, space and time. We can be sure that the time variable contains a very solid 24 hour cyclic signal. Now a min/max thermometer is going to give you two samples a day, and that only satisfies Nyquist, if the daily cyclic signal is purely sinusoidal; but it isn’t daily temperature graphs tend to show a fairly rapid warm up in the morning, and a slower cooldown in the evening, so there is at leas a second harmonic component or 12 hour signal component. So a min/max themrometer already violates Nyquist by a factor of 2 which is all you need to fold the spectrum all the way back to zero frequency; so the correct daily average temperature is not recoverable from a min/max thermometer. And that doesn’t allow for varying cloud cover that will introduce higher signal frequencies beyond even a 12 hour sample time.
    So the lcimate data records are already aliassed before you even consider the spatial sampling. Here in the bay area, we get temperature cylces over distances of 10 km or less; yet climatologists believe they can use a temperature reading to represent places 1200 km away.
    The problem is that most of them seem to be statisticians, and not signal processing experts, or even Physicists.
    Well no central limit theorem is going to buy you a reprieve from a Nyquist criterion violation, and no amount of linear or non-linear regression analysis, is ever going to recover the true signal which has been permanently and irretrievably corrupted by in band aliassing noise.
    Other than that slight inconvenience, there isn’t any good data even improperly sampled data before about 1980, when the Argo buoys, and the polar orbit Satellites were first deployed.
    So HadCRUT and GISStemp belong on the ash heap of history; they are not even worth fixing.
    UHIs cause no problems unless the sampling regimen is improper.
    So to me it hardly matters if the playstation video games are any good; the data that goes into them is true garbage anyway. You would think that somebody like Gavin Schmidt, would have heard of sampled data system theory, and the Nyquist Sampling theorem; but he sure doesn;’t act as if he has.

  54. Slightly off topic but John G-C’s blog on the “hide the decline code” is the most thoughtful and credible analysis I have seen. You can read it on his blog archive for November. Thanks John 🙂

  55. John Graham-Cumming (14:38:26) :
    To be honest, it is not only not intuitive for the error to remain the same but downright incorrect.
    Consider where the stations were & how it has changed over time. E.M. Smith has done a lot of work on this.
    To be honest, I don’t even see satellites being a big improvement as I can’t see how they can take more than two (2) measurements at any given location per day.
    DaveE.

  56. Thanks for the reference steven mosher.
    I have had a quick look over the paper, but I’m not sure we’re on the same page.
    The paper talks about estimation error for two distinct directions (1) find the best guage locations, or (2) work with the fixed guage locations we have. The paper examines optimal filtering from the starting point of (2).
    The question (from George and myself) has more to do with (1): if we were starting with a clean sheet, what design criteria would we use to determine the spatial distribution of measuring stations for the putative gloabal T ? The sampling theorem would have a central role to play in determining the “gap” we could suffer between stations.
    So the paper looks like an intersting analysis of optimal filtering, assuming we can live with T(r,t). But T(r,t) could lack meaning due to spatial aliasing.
    Looking it another way – those who have questioned the MWP by talking about a “local phenomenon” are basically asserting that spatial aliasing distorts our view of the past. It’s a fair point, but works both ways.

  57. NickB. (08:43:15) :
    Medic1532 (08:26:21) :
    Naval data – logged every watch change – ie 4 hours. Proper obs (weather observations) done by a percentage of the British merchant fleet every 6 hours – on the synoptic hour.
    When I’ve asked the question before I have been pretty much brushed off with – ‘its amateur observers and therefore rubbish’ Which in itself is rubbish – yes some obs were flogged – but in my personal experience at least 90% were good to excellent. (Basically as the Radio Officer I refused to send rubbish; which occasionally caused some interesting relationships and there were other ‘sparks’ around with the same attitude. There were of course also those who just made it all up – perhaps they got jobs in East Anglia ?)
    So I still want to know WHY THE HELL HAS NO ONE ABSTRACTED THE DATA ? Perahps they are too scared of finding some inconvenient truths

  58. bill (15:15:22) :
    Ever heard of Variable bit-rate bill? That’s where the sample rate is changed by the Nyquist theorem according to the amount of relevant data in the samples
    DaveE.

  59. Fred Lightfoot (09:47:37) :
    Having a free time I invested in 100 thermometers, just wanted to see for myself, I placed all in my fenced property, plus minus 1 hectare, I placed 3 on 30 wooden polls 2 meters high, … garden in sunlight shows plus minus 1.03C average difference in shade 1.8 C difference from east to west, north to south shows 0.97C and 1.67C respectively, and my science is settled!

    Assuming you calibrated the thermometers, melting distilled water, boiling distilled water corrected for baro pressure, water triple point cell, etc., to what do you attribute the variation ?
    As these measurements were all taken after the industrial revolution, we might suspect increasing carbon dioxide, but siting problems should be taken into account. In the garden, are there snow plants or firethorns ? Sunflowers ?
    UHI is unlikely, discounted even on larger scales by such authorities as Hadley CRU and GISS. Another consideration is instrument precision. Fahrenheit thermometers are more precise than Celsius, and for spacial distribution of your sites, acres are more precise than hectares.
    You do make a good point about changes in “global” temperatures. Given the range of your data set (Lightfoot 09), it’s hard to get concerned about 0.6 °C over a century.

  60. AlanG (11:09:06) :
    AlanG (11:09:06) : …” Take it to the limit where there are 2 temperature sensors. One is rural where rural is 98% of the land area. The other is urban where urban is the remaining 2% of the land area.”
    The problem is more basic. Take 2 sensors near the equator. Monitor temperature trends for a couple of decades. Then add a sensor at the north pole. The average temperature drops dramatically when the 3rd sensor is added to the average.
    While not so dramatic as the above example, the real history of thermometer measurements is that the Arctic and Antarctic regions were undersampled until relatively recently.
    That’s why anomalies are used. While not perfect, using changes from baselines does reduce the effect of changes in locations of thermometers.
    The same problem happens when there are missing readings. Just doing simple averages of all available temperature measurements can lead to confusing, erroneous results.

  61. George E. Smith
    We agree on so much.
    There are huge issues in the time dimension. I still take the view that where arbitrary “homogenisation” turns a negative trend into a positive, we have nothing more than an elaborate way of saying the historic data is junk.
    I agree with your point that the spatial dimension may not have had adequate attention from a sampling perspective. Who has done the analysis to satisfy us that there is no distortion of the signal due to spatial aliasing?
    I can appreciate many of the more detailed points you make. But let’s leave that as a challenge to the literature – show us the papers which have addressed and answered these issues. If there are none, it has to be a matter for further research. But it would leave little reason to pay attention to aggregation of the surface network.
    “Well no central limit theorem is going to buy you a reprieve from a Nyquist criterion violation and no amount of linear or non-linear regression analysis, is ever going to recover the true signal which has been permanently and irretrievably corrupted by in band aliassing noise.”.
    Yep. Worth repeating.

  62. There is a tendency in many of these comments to conflate the requirements for obtaining an accurate mean global temperature with those for obtaining a mean temperature *trend*. The latter does not depend on the former, and it is the latter which is of interest for AGW theory. Even a fairly small number of globally well-distributed stations (in terms of lat/long and altitude) which have long-term records could shed useful light on the trend question. Or as supercritical suggested, stations with shorter-term records could reveal decadal trends, which could be combined to give a long-term trend.
    Obtaining an “accurate” global mean is probably insuperably difficult. Obtaining a plausible global trend is probably doable.

  63. George E. Smith (10:20:39) :
    Better still..
    Try to reconstruct Mozart’s 19th piano concerto by taking 1 out of every 40 notes in succession from the manuscript…

  64. George E. Smith (10:20:39) :
    . . . have your computer (you write the code) go through the data, and pull out every 200th [20th] digital sample of that piece of music. So maybe the disc is recorded at 88 Khz sampling rate or something like that, so you are going to end up with about 4.4 Khz rate of selected samples.

    Given my tinnitus from thousands of hours piloting noisy airplanes, anything above 4.4 Khz is wasted on me. I think ordinary telephone (copper wire) bandwidth is around 3 to 4 Khz. All the data in conversations is carried well below 1 Khz, so does your point have to do with sample size or station distribution?

  65. There are many means for computing statistic uncertainties.
    1- The most common way assumes that noise (or error, etc.) follows a normal law (gaussian-like). The more you get data, the more ( overall noise / number of data ) averages toward 0. Of course this is pure bet since in many cases noise does not follow normal laws … but classical statiscians often simply forget this care ! For example unknown phenomenons affect all sensors the same way at related times : thermometers boxes are repainted the same way which, boxes shape changes and reacts in a different maner depending on winds, warming urban island grows more or less simultaneously, etc.
    2- Another way is to adress uncertainties related to data inhibition is to guess (extapolate, modelize, etc.) the data of one sensor (e.g. thermometor) from the data of other sensors (e.g. thermometors). This way you can compute an uncertainty related to the inhibition of one sensor, then a second one, etc. This more empirical way is muche more costly and shall be prefered only when you get time and money to design the cross-models and many sensors data.
    3- An intermediate way consists into getting the delta serial on the (geographicaly averaged) mean temperatures when you inhibit the data of one or the other sensors. You can do it with 3a- simple means or preferably 3b- geographic-surface-weighted-means. Either way you get a fast evaluation of robustness against sensor inhibition.
    Method 3 is much more simple, easy, fast, neutral, approximate than method 2. So it is sub-optimal on a mathematical point of view but may prove more reliable on a management point of view facing human biases (guess what I mean following CRU, Climategate etc.).
    In cases 2 and 3b, strong robustness are plausible if the remaining sensors are sufficient for guessing/extrapolating (case 2) or localy-averaging/interpolating ( case 3b) the data of each previous sensor.
    Although I have not read the article you make mention of, I guess (?) from some Climategate reading the way CRU is used to be working relies on geographic-surface-weighted-means (case 3b), using Voronoi/Delaunay diagramms for computing the relevant surfaces. Such diagramms associate a given location to the 3 sensors drawing the smallest triangle around it. Then some weighted average is computed (possibly with 1/distance to each of these 3 sensors).
    cf. CBS article:
    http://www.cbsnews.com/blogs/2009/11/24/taking_liberties/entry5761180.shtml
    They do not seem to be happy with their Delaunay diagramm(s) because it has turned out to be too loose at lest in some World regions.
    Regards,
    Xavier DRIANCOURT
    PhD machine learning, etc.
    On a more genaral basis, there is a global statistic theory on robustness evaluation and uses for tuning statistic systems. See US Vladimir VAPNIK, Léon BOTTOU, et al. on this. They follow some previous USSR work on regularisation (i.e. stabilisation) of unstable systems see TYKONOV et al. on this.

  66. “”” P Wilson (16:15:14) :
    George E. Smith (10:20:39) :
    Better still..
    Try to reconstruct Mozart’s 19th piano concerto by taking 1 out of every 40 notes in succession from the manuscript… “”
    Hey that works for me PW; as I recall at least from “Amadeus”, Mozart reportedly said his works had just the right number of notes.
    I can assert that it is Ok to play the Symphony # 41 (in C Major, (the Jupiter), and leave out the second clarinet part, and nobody will notice; well they won’t even notice if you leave out the first clarinet part.
    But your example may be even better than mine. If anybody can hum the clarinet part in the Jupiter symphony, give me a buzz.
    And for Mike McMillan, my experiment was to point out the folly of sampling at too low a rate. The example that everybody is familiar with is the movie of TV Horse Opera with the damsel in distress in a runaway chuck wagon, with the wheels wildly turning backwards. At 24 samples per second (movie or 60 (50 in Europe) for standard TV, the moving wheel spokes represent a time varying signal that has a frequency that is higher than half the sample rate, bearing in mind that the replacement of one spoke, by its neighbor constitutes one cycle, if you sample at exactly the spoke frequency which is half of the Nyquist rate, the spokes appear stationary, and the stationary spokes (the average condition) could appear in any phase, so the error in spoke position could be anywhere in the amplitude of the out of band signal, in this case the spokes moving too fast for the frame rate.
    The sampling theorem does not require equal spacing of samples, so random sampling is ok, so long as the maximum sample spacing is no larger than a half cycle of the highest signal frequency. So random sampling is less efficient (you need more sample points) but it has some advantages, and can eleiminate the degenerate case of sampling at exactly twice the signal frequency.
    For example, if you have a pure sinusoidal signal at a frequency f, and you sample at exactly 2f, that satisfies the Nyquist criterion, but it could happen that all the samples happened at the zero crossing points, in which case the reconstruction would be zero signal. But if you happened to sample at the positive and negative peaks, you would recover the correct signal amplitude (but you wouldn’t know it was correct. Random sampling would slue threough the whole cycle and eventually reproduce at least a repetitive signal, so random sampling has been used to advantage in sampling oscilloscopes for many years
    With climate data gathering from ground stations you automatically end up with random spatial sampling, but unfortunately you don’t have enough samples by orders of magnitude to correctly recover the complete continuous global temperature map or even its average, over time and space.

  67. “”” bill (15:15:22) :
    George E. Smith (10:20:39) :
    Here’s an interesting experiment for some of you computer nerds to try when you have an evening free;
    Ever heard of MP3 format Mr. Smith?
    or
    PASC (Precision Adaptive Sub-band Coding)
    or
    ATRAC etc.
    Yes I have; does the algorythm I suggested perform Precision Adaptive Sub-band Coding ?
    The human ear is well known for being able to find intelligence in the most garbled sounds. Early encoding using things like audio spectrum inversions and such were found to still leave voice messages intelligible to a trained ear.
    What we are interested in with climate data, is in being able to recover the correct signal; not one that is similar in some respects.
    Can MP-3 encoding be done live in real time; or does it require fore-knowledge of what information is coming next. How would we disburse global temperature sampling stations spatially so as to be able to MP-3 encode them to reduce the amount of data ?
    If we can do that it would be a good idea.

  68. Sony were criticized for deliberately falsifying error correction bits as a form of copy protection on their audio CDs.
    If an audio CD player saw the error it would just repeat the previous sample or maybe interpolate but a computer needs to have the exact number so would make a few attempts at reading and then crash out.
    Very few people, if any, noticed any degradation in the music which is just as well because audio CDs would exhibit this sort of behaviour quite frequently due to misreads even with the correct error correction.

  69. what data can you trust anymore.
    take any real (unadjusted) data from the time thermometers were accurate for a baseline, and work to the present time.
    we don’t need 0.1 accuracy to spot trends. trends don’t mean anything anyway. were just chasing our tail aren’t we?
    if were trying to predict the future were fooling ourselves, if were trying to understand the climate lets collect data.

  70. About the MET surprisingly the errors(errare humanum est) goes only in the “good” way cooler for the Optimum Medieval and hotter in this case.
    The met that we all know is in busisness with the barbecue industry should remember
    Perseverare diabolicum

  71. The update posted by John Graham-Cumming rings distant bells from my question to the BoM a couple of weeks ago:
    “While analysing monthly temperatures in Western Australia for the past 12 months based upon data on the BoM website, I noticed that according to my records the August 2009 data for all observation sites in WA has been adjusted at some time since that month (November 17 I believe). The adjustment has resulted in the mean min and mean max increasing by an average .5 degrees C at all sites for August 2009. The adjustment at almost all sites was a uniform increase for both min and max… i.e. if the min went up by .4, so too did the max. If the min went up by .5, so too did the max. Could you please let me know what caused the adjustment?”
    Reply:
    “Thanks for pointing this problem out to us. Yes, there was a bug in the Daily Weather Observations (DWO) on the web, when the updated version replaced the old one around mid November. The program rounded temperatures to the nearest degree, resulting in mean maximum/minimum temperature being higher. The bug has been fixed since and the means for August 2009 on the web are corrected.”
    There seem to be so many bugs in Australian data that it needs insect spray.
    I’ve just uploaded linear graphs and source data for 24 Western Australia surface stations mostly dating to pre-1900 showing the official trendlines according to the historic BoM data, the High Quality data homogenised by the BoM, the GISS records and the HadCRUT3 data, where available:
    Albany
    Bridgetown
    Broome
    Busselton
    Cape Leeuwin
    Cape Naturaliste
    Carnarvon
    Derby
    Donnybrook
    Esperance
    Eucla
    Eyre
    Geraldton
    Halls Creek
    Kalgoorlie
    Katanning
    Kellerberrin
    Marble Bar
    Merredin
    Perth
    Rottnest Island
    Southern Cross
    Wandering
    York

  72. It would appear that different people recording temperature readings on different thermometers in different kinds of places in different seasonal conditions would provide a veried report on local temperature conditions. (as much as 8 degrees F.)
    Now how in the world can any one tease 0.6 degrees F. long term climate change from this record and temperature is only one leg of the total energy in the enviroment.
    I believe that it has already been demonstrated that when you add a starting data set to a to be averaged string and a set to the end to compleat the averaging, you get an up tick (hockystick) at the end.
    I think Roy Spencer and George E. Smith may have it right, Temperature is a local result of all the energy conditions in the system.

  73. steven mosher
    Thanks again for the further reference. I did notice the cross-reference to North, but focused my attention on the paper you suggested. I’ll have a look at the North paper today.
    George E Smith
    “For example, if you have a pure sinusoidal signal at a frequency f, and you sample at exactly 2f, that satisfies the Nyquist criterion, but it could happen that all the samples happened at the zero crossing points, in which case the reconstruction would be zero signal. But if you happened to sample at the positive and negative peaks, you would recover the correct signal amplitude (but you wouldn’t know it was correct.”
    Again, fair points. However we should be much more comfortable with the mechanical min/max thermometers as they have an underlying continuous measurement. No?

  74. The data for Broome station shows no trend at all. This must be excluded from the official global dataset.

  75. stephen
    I have had a look at two papers by North. I get the gist of his analysis, where it is coming from and where it ends. But it is still in the realms of measuring statistical aggregates, and not issues around how to reconstruct a signal from sampled data – as that’s what the above posts are about.
    I’m sure we can all agree that knowledge of the statistical properties of a signal does not allow us to reconstruct the signal.
    Thinking about it, failure to observe the Nyquist sampling period (in time or in space) can have an imact on how we measure statistical properties. Injudicious sampling interval could result in a need to increase the sampling interval, since more samples would be needed to counter the effects of aliasing. But other than that, I would be confident that the statistical properties should emerge from sample data, eventually.
    That is not the case when we seek to reconstruct a signal from sampled data. Randomly scattering a minimum number of point measuring stations around the globe is not a sufficient condition to enable us to reconstruct a global signal.
    It is essential to assess and then comply with a minimum distance between measuring points in order to avoid spatial aliasing. Failure could result in a seriously distorted impression of a trend in the global signal.
    To repeat, those who claim the MWP was a “local effect” are making exactly the same point.
    And – as somebody mentioned above – siting measuring stations at airports and other built-up could be another example of distortions being introduced by spatial aliasing. A way to address this kind of issue would be to design into the measuring system a decent number of point-samples between the airports and cities. With that, we would have a chance of being able to re-construct a picture of the true temperature field.
    And that’s before we even think about polar ice caps and the Pacific Ocean.
    In the absence of the matter being addressed in the literature, this looks like a serious gap in how we have approached the reconstruction of the putative global temperature trend.
    George makes a good point – can we be satisified that the global temperature trends have not been reduced to junk due to spatial aliasing,

  76. I don’t see how the sampling theorem is relevant. We are not interested in exactly recreating daily temperature variations, but estimation of variations over decadal time frames (besides, if you have a priori information that there is a 24-hour signal with harmonics, the sampling theorem doesn’t really apply, does it). That said, I certainly agree that there are large errors due in part to poor sampling methods, and the HadCRUT3 error estimates seem to be off by an order of magnitude or so.

  77. For Peter dtm
    When I was Apprentice, 3rd. Mate and 2nd. Mate, I served on several Weather Reporting Ships and agree with your post. All the Mates tried very hard to get it right and Sparks always tried to get the message away timeously.
    Although it might have been hard to take readings in a force 9 or more, those days were few and I would say that 98% of our reports were as good as the instruments would let them be.
    The raw data must be somewhere; Portishead might know where their log records are now.

  78. Jordan I think the sampling of an audio signal is very misleading
    The spatial field is highly correlated. It’s boring music.

  79. Toho (08:26:18) :
    “I don’t see how the sampling theorem is relevant. We are not interested in exactly recreating daily temperature variations, but estimation of variations over decadal time frames”
    It is important to get the temporal sampling regime right, although we also need to consider spatial sampling and risk of aliasing.
    Take an event like ENSO – a phenomenon which evolves over several months, shifting thermal energy over wide regions. it is not felt equally over all parts of the globe.
    Spatial distribution of the measurement network is surely a crucial factor in our attempts to create an accurate picture. Get it wrong and the picture could be completely wrecked by spatial aliasing. (Something I often wonder, when looking at 1998.)
    The measurement network seems to change almost continuously (above plot of number of stations). If there was an identical “1998” at a different time, to greater or lesser extent, the temperature reconstruction would produce a different impression – solely due to changes in the network.

  80. It is neither standard deviation nor standard error, it is standard fraud. Eliminate the fraud and the data will make more sense.

  81. Jordan:
    I agree with most of what you write. But my point is that it doesn’t follow from the sampling theorem. The sampling theorem is about recreating exactly. Sure, you are going to lose information when sampling, and that will cause errors in the temperature estimates. I certainly agree with that. But it has nothing to do the sampling theorem. I don’t think aliasing is a big deal by the way, because localized atmospheric energy will not stay localized for long.

  82. I’ve found another interesting analysis. Apols if it has been noted before:
    http://strata-sphere.com/blog/index.php/archives/11932
    The best bit is the final graph:
    http://cdiac.ornl.gov/epubs/ndp/ushcn/ts.ushcn_anom25_diffs_urb-raw_pg.gif
    If you look at it, it looks very much like the increase in temps we have been told is going on. But it’s not. It is the changes made to the raw data sets to produce the ‘adjusted’ data sets.
    Conspiracy? Where’s my working? Well, this is their graph, not an independent one! It comes from http://cdiac.ornl.gov/ itself.
    So this tells us, in no uncertain terms, and completely unambiguously as far as I can see, that the increase of 0.6C is entirely fabricated. True I have not searched for a justification of that fabrication – it may be valid, but it should be reported as such!

  83. “”” Toho (08:26:18) :
    I don’t see how the sampling theorem is relevant. We are not interested in exactly recreating daily temperature variations, but estimation of variations over decadal time frames (besides, if you have a priori information that there is a 24-hour signal with harmonics, the sampling theorem doesn’t really apply, does it). That said, I certainly agree that there are large errors due in part to poor sampling methods, and the HadCRUT3 error estimates seem to be off by an order of magnitude or so. “””
    Well then you don’t understand the sampling theorem. Standard Sampled Data theory shows that an out of band signal at a frequency B + b sampled at a rate 2B results in an in band error signal at a frequency B-b, which cannot be removed by any filter without also removing valid in band signals at the same frequency. Violation of the Nyquist criterion by a factor of only 2, as in sampling an out of band signal at a frequency of B + B or higher, results in an aliassed signal at B-B which is the average value of the function.
    That means that by undersampling the data either temporally or spatially for the duration of the baseline period (30 years or whatever) that is used as the reference value for the “anomaly” calculation; that baseline average over whatever time period is corrupted by zero frequency aliassed noise components. It doesn’t matter that one may only be interested in trends; the trends are illusionary anyway. Anybody can see that plots of temperature anomalies over various time scales exhibit fractal like properties; longer and longer plotting intervals show “trends” of greater and greater extent over longer and longer time frames. I don’t understand your comment that a priori knowledge of a 24 hjour cycle with harmonics somehow dismisses the sampling theorem. Current methodology records a daily min/max temperature. That is a twice daily sampling rate, which only satisfies Nyquist in the event that the dailly cycle is a pure sinusoidal function. If it is not sinusoidal in waveform and is still periodic, there must be at least a second harmonic 12 hour periodic signal present, and 12 hour sampling of that violates Nyquist by a factor of two rendering the average contaminated by aliassing noise. Now as it turns out, min/max recording rather that simple 12 hour sampling eliminates the dgenerate case of exact twice signal frequency sampling, that can identify the signal presence, but not its amplitude, for example, when samples are taken exactly at the zero crossings, so recording zero. The min/max strategy does at least record roughly the correct amplitude, but not the correct time average, due to the presence of the 2F or higher signal components. So no, that foreknowledge does not erase the need for the sampling theorem.
    Pachauri’s silly plot that Viscount Monckton called him out on in Copenhagen is a great demonstration of that fact. Statisticians are kidding themselves when they claim to be extracting additional information by applying regressions and filterings like running five year averages. We are talking about a chaotic function that never ever repeats; it only happens once; so what statistical significance is there in anything that only happens once ?
    Yes it is true that the common mathematics of statistics can be applied to sets of totally unrelated numbers; and averaages, medians, standard deviations, or any other buzzword of standard statistics can be applied to number sets with no relational significance whatsoever.
    That does not mean that the mere mechanics of doing that somehow will reveal “information” which was never in there in the first place.
    It is often stated that White Noise contains more “information” than any other signal; it is totally unpredicatble, and no matter how long a string of white noise values you collect and process, you can never learn anything about the very next value to come along. In that sense the incoming signal is 100% information about itself.
    The anomaly concept may seem advantageous, and it certainly does have some merits as to incorporating new stations into an existing network. But that is about the only merit.
    Consider a solid sphere that is enclosed in a close fitting rubber (latex) skin so the skin is in contact with the sphere everywhere, but is just barely stretched, so it touches the sphere at every point.
    Now take a hold of the skin at any point, and pull it away from the sphere by some small distance. The skin can then be stretched in any direction to move that point you have a hold of, over so some other point on the sphere by stretching the skin, and then the skin can be released at that point, so the skin is distorted, and points around the moved point are all moved to new locations.
    It can be shown, that no matter how the skin is stretched and moved, there must always be at least two points somewhere on the skin that have not moved at all, and are still in their orighinal locations. No matter what contortions are applied there will always be at least two stationary points. Of course thoes point change for every different application of the stretch operation.
    Now that simple problem in topology, is not unlike the mapping of tempearture aroound the globe and noting anomlies at each point. A zero anomaly report (at any point) is like one of the stationary points on the latex skin. The fact that some points did not move; and therfore record zero anomaly, does not grant a licence to assume that neighboring points also remained stationary, and would yield a zero anomaly, in the event that they too were actually sampled.
    The complete global temperature continuous function ( at any time epoch) is exactly analagous to the latex skin on the sphere. Adjacent points can be stretched away from some stationary point, and nothing can be known about their locations (in the temperature anomaly realm) without actually taking a sample there.
    In the case of temperatures the greater difference between adjacent points is like a greater stretching of the rubber skin. In weather terms, such diferences lead to the development of winds; but nothing can be learned about that unless all of those points are sampled. Anomalies yield no information that can be used to map wind patterns or virtually any other weather phenomenon, and after all, climate is supposed to be the long term average of weather.
    On another issue, Bill had raised the concept of MP3 encoding, presumably as an argument counter to my suggestion to sub sample a digital data strem such as a music piece for example. Here Bill is confusing “Data compression” with “data aquisistion”
    MP3 is an adaptive endoding of data THAT HAS ALREADY BEEN RECORDED.
    The encoding algorithm, reacts to prior knowledge of data that has yet to arrive to be processed. similar concepts were already being applied in the 1960s in the recording of long playing 33 1/3 RPM phonograph (gramophone) records. During low level passages being cut on the master disc, the recording groove spacing was reduced to place the grooves closer together. When a louder passage was about to be recorded, the cutting lathe increased the groove spacing prior to the arrival of the loud signal, and that allowed the recording of a larger dynamic range than earlier constant spacing recording, and it also allowed different frequwncy compensation curves, that gave improeved signal to noise ratio; such as the standard RIAA recording and playback curves.
    As Bill pointewd out, MP3 encoding allows data compression by factors of ten or more, which permits storing a whole lot of rather boring music on small players with play back quality, that passes for hi-fi to the unsuspecting purchasers of such music.
    Just to be sure that some new technology hadn’t somehow snuck past this old fogey, I contacted a department manager at Creative; Soundblaster to some of us. They know about as much about MP3 and lookalikes as anybody.
    He confirmed that there is no such thing as live real time MP3. The adaptive processing relies on prior knowledge of data yet to be processed, so some form of “buffering” is absolutely mandatory. In simple terms, the DATA must already be gathered, BEFORE you can process it, and compress it to store the pertinent information in less storage space.
    So it is not a method of acquiring more information with less resources; just imagine running from place to place with a Stevenson screen to be in the right place at just the right time to record an important temperature anomnaly. Somehow basic information theory does not permit gathering more information with less resources. I’t somewhere in that whole signal to noise ratio, data rate, and channel bandwidth relationship. There’s that
    Clude Shannon and his theorem in there, anothe bell Telephone Laboratories Product, Like Nyquist. How sad it is that we have lost that great National Treasure.
    Signal recovery and signal processing, are somewhat different tasks than efficient data storage.
    So sorry Bill, but no cigar.

  84. George E. Smith (23:59:30) :
    Really, that hits the nail right on the head. There is no way to zero out the noise and say with any sort of replicable or predictive calculations, “This is the contribution to warming because where x^2/y/(z*x) there is B warming.” And that is the part that I think that should be demonstrable before we drive our societies off of the economic cliff.
    2 notes:
    1) Made up the calculation above to illustrate, hopefully obviously.
    2) If this kind of evidence exists, no one has ever shown it to me when I asked.

  85. Toho.
    The sampling theorem is definitely not about recreating a signal exactly.
    As you correctly say, sampling loses information. There is generally no prospect of exact recovery of the original signal in practical systems. A well sampled system (more-than-observing the limit of the sampling theorem) might be able to recreate a very good approximation to the original signal by interpolation between samples.
    Rather than exactness, it would better to talk about the adequacy of sampling. For a bandlimited signal, the sampling theorem tells us to absolutely avoid aliasing as that will destroy our ability to reconstruct a reasonably faithful representation of the original signal.
    Is aliasing a big deal in temperature reconstructions?
    Perhaps the point is that we don’t know. We can share opinions on this and compare examples. That’s all good fun and helps to spread knowledge and experiences. But where is the formal analysis in the literature?
    I haven’t seen it. So right now, it looks like there could be a potentially serious gap in the whole approach to recreating historic temperature series on a global scale. And that means they could be junk.
    I do tend to agree with you that localized atmospheric energy will not stay localized for long. But people who claim that the MWP was “localised” would appear to have a contrary view. To repeat my example above, an El Nino event is relatively localised, and (without comfort on spatial aliasing) I wouldn’t be too ready to accept the 1998 spike was anything more than an artefact of an inadequate network.

  86. Sorry to jump tracks …… the following has been posted on a warmist blog. Can somebody please let know how Anthony responded?
    “Oh dear, Anthony Watts?. Probably new to you, but that UHI crock has long been debunked: http://www.ncdc.noaa.gov/oa/about/response-v2.pdf
    “One analysis was for the full USHCN version 2 data set. The other
    used only USHCN version 2 data from the 70 stations that surfacestations.org classified as good or best. We would expect some differences simply due to the different area covered: the 70 stations only covered 43% of the country with no stations in, for example, New Mexico, Kansas, Nebraska, Iowa, Illinois, Ohio, West Virginia, Kentucky, Tennessee or North Carolina. Yet the
    two time series, shown below as both annual data and smooth data, are remarkably similar. Clearly there is no indication from this analysis that poor station exposure has imparted a bias in the U.S. temperature trends”
    Reply: Covered here. I would also like to point out that you can tell the writers at that blog that the Talking Points memo is not a part of the peer-reviewed climate literature and therefore by their own standards something which should probably be given little weight. ~ charles the moderator

  87. I think it would be useful at this juncture to review the difference between standard deviation (SD) and standard error of the mean (SEM).
    For example, if you want to know how tall the human race is on average, you can measure the height of N people chosen at random, and then calculate the mean. If you want to know how confident you can be in your mean, that will depend on how variable the population is (standard deviation, or SD) and how many samples you have (N). In the extreme case of a uniform population, it would be sufficient to measure a single person. It turns out that SEM = SD/sqrt(N). It bears repeating that SD is a measure of how variable your measurement is *within* the population, and SEM is how confident can be in the estimate of the true mean across the entire population.
    All of this depends on the randomness of the N samples. If Japanese or Norwegians were overrepresented, your estimate would contain an error that is not expressed by SEM. The same is true for temperature. There are separate techniques that attempt to correct for sampling bias, which in the case of global temperature, is expected to be the driving source of uncertainty, and difficult to adjust for.
    As an earlier person posted, rather than the number of measurements, it is the geographic distribution of measurements that does more to determine the reliability of the estimate of the mean. Reasonable people may also question the validity or relevance of global mean temperature as a rather meaningless concept, akin to the average human, with one breast and one testicle. Indeed, the last ice age was characterized less as a drop in the global average temperature, and more as a large increase in the difference between the tropics and the northern latitudes. In the terms of this discussion, SD increased more than the mean decreased.

  88. George E. Smith (23:59:30) :
    “I don’t understand your comment that a priori knowledge of a 24 hjour cycle with harmonics somehow dismisses the sampling theorem.”
    It doesn’t dismiss the sampling theorem (I probably did not communicate what I meant very clearly). The sampling theorem provides a sufficient condition to be able to exactly recreate the signal using samples. But it is only a sufficient condition. With the sampling theorem comes a method of reconstruction. (But it is not the only method of reconstruction conceivable, given an arbitrary set of samples and an arbitrary sampling methodology.)
    If you happen to have a-priori knowledge that the signal is a pure sine wave of a certain frequency you only need a single sample to be able to perfectly recreate it to prepetuity (i.e. the only missing piece of information is the amplitude). That is a lot less than what the sampling theorem asks for. If you know that the signal is a 24-hour sine with a certain number of harmonics you are missing some more information, but you can still recreate the signal (to perpetuity) with a finite number of samples.
    Besides, knowing the daily max and min temps give you more information than two equally spaced temperature samples.
    I still stand by my post above.

  89. In the time domain its been shown that (Tmax+Tmin)/2 is a good estimate. If you like go get CRN 5 minute data and see for your self. . In the spatial dimension the field is coherent over large distances. Think about it. The field can be sparsely sampled and long term trends can still be captured. That’s all you care about. Look at the records of the 4 longest temperature stations and compare them to the global average ( IPCC Ar4 ch06)
    There are more important issues. Put your brain power there.

  90. Toho
    “If you happen to have a-priori knowledge that the signal is a pure sine wave of a certain frequency you only need a single sample …”
    It’s a highly idealised example, and not sure it get us very far with the question of whether there is aliasing in the temperature data.
    Looking at a more recent post on WUWT, I see even more evidence to warn us that there is likely to be aliasing in the temperature data.
    Before that, pause to imagine a sequence of samples of a signal (in time or in space) as a sequence of point values. If we join up the points, we can turn this into a sequence of trapezia. We’re interested to know whether the resulting stepwise-linear curve is a decent reflection of the original signal. (More sophisticated interpolation using curves doesn’t really add to this).
    If a smoothly changing sequence of trapezia is a “good” representation of the original signal, it must follow that the closest neighbouring sample data points are highly autocorrelated.
    We can say this because erratic changes in neighbouring sample points would be a good indicator of overlapping bands in the “frequency” domain, and therefore aliasing. (Frequency in quotes becuse this point also holds for spatial data sampling.)
    And as discussed, the consequence of aliasing is that we cannot rely on the sampled data to reconstruct the original signal.
    Now consider the latest WUWT post about Darwin. It is claimed that a “neighbouring” station (for a hologenising algorithm) was some 1500 miles away. There are closer stations, but they are not sufficiently correlated for a homogenising algorithm. In fact some of the closer stations are negatively correlated to Darwin.
    What better hint do we need that spatial aliasing is a potential problem.

  91. Toho (13:11:31) :
    You are referencing only one point in the system and Mr. Smith is referencing the entire system. This maybe why you are not seeing things eye to eye?
    If you are looking to make a mean temperature for the entire planet, you need to be thinking in terms of the entire system, and not just the particular sampling point in the system. It seems pretty silly (to me anyway, no disrespect) to argue against the idea of taking samples at evenly spaced intervals for reasons other than actual physical difficulty.
    Of course, I feel that this is all a moot point and the range of the entire climate system is the important bit of knowledge. If the radiative warming theory is true the whole system should read warmer and the best way to show that, or its absence, is to show the bounds increasing. Of course, that would not prove CO2 involvement, but it would eliminate a lot of data issues.

  92. Hey guys, come on now. Don’t put words in my mouth. I was talking about the sampling theorem specifically, and I was trying to make a point with my simple examples. I don’t really disagree with the sentiment here that the error bounds on instrumental temperature records are probably significantly understated (I would guess by an order of magnitude, i.e. a big deal). But to make that case we need a better argument than the sampling theorem. It is a mathematical theorem which does not say what some people here try to argue it does.
    Also, I don’t believe that there are large errors (in multidecadal trend line regressions) caused by aliasing (but I would be happy to be proven wrong about that). However, I do agree there are serious problems with the homogenisation algorithms that seem to be employed.
    Jordan:
    “If we join up the points, we can turn this into a sequence of trapezia. We’re interested to know whether the resulting stepwise-linear curve is a decent reflection of the original signal.”
    No, I think you go wrong here. We are not interested in a decent reconstruction of the daily temperature variations. The thermometer readings are just viewed as a statistic.

  93. “”” Toho (13:11:31) :
    George E. Smith (23:59:30) :
    “I don’t understand your comment that a priori knowledge of a 24 hjour cycle with harmonics somehow dismisses the sampling theorem.”
    If you happen to have a-priori knowledge that the signal is a pure sine wave of a certain frequency you only need a single sample to be able to perfectly recreate it to prepetuity “””
    This is worse than pulling teeth. If I have a sinusoidally varying signal with an exactly known frequency, and I take one sample per cycle; I get zero information about whether the amplitude of the cycle is 20 degrees C or 20 millidegrees C. I get exactly the same measurment every sample; and any process for obtaining an average, will naturally give exactly the value of that sample. Now remember I did say that the case of sampling at exactly 2.B (your are talking only 1.B) is a degenerate case with an indeterminate result. That is one of the reasons for random sampling; but that only works for a truly sinusoidal signal, which by definition has exactly the same amplitude each and every cycle. (if the amplitude changes from cycle to cycle it isn’t a sinusoidal function).
    Some are saying that the min/max average is good enough. How good is good enough when we are talking hundredths of a degree.
    I have looked at a whole bunch of daily “weather” maps; do it every day for the SF bay area, and daily min-max ddifferneces of 30-40 deg F are very common, with tens of degrees differences over distances of a handful of km separation.
    When it comes to shorter than daily cyclic temperature changes, (we actually have clouds in California) there is no prior frequency knowledge.
    One can argue that “it all comes out in the wash” and the averages are good enough. Well the climate does not depend totally on average temperature.
    At the global mean temperatuyre there is no “Weather”, so there can’t be any climate either since that is defined as the average of weather.
    Unfortunately, the operating Physics doesn’t pay any attention to averages. The surface emitted thermal radiation that is a major earth cooling process, follows a more 4th power of temperature law, and the spectral peak of the thermal radiation which is what is of interest to GHG capture, varies as the 5th power of the temperature.
    So if you take the integral of the 4th or 5th power of any cyclic temperature curve over a complete cycle, the result is always higher than simply integrating the average; so cycles do matter, and in the case of climate temperatures there are notable annual and daily cycles that result in an always positive enhancement of the total radiant emmission from the earth, and hence estimates of whether we are warming or cooling.
    It is ironic that all you statisticians think that a tenth of a degree change in the average of a variable that has a 150 degree C possible maximum range, of values on any given day is somehow significant; but you can then dismiss similar errors in your homogenised data, that result from plain and simple experimental errors; because well you aren’t really interested in the data, just the imagined “trends”.
    Just don’t call it science if that is what you truly believe.
    Take a look in Al Gore’s book “An Inconvenient Truth”; pages 66-67 specifically, where we have Al’s impression of atmospheric CO2 and Temperature from some ice cores, over a long period of time.
    So tell us what “TREND” is depicted in those graphs of CO2 and Temperature ? I’m only interested in Climate not weather so just give me one number for the whole 600,000 years. That should be long enough for you to get some good statistical average of the trend, and I would expect a pretty small standard deviation after all that time.
    Just try telling your cell phone service provider, that the sampling theorem doesn’t matter, and he should be able to give you pretty good; though “Average” service without paying any attention to the Nyquist theorem.
    The CRU supporters complain that we are taking the whistle blown e-mails out of context; in other words; we aren’t getting the full story form just a few e-mail snippets. Funny how that works with e-mails but we can get perfectly good climate data from out of context “sampling”.

  94. “”” Toho (14:15:51) :
    Jordan:
    I agree with most of what you write. But my point is that it doesn’t follow from the sampling theorem. The sampling theorem is about recreating exactly. Sure, you are going to lose information when sampling, and that will cause errors in the temperature estimates. I certainly agree with that. But it has nothing to do the sampling theorem. I don’t think aliasing is a big deal by the way, because localized atmospheric energy will not stay localized for long. “””
    Here’s a Quote From Professor Julius T. Tou’s book on sampled systems.
    ” Sampling Theorems. Fundamenta Theorem of Sampling. If a signal f(t) has a frequency spectrum extending from zero to B cps, it is completely determined by the values of the signal (i.e. the samples) taken at a series of instants separated by T = 1/2B sec, where T is the sampling period.
    This theorem implies that if a signal is sampled instantaneously, at a constant rate equal to twice the highest signal frequency, the samples contain all of the information in the original signal. ”
    Now I changed some of his symbols which don’t repicate well here; but otherwise it is verbatim.
    And I would point out that the theorem says the signal is COMPLETELY determined by the samples, which contain ALL the information.
    That statement, is immediately followed by a mathematical proof of the theorem, based on Fourier Integrals, and the Fourier Transform.
    That proof is then followed by a proof that the original signal can be COMPLETELY reconstructed from the samples, and how to do that.
    So please don’t try to tell me that a properly sampled signal can’t be correctly reconstructed; the reconstruction is only limited by practical technological questions; not by theoretical mathematical limitations.
    Besides that my point is not that the signal needs to be reconstructed. A consequence of the sampling theorem is that THE AVERAGE cannot be recovered, given only a modest violation of the Nyquist Criterion (factor of two), and min/max daily sampling already is at or beyond that limit of error.
    And it shouldn’t be necessary for this audience to point out that “frequency ” can be applied to any cyclic variable, not just electrical signals. In the case of global temperature, it applies to two different variable, namely time and space; both of which are subject to the limitations of sampling.
    But I’m not here to try and free anybody from their delusion that statistical machinations can produce information out of nothing. You local telephone directory would be a good place to start a new science of statistical manipulations. Or if you are also a greeniie, or a WWF enthusiast; why not start a global project to determine the average number of animals per hectare all over the earth; animal meaning anything in the biological animal kingdom. Don’t pay any attention to what sort of animals; yes they range from smaller than ants to whale size; but that shouldn’t matter any more than the kind of terrain matters to the significance of the temperature (or anomaly) recorded there, matters to global climate. If you’ve seen one thermometer reading, you’ve seen them all.

  95. Toho:
    I agree global reconstructions are not expressed at daily reolution, but most are monthly. Do we have the well behaved “thermal surface”? (Note, spatial resolution is the question, not time.)
    My doubts remain. We know of significant regional fluctuations which prevail for many months and even years (like ENSO, AMO, PDO). What picture do we get from sparse and irregular spatial samples?
    I agree with your comment that thermometer readings are just viewed as a statistic, but aliasing can introdice a nasty systematic distortion. Not necessarily a zero-mean random variable which will “melt away” in the calculation of a mean. If the network gives us a distorted picture of (say) ENSO in 1998, the same network might give a similarly distorted picture of other ENSO events.
    Another example to illustrate concerns about loss of information in sampling, and why statistical analysis does not get us “out of jail”:
    You’re travelling at constant speed on a bicycle, and compelled to keep your eyes closed. You are permitted to blink your eyes (open) at regular intervals to capture some information about the road ahead. This is all you have to make decisions about direction. You’re fine at (say) 10 blinks per second, but then your are required to reduce your blink rate. There will come a point when the blink rate is so low that you are unable to gather enough information about the road ahead to avoid a crash. Loss of information is too great beyond that point. The lost information is permanently lost, there is no statistical analysis or modelling from the samples to get it back. So when you are at risk of crashing, the only practical option is to increase the blink rate to something that meets your requirements.
    To George:
    My encounters with the sampling theorem are all founded in discrete control systems design. It is easy to show matehmatically how a stable closed loop system can be driven to instability by extending the sampling interval.
    I think your familiarity of the sampling theorem will be greater than mine. However I understand the mathematical analysis usually starts with an absolutely band-limited spectrum (dropping to zero amplitude beyond an assumed maximum frequency). The sample sequence “repeat spectra” can then be completely isolated and removed in theory. Do I take it that’s the reason why you suggest the original signal can be perfectly recovered from the samples?
    I asseted that perfect reconstruction is not possible – without explaining that practical systems cannot totally bandlimit a signal in the way of the theory. We can realistically reduce frequency-folding to an immaterial level (for a purpose, such as control), but that also means the samples can only ever produce an imperfect reconstruction of the continuous signal.
    (OK – perhaps there is an exception of well cyclic signals, but that doesn’t add much to discussion of sampling the climate system.)
    To Steven
    You could well be right. Or maybe wrong – I don’t know and feel we’re not really in a position to say either way. As you;ll see from above, I feel inclined to hang onto my scepticism. The issue of aliasing needs to be formally investigated and submitted for review in the usual way. Until then, could we put an asterisk next to the global reconstructions, just to remind us of this loose end?
    (Good discussion about an interesting topic.)

  96. “”” Jordan (15:54:27) : “””
    Well Jordan bear in mind that what I just related above, extracted from a sampled data control system Text book, is a purely mathematical construction. But it does assert that in the mathematical sense the band limited continuous functioon is in fact perfectly represented by the set of properly spaced discrete samples, and in the mathematical sense it can be completely reconstructed. Now it is true that often we can’t be sure the signal really is band limited; global temperature data for example, we really can’t know the resolution limits of actual spatial temperature variations, or even temporal ones.
    In the signal processing realm, this problem is usually dealt with by running the raw signal through an anti-aliassing filter to be sure that it is truly band limited before sampling. Generally that means reducing out of band signal information to below the LSB of the A-D converter.
    Aliassing noise is a real problem for example with optical mice. Your basic Optical mouse is a digital camera that may take 2000 frames per second images of whatever surface the mouse is sitting on. It cares not what that surface variation is, just that there is some. Crosscorrelation of incoming images, with the previous stored image, is converted into cursor move information, after deducing the movement between the two successive images.
    Now the high frame rate (sample rate) is only possible because the image has only a small number of pixels, and quite large ones. So the camera may only have between 15 x 15 to perhaps 30 x 30 pixels, somehwere in the 30 -60 micorn pixel size range. This is not like your favorite Nikon Digital SLR.
    Now I can easily design single element camera lenses (1:! relay) that can resolve way below 30 microns, since we are not constrained to spherical surfaces, since the lenses are molded. As a result the camera lens in proper focus can provide an image on the silicon sensor, that the sensor can not properly sample, without aliassing noise. this problem will be set off my ssurfaces containing repetitive patterns, so the dot patetrn color printing images, and even things like Tatami mats can result in eratic cursor movement, if you try mousing on them.
    We have eliminated that problem in at least LED mouses, since I can build a completely Optical anti-aliassing filter, and put it right on the actual camera lens itself. In effect I can design a deliberately fuzzy lens that won’t resolve below the pixel size; with an accurately manufacturable cutoff frequency, of the Modulation Transfer Function. In fact I have several Patents on the method.
    It is much harder to implement that in laser mice because of the effects of the beam coherence.
    So yes I am up to my elbows every day in Sampling theorem realities.
    And yes, in practice it is practical limitations that prevent exact reconstruction; but not mathematical theoretical reasons.
    But the whole modern communication technology is intimately tied to sampled data. The sampling theorem says I can completely represent say a 4 KHz band limited voice signal by discrete samples taken at 125 microsecond intervals. Well those samples can themselves take less time than say one microsecond. So I can have 124 microseconds of silence between the adjacent samples of a typical voice message. Or alternatively, I can fill that empty space with another 124 sets of one microsecond samples for a total of 125 total voice messages all happening at the same time. Adn at the receiving end, I can sort those samples into 125 channels, and then reconstruct each of them at its end destination.
    Well of course you need some time for synchronisation, and management overhead. Now you aren’t getting something for nothing; because in order to transmit that pulse train of one microsecond pulses, with sharp transition steps between channel samples, you need a transmission channel that can handle one microsecond pulses, instead of a slow 4 khz audio signal; well all of that is dicatated by the Shannon theorem.
    So accurate (sufficiently) reconstruction of properly sampled data, is a well developed technology, and the telephone companies spent mucho bucks and time, making sure that the theory behind such transmission methodologies is sound.
    Now it is a lot more complex than I have indicated here, because they also go out of their way to compress the data to its minimum intelligible size as Bill alluded to in his posts; and many years of study has gone into the development of digital data encoding and transmission techniques, to improve capacity, and signal to noise ratios while containing total channel bandwidth.
    But all of that magic is post processing of already snared information, and the recipient, knows exactly what surgery was carried out on his stuff and how to unbury it, and recover his data.
    That luxury is not present in the climate data gathering field. Satellite systems that can scan offer a big improvement, but have their own difficulties, in terms of understanding just what the blazes your remote sensors are really responding to.

  97. I forgot to add above that it is not any need to reconstruct a global temperature map that concenrs me. It is that aliassing noise can swamp even the average, with just modest undersampling, and the base time data sets used to compute temperature anomalies are just such long time averaged data that completely ignores the fact that that base average temperature is itself corrupted by noise; no matter how long the base interval is.

  98. Jordan:
    “I agree global reconstructions are not expressed at daily reolution, but most are monthly. Do we have the well behaved “thermal surface”? (Note, spatial resolution is the question, not time.) ”
    No, I don’t think we have. I think the thermal surface is a lot more noisy than is what is percieved in AGW circles.
    “I agree with your comment that thermometer readings are just viewed as a statistic, but aliasing can introdice a nasty systematic distortion. Not necessarily a zero-mean random variable which will “melt away” in the calculation of a mean. If the network gives us a distorted picture of (say) ENSO in 1998, the same network might give a similarly distorted picture of other ENSO events.”
    Maybe, but my gut feeling as a physicist is that aliasing specifically is a non-issue compared to a lot of other potential error sources. The reason for my feeling is that energy will tend to move pretty quickly in the atmosphere. However, this is something that probably could (and should) be statistically tested. I have a feeling that most of the testing that has been done and published in this area relies on homogenized data which will tend to hide systematic errors by making station records more appear more correlated than they are in reality.
    George:
    “And I would point out that the theorem says the signal is COMPLETELY determined by the samples, which contain ALL the information.”
    True, but the theorem does not say the inverse, that you can’t have complete information of the signal without all the samples, or with a smaller number of sample points. All your arguments seem to be based on this inverse of the theorem, which isn’t generally true. In particular it isn’t true when you have additional information about the system dynamics. The theorem provides a sufficient condition for reconstruction, not a necessary condition.
    “That proof is then followed by a proof that the original signal can be COMPLETELY reconstructed from the samples, and how to do that.
    So please don’t try to tell me that a properly sampled signal can’t be correctly reconstructed; the reconstruction is only limited by practical technological questions; not by theoretical mathematical limitations.”
    That’s not what I said. My comment was made in response to something that Jordan wrote above, and I think I was pretty clear that it wasn’t regarding the sampling theorem, but regarding temp records. To recreate exactly as per the sampling theorem you need an infinite number of samples, even if the bandwidth is limited. In real life you don’t have that, so in real life sampling an arbitrary signal will lose information.
    “This is worse than pulling teeth. If I have a sinusoidally varying signal with an exactly known frequency, and I take one sample per cycle; I get zero information about whether the amplitude of the cycle is 20 degrees C or 20 millidegrees C. I get exactly the same measurment every sample;”
    If you have a sine signal you by definition know the phase. I explicitly said the only missing piece of information was the amplitude. If that’s the case you only need one sample (not one per period) to reconstruct the entire signal. If you are missing phase and frequency as well you need three carefully chosen samples instead (not per period, three total). This is a simple counter-example to the inverse of the sampling theorem if you will. It show that you can get by with a lot less than two samples per period if you have knowledge about the system dynamics and carefully select where you sample your signal.
    The example was in response to one of your posts above where you didn’t understand (my probably poorly worded) statement about a-priori information. But the main point is, the inverse of the sampling theorem is not generally true.

  99. Toho (03:32:41) : ” … homogenized data .. will tend to hide systematic errors by making station records more appear more correlated than they are in reality.”
    Yes, that’s a good point.

  100. “”” Toho (03:32:41) :
    “” If you have a sine signal you by definition know the phase. “””
    OK Toho; I give up; you win.
    So yes I know it’s a pure sinusoidal signal with absolutely no harmonic content.
    I can also tell you that the period is exactly 86,400 seconds.
    So I just read my thermometer and got my one sample. The thermometer reads 59 deg F/15 deg C. It is 10:30 AM PST.
    Please reconstruct the complete signal and give me the following information;
    1/ Minimum temperature
    2/ Maximum temperature
    3/ Time of either minimum temperature or maximum temperature.
    Or if for some reason you are unable to provide any or all of those numbers; please give me instead;
    4/ The Average temperature for the Cycle.
    So that should be fairly straight forward Toho;
    So it’s your move now.

  101. Is anyone else interested in the airport heat island effect and the correlation with jetfuel consumption on takeoff?

  102. George
    There was no doubt some misunderstandings and dead ends in the above discussion, and the sine wave might be one of them. But I wouldn’t be too dismissive of Toho’s position.
    Toho also makes this very good point (for a non-periodic signal):
    ” the theorem does not say the inverse, that you can’t have complete information of the signal without all the samples, or with a smaller number of sample points. All your arguments seem to be based on this inverse of the theorem, which isn’t generally true.”
    Seems to make sense to me.
    Toho- you have said the following on a number of ocasions: “The theorem provides a sufficient condition for reconstruction, not a necessary condition.” I don’t follow – could you please expand.
    One of the things I take from the above discussion is the apparently conflicting requirements of statistical analysis (sampling error) versus the sampling theorem (errors due to aliasing) when it comes to sample autocorrelation.
    If the objective is to measure statistical aggregates, autocorrelation in the samples tends to be a problem – making life generally more difficult and perhaps even obstructing an analysis. If the objective is to recreate a faithful representation of the continuous signal from a finite sample, autocorrelation between the samples would appeat to be an absolute necessity.

  103. Jordan:
    “If the objective is to measure statistical aggregates, autocorrelation in the samples tends to be a problem – making life generally more difficult and perhaps even obstructing an analysis. If the objective is to recreate a faithful representation of the continuous signal from a finite sample, autocorrelation between the samples would appear to be an absolute necessity.”
    Yes, I think that is a pretty good simple summary.
    “Toho- you have said the following on a number of ocasions: “The theorem provides a sufficient condition for reconstruction, not a necessary condition.” I don’t follow – could you please expand.”
    That is just an attempt to express my point about the sampling theorem in a different way. I suppose the above is a formulation that would appeal to a mathematician.
    The theorem essentially says that IF you have a bandwidth limited signal and sample it often enough and in a specific way, THEN you can recreate the signal (that is the sufficient part). However, it does not say that IF you DON’T sample it that often or IF you DON’T sample it in that specific way, THEN you CAN’T recreate the signal (that would be the necessary part).
    I.e. there may be other ways to sample that would require a smaller average sample rate for reconstruction.
    Related is the second point I am trying to make, that if you know something about the dynamics of the system, you can use that knowledge along with a much sparser set of samples in order to get much more information out than you would get from the samples alone. I also gave a very simplistic example of such a case. Another more realistic example is meteorologists making short term weather predictions from (samples of) initial value conditions.
    George:
    I have never claimed that real temps are a pure sine signal. However, there are published statistical relationships between the max/min temp and the cycle average. It seems pretty clear to me that if you are looking at the anomalies at a decadal scale, then such statistics should be pretty good (for local temperatures). If you have a good argument as to why they are incorrect I am more than willing to listen, but it simply does not follow from the sampling theorem. Errors from UHI effects, changes in instrumentation, location changes etc should be larger by orders of magnitude.
    And George, you do have a number of good points above that I tend to agree with.

  104. Gerlich and Tscheuschner (in their debunking of the CO2 greenhouse effect) argue that “there are no calculations to determine an average surface temperature of a planet” because there are too many localized random temperature variations to know which are and aren’t accounted.
    I note the NASA GIStemp procedure includes “eliminination of outliers”. G&T would probably argue there are no outliers that should be eliminated, since they are all part of local variations over the globe.

  105. “”” Jordan (00:49:01) :
    George
    There was no doubt some misunderstandings and dead ends in the above discussion, and the sine wave might be one of them. But I wouldn’t be too dismissive of Toho’s position.
    Toho also makes this very good point (for a non-periodic signal):
    ” the theorem does not say the inverse, that you can’t have complete information of the signal without all the samples, or with a smaller number of sample points. All your arguments seem to be based on this inverse of the theorem, which isn’t generally true.”
    Well Jordan, I’m not dismissive of Toho’s position; maybe I just don’t understand it; so I’m here to learn.
    But he asserted that one sample suffices to completely define an unknown sinusoidal signal; I’d like to see how that is done.
    The text books point out that the case of exact two sample per cycle of a single frequency signal is degenerate, and even though it fully complies with Nyquist, the signal can’t be recovered in that case. It’s not of practical importance since any lack of phase lock of the sampling allows for slueing through the complete waveform (which by definition is repetitive since it is a sine wave) so it has to be infinite in extent.
    Now if Toho wants to add other information to the sample that is a different situation. The min/max strategy for the daily cycle at least establishes much of the information, and it would be complete for the simple case of the sinusoid, and yield the correct average; but that is not the real case.
    My signal processing colleagues who do sampled data processing all day long assure me, that absent additional special information (which is only applicable in special cases) the sampling theorem is both necessary and sufficient. Yes you can construct special cases, that permit undersampled signals to be recovered because of other information available in those cases.
    Weather and climate are basically chaotic; there is no way they are likely to conform to any special case that can eschew full Nyquist compliance.
    And I reiterate, that I am unconcerned about the lack of ability to reconstruct the original continuous signal; but I am concerned when the Nyquist violation is serious enough to corrupt even the average with aliassing noise.
    And when the time sampling strategy clearly excludes consideration of cloud variations; then nobody is going to convince me, that any GCMs which also don’t properly model clouds, can be made to track observational data that does likewise.
    I’m not suggesting that the network of land based weather stations simply be abandoned. but it needs to be recognized that many of those stations exist for the benefit of pilots who have a real pressing need for up to the minute data on real runway weather conditions, principally temperature, atmospheric pressure, and humidity, as well as the obvious like wind speed and direction.
    You haven’t lived on the edge (as a pilot) if you have never made that mistake, of landing a plane on a short runway in the wrong downwind direction. I did it precisely once and on a quite long runway for the plane I was flying. Believe you me I got religion before the plane rolled to a stop.
    But when that network of “weather” stations is conscripted to try and observe the mean surface temperature of the entire earth, where over 70% of the surface has no long term observational stations; I get less than impressed with the methodology.

  106. George
    I see more agreement in our various discussions than disagreement. Particularly on the question of whether we have an adequate sample of the climate system to support the single line which is supposed to represent the global trend in temperature.
    (I might add that the concept of a global temperature is about as meaningful to me as the average one-breasted, one testacled person mentioned in a previous comment here).
    Toho makes a fair point that other information can reduce the demands we would otherwise have to make on data sampling. The generality of this point should not be understated.
    His (her?) first example took ihe point to an extreme – perhaps unhelpfully. However … if we know the signal is a simple sinusoid, and we also know the amplitude (or phase), it would only take one sample to give us the last unknbown to fully define the signal for an indefinite period.
    Theat’s an extreme example of Toho’s (fair) point that there is a “sufficiency” versus “necessary” angle to sampling.
    In response to further comment, Toho then dealt with a situation where we have less knowledge. If we only know that the signal is a sinusoid, it would only take three samples to fully define the signal. (OK, we also need to know that the samples are all within one cycle – so we would need to have at least some idea of the frequency or phase).
    This is a leap that deserves acknowledgement.
    I mentioned my background in control engineering, where we frequently have the luxury of a framework of “a priori” knowledge. One of my first posts on this thread talks about sampling as a design problem – the issue being to have enough initial knowledge of the system/signal to design the sampling methodology. Well that’s the sort of approach that comes naturally to tackling a control problem.
    I think Toho makes much the same point from a different direction – if we know something about what we are trying to sample, we can then make decisions about how to sample it. (Toho – I hope that’s fair to what you were saying.)
    At the end of all of that a question: can I get comfortable that the sparse and erratic sampling of temperatures over the last 150 years gives us the information to support the analysis at the top of this thread. Frankly? No!

  107. Well Jordan, you won’t get any argument form me, as to the benefits of a priori knowledge other than the samples. But all of the cases I am familiar with apply only to certain special situations. In the case of a perfectly general signal, it is not clear to me that there are any a priori snippets of information that can substitute for a proper set of samples.
    But I am of a like mind, in that I think the whole concept of a “global mean temperature” is quite fallacious, even though one can define such a thing and in principle, can measure it, and I do mean in principle, since it is quite impractical in practice.
    But after you have determined that, you still have exactly no knowledge of the direction of net energy flow into or out of planet earth, which is what will really determine the long term outcome; and the lack of any differential information (you only have the average) means you can’t even discuss the weather which depends on temperature differences (at the same time).
    Temperature alone; without knowledge of the nature of the terrain, tells you nothing about energy flux, since the processes happening over the oceans, are quite different from those occurring over tropical deserts or arboreal forests, or snow covered landscapes, and are quite differently related to the local temperature.
    I have no problem with GISStemp as a historical record of GISStemp, although it has many problems; but extending that to global significance doesn’t cut it with me.

  108. George
    The point about a priori information appeals to me because it has parallels in the procedure of “identification” in control system design. In reasonably well defined situations, we can understand the linkages between different parts of the controlled process to determine where measurements are required. We can often use knowledge of dynamic parameters to determine sample rate for discrete controllers, and therefore to design an observable and controllable solution. Where things are not so well defined in advance, we may need to set up some form of test to determine the required parameters empirically. I know that these are luxuries which are not generally available elsewhere – including the climate.
    A situation where we have absolutely no a priori snippets of information, would lead me to question how we could even start to work out the how’s and where’s of sampling. Not least what information would we use to choose a sample frequency/disribution.
    This unhappy situation does not appear to be too far from what I can see in the assessment of global average temperature. I do not think the underlying system is well enough understood to come to decisions about how best to sample it. Perhaps the greater spatial coverage of the satellite systems will help to resolve that in time.
    But, IMO, the historic “instrumental” temperature record is quite another thing. Analysis of trend lines and error regions takes a remarkable degree of faith in underlying assumptions about: (i) the behaviour of the spatial field in different time scales; and (2) the signal we are getting from an inherited and changing measurement system originally set up for all sorts of other purposes.
    I could take issue with those who suggest there are bigger fish to fry than concerns about sampling, spatial and perhaps even temporal aliasing. What do we have to give use the comfort that this data is more than just pile of badly sampled and misleading junk? The trends may have about as much meaning as tracing the path of a drunk man staggering around in the dark.
    If talking about priorities, is there anything more important than such a fundamental question about the quality of the data?
    There is no question that the historic data has some value. It is better that we have it than not. But are we allowing ourselves to be impressed by the sheer mass of data? Are we trying to convincing ourselves that the more of the data we use, the more meaning we can yield?
    Dan makes a perfectly good suggestion in this thread: “don’t try to make it a global or hemispherical average, just track trends at those admittedly small sample of sites.”
    But that kind of suggestion will mobilise an army of opinion, arguing that point measurements are not representive of the full spatial field. Arguments which are basically alluding to aliasing – admssion that the field does not behave in a way which can rely on sparse sampling.
    So where does this get us? JGC acknowledges a problem in what he calls “coverage bias”. Would it be better for JGC to put the analysis of historical data onto the back burner until we have some pretty convincing analysis and methods which will allow us to extract meaningful information from this historical data.
    Right now (I say it again) There is no convincing reason to pay any attention to those trend lines. I think Dan’s suggestion is just as convincing.

  109. Oh dear … spoke too soon. Latest from the Met:
    http://www.metoffice.gov.uk/corporate/pressoffice/2009/pr20091218b.html
    “New analysis released today has shown the global temperature rise calculated by the Met Office’s HadCRUT record is at the lower end of likely warming. … This independent analysis … uses all available surface temperature measurements, together with data from sources such as satellites, radiosondes, ships and buoys. …. The new analysis estimates the warming to be higher than that shown from HadCRUT’s more limited direct observations. ”
    If anybody wonders how this could be, here’s the MET’s latest excuse:
    “This is because HadCRUT is sampling regions that have exhibited less change, on average, than the entire globe over this particular period.”
    That’s right, you chooses your data and you gets your answer. And whaddayaknow – the latest analysis shows we woz right all along:
    Further:
    “This provides strong evidence that recent temperature change is at least as large as estimated by HadCRUT.”
    NO IT DOESN’T. All it shows is that the data is not robust. Look at the data differently and you get a different result. (That’s almost a way to explain what we mean by statistical insignificance.)
    Even the MET acknowledges this (although they probaly don’t realise it). Look at the legend under the graphical presentation which refers to sparseness of the sampling:
    “The ECMWF analysis shows that in data-sparse regions such as Russia, Africa and Canada, warming over land is more extreme than in regions sampled by HadCRUT …. We therefore infer with high confidence that the HadCRUT record is at the lower end of likely warming.”
    As there appears to be no formal study of the characteristics of the sampling problem (reported in the literature) there is no “identification” of the problem which would supprt a decision on sampling methodology. Without that, this kind of analysis cannot rise above junk status.

  110. I think Gerlich and Tscheuschner talked about random variations in temperature in local regions across the globe due to cloud effects making it rather difficult to figure a global mean from a limited geographic sample. It would be interesting to compare data from home weather stations with the Hansen smoothing across geographic areas.

Comments are closed.