The View from Down Here Isn’t Much Different

James A. Schrumpf

If there’s one belief in climate science that probably does have a 97% — or more — consensus, it’s that the raw temperature data needs meaning, least-squaring, weighting,  gridding, and who know what else, before it can be used to make a proper data set or anomaly record.  I’ve been told several times that without some adjustments or balancing, my calculations will be badly skewed by the imbalance of stations.

So the null hypothesis of temperature data all along has been, “Just how bad/far off /wrong are the calculations one gets by using raw, unadjusted data?”

NOAA provides access to unadjusted monthly summaries of over 100,00 stations from all over the world, so that seems a good place to begin the investigation.  My plan:  find anomaly charts from accredited sources,  and attempt to  duplicate them using this pristine data source.  My method very simple.  Find stations that have at least 345 of the 360 records needed for a 30-year baseline.  Filter those to get stations with at least 26 of the 30 records needed for each month.  Finally, keep only those stations that have all 12 months of the year.  That’s a more strict filtering than BEST uses, as they allow fully 25% of the 360 total records to be missing.  I didn’t see where they looked at any of the other criteria that I used.

Let’s see what I ended up with.

Browsing through the Berkeley Earth site, I ran across this chart, and decided to give it a go.


I ran my queries on the database, used Excel to graph the results, and this was my version.


BEST has more data than I, so I can’t go back as far as they did.  But when I superimposed my version over the same timeline on their chart, I thought it was a pretty close match.  Mine is the green lines.  You can see the blue lines peeking from behind here and there.


Encouraged, I tried again with the contiguous 48 US states.  Berkeley’s version:


My version, superimposed:


That’s a bit of a train wreck.  My version seems to be running about 2 degrees warmer.  Why?  Luckily, this graphic has the data set included.  Here is BESTs average baseline temps for the period:

% Estimated Jan 1951-Dec 1980 monthly absolute temperature (C): %
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
% -4.07 -1.75 2.17 8.06 13.90 18.77 21.50 20.50 16.27 9.74 2.79 -2.36
% +/- 0.10 0.09 0.09 0.09 0.09 0.09 0.10 0.09 0.09 0.09 0.09 0.09

Here are mine:

-1.26 1.02 5.2 11.13 16.22 20.75 23.39 22.58 18.68 12.9 6.09 .98
.03 .03 .02 .02 .02 .02 .01 .01 .02 .02 .02.02

It’s obvious by glancing at the figures that mine are around two or so degrees warmer.  I guess it’s due to the adjustments made by BEST, because NOAA says these are unadjusted data.  Still, it’s also obvious that the line on the chart matches up pretty well again.  This time, the data set for the BEST version is provided, so I compared the anomalies rather than the absolute temperatures.

The result was interesting.  The graphs were very similar, but there was offset again.  I created a gif animation to show the “adjustments” needed to bring them into line.  The red line is BEST, the black is mine.


At the moment, the GISS global anomaly is up around 1.2, while BEST’s is at 0.8.  The simple averaging and subtracting method I used comes closer than another, highly complex algorithm that uses the same data.

My final comparison was of the results for Australia.  As with the US, the curve matched pretty well, but there was offset.


The same as with the US comparison, the BEST version had a steeper trend, showing more warming than did the NOAA unadjusted data.  For this experiment, I generated a histogram of the differences between the BEST result for each month and the simple method.


The standard deviation was 0.39, and the vertical lines mark the 1st and 2nd standard deviations.  Fully 77.9% of the error fell within one standard deviation of the BEST data.and 98.0% within 2.

What does this mean in the grand scope of climate science, I don’t know.  I’m certainly no statistician, but this is an experiment I thought worth doing, just because I’ve never seen it done.  I’ve read many times that results of calculations not using weighting and gridding and homogenization would produce incorrect results — but if the answer is not known in advance, like how the “heads” result of a million coin tosses will come very close to 1 in 2 is known, then how can it be said that this simple method is wrong, if it produces results so close to those from to more complex methods that adjust the raw data?

0 0 votes
Article Rating
Newest Most Voted
Inline Feedbacks
View all comments
May 7, 2019 2:13 pm

We’ve seen this over and over where the official records have cooled the past, and heated the present. This is not a surprise to skeptics who have been following the data sets.

Reply to  Jeff in Calgary
May 7, 2019 3:19 pm

Progressive ( no pun intended ) cooling of the past as has been shown with both US and Aussie data manipulation is very suspicious. However, what does this article add to the discussion?

So the null hypothesis of temperature data all along has been, “Just how bad/far off /wrong are the calculations one gets by using raw, unadjusted data?”

That is not an hypothesis, it is a question ! Best not to use terms if you don’t know what they mean.

Not gridding data is obviously going to bais results one way or the other as geographic density of sites moves over time. Apparently it has been relatively stable in Europe.

James Schrumpf
Reply to  Greg
May 7, 2019 5:47 pm


I did write poorly there, I admit. The dictionary defines the null hypothesis as “the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.”

I was attempting to make the point that there’s no significant difference between the simply-calculated anomalies and the gridded/homogenized/weighted/etc. version, except for the adjustments that may or may not be justified.

If there’s no true “expected value” for the mean to move toward, who’s to say which is the “correct” mean? In the case of sampling a population, which is what I was doing, the standard error is a calculation of how close to the mean you’d get 67% of the time if you repeated the entire experiment over with all new samples. It’s not telling one how close to the “true” mean one is.

Reply to  James Schrumpf
May 7, 2019 11:15 pm

Thanks for the reply James. The problem is you are lumping everything together: gridding/weighting/homgenisation etc. and some more questionable adjustments.

If you blindly take the simple arithmetic mean of everything, what does the answer indicate? It is a mindless statistic. What is the “true mean” you think you are aiming for?

You need to state what you are seeking to find, then develop a method to get your best approximation to it. If you are seeking to find out whether the world or country as a whole is getting warmer, you need to look at temperature changes ( called “anomalies” ) at each site. You do not want to bias towards a location because there is a high concentration of stations in that region, nor ignore others because they have few recording sites. That is the point of gridding.

Compare the east coast of Australia to the outback. If you take the blind average in Oz you just get the climate of the east coast since it swamps the rest. There is some very dubious manipulations being done by BoM but just doing a dumb average is not going to expose the malfeasance. You actually need to try a bit harder to show anything.

James Schrumpf
Reply to  Greg
May 8, 2019 3:02 am


I’m still not explaining myself well. I think all one can do with this data is get a “blind average.” Besides, we are doing anomalies, not absolute temperatures. Here’s a graph of the anomalies from 1900-2019 for Alice Springs and Sydney as 10-year running averages. Can you tell them apart?

comment image

If there were ten more of one or the other, how much would the mean of them be changed?

Reply to  Greg
May 9, 2019 8:46 pm

“Not gridding data is obviously going to bais results one way or the other as geographic density of sites moves over time.”

The biggest bias comes from the adjustments themselves. If you want to look at temperatures from decades back, the US has the best record around, yet it gets the most adjustments.

That’s why the only land temperature data record I trust at this point is the USCRN, with high quality sites and no adjustments. It only goes back to 2005, but since then temps have gone down not up.

comment image

That’s a couple of years out of date now, and temps are down even further recently. The last 6 months are averaging -2.7 degrees below the starting point of the record in Jan 2005.

To me, the fact that historical adjustments introduce warming, and when you eliminate the adjustments it stops warming is strong evidence that the record being reported is heavily biased.

Izaak Walton
Reply to  Jeff in Calgary
May 7, 2019 3:20 pm

What this post shows is the exact opposite. James is saying that a simple average of NOAA records is virtually identical to more complicated methods for calculating global temperatures. So basically there is no evidence that official records are cooling the past.

Robert B
Reply to  Izaak Walton
May 7, 2019 4:25 pm

14 years ago, the sceptical argument was that it had only warmed 0.6° in the past 100 years and most of it was before 1940. Even Tom Karl said something similar 17 years ago. That has gone up by a third of a degree for the same period, mostly after 1940 now. The post shows that most of that was due to a little offsetting of sections. Hardly what you would expect if they actually did what they claimed.

Reminds me of the Kinnard reconstruction of sea ice extent in the Arctic. They just offset the middle of a 60 year oscillation in an older reconstruction – and cropped one year that suggested you could do the NorthWest passage in a pair of budgy smugglers.

Michael Jankowski
Reply to  Izaak Walton
May 7, 2019 5:05 pm

“…So basically there is no evidence that official records are cooling the past…”

Yeah…no evidence aside from the mountains of evidence that official records are cooling the past. What planet have you been living on?

Here’s just one example of NOAA

And past one here

And here

Try again.

Anthony Banton
Reply to  Michael Jankowski
May 7, 2019 11:15 pm

“Try again.”

The conspiracy theory that “they” are tampering the record to make AGW look worse (or even in totality in the extreme Naysayer’s mind), has failed miserably if you look at the big picture…..

comment image

The warming trend has been reduced pre 1970.
So yes, the past has been warmed – and considerably so.
Spot the difference since then.
Seems silly to tinker with the odd station’s record, just so you sceptics can “smell a rat” and conjure the crutch of “tampering” and “fraud”, yet manage to make not one just of difference to the overall story.

Matthew Drobnick
Reply to  Anthony Banton
May 8, 2019 7:47 am

Sceptical science as the source.
What’s next Batman? CNN?
The Guardian?

Hahahaha, I love how the intellectually inferior continue to ingest information from known liars.

Reply to  Anthony Banton
May 8, 2019 5:23 pm

That’s only when you include the ocean temp observations. They cool past land temp observations dramatically. It is very noticeable in the u.s. station data, which is by far the best set of land surface temp observations available for any ‘local’ global region.

Michael Jankowski
Reply to  Anthony Banton
May 8, 2019 5:35 pm

It’s really simple.

Someone said, “So basically there is no evidence that official records are cooling the past.” I presented evidence otherwise.

I didn’t say anything about a “conspiracy theory.” I didn’t say that I “smell a rat” or that there was “tampering” or “fraud.” Nor did any of the posts leading to my reply.

So after making all of that ish up and inventing an argument that didn’t exist, your evidence to oppose it was a graph hosted at Skeptical Science and lacking any attribution as to which official record and source it represented (with NOAA adjustments being the ones relevant to the author’s post and particularly in Europe, the US, and Australia).

I’d say “try again” to you as well, but your failures are weighing on society enough as it is.

Gary Pearse
May 7, 2019 2:49 pm

This is prima facie evidence that the adjusters havent actually adjusted each station based on quality considerations. It has undertaken a gross adjustment of the intact series. It’s as if a thumb tack was used at the pivot point to rotate the curve counterclockwise. The closer you are to the thumb tack, the lesser the adjustment and there is a smooth transition from positive adjustments to the right of the pivot to negative adjustments to the left.

This shouts BS to the quality control argument. Ask yourself, why would the “little bridge” between 1965 and 1980 be perfect without adjustment needed but the earlier be shifted down and the later jerked up!! In a court of law this would be indictable on the evidence.

Whenever I’ve queried some apparent flaw, Nick Stokes or some other is quick to say, yeah, but there was a station move! It kills you in your tracks. But hear that enough and couple it with climate revelations of willful altering that they discussed openly, even corrupting history to support an agenda or such things as Santers rewrite of the IPCC uncertainties into certainties or the Karlization of the Pause, and it dawns you that station moves would be a good way to stop some decline in temperatures. It is likely one of the tools in their fudging toolbox. Why didnt they move bad stations away from A/C exhaust an asphalt parking lots if they were so concerned about quality control.

Reply to  Gary Pearse
May 9, 2019 8:32 pm

Yes, there is a thumb tack in the middle, and then they use the Control Knob to rotate the curve counter clockwise.

What is the Control Knob, you ask? Why, it’s CO2 of course!

comment image?w=720

May 7, 2019 2:59 pm

The past is always cooled, the present is always warmed – statistical chicanery in the service of maximum FEAR, how surprising….

Jennifer Marohasy
May 7, 2019 3:21 pm

An important new report has just come out that explains these types of reconstructions are hopeless until we address issues of how the temperatures are measured:
It is by a really clever guy called Jaco Vlok, soon to be out of a job.

In Australia, the change over to automatic weather stations (AWS) combined with no averaging of the one second readings makes for more warming for the same weather. To be clear, the Bureau is not taking spot one second readings and the highest for any one day is recorded as TMax. Also, they have an increasing number of these at airports within reach of the exhausts from large boeing jets.

I’ve written lots on this, something recent that perhaps has relevant links:

Reply to  Jennifer Marohasy
May 7, 2019 4:19 pm

Rufus Black (UT) and Andrew Johnson (BOM) are currently discussing how to destroy Jaco Vlok without creating another Peter Ridd.
I’ll donate to Volk’s defence . . .

a happy little debunker
Reply to  Jennifer Marohasy
May 7, 2019 6:26 pm

Hi Jennifer (luv your work, btw),

In addition to issues with how the BOM actually measure data (AWS vs analogue readings) – the other impact on their records is their homogenisation, by using non-acorn sites to adjust acorn sites.

Most notably using the Sydney Observatory readings to adjust nearby acorn stations.

That they cannot/willnot/or are unable to allow outsiders to review these processes suggests a corruption of those processes.

If it can’t be reproduced it aint science!

Komrade Kuma
Reply to  Jennifer Marohasy
May 7, 2019 8:44 pm

No reflection on all those honest scientists out there whose intellectual intgrity is a core part of their being but I not thet JD Blok has a PhD in Electronic Engineering which I assume means he has not had to take part in the ‘climate science’ group think and group talk at UTas.

UTas has a very solid engineering section with civ-mech-electro associated with a State that is 90% renewable energy droiven via hydro and hs been for decades and a world class maritime section ( Australian Maritime College) with Naval ARchitecture, Ocean Engineering and Marine & Offshore Systems degrees feeding int a range of serious engineering based industries.

It will be interesting to see how JD Blok’s career path continues.

May 7, 2019 3:22 pm

Someday perhaps some climate scientist will write a paper that explains why thermometers always showed too high temperatures before 1960 and too low after.

Of course I’m just joking.

May 7, 2019 3:25 pm

James A. Schrumpf

“My method (is?) very simple. ”

Great post…

May 7, 2019 3:28 pm

I think that the KISS method is the correct one. Whenever I see the
words, “”A Anagram was used”” my BS detector lights up.

A very interesting article, thank you.


Jeff Alberts
Reply to  Michael
May 7, 2019 6:33 pm

”A Anagram was used”

Especially since it should be “an anagram”. 🙂

May 7, 2019 3:31 pm

Goldilocks adjustments..

The data before 1960 was just too warm

The data after 1980 was just too cold

The data from 1960-1980 was jusssst right

steve case
May 7, 2019 3:55 pm

GISSTEMP changes to their Land Ocean Temperature Index (LOTI) since 2002:

Here’s how GISSTEMP’s LOTI adjustments have altered the slope since 1997:

GISSTEMP has probably made nearly 100,000 changes to their LOTI report over the last two decades.

That these changes have been made cannot be disputed. Why they have been made is a matter of opinion.

Robert B
May 7, 2019 4:12 pm

I produced this a 5 years ago to point out why there was a pause. Better than fitting straight lines and then arguing about choice of end points.
comment image
The component consistent with an exponential increase due to increasing human emissions is 1/3 of the hole degree of warming. The rest is an assumed constant rate of warming since the LIA and an oscillation with a 60 year period. Such an opinion shouldn’t make you a heretic of AGW, or even CAGW, but the pause made even this smaller contribution dubious. When human emissions should have been the most noticeable is where the plot strays from fit the most. Relevant to the post because it shows that only small changes are required to make a stronger case for human rather natural reasons for changes that have occurred.

Gary Pearse
May 7, 2019 4:23 pm

Mods I had a perfectly fine, apropos ad hominem free comment that initially was up and now taken down. Is there an explanation? I know there were changes made but to me it seems almost arbitrary sometimes. For me, some of these are a lot of work.

Rhys Jaggar
Reply to  Gary Pearse
May 8, 2019 1:23 am

This happens regularly here. You should always copy and paste into Word or some other WP programme so you have a copy before pressing the Comment button.

You should also assume that the more valuable your comment, the more likely it will be passed around before publication and quite possibly deliberately ditched to prevent claims of prior art…….

The Queensbury Rules do not work in the big bad world of science politics…..

May 7, 2019 4:25 pm

The thing to understand is not whether there are adjustments but why. Logic tells me that over the last 100 years older raw temperatures should be raised and / or later temperatures lowered to take account of the urban heat island effect, yet adjustments almost always go the other way. The lack of verifiable and logical reasons for the adjustments confirms that whether it’s deliberate or just caused by confirmation bias the data provided is misleading . The crazy thing is if we can’t trust scientists in relation to historical data how can we expect any accuracy in their future predictions.

Steve O
May 7, 2019 5:27 pm

Part of the challenge is there are so many stations to validate. It seems that researchers all try to apply algorithms or other mass review methods to filter through them, when in fact they each need an individual evaluation.

I would rather see wider error bands due to having fewer stations, and only use pristine data. When adjusting data, the error bands are mathematically unknowable. How do researchers expand confidence limits based on possible adjustment error?

Steve Richards
Reply to  Steve O
May 8, 2019 11:21 am


May 7, 2019 5:46 pm

Again, when the range is a degree or two, I just don’t see how we can take anomalies in temperature data seriously.

Suppose a temperature-measuring station consisted of a larger area, where, say 12 different thermometers were distributed throughout the area. I wonder what sort of agreement there would be between the 12 thermometers just for one day, for one station. Would all the readings agree within one degree? Two? A fraction of a degree? [I don’t know]

All the focus, all the calculations, all the discussion around a degree or two still seems bizarre to me.

Gary Pearse
May 7, 2019 7:00 pm

My take on this is they tell us the adjustments are for data quality issues and give the idea that changes are applied to individual station data – some made warmer, some cooler. If this was the actual case, the result wouldnt look like they stuck a thumb tack in between 1965 and 1980 and rotated the graph counter clockwise about 5 degrees angularly, making it steeper.

How could it be coincidence that no adjustment was necessary on the stretch between 1965 and 1980, but increasing amounts are subtracted from the readings going back to the 19th Century and increasing amounts added coming up to the present. This can only be done by adjusting the trace of the record as a whole (ie, the thumbtack method on the raw data) rather than quality control on the readings themselves. Moreover, it means the entire record at each station individually were “thumb tacked”, i.e. a bad reading was made by an observer in Capetown on the same date as one was being made in Paraguay, Ecuador, Nuuk and Malmo. WUWT?

Steve Sceptica
May 7, 2019 7:29 pm

Average is an abused measure of centrality. They have to process the data in order to get rid of the obvious sampling biases. The more you process, the more difficult it is to interpret and the more likely that human confirmation biases will be introduced to make the data ‘right.’

Finding the ‘average’ is simply trying to put a square peg in a round hole. But it does lead to a lot of papers discussing “proper” treatment and give enough wiggle room to get the numbers “right.”

Ordering-based statistics like the median (as opposed to sum-based stats like the mean) make more sense to me with this type of data .

May 7, 2019 7:30 pm

“I’ve read many times that results of calculations not using weighting and gridding and homogenization would produce incorrect results”
They don’t (or shouldn’t) say that it would produce incorrect results, rather that it could. If you have a biased sample, it only causes trouble if there is a complementary trend in the data. The unweighted dataset will lean heavily to results for ConUS, for example. If ConUS behaves differently to global, this is a problem; if not, not.

I did a recent post here which compared various proper methods with just unweighted averaging. The latter was not particularly biased, but just all over the place. I had to graph it in pale yellow so the proper results would show through. The interest there was that if you take out the main variation that clashes with the sample bias, by subtracting a fit of spherical harmonics which have known (zero) integral, then the unweighted performs quite well.

A major practical problem with sampling without area weighting is that the result depends a lot on what turns up in the sample. It happens that GHCN V4 include a lot of ConUS stations. Someone might compile a set with a different emphasis and get a different result. Proper weighting counters such discrepancies.

Reply to  Nick Stokes
May 7, 2019 8:48 pm

To add to what Nick said, simply averaging station values can work IF the following things are true:

1) Stations are well-distributed across the area of interest. For example, if you are averaging all 32,000 or so global land stations with records of meaningful length, you need to account for the fact that around 10,000 of those are in the US. Otherwise you are giving 5% of the world’s land area 33% of the weight in your reconstruction. Thankfully if you are looking at a particular region, such as the US or Europe, stations tend to be well distributed within those regions, so the bias isn’t too large.

2) Station records are continuous through present – or, if they are not, the stations that “drop out” don’t have any climatological bias. In the USHCN dataset, the number of stations available has declined over the last decade. The network was created in the late 1980s and purposefully included stations that were currently reporting at that point in time. As much of USHCN represents a volunteer effort in the form of co-op stations, these volunteers will sometimes quit or pass away, and NCDC has had difficulty finding new volunteers to take over. It turns out that those stations that stopped reporting had an annual average temperature of around 12C, while those that continued reporting were closer to 11.5C. This causes a modest cool bias in the combined record if you average together the absolute temperatures, though you can easily get around this by converting them to anomalies relative to the baseline period. See this post for more details:

3) You can’t assign too much meaning to the resulting average absolute temperature values. Stations are located in specific spots, and not every mountaintop or valley in the country is blanketed with stations. Thus their average simply reflects the average of those locations, rather than the average of a wider region. Berkeley attempts to get a more accurate estimate of absolute temperature based on an approach that uses spatial interpolation (kriging) to estimate spatially complete temperature fields, taking elevation at not just the station location but all the areas in between into account. Thats why the climatology differs from a simple average of station values. There are other more sophisticated approaches like reanalysis that give even more accurate climatologies. See this discussion at RealClimate for details:

James Schrumpf
May 7, 2019 9:14 pm

Hi Nick,

The way I view it, the important point is “What are you trying to show?” It seems to me that you believe an average global temperature, or anomaly, can be produced from this data if you weight it, interpolate it, grid it, and subtract from it a fit of spherical harmonics with known integral.

But here’s the thing: you can’t prove you have the “right” answer. My calculated average global anomaly was closer to BEST’s than was NASAs’. Does that mean I’m “more right” than NASA? Or is NASA correct, and BEST and I are way off?

There’s really no way to tell. When sampling a population, a standard error is just a statement of how close your new calculated mean would be to the original mean if you ran the entire experiment over with all new measurements. It tells you nothing about the “correct” mean.

My calculations, on the other hand, are a true representation of the mean of the stations I’ve decided to include in my sample. In my universe, this was the average anomaly for March. UHI doesn’t matter; the city isn’t going to disappear any time soon, so its contribution will be the same every month. My simple calculations will show what’s happening with each station, if I so desire, or for an entire continent.

The bottom line is that with only simple calculations, my anomalies for three different regions of the earth had a standard deviation within ±0.4°C of a “proper” method, and there’s no way to prove which method gave the “right” answer for the Earth.

However, I can show that my methods give the right answer for the stations.

Reply to  James Schrumpf
May 7, 2019 10:01 pm

“My calculations, on the other hand, are a true representation of the mean of the stations I’ve decided to include in my sample.”
That summarizes the problem. If you decided to include different stations, you might well get a different answer.

The purpose of calculating an average is usually to estimate a population mean – something that doesn’t depend on what you’ve decided to included in the sample. That is the point of “weighting, gridding, interpolating” etc.

James Schrumpf
Reply to  Nick Stokes
May 10, 2019 12:01 am

Nick, the standard error — aka the error in the mean — is one standard deviation from the value of the mean. It means that if you took the same number of random samples and performed the entire experiment over again, you would have a 67% chance of coming that close to having he same mean value from the new samples.

That being the case, it stands to reason that if I got a standard error of 0.11 from a sample of 3500 out of 5200 stations, I’d get pretty much the same results if I decided to take a different sample.

Jeff Alberts
May 7, 2019 10:02 pm

I’ve mentioned before of tremendous temperature difference just a few miles apart.

For example, where I work in Mt Vernon, WA, in high summer it may get to 90. By the time I get home, 13 miles straight-line distance, it’s usually 10-20 degrees cooler. The highest disparity I’ve seen was 27f. Which is why infilling and averaging readings from different stations is bad, very bad.

Reply to  Jeff Alberts
May 8, 2019 10:45 am


The NOAA GLERL temperature sensor for Chicago sits on the Water Intake Crib, about three miles offshore. I am on their site all the time checking the weather, as I can see it out my window here on the Lake. O-Hare and Midway are both about 9-10 miles the other way.

One day three years ago O-Hare was reporting 71 degrees while NOAA_GLERL Chicago was reporting 41 degrees AT THE SAME TIME. Two stations 14 miles apart, 30 degrees different.

Steven Mosher
May 7, 2019 11:29 pm

“then how can it be said that this simple method is wrong, if it produces results so close to those from to more complex methods that adjust the raw data?”

By testing it James. Thoroughly and completely.

To test it you use the SAME data!.

To do a proper test you change ONE variable.

You are testing a METHOD.. call it method B
To test it against method A, method C, Method D.
you use ONE data set with the three diferent methods.

Here is one example how to do it

Simple averaging will always perform WORSE when the following occur

A) there is a change in average latitude over time
B) there is a change in aveverage altitude over time.

The other way to test your method is to randomly remove data from your series
and see if the answer changes.

take your European dataset.

Now remove 10%, 20%, 30%, 40% 50% ect.

See if your curve changes.

if it does, then you know your method is not robust

simple averaging is not robust

What this mean?

Excel is the sign of amatuer at work.

James Schrumpf
Reply to  Steven Mosher
May 8, 2019 3:59 am

Why Steven, how mean of you to call an amateur an amateur. I’ve never presented myself as a professional statistician, either on your site or mine. I’m doing all my work on an Oracle Enterprise database, various versions of which I’ve been a professional DBA and software developer for over 30 years. However, Oracle’s graphing packages are not something I’ve ever had to use, and so I dump the data and use Excel for graphing. I could have done it with R — would that have made it better for you?

Anyway, here’s a replot of the Europe averages with 60% and 40% of my stations used.

comment image

I don’t know, looks pretty robust to me.

May 7, 2019 11:33 pm

“if the answer is not known in advance, ”

But we do know the answer. We’re doomed. So we just have to make sure that the data supports that.

May 7, 2019 11:35 pm

” if the answer is not known in advance,”

But we do know the answer. We’re doomed. So we just have to make sure the data supports that.

May 8, 2019 12:51 am

James, nice work.
Since you have the numbers try this.
Instead of correlating station ids correlate on lat/lon rounded to 2 decimals. The few times you get overlaps take the average.
Include both v3 and v4 in this process.
Now map the different versions inventory to those same lat/lon readings. Compare!
What do you see?

Reply to  MrZ
May 8, 2019 8:34 am

Sorry, I see now this was not clearly expressed.
Purpose to map temp readings to coordinates is that you can more easily spot trends between different releases at their shared locations. I used 2dec lat/lon because coordinate resolution differs between GHCNM versions.

Comparing GHCNMv3 and GHCNMv4 QCU January files
If you do MIN(tavg) and MAX(tavg) on every shared location for every year and plot them the corrections should be random but instead there is a trend. Average MAX(tavg)-MIN(tavg) starts at 0.057C 1900 and peaks at 0.256C 2010. (This is excluding -9999s and QCFLAGS).

I can also see that the station set has moved north in NH and slightly south in SH. To check what that does with trends I need to use anomaly and gridding and it takes a bit more time.

This was

James Schrumpf
Reply to  MrZ
May 9, 2019 11:50 pm

Now, that’s more work than I was ever planning to do. I’m not looking to find the global average temperature of the Earth — that’s impossible. What I want to do is find the best possible picture of what temps at the stations are doing.

That’s a thing that can be done.

May 8, 2019 4:14 am

Thanks, James. Interesting to note that part of your work replicates some of that produced by Heller and presented on a youtube here.
(Hopefully WordPress will not truncate the address, or search youtube for Heller Corruption Of The US Temperature Record.)
Heller has done a lot a good work in this field, I realise that some scholars of maths and statistics might not like his presentational methods, but this is a war of minds and sometimes simple (but correct, of course) is best. Do not be overly discouraged by those critiques yourself.

The Reverend Badger
May 8, 2019 5:58 am

The fundamental point is this:
You are trying to find evidence of a change in the heat input/output balance in the system comprising the sun/atmosphere/rocky planet.
To do so by measuring the air temperature near the surface daily, taking the max/min and dividing by two isn’t really anything like what is needed considering basic physics.
It’s not fit for purpose.
It’s no good arguing that that is all you have got. I’m well aware of that. Arguing doesn’t make it any more fit for purpose, in fact it makes it worse because you are likely to delude yourself.
Try considering what you would ideally like to measure and how, in detail you would do it if , as a thought experiment, you had another rocky planet with an atmosphere and you wanted to monitor the heat input/output over, say, 200 years with current technology.
Where would be the best place to put your temperature probes, how often would you take readings, etc.
Once you have that sorted and agreed then you could start such monitoring here on Earth, let it run and compare with the traditional crock of shite method currently in use.
I’ll wager the results would be quite illuminating.
I’ll stick my neck out and assert they won’t be telling you the same about heat input/output balance.
What you have with the CARW “debate” on here as well as pretty well everyhere else is arguments about the relative merits of the architecural design of the upper stories of the castle. I’m standing a little further away than most of you and all I see is the whole edifice built upon dodgy foundations wabbling from left to right as the wind blows in different directions. The group over here by me is tiny and NEITHER side near the edifice like arguing or talking with us.
Trouble is BOTH camps are going to get squashed when it all comes down and who do you think will have to pick up the pieces. Well I’m getting bored with that. It’s gone on tooooooooo long.
I’m getting my RPG out soon.
You have been warned.
Step away.
Or risk getting squashed.
If you are not sure you are welcome to come over here and look at it from afar, free tea and biscuit and a nice chat too.
You can even handle my RPG before you decide whether to defect or not!

brad tittle
Reply to  The Reverend Badger
May 8, 2019 9:07 am

@Badger — Nicely stated. I don’t quite know how the RPG will change anything, but the entire process is highly suspect.

God help you if you try and get someone to explain allocation of resources IF climate change is a dire thing vs IF climate change is not a dire thing. For me, those resources pretty much stay the same in either mode. Help increase transportation, energy availability, shelter, water and sanitation. Maybe not in that order, but those 5 things. Top of the priority list.

CO2… Not on the list of worries for a long long time

kevin kilty
May 8, 2019 8:43 am

Arguing over mean temperature time series has become very, very stale. I am old enough to recall Gilbert Plass’s many publications about warming trends in northern Europe pre 1950. His graphs were probably sufficient to demonstrate a warming planet, but we know now that the trends these portrayed reversed after the early 1950s for the next several decades. So, time series were sufficient to show how CO2 was warming the planet until they weren’t. Of course, there is a ready explanation, actually several, for why they reversed that preserves “appearances”. The climate system is complicated enough that one can always find an explanation for why a trend does whatever it does.

People making statistical adjustments to time series never mention, perhaps because they aren’t aware, that adjustments come with uncertainties of their own. Without carefully examining each station’s time series, and data about station moves, encroachment, and so forth, one can’t be certain that statistical adjustments are helping matters. Many criteria by which we judge the correctness of time series and our corrections are ad hoc. There is no basis for even believing that corrections resulting in a residual “anomaly” that is a series of random, normally distributed numbers, means the corrections are valid. Weighting criteria are ad hoc, too.

One thing that climate scientists (I am not not one) are trying to accomplish with regard to mean temperature is to demonstrate that the Earth is out of equilibrium with regard to energy balance, and hence that some agent like CO2 or feedback is warming the planet. But this seems a pretty pointless goal when one considers that heat is being carried constantly toward the poles and toward the top of the troposphere–transport means we are always out of equilibrium, perhaps quite far from equilibrium, and we don’t fully understand and cannot model this transport accurately.

The thing activists are trying to accomplish is change to the political and economic structure of the world using a “climate crisis” as justification. Temperatures are in no way fit for this grand and frankly scary purpose.

What one thinks of the mean temperature record is dictated largely by one’s prior beliefs. If someone is predisposed toward belief in climate troubles, then a temperature time series means quite a lot. If one is skeptically inclined, it means much less. Looks much like belief in the paranormal doesn’t it?

Reply to  kevin kilty
May 8, 2019 3:36 pm

People making statistical adjustments to time series never mention, perhaps because they aren’t aware, that adjustments come with uncertainties of their own. Without carefully examining each station’s time series, and data about station moves, encroachment, and so forth, one can’t be certain that statistical adjustments are helping matters. Many criteria by which we judge the correctness of time series and our corrections are ad hoc. There is no basis for even believing that corrections resulting in a residual “anomaly” that is a series of random, normally distributed numbers, means the corrections are valid. Weighting criteria are ad hoc, too.

Indeed, there is scant understanding of the pressing need–or the rigorously proven methods–for VETTING highly adulterated climate data. The abject, ad hoc aspect of much of “climate science” derives from the fact that geographers and representatives of other “soft” sciences have dominated the field. Analytic aptitude along with bona fide physical insight are very scarce commodities.

May 8, 2019 9:43 am

This is all a crock. There can be averages of temperatures from various locations. There is no way to measure Global Average Temperature. The attempt is subject to manipulation such that no objective observer would ever trust the “results.” One single thermometer in the Arctic used to show temps across 1200 kilometers? Please.

Any dozen widely spaced long continuous records of temperature would show an effect from increasing CO2, if there is one. There is no evidence of one so far. The endless pontification about the global average, whatever they are calling it now, completely unscientific.

I did not spell “pontification” wrong, what is your SpellCheck doing?

Much Ado About Nothing…

Reply to  Michael Moon
May 8, 2019 2:04 pm

Michael Moon

“One single thermometer in the Arctic used to show temps across 1200 kilometers? Please.”

I’m not interested in this boring CO2 discussion.

You might look at a comparison of 3 (!) GHCN daily stations above 80N (1 in Canada’s Nunavut, 1 in Greenland, one in Siberia) with UAH6.0’s 2.5 ° LT grid data in the 80N-82.5N latitude band (144 grid cells):

3 extreme anomalies of 483 (below -10 or above +10 °C) were filtered out of the surface time series.

We should be honest: this is really impressing. Even after 2009, when the running means stronger differ, a correlation still is visible.

The linear estimates of course do not match:
– GHCN: 0.76 °C / decade
– UAH: 0.42.

Three stations! It’s hard to imagine, but so is the data.

Reply to  Bindidon
May 10, 2019 10:00 am


Those are all more than 1200 kilometers apart. The gridding/kriging in all Global Average Temperature schemes are non-scientific.

My point is, if CO2 is changing temps on this planet, any dozen reliable long records of temperature in any dozen widely spaced locations would show it. Anyone who changes the data recorded by a sensor and publishes the changes must MUST be suspected of advocacy, not Science.

You cannot make a silk purse from a sow’s ear.

Reply to  Michael Moon
May 10, 2019 5:36 pm

I see.

You are manifestly not at all interested in the comparison of the surface station data (without any kriging) with that recorded in the lower troposphere by satellites.

“The gridding […] in all Global Average Temperature schemes are non-scientific.”


What is non-scientific is, for example, to compare 18000 stations located within 6 % of the Globe with 18000 stations located within 94 % of the Globe.

And again: why are you talking about CO2?

Dr Francis Manns
Reply to  Michael Moon
May 10, 2019 6:14 pm

NO NO NO ! when you start linear regression or any other regression from a cold volcanic period from 1875 to 1934 the models are all bogus.

Dr Francis Manns
May 8, 2019 12:40 pm

The temperatures at Berkeley record a number of volcanic cool periods and a number of ENSO cycles. The most obvious on your graphs is Krakatoa in 1883. The temperature prior to 1875 had already reached current range. There were 21 serious (cooling) eruptions between 1875 (Askja, Iceland) and 1934 (Robock, 2000) that are recorded on NOAA thermometers. I suggest you look at a more remote station like Key West. FL. or Death Valley, CA.

The modelers are misinterpreting regression toward the norm after the Little Ice Age. Regression toward the norm explains the hiatus. Climate models ought to be deleted.

Dr Francis Manns
May 8, 2019 4:34 pm

The temperatures at Berkeley record a number of volcanic cool periods and a number of ENSO cycles. The most obvious on your graphs is Krakatoa in 1883. The temperature prior to 1875 had already reached current range. There were 21 serious (cooling) eruptions between 1875 (Askja, Iceland) and 1934 (Robock, 2000) that were recorded on NOAA thermometers. I suggest you look at a more remote station like Key West. FL. or Death Valley, CA.

The modelers are misinterpreting regression toward the norm after the Little Ice Age. Regression toward the norm explains the hiatus. Climate models ought to be deleted.

May 8, 2019 4:37 pm

James Schrumpf

You seem, if I understand you well, to doubt about the usefulness and correctness of latitude and longitude gridding of absolute or anomaly-based station data before averaging it into time series.

Maybe you look at various graphs showing you why you are wrong.

Here is a graph showing for CONUS the yearly average of daily maxima per station in the GHCN daily data set:

And here is the same for the Globe:

Maybe you think: „Aha. The Globe is like CONUS“.

But… here is the Globe without CONUS:

What happens in (2) is that you have in GHCN daily worldwide about 18000 US stations competing with 18000 non-US stations: this means that about 6 % of the Globe’s land surface has the same weight as the remaining 94 %.

When you exclude the US stations, you irremediably get another picture of the Globe.

If you now average, in a preliminary step, all stations worldwide into grid cells of e.g. 2.5 degree (about 70000 km² each) you obtain the following graph:

Yes: it looks like the preceding graph (3) in which all US stations had been excluded.

This is due to the fact that now, about 200 CONUS grid cells compete with 2000 grid cells outside of CONUS. Thus CONUS’ weight is reduced to about its surface ratio wrt the Globe. Fair enough!

Ironically, the gridding method gives, when applied to CONUS itself, a graph

showing that the number of daily maxima per station per year above 35 °C even decreases. The reason is again that the gridding gives more weight to isolated stations whose ‘voice’ was diluted within the yearly average of the entire CONUS area.

Like an ungridded Globe looks like CONUS, an ungridded CONUS looks like the sum of its West and East coasts.

J.-P. D.

Reply to  Bindidon
May 9, 2019 9:11 pm

Please explain the physical mechanism by which CO2 in the atmosphere is expected to disproportionately affect non-US stations in such a way that the US trend should mismatch the rest of the globe.

Since I doubt you can describe such a mechanism, I will assume that the differences you are seeing are related to comparing a higher quality record in the US to a lower quality record from the rest of the globe.

Africa is huge, if you take garbage data from a handful of stations and smear it across the entire African continent it probably isn’t going to match the US record, ever.

But hey, if you really believe that CO2 is impacting the US differently than non-US then feel free to explain.

Reply to  KTM
May 10, 2019 1:47 am

The Explanation:
Americans are bigger than everybody else
American cars are bigger than everybody else’s
American trains are bigger than everybody else’s
American planes are bigger than everybody else’s etc
By simple extrapolation of the known facts, as listed above, the simple deduction is that American CO2 molecules are bigger than everybody else’s thus retaining and transferring more vibrational energy and necessitating a downward adjustment of American records. Simples.

Reply to  PeterGB
May 10, 2019 3:40 am


Thanks a lot for this amazing sarcasm!

Reply to  KTM
May 10, 2019 3:38 am


Please explain why you are so tremendously fixated on the CO2 story that you even manage to reply with such CO2-based blah blah to a comment complerely lacking it.

And by the way, your comment perfectly shows that you don’t understand anything about station quality evaluation, let alone about time series computations.

May 9, 2019 4:04 am

James Schrumpf

I’m wondering why you cherry-pick some small parts of the Globe, like Europe, CONUS or Australia, instead of checking your simple [?|!] method against BEST’s entire land-only data:

Please download and enter it into Excel (you seem to be an excellent Excel user), enter the data you processed (I suppose you use GHCN daily), and proceed to a fair, complete comparison.

Maybe you see then that you possibly made more errors than BEST did adjust.

Your goal: to come as near to BEST as possible:

and compare your result with the little layman work I made, and a with more complex approach performed by Clive Best:

You will easily see there that Clive Best started the BEST/GHCNd comparison much earlier.

But my lack of knowledge and experience in statistics made the comparison with the professional work in earlier times simply hopeless because of the paucity of station data.

J.-P. D.

James Schrumpf
Reply to  Bindidon
May 9, 2019 11:28 pm

Hi Bendidon,

Sorry it took me so long to get back here. Lots to do with a new puppy.

Anyway, I cherry-picked those three graphs because I wanted to try to reproduce a graph from a reputable source, and they included their data sets with the graphic, so I was able to plug their numbers right in along with mine to really see how they matched up. The world graphic I found on BEST included ocean data, which I didn’t have. I really appreciate you providing the link to BEST’s global average data, using TAVG to boot, being the same as I’m using.

The following is a chart I made using BEST’s global data set from the link you provided. It’s a 10 year rolling average, and for my data I did one with US data and one without.

comment image

The red line is BEST, the black line is mine with US data included, and the green is mine with the US data left out. I don’t know how many stations were included in the BEST chart, but mine used 5436 stations with US data, and 2659 stations without.

That’s just about half the stations out of the data, and the line doesn’t look all that different to me — does it to you? There’s about a 40-year period from about 1950 to 1990 where all three lines were lying right on top of one another. I don’t think the three lines ever get more than maybe a half-degree apart over the whole 120-year period. I don’t think GISTEMP, HADCRUT, UAH, or any other data sets agree with one another that well. I could be misremembering, though.

I think you might have missed from my explanation of the simple method I use that I’m not ever averaging temperatures anywhere but in the scope of one individual station. I think that makes the method more impervious to data dropout.

What I do is get a baseline for each station that has sufficient good data (no -99.99 values or QFLAGs) to be included. It also has to have 345 of the 360 records that make up a 30-year by 12-month baseline range. After those have been rolled up into monthly series, I filter out any station that doesn’t have at least 26 of the 30 records needed for each monthly series, e.g., if a station only has 25 Januaries of the 30-January series from 1951 to 1980, that station is gone. Finally, if a station is missing a month from the annual series of Januaries, Februaries, etc, it gets left right out, too.

When the entire process is done (takes about three minutes from start to end on my Oracle database) I have a huge collection of anomalies for each individual station. Then I average the anomalies across each month of the year to get each value for Jan, Feb, Mar, etc. Since the range of one station’s anomaly is usually similar to the range of another’s, it doesn’t matter so much when data drops out.

Finally, here’s the chart my global average data, with and without the US data. It changes some, but I don’t think it’s very much. What do you think?

comment image

Reply to  James Schrumpf
May 10, 2019 5:28 pm

James Schrumpf

Thanks for the reply.

Looks interesting a a first glance, but I have some more specific comments, I’ll come here again soon.

May 9, 2019 1:19 pm

James: Many scientific fields, we design experiments to determine what we want to know. Let’s imagine we are time travelers and could go back in time and measure temperature anywhere on the planet. For GHG-mediated global warming, we believe that GHGs slow radiative cooling to space, leaving more energy in the environment and therefore a higher temperature. So perhaps we would design an experiment to measure the temperature and average local heat capacity per unit area, and then see how much energy has built up over time in terms of W/m2. The ARGO buoys are already doing a pretty good job of measuring how much energy is accumulating in the ocean, so let’s give ourselves the job of measuring energy accumulation over land. Since there is the same weight of air over every square meter, the heat capacity of the atmosphere is the same over every square meter. If we assume the lapse rate remains constant (a reasonable approximation), then warming everywhere in the atmosphere will be the same as warming 2 m above the surface. So, maybe we would design an experiment where we placed one temperature station in every 1, 10, 100, 1000 or 10,000 km^2. If we intended to compare with a climate model, we might place one temperature station at the center of every land grid cell in bottom layer of the climate model. However, I doubt we would scatter temperature stations at random, or put them mostly in cities, or near the coast, or near airports.

Unfortunately, we aren’t time travelers. We must work with the data we have, despite the fact that the thermometers aren’t ideally distributed. Nevertheless, we can still make a grid and average all of the thermometers in each grid cell, so we can have one temperature for every (say) 10,000 km^2 grid cell. Then we can average all of the grid cells. BEST has a better way of doing this: kriging, which is a way of using the data to determine a temperature “field”. That field is continuous: T(x,y) is the temperature at point x,y. Gridded temperatures are discontinuous.

All of the temperature indices calculate temperature per unit area and average, because that is what climate scientists most want to know from an energy accumulation perspective.

You may have a different perspective. You may care about how much warming is hurting people design an experiment with one thermometer in the center of every 100,000 people. Or put the thermometers where both people and agriculture are. However, whatever you want to know, thermometers probably have not been placed at the right location to give you a MEANINGFUL answer, so simply averaging is unlikely to provide the best answer. As best I can tell, simple averaging serves no purpose.

With the advent of satellite altimetry, we can measure risings sea level everywhere in the ocean. However, I only care about SLR at the coast where people living. That happens to be where tide gauges are located. In this case, simple averaging would be more useful than the global average we get from satellite altimetry. However, a better answer would be to weight the tide gauge data by the value of the land at risk around the tide gauge. (Climate scientists need the global value to prove thermal expansion, ice cap melting, glacier melting, ground water depletion, and reservoir-filling explain the rise we see.)

James Schrumpf
Reply to  Frank
May 9, 2019 11:34 pm

Actually, what I was doing was an experiment to see how far off a plain averaging scheme would be from the complicated systems used y others.

It seems to be working out pretty well so far.

May 12, 2019 3:28 pm

James Schrumpf

I lacked the time to go back into the center of our discussion about GHCN daily. But inbetween here are some related infos concerning similar things when processing sea level data in about the dsame manner as temperature time series:

Below you see a graph comparing two evaluations, the one without and the other with gridding:

How important it is to do this gridding you see here:

In this graph you have 1513 tide gauge stations. The smallest points show grid cells with up to 4 gauges.
The bigger points show stations with up to about 20 of them.

12 % of the grids totalise 40 % of the stations!

But… this is not the end of the story, because our simple-minded baselining eliminates lots of stations having provided data in earlier times, whose data then is absent because they lack sufficient baselining data.

This leads to a strange bulge in the time series which very probably is not contained in a correct baselining.

A very first trial to solve the problem was to generate 4 time series baselined wrt different reference periods, each about 30 years long, and to rebaseline 3 of them wrt the main one (1993-2013):

This motivates to an automated solution which, if successful, could be brought back to the temperature data processing.

It seems that Church and White, whose work has been subject to harsh criticism on this blog, can’t have been that wrong, isn’t it?

In a few days, I’ll come back again and present similar data concerning GHCN daily.

J.-P. D.

James Schrumpf
Reply to  Bindidon
May 13, 2019 10:41 pm

Hi Bindidon,

I think we should keep the focus of our conversation on the data set I’m using, rather than one I’m not. I’m using the GHCN Monthly summaries — not the daily measurements, and not tide tables — because I want to compare my results with those of reputable sources such as NOAA, NASA, Hadley, and Steven and Nick.

Yes, I’ve been told many times how important gridding is, but the point of my experiment is to see the results of my relatively simple method — which does not involve averaging temperatures — compared to other, more complicated methods.

But, since you’ve gone to the trouble of posting these charts, by all means we should discuss them in relation to my results.

Looking at the first chart, it’s apparent that since 1983 or so — the last 36 years — gridding has very little effect on the line. Why is that? I presume it means that the number of stations in the grids has evened up somehow. Can you think of other reasons for the lines to have merged?

If you look at my charts, you’ll see that there’s no “bulge” from eliminating station data, and all three lines are very close over the entire 120-year series. My line is noisier, but that’s because I don’t do any smoothing beyond what a 12-month or 10-year running average does. Remember, I’m not attempting to find the “true” value of the global anomaly, I’m just presenting what the station data tells us.

You haven’t seemed very interested in trying to understand the significance of a “simple-minded” method that matches up to the complicated models so well. I’m not trying to make conclusions about the validity of the complicated methods; I’m just observing that a much simpler and direct method seems to give results just as good.

I don’t know anything about Church and White, so I can’t comment on their work.

I would like to hear some commentary from you on my method — I can’t really call it “my” method, as it’s about as simple as can be — rather than hear more about the importance of gridding to achieve proper results. It seems that calculating an anomaly for each station individually, then averaging those results, has the same effect as gridding. It seems that by reducing the values to be averaged to an anomaly first, and then averaging those reduces– or even eliminates — any benefits of gridded data.

May 14, 2019 6:06 pm

James Schrumpf

Apologies for this reply, but… to write “my method” about what has very certainly been designed and implemented 40 years ago is a bit (too) hard. Indeed, it is “about as simple as can be”, I tested it 5 years ago, but quickly did understand that it is by far too simple.

And no, Mr Schrumpf: ” calculating an anomaly for each station individually, then averaging those results”, does NOT have “the same effect as gridding.”

I thought you would understand that when looking at sea level data processing. You didn’t.

So let us come back to temperatures.

I generated yesterday two time series with absolute data (out of GHCN daily, but I could have done the same job out of GHCN monthly V3 or V4):
– one without gridding;
– one with gridding.

The average temperature over all months from Jan 1880 till Dec 2018:
– no gridding: 10.2 °C
– gridding: 15.2°C.

How is that possible?

The GHCN daily data processing told me this:
– 35537 stations contributed during the period above;
– these stations were/are located within 2567 grid cells.

This gives in average about 14 stations per grid.

The 413 grid cells containing more than 14 stations made a total of 29116 stations. This means that no more than 16 % of the cells in the world contain 83 % of the stations.

And that’s how it looks:

As you easily can see, the 413 stations are nearly all located in the US and in Southern Canada, in Northern Europe and in South East Australia.

Thus if you don’t grid, your global average temperature will look like an average of these three corners, and not like an average of the Globe. How could 17 % of the data manage to beat 83 % of it?

This you can see in this graph below, in which no gridding was applied:

As you can see, the blue line looks very flat. No wonder: its data shows a trend of 0.077 °C / decade.

The red line (whose data does not contain any US station data) has a well steeper slope with 0.15 °C / decade, i.e. nearly twice as much.

And now have a look at the chart below, which compares gridded data for the Globe with and without the US:

Now, due to the fact that we have 413 grid cells competing with 2154 cells, instead of having 29564 stations competing with 5973 stations, the comparison becomes quite different: the two plots are nearly similar, their trends are equal with 0.10 °C / decade.

Your words: „It seems that by reducing the values to be averaged to an anomaly first, and then averaging those reduces– or even eliminates — any benefits of gridded data.“

Sorry, Mr Schrumpf: this is simply wrong.

James Schrumpf
May 15, 2019 10:42 am

Hi Bendidon,

You might have got different results if you had reduced the values to anomalies first, and then averaged those. From the description of what you did, you again averaged absolute temperatures in some manner and then turned those into anomalies. I’m guessing you took the stations within one grid and averaged those temps to get a grid average temperature. Then you used those grid averages in some way to produce anomalies to average into your global average anomalies — perhaps with some extra processing as well.

You should take the blue line from your non-gridded chart (globe with US data) and lay it along the lines from your gridded data chart. They match almost perfectly back to around 1953 or so — around 66 years.

Why should non-gridded data chart so perfectly over gridded data for two-thirds of a century?

I’ll post some more when I’ve had more time to look your and my data over.

Reply to  James Schrumpf
May 15, 2019 5:31 pm

James Schrumpf

“You might have got different results if you had reduced the values to anomalies first, and then averaged those. ”

But… this is exactly what I do, Mr Schrumpf.

How else would it be possible to build correct anomalies of e.g. nearby stations having different characteristics (rural vs. urban, sea level vs. altitude)?

” From the description of what you did, you again averaged absolute temperatures in some manner and then turned those into anomalies. I’m guessing you took the stations within one grid and averaged those temps to get a grid average temperature. Then you used those grid averages in some way to produce anomalies to average into your global average anomalies — perhaps with some extra processing as well.”

But… this is exactly what I did not, Mr Schrumpf.

The difference between what you do and what I do might be found in how you handle data provided by those stations you reject because they do not have enough data within the reference period you chosed.

“You should take the blue line from your non-gridded chart (globe with US data) and lay it along the lines from your gridded data chart. They match almost perfectly back to around 1953 or so — around 66 years.”

This is exactly the problem you completely misunderstand: the similarity you are referring to is irrelevant.

What is relevasnt is:
– the difference between the two non-gridded time series showing how much US warming in the past influences the global average;
– the similarity between the gridded time series showing how gridding prevents this bias.

The discussion is useless, and becomes really boring because you manifestly prefer to ignore the need to dampen the preponderance of American stations, e.g. by gridding.

The urge in reinventing the wheel is as old as is the wheel itself. I wish you all the best, Sir.

James Schrumpf
Reply to  Bindidon
May 15, 2019 10:17 pm

You might have also found it interesting that the “simple-minded reinvention of the wheel” method produced a trend of .087, or more properly for this data, 0.09 or even 0.1C/decade over the 120-year series from 1900 to now.

May 15, 2019 6:44 pm

Seems to me you are ignoring the fact that the non-gridded, US-included line falls right on top of your gridded lines for two-thirds of a century. You seem unable to explain it and prefer to handwave it away as irrelevant.

It also appears to be lost on you that your blue, ungridded, US data had a trend of 0.077C/ decade, while your gridded data had a trend of 0.10C/decade.

The difference in those trends is only 0.023C. What was your standard error? It only needs to be 0.012 for that difference to be statistically insignificant. At any rate, its hardly the vast gulf you make it to be.

It’s too bad you decided to give up and go home. It could have been interesting if you didn’t believe what you couldn’t explain was unimportant.

%d bloggers like this: