
Image Credit: Walter Dnes
By Walter Dnes – Edited by WUWT Regular Just The Facts
I have developed a methodology and spreadsheet to capture and chart HadCRUT3, HadCRUT4, GISS, UAH, RSS, and NOAA monthly global temperature anomaly data. Calculations are also done to determine the slope of the anomaly data from any given month to the most recent month of data. Your help is needed to help validate the methodology I’ve created and help us to leverage the expertise and resources of WUWT keep this data reasonably up to date.
In order to dispense any potential legal/copyright questions:
1) I, Walter Dnes, hereby declare that the end-user programming in the spreadsheet and this article are entirely my work product.
2) I, Walter Dnes, grant you the royalty-free, perpetual, irrevocable, non-exclusive, transferable license to use, reproduce, modify, adapt, publish, translate, create derivative works from, distribute, perform, and display the aforementioned end-user programming and spreadsheet (in whole or part) worldwide and/or to incorporate it in other works in any form, media, or technology now known or later developed.
The spreadsheet is on Google Docs at this URL:
https://docs.google.com/spreadsheet/ccc?key=0AnTohu4oFUbcdEgzTkpEYTAwN1BiXzJXMXZ5RVJiOUE&usp=sharing
Some of the graphs can take several seconds to render, because of the complexity and sheer amount of data. I acknowledge that implementing a spreadsheet via web interface is an amazing feat. I do not wish to detract from that, or complain about it. However, there are some limitations that require workarounds. I will note them as necessary.
I’ve sized all graphs to a 1920×1080 screen. My apologies to users with smaller screens. The details of the graph would be difficult to see if the graph was reduced.
The monthly date convention used in this spreadsheet is to refer to monthly data by the end of the month in question. So January 2008 would be 2008.083 (year 2008, offset by 1/12th of a year, i.e. 1 month). This keeps going through November 2008 (2008.917) and December 2008 (2009.000). This may seem a bit weird, but in computing, we often start at zero rather than 1. It works out better in many cases. Here is a sample set of dates to familiarize you with the idea…
2007/12 == 2008.000 where .000 = Dec data for previous year (2007)
2008/01 == 2008.083 where .083 = Jan data for current year (2008)
2008/02 == 2008.167 where .167 = Feb data for current year (2008)
2008/03 == 2008.250 where .250 = Mar data for current year (2008)
2008/04 == 2008.333 where .333 = Apr data for current year (2008)
2008/05 == 2008.417 where .417 = May data for current year (2008)
2008/06 == 2008.500 where .500 = Jun data for current year (2008)
2008/07 == 2008.583 where .583 = Jul data for current year (2008)
2008/08 == 2008.667 where .667 = Aug data for current year (2008)
2008/09 == 2008.750 where .750 = Sep data for current year (2008)
2008/10 == 2008.833 where .833 = Oct data for current year (2008)
2008/11 == 2008.917 where .917 = Nov data for current year (2008)
2008/12 == 2009.000 where .000 = Dec data for previous year (2008)
And now for an overview of the spreadsheet…
Tab “temp_data”:
Anomaly data
Column A is date in decimal years in the manner noted above.
Column B is HadCRUT3 anomaly data
Column C is HadCRUT4 anomaly data
Column D is GISS anomaly data
Column E is UAH anomaly data
Column F is RSS anomaly data
Column G is NOAA anomaly data
____________________________________________________________________
Slope data
Column I has the slope for each corresponding cell in column B (HadCRUT3) from that cell’s date (Column A) to the most recent month with data for that dataset.
Column J slope data for Column C (HadCRUT4)
Column K slope data for Column D (GISS)
Column L slope data for Column E (UAH)
Column M slope data for Column F (RSS)
Column N slope data for Column G (NOAA)
For columns I through N, the earliest cell with a negative value indicates how far back one can go in a temperature series, with a negative slope. The slope data is plotted in the tabs with the names of the datasets. This allows one to see where the slope value crosses zero.
____________________________________________________________________
12 month running means
Column P is HadCRUT3 12-month running mean anomaly.
Column Q is HadCRUT4 12-month running mean anomaly.
Column R is GISS 12-month running mean anomaly.
Column S is UAH 12-month running mean anomaly.
Column T is RSS 12-month running mean anomaly.
Column U is NOAA 12-month running mean anomaly.
Column V is left blank for data-import when updating data.
A couple of notes about limitations of Google’s online spreadsheet
1) You can not enter text manually in graph legends. The spreadsheet can, however, use text from the first row of the series, i.e. the “header row”. Cells P11 through U11 have the series’ names in them, for use in the legend.
2) Scatter graphs will not work properly with nulls/blanks in a series. In order to get the series of varying length to plot properly, dummy values have to be inserted to fill in shorter series. I use -9 as the filler value.
Tab “HadCRUT3”:
Is a graph of slope values for each month for the HadCRUT3 series. The slope is from the month of the cell (given in Column A) to the most recent month of data. It uses data from Column I. Note that due to complexity limits in the Google spreadsheet, the values are only calculated for part of the data series.
Tab “HadCRUT4”:
Is a graph of slope values for each month for the HadCRUT4 series, using data from Column J.
Tab “GISS”:
Is a graph of slope values for each month for the GISS series, using data from Column K.
Tab “UAH”:
Is a graph of slope values for each month for the UAH series, using data from Column L.
Tab “RSS”:
Is a graph of slope values for each month for the RSS series, using data from Column M.
Tab “NOAA”:
Is a graph of slope values for each month for the NOAA series, using data from Column N.
Tab 12mo1850:
Is a graph of 12-month running means of anomalies from January 1850 to present.
Tab 12mo1979:
Is a graph of 12-month running means of anomalies from 1979 to present. This covers the satellite data era.
Navigating Through The Spreadsheet:
Spreadsheet navigation is similar to Excel, with the most major difference being that pressing the {END} key immediately takes you to the far right-hand side of the page. Similarly, pressing the {HOME} key immediately takes you to the far left-hand side of the page. The equivalant to {END}{UP}, {END}{DOWN}, {END}{LEFTARROW}, and {END}{RIGHTARROW} combinations is to hold down the {CTRL} key while pressing the arrow in the direction you wish to jump.
Interpreting The Slope Graphs:
The slope graphs in the tabs “HadCRUT3”, “HadCRUT4”, “GISS”, “UAH”, “RSS”, and “NOAA” represent the slope from a given month to the latest available data. Note that the graphs are a guide to narrow down the earliest month with a negative slope. The authoritative numbers are in columns I through N of tab “temp_data”, which list the slopes. Two examples follow. ***IMPORTANT*** as additional months of data come in, the slope numbers and graphs will change each month. The numbers and graphs used in these examples were generated in mid-May 2013, using data to the end of April 2013. Do not expect to see the same numbers that you see in the screenshots. To find the longest period of negative slope, find the leftmost (i.e. earliest) point in graphs HadCRUT3/HadCRUT4/GISS/UAH/RSS/NOAA which has a value below zero.
The easy example is shown by image rssmag.png, which is a zoom of part of the graph in tab “RSS”. It’s obvious that 1997.0 (i.e. December 1996 is negative). The screenshot rssdata1.png confirms that 1997.0 has a negative value in column M, which contains RSS slopes.


UAH is a more difficult case. Data including April 2013 shows that the first negative slope value is sometime in 2008. See the zoomed image uahmag.png. The graph at least narrows down the month to somewhere in 2008. Image uahdata1.png shows that the first negative value in column L (UAH) is for 2008.583, i.e. July 2008.


“Coming Soon” Part 2: Instructions for Updating the Global Temperature Records in Google Docs:

This spreadsheet is intended as a proof-of-concept and a starting point for people who may want to extend it further. A followup post will deal with updating your own copy of this spreadsheet on Google Docs, or downloading and maintaing a local copy on your home machine.
Please let us know if you see any issues or errors within the methodology or spreadsheet. Also, please let us know your thoughts and recommendations on how we can best keep this spreadsheet reasonably up to date. Ideally we would like to automate this process or spread the work among a few WUWT’s readers. We also need to figure out how to make the resultant data readily available on WUWT, which could be accomplished through monthly/quarterly WUWT threads, a WUWT Reference Page or other communication method. Please let us know your thoughts below.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
No CRN data?
> crosspatch says: May 23, 2013 at 3:42 pm
> No CRN data?
I’m concentrating on global data sets in this spreadsheet. In a followup post, I’ll explain how to download the spreadsheet to your PC, or copy to your Google account. Once you have your own copy, you can customize it to your heart’s content. Note that the Google online spreadsheet is already rather slow. So I hesitate to add more data sets to this spreadsheets.
You could also just use this…
http://www.woodfortrees.org/
> Zeke Hausfather says: May 23, 2013 at 3:53 pm
> You could also just use this… http://www.woodfortrees.org/
It’s already listed on the WUWT sidebar. The spreadsheet in this article allows people to “roll their own”, add additional data sets, go into greater detail, and whatever other customizations they want. This goes beyond what you can do on WFT.
Zeke Hausfather says:
May 23, 2013 at 3:53 pm
You could also just use this…
I really like WFT, however GISS and HadCRUT3 have not been updated since November so I had to rely on Walter and other sources for the latest times where the slope was 0.
Which 30-year period are you using for a base?
> Lance Wallace says: May 23, 2013 at 4:20 pm
> Which 30-year period are you using for a base?
I download the data from the respective websites, so it depends on what base period they use. For purposes of calculating slope, it doesn’t matter. We’re looking at the relative change over a period of time.
It seems to me that Javascript with HTML 5 is more flexible, and also quicker to download. You can also do more interactive graphics (example here). Or here is an example with graph mobility and ability to add regression curves (and add your own data, and output numerics).
On updating, I’ve been using wget and cURL to automatically download data files each month – I run a script every night. They allow you to mirror – ie just download when there are updates. I keep the updated data here. It also automatically updates the graphs.
I read only half-way down but it seems as if you’re trying to reproduce something that could be done quite easily using JavaScript or – I suspect – WordPress’ own plotting widget.
You could even have a processing tool for regression fits (linear, pwr law, polynomial etc.), convolution/FFT (+processing), even moving FFT or FWT, etc.
But the short answer is that you need time and people to do it. Even I could write the code in C (library of a few 1000 lines of code would do all the maths) to do all above but you’re need some type of php to interface although I doubt WordPress would like this very much (security risk that exposes too much native code). Thing is you need to sit down with someone at WUWT and work through aims, needs => design (maintainability) => output.
I’ll ask around and get a few experts to read the post and see what they suggest.
> Nick Stokes says: May 23, 2013 at 5:12 pm
> On updating, I’ve been using wget and cURL to automatically download
> data files each month – I run a script every night. They allow you to
> mirror – ie just download when there are updates. I keep the updated
> data here. It also automatically updates the graphs.
One advantage of doing it manually is that you find out right away when the version number is bumped, and the source URL changes. E.g. HadCRUT4 is now at http://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/time_series/HadCRUT.4.2.0.0.monthly_ns_avg.txt
http colon slash slash http://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/time_series/HadCRUT.4.2.0.0.monthly_ns_avg.txt
The whole point of this excercise is to allow a competent spreadsheet user to do it all on their home PC. A followup post in the next few days will get into the details of doing monthly updates.
Walter,
“HadCRUT4 is now at…”
Yes, that was a nuisance. But there’s a way. On this page, which doesn’t change its URL, just locate the link in the HTML.
> cd says: May 23, 2013 at 5:16 pm
> But the short answer is that you need time and people to do it. Even
> I could write the code in C (library of a few 1000 lines of code would
> do all the maths) to do all above but you’re need some type of php
> to interface although I doubt WordPress would like this very much
> (security risk that exposes too much native code). Thing is you need
> to sit down with someone at WUWT and work through aims,
> needs => design (maintainability) => output.
That’s exactly what I’m trying to avoid. A followup post will discuss downloading the spreadsheet and fixing up the graphs (because the translation of graphs during the download has problems). I want something that a competent spreadsheet user can update and customize on their own.
Is it possible to graph the JMA temperatures with the data above?
http://wattsupwiththat.com/2013/01/31/japans-cool-hand-luke-moment-for-surface-temperature/
http://wattsupwiththat.files.wordpress.com/2013/01/surfacetemps_japan.png?w=640&h=507
“This goes beyond what you can do on WFT.”
Indeed, WTF is too limited.
Echoing some comment above , spend time defining the objectives before doing too much work.
On what you have so far.
If you are plotting slopes you are interested in rate of change, why not plot rate of change directly (ie the monthly incremental changes, = “first difference”)
It’s easier to visualise what you are actually interested in. Not sure why the cumulative slope. This has changing changing sensitivity and frequency response as it progresses. Not sure this would be informative.
Argh, don’t use running averages. Here’s why not and a better option:
http://climategrog.wordpress.com/2013/05/19/triple-running-mean-filters/
December2008=2008 is crazy confusing and pointless. I would suggest logging data that is a monthly average at the middle of the month : Jan = 1/24 ; Feb =3/24 etc.
That way you are not introducing a phase shift into the data and not calling 2007 data 2008.
Hope that helps.
> Snowlover123 says: May 23, 2013 at 5:49 pm
> Is it possible to graph the JMA temperatures with the data above?
Any monthly data. Wait for my next post about downloading the spreadsheet and updating/customizing it.
Lance Wallace says: May 23, 2013 at 4:20 pm
Which 30-year period are you using for a base?
Good question. We should be using 1981–2010 as our base period;
However, there is significant variation across the data sets, i.e.:
UAH
GISS
HadCRUT4
http://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/download.html
RSS
NOAA NCDC
NOAA CPC
NOAA Climate Prediction Center’s CAMS station temperature anomaly dataset. “CAMS” is an acronym for the “Climate Anomaly Monitoring System” in use at the Climate Prediction Center (CPC).
Please post base periods and source links to any data sets I missed.
You guys certainly have a good handle on things crowd-wise!
Though, I’ve also come across some new research that you may find interesting in this regard…
It’s called “The Theory of Crowd Capital” and you can download it here if you’re interested: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2193115
Enjoy!
> Greg Goodman says: May 23, 2013 at 5:53 pm
> If you are plotting slopes you are interested in rate of change,
> why not plot rate of change directly (ie the monthly incremental
> changes, = “first difference”)
> It’s easier to visualise what you are actually interested in. Not
> sure why the cumulative slope. This has changing changing
> sensitivity and frequency response as it progresses. Not sure
> this would be informative.
Like it or not, one of the major arguments right now is “no global warming for X years”. The CAGW crowd has drawn the line at 17 years as proof that global warming has stopped. That’s why people are fixated on how far back we can go with a negative slope.
> Argh, don’t use running averages. Here’s why not and a better option:
> http://climategrog.wordpress.com/2013/05/19/triple-running-mean-filters/
Again, we’re debating the CAGW crowd. The “annual temperature anomaly” is what the public is fixated on. This is the 12 month running mean from January through December. My plot includes all 12 points in the year. Can you persuade GISS etal to use the result of a triple running mean filter, rather than a straight 12 month average?
In reply to:
Zeke Hausfather says:
May 23, 2013 at 3:53 pm
You could also just use this…
http://www.woodfortrees.org/
Woods for trees has for some unexplained reason dropped UAH global temperature anomaly.
It is interesting that the satellite UAH global temperature anomaly is less than HADCRUT4 and GISS in the 1980s, while HADCRUT4 and GISS are almost 0.4 higher than UAH in 2013.
It appears the science is not settled in the measurement and the manipulation of planetary temperature.
https://docs.google.com/spreadsheet/ccc?key=0AnTohu4oFUbcdEgzTkpEYTAwN1BiXzJXMXZ5RVJiOUE#gid=15
Nick Stokes says: May 23, 2013 at 5:12 pm
I keep the updated data here. It also automatically updates the graphs.
Nick, why do you chose to use the base period of 1979 – 2000 for the plots on your site, i.e.:
JTF,
I start in 1979 because that is when satellites start. I actually started collecting the data in 2010, so couldn’t use that decade, though I maybe should have gone to 2009.
But it’s just a matter of subtracting off an average – a single number. It doesn’t matter very much which period, as long as it’s the same for each.. My main concern was to have them all on the same base.
Your numbering convention (2000.000 = Dec 1999) is different than most similar implementations. Not wrong, but not standard and counterintuitive.
Your slope appears to be a two point calculation. Much less useful than the standard method of calcing slope of temp series, which would be the slope of the linear regression of all of the data between the endpoints…
> JJ says: May 23, 2013 at 7:08 pm
> Your slope appears to be a two point calculation. Much less useful than
> the standard method of calcing slope of temp series, which would be the
> slope of the linear regression of all of the data between the endpoints…
??? The slope uses the spreadsheet “slope” function, which uses all the data points. E.g. cell M1764 is
=slope(F1764:F$1960,$A1764:$A$1960)
JMI says: May 23, 2013 at 6:37 pm
You guys certainly have a good handle on things crowd-wise!
Though, I’ve also come across some new research that you may find interesting in this regard…
It’s called “The Theory of Crowd Capital” and you can download it here if you’re interested: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2193115
Enjoy!
Interesting. They definitely seem to understand the process and value, i.e.:
In the current context, their theory is very organizationally focused, i.e.;
whereas WUWT is much less structured, more of an organism with multitude of heads and hands.
So we accurately “know” the temperature to .3000 between datasets. While the trends diverge.