A new tool for climate data visualization is showcased
David Walton writes to tell us about a satirical piece by Mark Steyn – titled Virginia is for Warmers
I was blessedly out of the country on Tuesday, so I’m belatedly catching up on post-election analysis. But I see that self-proclaimed Nobel laureate Michael Mann is hailing Virginia’s gubernatorial race as a referendum on climate science. To scroll his Twitter feed, you’d get the impression “climate change” was the No. 1 electoral issue in the state. Which would tend to suggest that the subject is political speech. Which is protected by the First Amendment, isn’t it?
Dr. Mann is a rare bird: a political activist whose politics require insulation from the Constitution.
About the same time WUWT reader “James” tips us to this story about a data analyst that decided to run some software called SAP HANA and a visualization tool called Lumira on climate data in Virgina in response to a reader challenge on their forum. Both of these tools are for the business community where getting it right has big rewards and getting it wrong has financial consequences.
He’s using data from NOAA’s Integrated Surface Database which has data back to 1901. It has good station coverage in the last 40 years, so should be able to detect a global warming signal during the period. Apparently, there is none in this data, which is absolute temperature, not anomaly data. Below is a screencap of Virginia time series of average temperature from the Lumira program:
You can read all about the ISH data in this PDF document here. Note that this is hourly data, mainly from airport sources. It doesn’t have the same sort of adjustments applied to it that daily data Tmax/min and Tmean have in the favorite surface climate data indices used by HadCRUT and GISS. For example, since it is hourly data, there is no need for a TOBs (Time of Observation) correction. Readers may recall that Dr. Roy Spencer has used ISH data for some of his analyses in the past.
Part 1 where he converts/ingests data is linked in part2 of the story below, and a video of the entire analysis process follows:
=============================================================
Big Data Geek – Is it getting warmer in Virginia – NOAA Hourly Climate Data – Part 2
By John Appleby
So I discussed loading the data from NOAA’s Hourly Climate Data FTP archive into SAP HANA and SAP Lumira in Big Data Geek – Finding and Loading NOAA Hourly Climate Data – Part 1. Since then, a few days have passed and the rest of the data got downloaded.
Here are the facts!
– 500,000 uncompressed sensor files and 500GB
– 335GB of CSV files, once processed
– 2.5bn sensor readings since 1901
– 82GB of Hana Data
– 31,000 sensor locations in 288 countries
Wow. Well Tammy Powlas asked me about Global Warming, and so I used SAP Lumira to find out whether temperatures have been increasing in Virginia, where she lives, since 1901. You will see in this video, just how fast SAP HANA is to ask complex questions. Here are a few facts about the data model:
– We aggregate all information on the fly. There are no caches, indexes, aggregates and there is no cheating. The video you see is all live data [edit: yes, all 2.5bn sensor readings are loaded!].
– I haven’t done any data cleansing. You can see this early on because we have to do a bit of cleansing in Lumira. This is real-world, dirty data.
– HANA has a very clever time hierarchy which means we can easily turn timestamps into aggregated dates like Year, Month, Hour.
– SAP Lumira has clever geographic enrichments which means we can load Country and Region hierarchies from SAP HANA really easily and quickly.
I was going to do this as a set of screenshots, but David Hull told me that it was much more powerful as a video, because you can see just how blazingly fast SAP HANA is with Lumira. I hope you enjoy it!
Let me know in the comments what you would like to see in Part 3.
Update: between the various tables, I have pretty good latitude and longitude data for the NOAA weather stations. However, NOAA did a really bad job of enriching this data and it has Country (FIPS) and US States only. There are 31k total stations, and I’d love to enrich these with global Country/Region/City information. Does anyone know of an efficient and free way of doing this? Please comment below! Thanks!
Update: in a conversation with Oliver Rogers, we discussed using HANA XS to enrich latitude and longitude data with Country/Region/City from the Google Reverse Geocoding API. This has a limit of 15k requests a day so we would have to throttle XS whilst it updates the most popular geocodings directly. This could be neat and reusable code for any HANA scenario!
=============================================================
Here is the fun part. Data and software is available for free.
The global ISH data converted into a usable format (from part 1 above) is here
The Lumira visualization tool is available free here
Happy data sleuthing, folks.

Nick Stokes says:
November 9, 2013 at 2:41 pm
Explain please why “adjustments” for the urban heat island effect make temperatures warmer rather than colder. Also please state why adjustments to temperature records from before WWII are colder rather than warmer.
Thanks.
===============================================================
I regret that I am not more skilled in the ins and outs of PC programs.
I don’t know how this will appear after I “paste it” but in the previous comment there should be enough information for you to check me.
I welcome specific corrections but I would suggest you look at your own local records, past and present, first.
Again, the 2012 list is from April and the 2007 list is from March.
(Even more “adjustments were made in July of 2012.)
Newer-April ’12 Older-’07 (did not include ties)
6-Jan 68 1946 Jan-06 69 1946 Same year but “new” record 1*F lower
9-Jan 62 1946 Jan-09 65 1946 Same year but “new” record 3*F lower
31-Jan 66 2002 Jan-31 62 1917 “New” record 4*F higher but not in ’07 list
4-Feb 61 1962 Feb-04 66 1946 “New” tied records 5*F lower
4-Feb 61 1991
23-Mar 81 1907 Mar-23 76 1966 “New” record 5*F higher but not in ’07 list
25-Mar 84 1929 Mar-25 85 1945 “New” record 1*F lower
5-Apr 82 1947 Apr-05 83 1947 “New” tied records 1*F lower
5-Apr 82 1988
6-Apr 83 1929 Apr-06 82 1929 Same year but “new” record 1*F higher
19-Apr 85 1958 Apr-19 86 1941 “New” tied records 1*F lower
19-Apr 85 2002
16-May 91 1900 May-16 96 1900 Same year but “new” record 5*F lower
30-May 93 1953 May-30 95 1915 “New” record 2*F lower
31-Jul 100 1999 Jul-31 96 1954 “New” record 4*F higher but not in ’07 list
11-Aug 96 1926 Aug-11 98 1944 “New” tied records 2*F lower
11-Aug 96 1944
18-Aug 94 1916 Aug-18 96 1940 “New” tied records 2*F lower
18-Aug 94 1922
18-Aug 94 1940
23-Sep 90 1941 Sep-23 91 1945 “New” tied records 1*F lower
23-Sep 90 1945
23-Sep 90 1961
9-Oct 88 1939 Oct-09 89 1939 Same year but “new” record 1*F lower
10-Nov 72 1949 Nov-10 71 1998 “New” record 1*F higher but not in ’07 list
12-Nov 75 1849 Nov-12 74 1879 “New” record 1*F higher but not in ’07 list
12-Dec 65 1949 Dec-12 64 1949 Same year but “new” record 1*F higher
22-Dec 62 1941 Dec-22 63 1941 Same year but “new” record 1*F lower
29-Dec 64 1984 Dec-29 67 1889 “New” record 3*F lower
Newer-’12 Older-’07 (did not include ties)
7-Jan -5 1884 Jan-07 -6 1942 New record 1 warmer and 58 years earlier
8-Jan -9 1968 Jan-08 -12 1942 New record 3 warmer and 37 years later
3-Mar 1 1980 Mar-03 0 1943 New record 3 warmer and 26 years later
13-Mar 5 1960 Mar-13 7 1896 New record 2 cooler and 64 years later
8-May 31 1954 May-08 29 1947 New record 3 warmer and 26 years later
9-May 30 1983 May-09 28 1947 New tied record 2 warmer same year and 19 and 36 years later
30 1966
30 1947
12-May 35 1976 May-12 34 1941 New record 1 warmer and 45 years later
30-Jun 47 1988 Jun-30 46 1943 New record 1 warmer and 35 years later
12-Jul 51 1973 Jul-12 47 1940 New record 4 warmer and 33 years later
13-Jul 50 1940 Jul-13 44 1940 New record 6 warmer and same year
17-Jul 52 1896 Jul-17 53 1989 New record 1 cooler and 93 years earlier
20-Jul 50 1929 Jul-20 49 1947 New record 1 warmer and 18 years earlier
23-Jul 51 1981 Jul-23 47 1947 New record 4 warmer and 34 years later
24-Jul 53 1985 Jul-24 52 1947 New record 1 warmer and 38 years later
26-Jul 52 1911 Jul-26 50 1946 New record 2 warmer and 35 years later
31-Jul 54 1966 Jul-31 47 1967 New record 7 warmer and 1 years later
19-Aug 49 1977 Aug-19 48 1943 New record 1 warmer and 10, 21 and 34 years later
49 1964
49 1953
21-Aug 44 1950 Aug-21 43 1940 New record 1 warmer and 10 years later
26-Aug 48 1958 Aug-26 47 1945 New record 1 warmer and 13 years later
27-Aug 46 1968 Aug-27 45 1945 New record 1 warmer and 23 years later
12-Sep 44 1985 Sep-12 42 1940 New record 2 warmer and 15, 27 and 45 years later
44 1967
44 1955
26-Sep 35 1950 Sep-26 33 1940 New record 2 warmer and 12 earlier and 10 years later
35 1928
27-Sep 36 1991 Sep-27 32 1947 New record 4 warmer and 44 years later
29-Sep 32 1961 Sep-29 31 1942 New record 1 warmer and 19 years later
2-Oct 32 1974 Oct-02 31 1946 New record 1 warmer and 38 years earlier and 19 years later
32 1908
15-Oct 31 1969 Oct-15 24 1939 New tied record same year but 7 warmer and 22 and 30 years later
31 1961
31 1939
16-Oct 31 1970 Oct-16 30 1944 New record 1 warmer and 26 years later
24-Nov 8 1950 Nov-24 7 1950 New tied record same year but 1 warmer
29-Nov 3 1887 Nov-29 2 1887 New tied record same year but 1 warmer
4-Dec 8 1976 Dec-04 3 1966 New record 5 warmer and 10 years later
21-Dec -10 1989 Dec-21 -11 1942 New tied record same year but 1 warmer and 47 years later
http://wattsupwiththat.com/2013/11/09/virginia-is-for-warmers-data-says-no/#comment-1470490
OOPS! I forget to say that the first list is record highs and the second list is record lows.
Virginia has been “colder than a well digger’s behind in MN” all Fall. Climate change could not be an issue in the election.
Gunga Din says: November 9, 2013 at 3:02 pm
Yes, I did check, though I could only get 2005, not 2007. And it did change as you report. I couldn’t find what dataset the data was coming from.
But but but but…
Warmistas have trouble arguing with honest clean data; so they twist the logic, insisting on discussing the mutilated anomalies they’re so fond of.
Stick with the hourly stuff so simply put above. Don’t like the hourly as presented, pull down the data and show us!
No infilling, no broad grid range averages, no fat finger adjustments, no fudging, no hockey spit sticks.
Hahahahaha!
Watch NOAA try to hide this data now.
David Walton: Excellent visual of data and accurate use of the metadata!
If you average temperatures you are guaranteed to get the wrong answer
========================================================================
Thank you.
An honest look is all I asked.
I don’t know what the dataset was either.
theyouk says: November 9, 2013 at 10:17 am
————–
So you’re the SAP bloke who made my life miserable for a few years ? 😉
I live in Virginia (came in 1981) and now live near Richmond. The past three months I’ve kept records of the temperature and rainfall here at the house (rural area, 12 miles from the airport) and compared them with readings from RIC, Richmond International Airport. I know that my readings are purely amateurish, using a single thermometer bought for maybe $10 at a local box store. I do have a nice big rain gauge. According to my figures, our property was 2 degrees (F) cooler than the airport in August, 2.7 degrees cooler in September, and 3 degrees cooler in October. Rainfall was 2.05″ more at home than at the airport in August, .09″ more in September, and .44″ less in October. We are definitely cooler on average than the airport, at least for these three months, likely due to the UHI effect. As for the political race: I never heard global warming or climate change mentioned by either candidate, and Michael Mann’s name never came up. This past summer was one of the coolest and wettest I can recall in over 30 years in Virginia; reservoirs that had been down a year ago have filled up again (though September had less rainfall than usual, and October’s came all in a 3-day span; almost nothing yet for November, so we could use some rain). Climate change as an issue was a non-starter in the political campaign, and remains so.
MacAwful won DESPITE Mann. Not because of him.
Steven Mosher says:
November 9, 2013 at 4:11 pm
If you average temperatures you are guaranteed to get the wrong answer
~~~
And if you average Climate Models?
Ahhh… SQL, better. David, nice post! Since it seems you have the data down and the power at hand have you considered roughly hitting maybe fifteen other state spread across the US? It would be very curious what a such a spread should show. Most of the central states seem to be flat too except for the large cities since 1895. Another possible seperate run might be if you even drop two or three UHI contaminated large city stations per state to get a more rural reading.
Lumira now downloaded but 3 gigs of memory may not be enough. What kind of equipment did you find is necessary to run this?
Nick Stokes…..sort of like all of those temp stations that used to be in open fields….keeping them right next to runways!! Or, keeping them in fields of…..asphalt!!! WOW WOW WOW.
Wow, there were a whole lotta 737’s and 767’s taking off in the early 1900’s, werent there?
Steven Mosher:
Your post at November 9, 2013 at 4:11 pm says in total
Your statement is meaningless in the form you have presented and requires much explanation. At very least you need to provide answers to the following questions if your assertion is to be more than meaningless nonsense.
1.
Please define what you mean by “answer” so it can be understood what you mean by “the wrong answer” and what you think would constitute the right answer?
2.
Whatever you mean by the “answer”, there is no possibility of a calibration reference for the “answer” so how could you know that it is ‘right’ or ‘wrong’ and what would be its accuracy, precision and reliability?
3.
Are you saying temperature measurements cannot be averaged because temperature is an intrinsic property and, therefore, it is not possible to assess temperature changes for regions, hemispheres and the globe?
4.
Are you saying the various compilations of global temperature anomaly (i.e. HadCRUTx, GISS, RSS, and etc.) are each compilations of “wrong answers” because temperature measurements and temperature anomalies indicate averages of an intrinsic property which cannot be averaged (a temperature anomaly is a measured average temperature which is offset by removing an ‘average’ value, and “average temperatures” are temperature anomalies offset by zero)?
Thanking you in anticipation of your clarifications
Richard
The screencap you show under the text “Below is a screencap of Virginia time series of average temperature from the Lumira program” has the August filter in place, according to the YouTube presentation
The August filter is removed at 7’57”, so you might want to update the screencap
Apart from this, thanks for an interesting and informative post!
Here’s why you can’t just average absolute temperatures, as Mosh says. There are 172 VA ISH stations listed, but some report for quite short periods. I’ve plotted here the average altitude of stations reporting in each year, along with the number (red, same numeric scale). As you can see, altitude is quite variable, with some increasing tendency. There is a range of 140 m, which is almost 1°C at 6 °C/km lapse rate. That variation goes into the temperature average, unless you use anomalies.
Steven Mosher says:
November 9, 2013 at 4:11 pm
If you average temperatures you are guaranteed to get the wrong answer
—————————————————–
Obviously some type of perfectly maintained small grid series appropriately adjusted for station moves and equipment changes is better.
But we have seen this methodology is subject to abuse. NCDC is dropping out and adding in stations whenever they need the trend to go the other way. Instead of adjusting out the UHI, they adjust it into the rural stations not affected by it. The time of observation bias adjustment appears to be simple bias. We have new sensors measuring a temperature decline and they adjust the old records down instead of assuming the new equipment is actually measuring declining trends. BEST takes 6,000 temperature stations and cuts them into 48,000 stations without understanding that this takes any declining temperature trends and turns them into rising temperature trends.
Just use ALL the stations over all the timeframe. Grid them and assume station moves, equipment changes, observation times balance out over time. That is more likely to provide the best estimate of the real temperature over time than the subjective biased methods used by the NCDC, GISS, HadCrut and BEST.
6,000 station averages are more than good enough on their own.
And leave the historic temperature records alone.
Nick Stokes:
Your post at November 10, 2013 at 5:20 am says in total
Sorry, but that answer is plain wrong and it does NOT clarify the post from Steven Mosher at November 9, 2013 at 4:11 pm which says in total
The clarification requires at least answers to the questions I posed in my response at November 10, 2013 at 2:46 am
http://wattsupwiththat.com/2013/11/09/virginia-is-for-warmers-data-says-no/#comment-1470885
Importantly, your post is plain wrong.
You are saying variation from different measurement sites is corrected by using anomalies. It is not.
As I said in my post,
So, using anomalies changes nothing because a temperature IS an “anomaly” but with zero offset.
Correction for dissimilar measurement sites is another subject altogether.
If you had said that adjustment for variable measurement sites is required then you may – n.b. only may – have had a point. But that point is the same whether one uses temperature anomalies or temperatures.
Richard
Berkeley Earth has the global land average temperature as the highest on record in September 2013 at +1.95C (yes that’s right).
http://berkeleyearth.lbl.gov/auto/Regional/TAVG/Text/global-land-TAVG-Trend.txt
But this is what it looked like from high resolution satellite. Pretty average and impossible to see how this would be hottest ever.
http://s7.postimg.org/sq8tae5x7/September_2013_Land.jpg
I don’t have the intellectual horsepower, the time, the opportunity to do it. But, I understand from direct conversations with Fernec Miskolski that the size of the data base he used with his FORTRAN programming to do his analysis, based on the layer by layer radiative response of the atmosphere, is equivalent to this problem.
IF THIS DATA can be made available (i.e. the spectroscopic data), if we can get some one to C++ transform Fernec’s programs, then it could be made public.
His CONCEPT is to develop an “atmospheric transfer fucntion”, such as a Laplace or Fourier.
From this concept, and deriving a net input/output number…(which by rational has to be near 2.0, he calculates 1.87…) he can account for CO2 PROPERLY (as his emission geometries consider the full 4Pi steradians, and the curvature of the earth AND therefore the IR leakage below the tangent plane.)
Accordingly, he claims he can take the Radiosone measurements of 80 years and see if there is a
NET CHANGE in the radiative balance. (He finds none.)
Now I will have to admit, that maybe I’m under playing my “intellectual horse power”, because in the 10+ hours of personal, one on one visits I’ve had with Fernec, we have discussed his initial training as a nuclear engineer/physicist. In that field, because of the neutron target spectrum change (particularly for Uranium and high, moderating to thermal neutrons) the calculations for
a nuclear reaction, whether a weapon or a power reactor, become exceedingly complete. Enough to dwarf the complexity of atm. radiative calcs. My comment to Fernec: “When you compare atmospheric radiation calcs to using the “Point Kernel” methods of neutron moderation/matter interaction/target size work, you are dealing with something which becomes much more trivial, correct?”
Thus, I have great faith that Fernec IS a Laplace, a Newton, a Fourier, and or an Elsasser. And it’s the relative “lack of intellectual horse power” on his CRITICS, not on HIS END which are the root problem. (And the fact that perhaps a couple years…with people on his level and a few hundred thousand…but now perhaps BLESSEDLY just a few PCs, and his method could be PEER PERFORMED and studied for validity.)
“Thus, I have great faith that Fernec IS a Laplace, a Newton, a Fourier, and or an Elsasser.”
He’s not even a Fernec!
“… in the 10+ hours of personal, one on one visits I’ve had with Fernec …”
Max,
I envy you being able to sit down with Ferenc. I have spent uncounted hours wading through Ferenc’s papers and I also do see great insight there, huge in fact. He is exactly correct to home in on the window portion for all else at each individual level is isotropic radiation from the various ghgs in each atmosphere. It is what can reach space at each altitude (mass level) that really matters. Just like a comment I made on another thread moments ago (here) before I found your comment here much of what I said there has some of its roots directly from Miskolczi’s papers, well, the thoughts from his papers. He took his parallels to Mars atmosphere, I am attempting to carry that very roughly to Venus’s atmosphere and I find it quite amazing what you find even looking at the radiation field even when very roughly differentiated, the Earth and Venus are basically the same even though their compositions are miles apart.
I downloaded Ferenc’s HARTCODE two years ago with the intention to do just what you have said, convert the Fortran code to c/cpp but I got sidetracked along the way and never got further that just a beginning. My forty years as a programmer has left me with the ability to translate one language to another and I see no reason why that should not be possible. Wonder if besides the hartcode he could share what code was gathering the radiosonde data (I know the TIGR2 data itself is proprietary without a fax legal application for use) but sure would love to read the code. I get so much from juist the act of the reading the applications of the equations.
So if they had to homogenize the old data because of inconsistencies in collection etc. why do they still need to adjust the data in the last few decades?
Nick Stokes says:
November 10, 2013 at 5:20 am
I built adjustable filters around which stations to include based on how many station sample days/year and how many years they provide data.
Even when you do that Mosh doesn’t like it. Apparently you have to krige it too.
I think we need a view of what the stations measured, not something that’s an abstraction of the measurements. When you do look at the actual measurements there’s a large difference in when the warming in North America and Eurasia happened.
Mosh,
Has anyone at BEST calculate a daily temp for an area but leave out a station, then compare the calculated temp against the actual measurement?