Metadata fail: 230 GHCN land stations actually in the water

Why is this important? Well if you are calculating UHI for stations by looking at satellite images of nightlights, like GISS does (see my post on it at CA) , you’ll find that there’s generally no city lights in the water, leading you to think you’ve got no urbanization around the station. Using only 10 lines of code, Steve Mosher finds 230 errors in NCDC’s Global Historical Climatological Network (GHCN) data that places the station over water, when it should be on land. Does this affect the calculation of Earth’s surface temperature? Steve Mosher investigates. – Anthony

Wetbulb Temperature

by Steven Mosher

This google map display is just one of 230 GHCN stations that is located in the water. After finding instances of this phenomena over and over, it seemed an easy thing to find and analyze all such cases in GHCN. The issue matters for a two reasons:

In my temperature analysis program I use a land/water mask to isolate land temperatures from sea temperatures and to weight the temperatures by the land area. An area that would be zero in the ocean, of course.
Hansen2010 uses nightlights based on station location and in most cases the lights at a coastal location are brighter than those off shore. Although I have seen “blooming” even in radiance calibrated lights such that “water pixels” do on occasion have lights on them.

The process of finding “wet stations” is trivial in the “raster” package of R. All that is needed is high resolution land/sea mask. In my previous work, I used a ¼ degree base map. ¼ degree is roughly 25km at the equator. I was able to find a 1km land mask used by satellites. That data is read in one line of code, and then it is simple matter to determine which stations are “wet”. Since NCDC is updating the GHCN V3 inventory I have alerted them to the problem and will, of course provide the code. I have yet to write NASA GISS. Since H2010 is already in the publishing process, I’m unsure of the correct path forward.

Looking through the 230 cases is not that difficult. It’s just time consuming. We can identify several types of case: Atolls, Islands, and coastal locations. It’s also possible to put the correct locations in for some stations by referencing either WMO publications or other inventories which have better accuracy than either GHCN or GISS. We can also note that in some cases the “mislocation” may not matter to nightlights. These are cases where you see no lights whatsover withing the 1/2 degree grid that I show. In the google maps presented below, I’ll show a sampling of all 230. The blue cross shows the GHCN station location and the contour lines show the contour of the nightlights raster. Pitch black locations have no contour.

I will also update this with a newer version of Nighlights. A google tour is available for folks who want it. The code is trivial and I can cover that if folks find it interesting. with the exception of the graphing it is as simple as this:

Ghcn<-readV2Inv() # read in the inventory

lonLat <- data.frame(Ghcn$Lon,Ghcn$Lat)

Nlight <- raster(hiResNightlights)

extent(Nlight)<-c(-180,180,-90,90) # fix the metadata error in nightlights

Ghcn<-cbind(Ghcn,Lights=extract(Nlight,lonLat)) # extract the lights using “points”

distCoast <-raster(coastDistanceFile,varname=”dst”) # get the special land mask

Ghcn <- cbind(Ghcn,CoastDistance=extract(distCoast,lonLat))

# for this mask, Water pixels are coded by their distance from land. All land pixels are 0

# make an inventory of just those land stations that appear in the water.

wetBulb <- Ghcn[which(Ghcn$CoastDistance>0),]

writeKml(wetBulb,outfile=”wetBulb”,tourname=”Wetstations”)

Some shots from the gallery. The 1km land/water mask is very accurate. You might notice one or two stations actually on land. Nightlights is less accurate, something H2010 does not recognize. Its pixels can be over 1km off true position. The small sample below should show the various cases. No attempt is made to ascertain if this causes an issue for identification of rural/urban categories. As it stands the inaccuracies in Nightlights and station locations suggests more work before that effort is taken up.

Click to enlarge images:

0 0 votes

Article Rating

96 Comments

Roger Knights

November 8, 2010 12:19 am

Let me say it first!
“All Wet!”

kim

November 8, 2010 12:54 am

Can we say they are lazy? Maybe they were processing their data in a dark alley.
==============

Pingo

November 8, 2010 1:02 am

They are all at sea!

a jones

November 8, 2010 1:03 am

Mr. Mosher, you progress sir, you progress. In time we shall learn much I think from your worthy efforts. Still clearly there is much more to do, But it is a start.
The only thing that puzzles me is how the professionals, if you can call them that, did not appear to be aware of this. It may of course be that they are, but believe that their statistical analysis copes with these problems. Perhaps, but if so why have they never even referred to this problem? If they knew it existed one might expect they would explain their methods of dealing with it.
Is that too much to ask? They have been doing this kind of thing for decades yet it takes one amateur, I hope you do not take exception to that description Mr. Mosher, Natural Philosophy has a long and honourable tradition of relying on the well informed amateur, where indeed would our astronomical colleagues be without their army of amateur star gazers? but just one such amateur working by himself to discover these flaws in a relatively short time.
Does that not strike you as strange? Perhaps not in the Topsy Turvy world of climatology where it seems it is the amateurs are doing the serious work that the professionals have omitted to do.
As for the answers you will get from the experts Mr. Mosher I keenly await them, but somehow suspect we may have to wait for a very long time indeed.
Kindest Regards

David Jones

November 8, 2010 1:16 am

Nightlights over water can be fishing boats or aerosols dispersing city light vertically. The instrument is sensitive enough to detect both. Or indeed, they can be calibration and/or registration errors, or blooming/bleeding.

juanslayton

November 8, 2010 1:54 am

Perhaps I may be forgiven for repeating a comment I left earlier over at Moshtemp:
Mr Mosher,
I had a great vacation last year prowling the west and documenting USHCN stations. Occurred to me belatedly that I was driving right past GHCN stations without getting them. So I started to include them just in case Anthony ever makes good on his threat to extend the Surface Stations gallery. I only have a couple so far, but I’ll be glad to share the on-site numbers now and in the future if it would be useful.
Oregon’s Sexton Summit shows typical creativity in coordinates. GISS has the station about a mile and a half to the north, MMS is much closer, maybe 300 feet to the east, and my Garmand (at 42.600131, -123.365567) is about right on the satellite photo.
GISS site descriptions can be just as creative as their coordinates. I got a chuckle this afternoon to see that they put El Centro, California, in an area of ‘highland shrub.’ Locals know that El Centro is in the Salton Sink, 39 feet BELOW SEA LEVEL. Ah, well….

Mike Haseler

November 8, 2010 1:57 am

A trillion pounds is being spent based on data which so lacks any quality control that it makes infant school artwork look “precise”.
And then they have the gall to suggest that being a sceptic is anti-scientific!

chu

November 8, 2010 2:20 am

It’s worse than we thought, undetected massive sea level rises.

orkneygal

November 8, 2010 2:24 am

230 Stations.
Is that a lot?

November 8, 2010 2:38 am

My respect for Hansen grows and grows.
Oops I used a trick there to hide the decline.

John Kehr

November 8, 2010 2:39 am

These are good results. Improvement in the the surface measurements are needed. While metadata mining is not my specialty, I am glad someone works that side of it. I think there is a benefit from the surface data, but it is limited in scope and has pesky problems like the UHI and the even simpler fact that there are thousands of separate thermometers in different locations. Calibration and location are just two issues.
A better standard for what the temperature is will greatly benefit everyone. My workaround for using a single set of temperature data is to blend 4 sources of data together. This blended data has some nice advantages. It gives coverage of the entire instrumental period, but also includes the superior satellite data. I even have some AGW fold agree that this is a good method.
Cleaning up the modern data is good, but the real battle is with CO2. This is where I really focus my effort. My current bit of trouble is to discredit climate sensitivity calculations based on CO2 changes.
We will win as the actual science is on our side, but a solid understanding of the science will be needed.
John Kehr
The Inconvenient Skeptic

John Marshall

November 8, 2010 2:53 am

This may all be meaningless because, as some physicists say, temperature can only be taken if the system is at equilibrium. The atmosphere is never at equilibrium so any temperature taken is meaningless.
What we are arguing about is a few tenths of a degree divergence from what someone has calculated as an average, so giving the anomaly that the graphs show, but this so called average may not be correct because it is calculated over too short a period of time.
Climates change, get used to it.

Roger Carr

November 8, 2010 2:58 am

chu says: (November 8, 2010 at 2:20 am) It’s worse than we thought, undetected massive sea level rises.
Beautiful, Chu! A classic.

Patrick Davis

November 8, 2010 3:00 am

“orkneygal says:
November 8, 2010 at 2:24 am”
It’s a lot of bad data, that’s for sure. But they thought they could get away with sloppy/shonky work, and to a large extent, they have. In no other industry do I know of such sloppiness that is tollerated. If my engineering work handn’t turned out within the +/- 2 micron specifications, I’d have been saked.

husten

November 8, 2010 3:02 am

I suppose it is not clear in all cases where what the coordinates in the metadata are referenced to. Nasa-Giss et.al might not have cared much about that. Google Earth uses WGS84, also the default setting at many hobby GPS devices. I have no idea if there is a worldwide standard for stations, possibly for airports???
This issue can account for a few 1000’s of feet in extreme cases.

Latimer Alder

November 8, 2010 3:33 am

The more I learn about the way that temperature data is collected, the less I am convinced that AW theory is built on a sound experimental footing at all.
Somebody please correct my broad brush understanding if it is wrong.
1. The whole subject relies 100% on the observation of daily temperatures around the world and the computation of an average temperature based on the just the maxima and minima of the daily readings.
Q: This seems like a very kindergarten way of computing a meaningful number. I can understand the arithmetic, (a+b/2) but is it the best statistical technique available? And does the mean so computed actually have any meaning to plant growth, sea levels, migration of birds etc etc. Has there been any experimental work to show that it does?
2. There is no universally accepted and adhered to way of observing these temperatures. Stations are built to different standards, placed in non-comparable sites, move, disappear, are modified, use different equipment, use different recalibration routines (if any) and are not subject to any regular systematic check of their validity or accuracy. The data is a complete hodge podge and the histories are largely unknown or unrecorded.
3. The purpose of the measurements is to record the daily maxima and minima as noted above. It is easy to imagine circumstances where the recorded temperature will be artificially high…our world is full of portable heat sources…but very difficult to imagine those that make it artificially low. Our world is not full of portable cooling fans. To my mind there is an inherent bias for the recordings to be higher overall than the truth.
4. Many have written eloquently as well about the heat island effect where the progress of urbanisation over the last hundred years or so has meant that previously undisturbed instruments have become progressively more subject to extraneous heat sources ..and so show higher readings. As urbanisation proceeds, the number of undisturbed stations decreases and the heat affected ones increase…once more biasing the record towards higher recordings over time.
5. Once all the data has been collected in the inconsistent and error prone manner shown above, it is sent to, for example, our friends at CRU in Norwich. They then apply ‘adjustments’ to the data based on their ‘skill and knowledge’. The methodology for the adjustments is not published… and may not be kept at all because of the lack of any professional data archiving mechanisms at this place.
The director of CRU does not believe that the UHI effect is of any significance, because a colleague in China produced some data sometime that showed that in China .. in the middle of great political upheavals.. it was only small. And cannot now produce that data once again…maybe he lost it in an office move.
He is (was) also a highly active and influential member of the community whose careers have been made by ‘finding’ global warming and actively proselytising its supposedly damaging effects. Any guesses as to which directions his ‘adjustments’ are likely to be in?
CRU is institutionally averse to any outside scrutiny. Resisting such examination by any and all means is in its DNA. Their self-declared philosophy is to destroy data rather than reveal it.
(Personal note – As an IT manager, losing any data is one of the greatest sins that I could commit…actively destroying it is a mortal sin, and under FOI may well be legally criminal as well)
6. The data so processed is then released to the world as the definitive record of the temperature series. And used as the basis for all sorts of predictions, models, scare stories, cap’n’trade bills and all that malarkey. Hmmmmmm!
Forgive me if I think that this overall process has not been ‘designed’ to get at the absolute truth. The data is collected in an inconsistent manner, and there is no reliable record that describes the exact circumstances of that collection. It is subject to many outside influences that tend to increase the readings over time, not to decrease them.
Once collected it is subject to adjustments according to no agreed or public methodology. There is a clear potential conflict of interest in the institutions making the adjustments.
I set as a task to first year science undergraduates…take this model of raw data collection and processing to the homogenised state, and suggest how it could be better designed to get at the truth of the temperatures. Do not feel shy about writing at length. There is plenty of scope.

H.R.

November 8, 2010 3:46 am

Hmm… so those 230 stations should be reclassified as wet bulb readings, eh?

Robin Guenier

November 8, 2010 4:11 am

O/T (slightly). In an article in American Thinker (here), Fred Singer says:

The global climate … warmed between 1910 and 1940 but due to natural causes, and at a time when the level of atmospheric greenhouse gases was relatively low. There is little dispute about the reality of this rise in temperature and about the subsequent cooling from 1940 to 1975, which was also seen in proxy records (such as ice cores, tree rings, etc.) independent of thermometers. The … IPCC, then reports a sudden climate jump around 1977-1978, followed by a steady increase in temperature until at least 1997. It is this steady increase that is in doubt; it cannot be seen in the proxy records.

Is that claim (re proxy records) correct? Has the evidence for it been published?
He goes on to say:

Even more important, weather satellite data, which furnish the best global temperature data for the atmosphere, show essentially no warming between 1979 and 1997.

Again, is that correct?

R. de Haan

November 8, 2010 4:42 am

Is there no way to hold these people accountable for their ill performed work?

jimmi

November 8, 2010 4:59 am

“Even more important, weather satellite data, which furnish the best global temperature data for the atmosphere, show essentially no warming between 1979 and 1997.
Again, is that correct?”
Well the RSS data was up in a post a few days ago – so, no it is not correct.

John S

November 8, 2010 5:27 am

Every one of these stations get serviced at least occasionally. How hard would it be to visit every single site over the course of, say, a year, get a GPS based location, and update the database? It could be done with little or no additional cost above the basic maintenance.

John Peter

November 8, 2010 5:34 am

Robin Guenier at November 8, 2010 at 4:11 am http: should look at
//www.drroyspencer.com/latest-global-temperatures/
There he can see that global Satellite temperatures are essentially “flat” until 1997 and the rises as a result of that 1998 El Nino. We then have another elevated “flat” period from 1998 to present with another peak summer 2010 but declining temperatures again due to La Nina. I have flat in “” as there are natural variations from year to year. As can be seen, the recent peak did not exceed 1998. Since 1998 CO2 has increased by more than 20ppm without essentially sending global temperatures upwards. Sea ice N/S is within natural variation and global sea levels are heading down again.

Mike Odin

November 8, 2010 5:36 am

Amazing (65 N)Canada arctic
automatic weather station
still hourly records
temperatures warmer than
(45 N)NYC JFK station–
http://www.ogimet.com/cgi-bin/gsynres?ind=71981&ano=2010&mes=11&day=8&hora=6&min=0&ndays=30
http://www.ogimet.com/cgi-bin/gsynres?ind=74486&ano=2010&mes=11&day=8&hora=6&min=0&ndays=30
This amazing station also
is consistently 10-30 degrees C
hotter than all its surrounding
Canada arctic stations–
Too hot to be water effects
–it must sit atop
an erupting volcano.

redneck

November 8, 2010 5:38 am

Latimer Alder: November 8, 2010 at 3:33 am
Latimer you make many good points.
Your statement:
“Q: This seems like a very kindergarten way of computing a meaningful number. I can understand the arithmetic, (a+b/2) but is it the best statistical technique available?”
Well it got me thinking about a quote I read years ago:
“Given enough data with statistics you can prove anything.”
Sorry I don’t know the source.
Now I’m no statistician but from my reading of events it was choosing the best statistic, the one that will give me the answer I want, some unusual variant of PCA that may yet get Mann in some hot water.

Wade

November 8, 2010 5:42 am

What are these people doing while under the employ of my tax money? If it is your job to make sure temperature data is accurate, why aren’t you catching these outrageous errors? The only thing I can figure is the data gives the output desired and they are using their time figuring out to keep their easy income, such as going on personal crusades decrying how bad global warming is.
Why do keep calling people who are too lazy to do required quality control “scientists”? They are more like fat cats.

1 2 3 4 Next »

wpDiscuz

Ghcn<-readV2Inv() # read in the inventory

lonLat <- data.frame(Ghcn$Lon,Ghcn$Lat)

Nlight <- raster(hiResNightlights)

extent(Nlight)<-c(-180,180,-90,90) # fix the metadata error in nightlights

Ghcn<-cbind(Ghcn,Lights=extract(Nlight,lonLat)) # extract the lights using “points”

distCoast <-raster(coastDistanceFile,varname=”dst”) # get the special land mask

Ghcn <- cbind(Ghcn,CoastDistance=extract(distCoast,lonLat))

# for this mask, Water pixels are coded by their distance from land. All land pixels are 0

# make an inventory of just those land stations that appear in the water.

wetBulb <- Ghcn[which(Ghcn$CoastDistance>0),]

writeKml(wetBulb,outfile=”wetBulb”,tourname=”Wetstations”)

Share this:

Related Posts

New paper: U.S. temperature extremes have declined since 1899, challenging assumptions about increasing heatwaves

New Temperature Study in Reno Finds Strong Urban Heat Island Bias at Official Climate Station

Another Temperature Bias: The Shrinking Stevenson Screen = Warming

‘Death Valley Days’ May Be Over for Global Temperature Record