Why Reanalysis "Data" Isn't

Guest Post by Willis Eschenbach

There is a new paper out by Xu and Powell, “Uncertainty of the stratospheric/tropospheric temperature trends in 1979–2008: multiple satellite MSU, radiosonde, and reanalysis datasets” (PDF, hereinafter XP2011). It shows the large differences between the satellite, balloon (radiosonde), and reanalysis temperatures for the troposphere and the stratosphere. The paper is well worth a read, and is not paywalled. Figure 1, from their paper, shows their tropospheric temperature trends by latitudinal band from each of the sources.

Figure 1. From XP2011 Fig. 3a. Original caption says: Inter-comparison of tropospheric temperature (TCH2) trends (K decade−1) for the MSU (RSS, UAH, STAR), Radiosonde (RATPAC, HADAT2, UK, RAOBCORE, RICH) and Reanalysis (JRA25, MERRA, NCEP- CFSR, NCEP-NCAR, NCEP-DOE) products for the period of 1979–2008. (a) Trend changes with latitude for each individual dataset;

In Figure 1, the three groups are divided by color. The satellite observations are in blue. The balloon-borne observations are in green. And the climate reanalysis model results are in orange. Now, bear in mind that these various results are all purporting to be measuring the same thing—which way and how much the temperature of the lower troposphere is trending. The paper closes with the following statement (emphasis mine):

In general, greater consistency is needed between the various data sets before a climate trend can be established in any region that would provide the reliability expected of a trusted authoritative source.

I can only heartily agree with that. However, there are a few conclusions that we can draw in the interim.

First, despite the fact that these are all plotted together as though they were equals, in fact only two of the groups represent observational data. The results shown in orange are all computer model outputs. Unfortunately, these model outputs are usually referred to as “reanalysis data”. They are not data. They are the output of a special kind of computer climate model. This kind of climate model attempts to match its output to the known datapoints at a given instant (temperatures, pressures, winds, etc.). It is fed a stream of historical data, including satellite MSU and other data as well as station reports from around the world. It then gives its best estimate of what is happening where we have no data, in between the stations and the observation times.

Given that the five different reanalysis products were all fed on a very similar diet of temperatures and pressures and the like, I had expected them to be much, much closer together. Instead, they are all over the map. So my first conclusion is that not only are the outputs of reanalysis models not data. As a group they are also not accurate. They don’t even agree with each other. To see what the rest of the data shows, I have removed the reanalysis model outputs in Figure 2.

Figure 2. Same as in Figure 1, but with the computer reanalysis model results removed, leaving satellite (blue) and balloon-borne (green) observations.

The agreement between the balloon datasets is not as good as that of the satellite data, as might be expected from the difference in coverage between the satellite data (basically global) and balloon data (in certain scattered locations).

Once the computer model results are removed, we find much better agreement between the actual observations. Figure 3 shows the correlation between the various datasets:

Figure 3. Correlations between the various observations (Satellite and Balloon) and computer model (Reanalysis) data. Red indicates the lowest correlation, blue shows the highest correlation. Bottom row shows the correlation of each dataset with the average of all datasets. HadAT is somewhat affected due to incomplete coverage (only to -50°S see Fig. 2), as is RAOBCORE to a lesser degree (coverage to -70°S).

Numerically, this supports the overall conclusion of Figure 1, which is that as a group the reanalysis model results do not agree well with each other. This certainly does not give confidence in the idea of blindly treating such model output as “data”.

Finally, Figure 4 shows the three satellite records, along with the MERRA reanalysis model output.

Figure 4. Same as in Figure 1, but with the balloon and computer reanalysis model results removed, leaving satellite (blue) and one reanalysis model (violet).

In general the three satellite records are in good agreement. The STAR and RSS datasets are extremely similar, somewhat disturbingly so, in fact. Their correlation is 1.00. It make me wonder if they are not sharing large portions of their underlying analysis mathematics. If so, one might hope that they would resolve whatever small differences remain between them.

I have read, but cannot now lay my hands upon, a document which said that the RSS team use climate model output as input to a part of their calculation of the temperature. In contrast, the UAH team do not use climate model for that aspect of their analysis, but do a more direct calculation. (I’m sure someone will be able to verify or falsify that.) [UPDATE: Stephen Singer points to the document here, which supports my memory. The RSS team uses the output of the CCSM3 climate model as input to their analysis.] If so, that could explain the similarity between MERRA and the RSS/STAR pair. On the other hand, the causation may be going the other way—the reanalysis model may be overweighting the RSS/STAR input … because remember, some dataset from among the satellite data, perhaps the RSS data, is used as input for the reanalysis models.

This leads to the interesting situation where the output of the CCSM3 is used as input to the RSS temperature estimate. Then the RSS temperature estimate is used as input to a reanalysis climate model … recursion, anyone?

Finally, this points to the difficulty in resolving the question of tropical tropospheric amplification. I have written about this question here. The various datasets give various answers regarding how much amplification exists in the tropics.

CONCLUSIONS? No strong ones. Reanalysis models are not ready for prime time. There is still a lot of variation in the different measurements of the global tropospheric temperature. This is sadly typical of the problems with the a number of the other observational datasets. In this case, this affects the measurement of tropical tropospheric amplification. Further funding is required …

Regards to all,

w.

DATA:

The data from Figure 1 is given below, in comma-separated format

Latitude,  STAR  ,  UAH  ,  RSS  ,  RATPAC  ,  HADAT  ,  IUK  ,  RAOBCORE  ,  RICH  ,  JRA25  ,  MERRA  ,  NCEP-CFSR  ,  NCEP-DOE  ,  NCEP-NCAR

-80, -0.104, -0.244, -0.134, 0.085,  , 0.023,  , 0.023, -0.243, -0.154, 0.028, 0.294, 0.304

-70, -0.074, -0.086, -0.094, 0.09,  , -0.035, 0.071, -0.034, -0.218, -0.115, -0.045, 0.147, 0.148

-60, -0.055, -0.142, -0.074, 0.09,  , -0.088, 0.1, -0.148, -0.285, -0.051, -0.094, 0.059, 0.104

-50, 0.005, -0.069, -0.043, -0.006, 0.138, 0.022, 0.081, 0.01, -0.232, 0.032, 0.029, 0.03, 0.114

-40, 0.07, -0.076, 0.026, -0.01, 0.118, -0.107, 0.08, 0.074, -0.081, 0.115, 0.116, -0.005, 0.077

-30, 0.143, 0.082, 0.087, 0.114, 0.123, 0.122, 0.127, 0.126, 0.047, 0.178, 0.22, 0.047, 0.108

-20, 0.182, 0.08, 0.13, 0.12, 0.085, 0.087, 0.143, 0.125, 0.116, 0.213, 0.289, 0.071, 0.097

-10, 0.199, 0.056, 0.153, 0.114, -0.02, 0.082, 0.116, 0.098, 0.069, 0.226, 0.313, -0.003, 0.053

0, 0.195, 0.038, 0.154, 0.089, 0.038, 0.028, 0.136, 0.089, 0.063, 0.284, 0.324, -0.007, 0.061

10, 0.179, 0.034, 0.144, 0.09, 0.064, 0.192, 0.162, 0.137, 0.087, 0.273, 0.328, 0.027, 0.065

20, 0.21, 0.093, 0.166, 0.09, 0.18, 0.16, 0.194, 0.207, 0.115, 0.245, 0.307, 0.115, 0.114

30, 0.23, 0.133, 0.162, 0.247, 0.239, 0.137, 0.238, 0.291, 0.152, 0.257, 0.307, 0.154, 0.153

40, 0.238, 0.164, 0.161, 0.237, 0.213, 0.189, 0.246, 0.3, 0.153, 0.244, 0.268, 0.161, 0.194

50, 0.241, 0.125, 0.161, 0.24, 0.314, 0.213, 0.247, 0.283, 0.166, 0.236, 0.238, 0.161, 0.201

60, 0.299, 0.167, 0.222, 0.283, 0.289, 0.207, 0.335, 0.324, 0.224, 0.288, 0.266, 0.202, 0.239

70, 0.317, 0.177, 0.245, 0.288, 0.289, 0.237, 0.427, 0.393, 0.254, 0.304, 0.269, 0.232, 0.254

80, 0.357, 0.276, 0.301, 0.278, 0.438, 0.384, 0.501, 0.323, 0.226, 0.328, 0.326, 0.235, 0.26
0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

96 Comments
Inline Feedbacks
View all comments
richard verney
November 7, 2011 4:01 am

What is the point of ‘reanalysis data’ when you have empirical observational data?
Why not and would it not be better to work from the empirical observational data?
I agree that the model creations (ie., the ‘reanalysis data’) appear way off target (although if they were to be averaged they may well provide a better fit with reality) and are another example as to how poor the models are. I guess nothing new there.

jens raunsø jensen
November 7, 2011 4:51 am

Hi Willis,
thanks for sharing this with us. I have not worked with the satellite or reanalysis data, but having just completed a step change analysis of nearly all complete and long term records in the GHCN (except North-America and Australia) I find strong support for the likely existence of real step changes in temperature records (hopefully I can post this later). If that is the case, the basic assumption of linearity underlying the widespread use of linear trend analysis (as also used in the XP2011) is violated and the resulting trend misleading. Given your insight and experience with the temperature records and analyses, would you have any thoughts on this fundamental issue, eg. should we discourage the use of linear regression as a method to identify “the pattern” in temperature records ?
Thanks … jens

Bloke down the pub
November 7, 2011 4:55 am

I suspect that error bars on the reanalysis data would be wider than any anomaly.

November 7, 2011 5:18 am

Data are readings taken from instruments. Period.
Missing data are forever missing.
Nothing produced by a model is “data”.
Rumplestiltskin is not a climate modeler.

RACookPE1978
Editor
November 7, 2011 5:21 am

Gee. That’s funny.
Last time I heard of temperature “data” for the Arctic, NASA-GISS’s Hansen was claiming +4 degrees across the whole Arctic landmass…… (Seems like he is plotting a huge “red” mass of hot air based on some 1200 km “averaging” and extrapolation scheme he came up with in some 1988 paper with a r^2 of 0.42 ….)
Now, is it +4.0 degrees? Or +0.4 degrees?

November 7, 2011 5:45 am

Ha! “Further funding is required …”
Here’s a simple Idea, Just scrap the Idea of ‘average global temperature’ problem solved!! And people like yourself can get on with real science.
All this averaging out of large amounts of data seems to show irrelevant results anyway, it does not relate very well to temperature nor can the result be used for any study related to causality which AGW enthusiasts do all the time, an environment with a variable range of temperature of between 0°C and 20°C can have any average temperature anywhere in-between at any giving time and it isn’t too hard to get the result you want just by mixing up and using different methods or by adding/subtracting favorable data.
Here’s an 8 day average temperature
1°C + 19°C + 3°C + 20°C + 2°C + 18°C + 0°C + 20°C :=83 /8 :=10.3 °C
Here’s another 8 day average temperature
1°C + 1°C + 1°C + 20°C + 1°C + 1°C + 0°C + 20°C :=44 /8 :=5.5 °C
Here’s a 16 day day average temperature from the above data calculated in two different ways
First 8 days + second 8 days 10.3 °C + 5.5 °C / 2 := 15.8°C
Complete 16 day
1°C + 19°C + 3°C + 20°C + 2°C + 18°C + 0°C + 20°C+1°C + 1°C + 1°C + 20°C + 1°C + 1°C + 0°C + 20°C :=128/16 :=8°C
Not a very scientific result to use or to show an accurate representation of the temperature of an environment over time, the margin of error on any variable temperature average is 50-100% Why? because it produces variable average results. if we model variable averages to trends such as Co2 the results become unusable as the rising trend of Co2 takes over and the the margin of error becomes 100% and all the data becomes unusable. Try to relate Solar effects to average global temperatures with a Co2 trend, It cant be done but reliable individual temperature readings fit very well.
Here’s my processors average temperature which is less variable over 8 days between 48°C and 50°C
49°C + 50°C + 48°C + 50°C + 49°C + 50°C + 49°C + 50°C :=395/8 := 49.3 °C
average temperature over 16 days
49°C + 50°C + 48°C + 50°C + 49°C + 50°C + 49°C + 50°C + 49°C + 50°C + 49°C + 50°C + 50°C + 48°C + 50°C + 49°C :=790/16 :=49.3°C
First 8 days + second 8 days 49.3 °C + 49.3°C:= 98.6/ 2 := 49.3°C
very accurate result!

Chuck Nolan
November 7, 2011 6:04 am

Further funding is required …
Willis have you contacted the “TEAM” to see if they could steer you towards some “Obama Money” to continue your research. They would be my first choice because they have really deep pockets. (Oh wait, their hands are in my pants pockets). Good luck.
If you get the chance tell Mann I want my data I paid for.

DirkH
November 7, 2011 6:21 am

Sparks says:
November 7, 2011 at 5:45 am
“Here’s a 16 day day average temperature from the above data calculated in two different ways
First 8 days + second 8 days 10.3 °C + 5.5 °C / 2 := 15.8°C

10.3 + 5.5 = 15.8
15.8 / 2 = 7.9
Forgot a bracket?

richard verney
November 7, 2011 6:38 am

DirkH says:
November 7, 2011 at 6:21 am
Sparks says:
November 7, 2011 at 5:45 am
//////////////////////////////////////////////////////////
Yes and isn;t 83 (first period) + 44 (second period) = 127 and not 128, so that the average of these periods should be 127/16 and not 128/16.
I have not checked the addition of the individual series.

P. Solar
November 7, 2011 6:40 am

Interesting post Willis, wise analysis.
It would be interesting to see the horizontal line recalculated for your stripped down versions. I presume this is the all latitudes average for all the data. It appears to be 0.14 K/decade
This is exactly what I just got from a totally different method looking at d2T/dt2 in HadCrut3
It needs writing up and checking but the bottom line was 1900 dT/dt trend of 0.2 K/century and second diff of 1.2 K/century^2 .
That would give a current value of 1.4 K / c as shown in that paper, and an average around 0.7 K/c for the last century, which seems to be a generally accepted value.
This sort of value is firmly in the DON’T PANIC zone.

Pamela Gray
November 7, 2011 6:43 am

If our degree of “agreement” were broadened, would not the correlation be near perfect? As in let’s round to the nearest whole degree and call anything less noise. That would solve a lot of problems with a noisy data set. Of course it would also cause a great deal of funding to simply dry up and blow away. Can anybody guess why?

ferd berple
November 7, 2011 6:50 am

Isn’t the real question why the norther hemisphere is getting warmer and the southern hemisphere is getting colder if CO2 is well mixed?
How is it even possible that the southern hemisphere is getting colder? AGW certainly doesn’t explain that. Aerosols certainly don’t explain that unless China and India have suddenly shifted to the southern hemisphere.
This is a huge mystery completely unaccounted for in current theories of climate change. There is no way the southern hemisphere should be cooling while the northern hemisphere is warming over a 30 year timespan.
Why isn’t climate science all over this? It is a huge unexplained mystery in an area in which the science was supposed to be settled.

P. Solar
November 7, 2011 7:00 am

jens raunsø jensen says:
>>
would you have any thoughts on this fundamental issue, eg. should we discourage the use of linear regression as a method to identify “the pattern” in temperature records ?
Thanks … jens
>>
The post war drop is significant oddity in the 20c record. The problem seems to arise when pretend scientists decide to use this trough as point of reference when calculating temperature “trends”.
GH warming is a log function of gas concentration. CO2 concentration is reckoned to be rising in a roughly exponential way. That gives a linear increase in the “forcing”. A radiative forcing will produce a rate of change of temperature not a simple change of temp. Anyone trying to fit a linear model to temperature has not understood the first thing about the physics. They are trying to fit a straight line to a parabola.
Fitting a century long linear regression to dT/dt using HadCrut3 temps gives credible figures. It would appear that that sort of timescale may be sufficient to even out the steps.

barry
November 7, 2011 7:02 am

richard verney,

What is the point of ‘reanalysis data’ when you have empirical observational data?

All temperature data has problems. All. We just don’t have God-like observational tools. Which is why data has to be tested and adjusted where appropriate, and why reanalyses are done, combining many different data streams. The goal is always to improve the fidelity of the data.
Willis gives the impression that the satellite data are better than the rest. But this is not necessarily the case. There are 5 global temperature satellite data sets that I know of, and the decadal trend for each ranges from 0.14 to 2.0C/decade degrees over the last 30 years or so.
Problems abound with satellite data. From the study Willis is citing:

Unfortunately, similar to the shortcoming in the radiosonde observations, the number of satellite instruments and changes in design impact observational practices and the application of the data. For example, the MSU data come from 12 different satellites and the data quality is significantly affected by intersatellite biases, uncertainties in each instrument’s calibration coefficients, changes in instrument body temperature, drift in sampling of the diurnal cycle, roll biases and decay of orbital altitude (Christy and Spencer, 2000; Zou et al., 2008).

Choices must be made. This paper builds on others (written by, for example, Kevin Trenberth), that try to point out problems with the data so that they can be better dealt with.

ferd berple
November 7, 2011 7:15 am

Seriously, anyone have an “evidence based” explanation of why there is almost perfect agreement between all the datasets that the NH has been warming for 30 years while the SH has been cooling?
What feature of the the earth has been changing that could account for this? I’m not buying CFC’s and Ozone as a cause. The ozone hole “appeared” at the south pole when almost all the CFC use was in the northern hemisphere. Looking further at the graphs above, there really hasn’t been very much net heating once you add the SH (negative) data from the NH (positive) data.
It makes one wonder if GW is more a product of land versus sea temperature? Most of the land is in the NH, while most of the oceans are in the SH. Is this what we are seeing in this data? Evidence that GW is mostly connected to portions of the globe where there is land. In which case, AGW would be more likely due to land use, which cannot be well mixed, because the land doesn’t move very fast.

November 7, 2011 7:16 am

The reanalysis datasets do have their issues including that they include the likely flawed RSS and UAH data. A discussion is at the following URL:
http://www.skepticalscience.com/Eschenbach_satellite_part.html

Latitude
November 7, 2011 7:22 am

Willis, don’t know if you saw this…………..
Envirocensors Hide Explosive Japanese Satellite Data
Scorching new evidence of the environmental left’s scientific obstruction has surfaced in the squelching of reports of Japanese satellite data, which suggest that the underdeveloped world emits far more carbon dioxide than previously imagined, even more than many Western nations! If the claim is substantiated, it could turn the entire meme that industrialized civilization is endangering the planet on its head.
http://rogueoperator.wordpress.com/2011/11/07/envirocensors-hide-explosive-japanese-satellite-data/

ferd berple
November 7, 2011 7:29 am

How is it possible that the SH has been cooling for 30 years and CO2 is well mixed?

November 7, 2011 7:38 am

DirkH says:
November 7, 2011 at 6:21 am
Thanks, I had it right the first time before someone came in and interrupted me, I’ve been rushing about all day, probably should have waited until later and put a bit more time into my comment. but I hope you still understand where I was going with it despite the error. It’s a travesty. 🙂

ferd berple
November 7, 2011 7:43 am

jens raunsø jensen says:
>>
would you have any thoughts on this fundamental issue, eg. should we discourage the use of linear regression as a method to identify “the pattern” in temperature records ?
Thanks … jens
>>
Most would agree that today’s temperature is not independent of yesterdays (autocorrelation).
Here is what wikipedia has to say:
Autocorrelation violates the ordinary least squares (OLS) assumption that the error terms are uncorrelated. While it does not bias the OLS coefficient estimates, the standard errors tend to be underestimated (and the t-scores overestimated) when the autocorrelations of the errors at low lags are positive.
In other words, linear regression can under-estimate data errors and natural variability.

ferd berple
November 7, 2011 7:49 am

Latitude says:
November 7, 2011 at 7:22 am
Envirocensors Hide Explosive Japanese Satellite Data
http://rogueoperator.wordpress.com/2011/11/07/envirocensors-hide-explosive-japanese-satellite-data/
I’d read about the Japanese findings and wondered why they were not being widely reported. This is huge because it shows that land use, not industrialization is driving CO2 emissions. Definitely not a message that the IPCC, WWF, and REDD wants anyone to hear.

November 7, 2011 7:51 am

I think it would be a fairer presentation of the analysis for the x-Axis to be plotted
Not as 90S to 90 N as equidistant increments.
rather the increments should be cos(of latitude) so that
0 to 10N is the widest increment and 80N to 90N is quite narrow.
This would visually reflect their contribution to their areas on the earth.
It is harder to label Latitude in the chart Excel with X = cos(latitude) than incrementing by latitude. But it might be effective to do it at least once.

Carrick
November 7, 2011 7:53 am

Robert:

The reanalysis datasets do have their issues including that they include the likely flawed RSS and UAH data.

I generally don’t bother with Sks but followed it since you linked it. I agree with Tom Curtis’ critique of the SkS post you linked.
“Likely flawed” just shows your (IMO) confirmation bias.

November 7, 2011 7:56 am

Another excellent article by Willis. As always, I gave it 5 stars.☺
Pamela Gray says:
“If our degree of ‘agreement’ were broadened, would not the correlation be near perfect? As in let’s round to the nearest whole degree and call anything less noise.”
Pamela rightly points out the elephant in the room: click
Error bands are wider than the tenths of a degree puportedly being measured. When a normal y-axis is used, the obvious conclusion is that there’s nothing to panic about. The small changes over the past fifteen decades are down in the noise.

commieBob
November 7, 2011 8:22 am

ferd berple says:
November 7, 2011 at 6:50 am
Isn’t the real question why the norther hemisphere is getting warmer and the southern hemisphere is getting colder if CO2 is well mixed?

Actually, it isn’t much of a mystery. The ratio of land to ocean in the NH is about 1:1.5 whereas in the SH it is 1:4. The ocean is much more efficient at storing heat than the land is.

For example, climate of Southern Hemisphere locations is often more moderate when compared to similar places in the Northern Hemisphere. This fact is primarily due to the presence of large amounts of heat energy stored in the oceans.

http://www.physicalgeography.net/fundamentals/8o.html

1 2 3 4
Verified by MonsterInsights