Tropical Tropospheric Amplification – an invitation to review this new paper

The Amplification Invitation

Guest Post by Willis Eschenbach

The ITCZ from space. Source: NASA Earth Observatory. Click for larger image

Tropical: the ITCZ from space. Source: NASA Earth Observatory. Click for larger image

A while ago I started studying the question of the amplification of the tropical tropospheric temperature with respect to the surface. After months of research bore fruit, I started writing a paper. My intention was to have it published in a peer-reviewed journal. I finished the writing about a week ago.

During that time, I also wrote and published The Temperature Hypothesis here on WUWT. This got me to thinking about science, and about how we establish scientific facts. In climate science, the peer review process is badly broken. Among other problems, it is often an “old boy” system where very poor work is waved through. In common with other sciences, turnaround of ideas in journals takes weeks. Under pressure to publish, journals often do only the most cursory examination of the papers.

Upon reflection, I have decided to try a different way to examine the truth content of my paper. This is to invite all of the authors whose work I discuss, and other interested scientists of all stripes, to comment on the paper and on whether they can find any flaws in it. To facilitate the process I have provided all of the code and data that I used to do the analysis.

To make this process work will require cooperation. First, I ask for science and science only. No discussions of motives. No ad homs. No generalizations to larger spheres. No asides. No disrespect, we can be gentlemen and gentlewomen. No comments on politics, CO2, or AGW, no snowball earth. This thread has one purpose only, to establish whether my ideas stand: to either attack and destroy the ideas I put forth in the paper below, or to provide evidence and data to support the ideas I put forth below.

People think science is a cooperative endeavor. It is not. It is a war of ideas. An idea is put out, and scientists gather round to attack it and disprove it. Sometimes, other scientists may support and defend it. The more fierce the attack, the better … because if it can withstand the strongest attack, it is more likely true. When your worst scientific enemies and greatest disbelievers can’t show that you are wrong, your ideas are taken as scientific fact. (Until your ideas in turn are perhaps overthrown). Science is a blood sport, but all of the attack and parry is historically done in private. I propose to bring it out in public, and I offer my contribution below as the first victim.

Second, I will insist on a friendly, appreciative attitude towards the contributions of others. We are interested in working together to advance our primitive knowledge of how the climate works. We are doing that by trying to tear my ideas down, to disprove them, to find errors in them. To make this work we must do this with respect for the people involved and the ideas they put forwards. You don’t have to smash the guy to smash the idea, in fact it reduces your odds.

Third, no anonymous posting, please. If you are think your ideas are scientifically valid, please put your name on them.

With that in mind, I’d like to invite any and all of the following authors whose work I discuss below, to comment on and/or tear to shreds this study.

J. S. Boyle,  J. R. Christy,  W. D. Collins,  K. W. Dixon,  D. H. Douglass,  C. Doutriaux,  M. Free, Q. Fu,  P. J. Gleckler,  L. Haimberger,  J. E. Hansen,  G. S. Jones,   P. D. Jones,  T. R. Karl,  S. A. Klein,  J. R. Lanzante,  C. Mears,  G. A. Meehl,  D. Nychka,  B. D. Pearson,  V Ramaswamy,  R. Ruedy, G. Russell, B. D. Santer,  G. A. Schmidt,  D. J. Seidel,  S. C. Sherwood, S. F. Singer,  S. Solomon,  K. E. Taylor,  P. W. Thorne,  M. F. Wehner,  F. J. Wentz,  and T. M. L. Wigley

(Man, all those 34 scientists on that side … and on this side … me. I’d better attack quick, while I have them outnumbered … )

I also invite anyone who has evidence, logic, theory, or data to disprove or to support my analysis to please contribute to the thread. Because at the end of this process, where I have exposed my ideas and the data and code to the attacks of anyone and everyone who can find flaws with them, I will have my own answer. If no one is able to disprove or find flaws in my analysis, I will consider it to be established science until someone can do so. Whether you consider it established science is up to you. However, it is certainly a more rigorous process than peer-review, and anyone who disagrees has had an opportunity to do so.

I see something in this nature, a web-based process, as the future of science. We need a place where scientific ideas can be brought up, discussed and debated by experts from all over the world, and either shot down or provisionally accepted in real time. Consider this an early experiment in that regard.

Three months to comment on a Journal paper is so 20th century. I’m amazed that the journals haven’t done something akin to this on the web, with various restrictions on reviewers and participants. Nothing sells journals like blood, and scientific blood is no different from any other.

So without further ado, here is my paper. Tear it apart or back it up, enjoy, ask questions, that’s science.

My best to everyone.

w.



A New Amplification Metric

 

ABSTRACT: A new method is proposed for exploring the amplification of the tropical tropospheric temperature with respect to the surface. The method looks at the change in amplification with the length of the record. The method is used to reveal the similarities and differences between the HadAT2 balloon, UAH MSU satellite, RSS MSU satellite, RATPAC balloon, AMSU satellite, NCEP Reanalysis, and five climate model datasets In general, the observational datasets (HadAT2, RATPAC, and satellite datasets) agree with each other. The climate model and reanalysis datasets disagree with the observations. They also disagree with each other, with no two being alike.


Background

“Amplification” is the term used for the general observation that in the tropics, the tropospheric temperatures at altitude tend to vary more than the surface temperature. If the surface temperature goes up by a degree, the tropical temperature aloft often goes up by more than a degree. If surface and tropospheric temperatures were to vary by exactly the same amount, the amplification would be 1.0. If the troposphere varies more than the surface, the amplification will be greater than one, and vice versa.

There are only a limited number of observational datasets of tropospheric temperatures. These include the HadAT2 and RATPAC weather balloon datasets, and two versions of the Microwave Sounding Unit (MSU) satellite data. At present the satellite record is about thirty years long, and the two balloon datasets are about 50 years in length.

Recently there has been much discussion of the the Santer et al. 2005, Douglass et al. 2007, and the Santer et al. 2008 papers on tropical tropospheric amplification. The issue involved is posed by Santer et al. 2005 in their abstract:

The month-to-month variability of tropical temperatures
is larger in the troposphere than at the Earth’s surface.
This amplification behavior is similar in a range of
observations and climate model simulations, and is
consistent with basic theory. On multi-decadal timescales,
tropospheric amplification of surface warming is a robust
feature of model simulations, but occurs in only one
observational dataset [the RSS dataset]. Other observations show weak or
even negative amplification. These results suggest that
either different physical mechanisms control
amplification processes on monthly and decadal
timescales, and models fail to capture such behavior, or
(more plausibly) that residual errors in several
observational datasets used here affect their
representation of long-term trends.


Metrics of Amplification

Santer 2005 utilizes two different metrics of amplification, viz:

We examine two different amplification metrics: RS(z), the ratio between the temporal standard deviations of monthly-mean tropospheric and TS anomalies, and Rβ(z), the ratio between the multi-decadal trends in these quantities.


Neither of these metrics, however, measures the amount of the amplification at altitude which is related to the surface variations. Ratios of standard deviations merely measure the size of the swings. They cannot indicate whether one is an amplified version of the other. The same is true of trend ratios. All they can show is the size of the difference, not whether the surface and atmosphere are actually correlated.

In order to measure whether one dataset is an amplified version of another, the simplest measure is the slope of the ordinary least squares regression line. This measures how much one temperature varies in relation to another.

Despite a variety of searches, I was unable to find published studies showing that the “amplification behavior is similar in a range of observations and climate model simulations” as Santer et al. 2005 states. To investigate whether the tropical amplification is “robust” at various timescales, I decided to calculate the tropical and global amplification (average slope of the regression line) at all time scales between three months and thirty years or more for a variety of temperature datasets.  I started with the UAH and the RSS versions of the satellite record. Next I looked at the HadAT2 balloon (radiosonde) dataset. The results are shown in Figure 1.

To measure the amplification, for every time interval (e.g. 5 months) I calculated the amplification of all contiguous 5-month periods in the entire dataset. In other words, I exhaustively sub-sampled the full record for all possible contiguous 5-month periods. I took the average of the results for each time period. Details of the method are given in the Supplementary Online Material (SOM) Sections 2 & 3 below.

I plotted the results as a curve which shows the average amplification for each of the various time periods from three months to thirty years (the length of the MSU datasets). This shows the “temporal evolution” of amplification, how it changes as we look at longer and longer timescales. I show the results at a variety of pressure levels in Figure 1. In general, at all pressure levels, short term (3 – 48 month) amplifications are much smaller than longer term amplifications.
willis2_fig1

Figure 1. Change of amplification with length of observation. Left column is amplification in the tropics (20N/S), and the right column is global amplification. T2 and TMT are middle troposphere measurements. T2LT and TLT are lower troposphere. Starting date is January 1979. Shortest period shown is amplification over three months. Effective weighted altitudes (from the RSS weighting curves) are about 450 hPa for the lower altitude TLT (~ 6.5 km) and 350 hPa (~ 8 km) for the higher altitude TMT.


Tropical Observations 1979-2008

Figure 1(a). UAH and RSS satellite data. This was the first analysis done. It confirmed the sensitivity of this temporal evolution method, as it shows a clear difference between the two versions (RSS and MSU) of the MSU satellite data. Both of the datasets (UAH and RSS) are quite close in the first half of the record. They diverge in the second half.

The
higher and lower altitude amplifications are very similar in both the RSS and the UAH versions. The oddity in Fig 1(a) is that I had expected the amplification at higher altitude (T2 and TMT) to be larger than at the lower altitude (T2LT and TLT) amplification. Instead, the higher altitude record had lower amplification. This suggests a strong stratospheric influence on the T2 and TMT datasets. Because of this, I have used only the lower altitude T2LT (UAH) and TLT (RSS) datasets in the remainder of this analysis.

Figure 1(b). HadAT2 radiosonde data. (Note the difference in vertical scale.) Despite the widely discussed data problems with the radiosonde data, this shows a clear picture of amplification increasing with altitude to 200 hPa, and decreasing above that. It confirms the existence of the tropical tropospheric “hot spot”, where the amplification is large. It conforms with the result expected from theoretical consideration of the effect of lapse rate. It also shows remarkable internal consistency. The amplification increases steadily with altitude up to 200 hPa, and decreases steadily with altitude above that. Note that the 1998 El Nino is visible as a “bump” in the records at about month 240. It is also visible in the satellite record, in Fig. 1(a).

Figure 1(c). HadAT2, overlaid with MSU satellite data. Same vertical scale as (a). The satellite and balloon data all agree in the first half of the record. In the latter half, the fit is much better for the UAH satellite data analysis than the corresponding RSS analysis. Note that the amplification of the UAH version is a good fit with the 500 hPa level of the HadAT2 data. This agrees with the theoretical effective weighted altitude of the T2LT measurement.

Global Observations 1979-2008

Figure 1(d). Global UAH and RSS satellite data. Note difference in vertical scale. There is little amplification at the global level.
Again, the UAH and RSS records are similar in the short term but not the long term.

Figure 1(e). Global HadAT2 radiosonde data. Note difference in vertical scale. Here we see that the amplification is clearly a tropical phenomenon. We do not see significant amplification at any level.

Figure 1(f). Global HadAT2, overlaid with MSU satellite data. Same vertical scale as (d). Once again, the satellite and balloon data all agree in the first half of the record. In the latter half, again the UAH analysis generally agrees with the observations. And again, the RSS version is a clear outlier.

Both in the tropics and globally, amplification of the levels above 850 hPa start low. After that they show a slow increase. The greatest amplification occurs at 8-10 years. After that, they all (except RSS) show a gradual decrease up to the 30 years end of the record.

Having seen the agreement between the UAH T2LT and the HadAT2 datasets, I next compared the tropical RATPAC data with the HadAT2 data. The RATPAC data is annual and quarterly. I averaged the HadAT2 data annually and quarterly  to match. They are shown in Fig. 2. Note that these are fifty-year datasets, a much longer timespan than Fig. 1.

willis2_fig2
Figure 2. RATPAC and HadAT2 Tropical Amplification, 3 years to 50 years. Left column is annual data. Right column is quarterly data.

There is very close agreement between the HadAT2 and the RATPAC datasets. The annual version shows a number of levels of pressure altitude. The quarterly version averages the troposphere down into two levels, one from 850-300 hPh, and one from 300-100 hPa. Both annual and quarterly RATPAC versions agree well with HadAT2.

Before going further, let me draw some conclusions about tropical amplification from Figs. 1 & 2.

1. Three of the four observational datasets (HadAT2, RATPAC, and UAH MSU) are in surprisingly close agreement. The fourth, the RSS MSU dataset, is a clear outlier. The very close correspondence between HadAT2 and RATPAC at all levels gives increased confidence that the observations are dependable. This is reinforced by the good agreement in Figs. 1(c) and 1(f) between the UAH MSU and the HadAT2 500 hPa level amplifications, both in the tropics and globally.

2. Figure 1(c) clearly shows the theoretically predicted “tropical hot spot”. It is most pronounced at 200 hPa at about 8-10 years. At its peak the 200 hPa level has an amplification of about 2. However, this gradually decays over time to a long-term (fifty year) amplification of about 1.5.

3. The lowest level, 850 hPa, has a short-term amplification of just under 1. It gradually increases over time to an amplification of about 1. It varies very little with the length of observations. RATPAC and HadAT2 are in excellent agreement regarding the amplification at the 850 hPa level.

4. The amplification of the next two levels (700 and 500 hPa) are quite similar. The higher level (500 hPa) has slightly greater amplification than the lower. Again, both datasets (RATPAC and HadAT2) agree very closely. This is supported by the UAH MSU satellite data, which agrees with the 500 hPa level of both the other datasets.

5. The amplification of the 300 and 200 hPa levels are also quite similar. The amplification of the higher level (200 hPa) exceeds that of the next lower level in the early part of the record. However, the 200 hPa amplification decreases over time more than that of the lower level (300 hPa). This accelerated long-term decay in amplification is also seen in the 150 and 100 hPa levels.

6. Between 700 and 200 hPa, amplification rises to a peak at around 8-10 years, and declines after that.

Observations and Models

Because it is the most detailed of the observational records, I will take the HadAT2 as a comparison standard. Fig. 3(a) shows the full length (50 year) HadAT2 record. Figures 3(b) to (e) show the outputs from five climate models. These models were selected at random. They are simply the first five models I found to investigate.

In Fig. 3(a) the decline in the observed amplification at the 200 hPa level seen in the shorter 30 year record in Fig. 1(c) continues unabated to the end of the 50 year record. The 200 hPa amplification crosses over the 300 hPa level and keeps declining. The models in Fig. 3(b-f), however, show something completely different.

willis2_fig3
Figure 3. Three month to fifty year amplification for HadAT2 and for the output of various computer models.

The model results shown in Figs 3(b) to (e) were quite unexpected. It was not a surprise that the models disagreed with observations. It was a surprise that the model results varied so widely among themselves. The atmospheric amplification at the various pressure levels are very different in each of the models

In the observations, the greatest amplification is at 200 hPa at around eight to ten years. Only one of the models, the GISSE-R, Fig. 3(d), reproduces that slow buildup of amplification. Unfortunately, the model amplification continues to increase from there on to the end of the record, the opposite of the observations.

The observed amplification at all levels except 850 hPa peaks at 8-10 years and then decreases over time. This again is opposite to the models. Amplification in all levels above 850 hPa of all of the models either stay level, or they increase over time.

In the observations, amplification increases steadily with altitude from 850 hPa to 200 hPa. Not a single model showed that result. All of the models examined showed one or more inversions, instead of the expected steady increase in amplification with altitude shown in the observations.

The 850 hPa observations start slightly below 1, and gradually increase to 1. Only one model, BCCR (c), correctly reproduced this lowest level.

Variability of Observations and Models

To investigate the natural variability in the amplification of both observations and models, I looked at thirty-year subsets of the various 50-year datasets. Figure 4 shows the amplification of thirty-year subsets of the datasets and model output. This shows how much variability there is in thirty year records. I show subsets taken at 32 month intervals, with the earliest ones at the back of the stack.

Once again, there are some conclusions that can be drawn from first looking at the observations, which are shown in Fig 4 (a).

1. The relationship between the various layers is maintained in all of the subsets. The lowest level (850 hPa) is always at the bottom. It is invariably below (smaller amplification than) the rest of the levels up to 200 hPa. The 700/500 hPa pair are always very close, with the higher almost always having the greater amplification. The 200 hPa level is above the 300 hPa level for all of the early part of the record. This order, with amplification steadily increasing with altitude up to 200 hPa, holds true for every one of the thirty-year subsets of the observations.

2. The 700 hPa amplification is never less than the 850 hPa amplification. As one goes lower, so does the other. They cross only at the short end of the time scale.

3. At all but the 850 hPa level, amplification peaks at somewhere around 8-10 years, and subsequently generally declines from that peak.

4. The amplification at 200 hPa is much larger than at 300 hPa at short (decade) timescales, but decreases faster than the 300 hPa amplification.

5. There is a surprising amount of variation in these thirty-year overlapping subsets. This implies that the satellite record is still too short to provide more than a snapshot of the variation in amplification.

willis2_fig4
Figure 4. Evolution of amplification in thirty year subsets of fifty-year datasets. The interval between subsets is 32 months. Fig. 4(a) is observations (HadAT2). The rest are model hindcasts.

The most obvious difference between the models and the observations is that most of the models have much less variability than the observations.

The next apparent difference is that the amplification in the models trend level or upwards with time, while the observed amplifications generally trend downwards.

Next, the pairings of levels are different. The only model which has the same pairings as the observations (700/500 and 300/200 hPa) is the HadCM3 model. And even in that model, both pairs are reversed from the observations. The other models have 700 hPa paired with (and often below) the 850 hPa level.

There is a final oddity in the model results. The short term (say four year) amplification at 200 hPa is very different in the various models. But at thirty years the models converge to a range of around 1.5 to 1.9 or so. This has led to a mistaken idea that the models reveal a “robust” long term amplification.

DISCUSSION

Having examined the changes in amplification over time for both observations and models, let us return to re-examine, one by one, the issues involved as stated at the beginning:

The month-to-month variability of tropical temperatures
is larger in the troposphere than at the Earth’s surface.

This is clearly true. There is an obvious tropical “hot spot” of high amplification in the upper troposphere. It peaks at a pressure altitude of 200 hPa at about 8-10 years.

This amplification behavior is similar in a range of
observations and climate model simulations, and is
consistent with basic theory.


This amplification behavior is in fact similar in a range of observations. However, it is very dissimilar in a range of climate model simulations. While the observations are consistent with basic theory (amplification increasing with altitude up to 200 hPa), the climate models are inconsistent with that basic theory (higher levels often have lower amplitude than lower levels).

On multi-decadal timescales,
tropospheric amplification of surface warming is a robust
feature of model simulations, but occurs in only one
observational dataset [the RSS dataset].


There are no “robust” features of amplification in the model simulations. They have very little in common. They are all very different from each other.

Multi-decadal amplification decreases gradually over the 50-year observational record. Three of the observation datasets (UAH, RATPAC, and HadAT2) all agree with each other in this regard. The RSS dataset is the outlier among the observations, staying level over time. This RSS behavior is similar to several of the models, which also stay level over time. One possible explanation of the RSS difference is that I understand it uses computer climate modeling to inform a portion of its analysis of the underlying MSU data.

Other observations show weak or
even negative amplification.


None of the tropical observational datasets above 850 hPa show “negative amplification” (amplification less than 1) at timescales longer than a few years. On the other hand, all of the observational datasets show negative amplification at short timescales, as do all of the models.

These results suggest that
either different physical mechanisms control
amplification processes on monthly and decadal
timescales, and models fail to capture such behavior, or
(more plausibly) that residual errors in several
observational datasets used here affect their
representation of long-term trends.


The problem is not that the observations fail to capture the long term trends. It is that every model disagrees with every other model, as well as disagreeing with the observations.

It appears that different physical mechanisms do indeed control the amplification at different timescales. The models match the observations in part of this, in that amplification starts out low and then rises to a peak over a period of years. In most of the models examined to date, however, this happens on a much shorter time scale (months to a few years) compared with observations (8-10 years).

However, the models seem to be missing a longer-term mechanism which leads to long-term decrease in amplification. I suspect that the problem is related to the steady racheting up of the model temperature by CO2, which increases the longer term amplification and leads to upward trends. Whatever the cause may be, however, that behaviour is not seen in the observations, which decrease over time.

Conclusions

1. Temporal evolution of amplification appears to be a sensitive metric of the state of the atmosphere. It shows similar variations in the various balloon and satellite datasets with the exception of the RSS MSU satellite temperature dataset.

2. The wide difference between the individual models was unexpected. It was also surprising that none of them show steadily increasing amplification with altitude up to 200 hPa, as basic theory suggests and observations confirm Instead, levels are often flipped, with higher levels (below 200 hPa) having less amplification than lower levels.

3. From this, it appears that we have posed the wrong question regarding the comparison of models and data. The question is not why the observations do not show amplification at long timescales. The real question is why model amplification is different, both from observations and from other models, at almost all timescales.

4. Even in scientific disciplines which are well understood, taking the position when models disagree with observations that “more plausibly” the observations are incorrect is adventurous. In climate science, on the other hand, it is downright risky. We do not know enough about either the climate or the models to make that claim.

5. Observationally, amplification varies even at “climate length” time scales. The thirty year subsets of the observations showed large changes over time. Climate is ponderous and never at rest. Clearly there are amplification cycles and/or variations at play with long timescales.

6. While at first blush this analysis only applies to temperatures in the tropical troposphere, it would not be surprising to find this same kind of amplification behavior (different amplification at different timescales) in other natural phenomena. The concept of amplification, for example, is used in “adjusting” temperature records based on nearby stations. However, if the relationship (amplification) between the stations varies based on the time span of the observations, this method could likely be improved upon.

Additional Information

The Supplementary Online Material contains an analysis of amplification in the AMSU satellite dataset and the NCEP Reanalysis dataset. It also contains the data, the data sources, notes on the mathematical methods used, and the R function and R program used to do the analyses and to create the graphics used in this paper.

References

Douglass DH, Christy JR, Pearson BD, Singer SF. 2007. A comparison of tropical temperature trends with model predictions. International Journal of Climatology 27: Doi:10.1002/joc.1651.

Santer BD, et al. 2005. Amplification of surface temperature trends and variability in the tropical atmosphere. Science 309: 1551–1556.

Santer BD et al. 2008. Consistency of modelled and observed temperature trends in the tropical troposphere. Int. J. Climatol. 28: 1703–1722

SUPPLEMENTARY ONLINE MATERIAL

SOM Section 1. Further investigations.


AMSU dataset

A separate dataset from a single AMSU (Advanced Microwave Sounding Unit) on the NOAA-15 satellite is maintained at <http://discover.itsc.uah.edu/amsutemps/&gt;. Although the dataset is short, it has the advantage of not having any splices in the record. The amplification of that dataset is shown in Fig. S2.

willis2_fig5

Figure S-1 Short-term Global Amplification of AMSU satellite data, MSU data, and HadAT2 data . The dataset is only ten years long.

Figure S-1(a). AMSU satellite data. This is a curious dataset. The 400 hPa level is a clear and dubious outlier. The distinctive “duck’s head facing right” shape of the second half of the 900 and 600 hPa levels is similar, while that of the 400 hPa level is quite different. It is very doubtful that one level would be that different from the levels above and below it. This was a strong indication of some unknown error with the 400 hPa data that is affecting the longer term amplification.

Figure S-1(b). Adjusted and unadjusted AMSU satellite data. To attempt to correct this error, I added a simple linear trend to the 400 hPa level. I did not adjust it until the amplification was right. Instead, I adjusted it until its long-term trend ended up proportionally between the long-term trends of the layers above and below. This gave the adjusted version (light green) of the amplification.

Figure S-1(c). Adjusted AMSU satellite data. After the adjustment of the 400 hPa trend, the amplifications fit together well. Curiously, despite being adjusted by a linear trend, the shape of the latter half of the 400 hPa level has changed. Now it has the “duck’s head facing right” shape of the 600 and 900 hPa levels. This unexpected change supports the idea that there is indeed an error in the trend of the 400 hPa data.

Figure S-1(d). Interpolated AMSU satellite data. Unfortunately, the referenced levels in the two datasets are at different pressure altitudes than HadAT2. To compare the AMSU to HadAT2, we need to interpolate. Fortunately, the HadAT2 dataset range (850 to 200) fits neatly inside the AMSU range (900 to 150). This, along with the similar and close nature of the signals at various levels, allows for linear interpolation to give the equivalent AMSU amplification at the HadAT2 levels. The interpolated version is shown.

Figure S-1(e). HadAT2 and RSS/UAH satellite data. The global observational data over this short period (ten years) is scattered. Also, the HadAT2 data has a more jagged and variable shape. We may be seeing the effects of the paucity of the observations. In the short-term (this is ten years and less) the RSS and UAH amplification records are quite similar. As before, both are close to the 500 hPa HadAT2 amplification.

Figure S-1(f). AMSU and HadAT2 data. Close, but not a good match. The 200, 700, and 850 hPa levels match, but 300 and 500 hPa are quite different. Overall, the satellite data seems more reliable. The AMSU data shows a gradual change in amplification with altitude. The HadAT2 data is bunched and sometimes inverted.

My conclusions from the AMSU dataset are:

1. It contains an error, which appears to be a linear trend error, in the 400 hPa level.

2. Other than that, it is the most internally coherent of the observational datasets.

3. It points up the weakness of using short (one decade) subsets of the HadAT2 dataset.
NCEP REANALYSIS

One of the attempts to provide a spatially and temporally complete global dataset despite having limited observational data is the NCEP reanalysis dataset. Figure S-2 compares the temporal evolution of amplification of HadAT2 and NCEP Reanalysis output

willis2_fig6
Figure S-2 Evolution of amplification of HadAt2 and NCAR. Left column is amplification from 3 months to 50 years, right column is amplification of 30 year subsets of the 50 year datasets. The interval between the individual realizations in the right column is 32 months.

The NCEP reanalysis data in Fig. S-2 (b) shows a fascinating pattern. The three middle levels (700,500, and 300 hPa) are close to the HadAT2 observations. The 300 hPa levels agree extremely well. And while the 700 and 500 hPa levels are flipped in NCEP, still they are in the right location and are very close to the observed values.

But at the same time, the amplification of the lowest and highest levels are way off the rails. The 850 hPa amplification starts at 1, and just keeps rising. And the 200 hPa amplification starts out reasonably, but then takes a big jump with a peak around thirty years. That seems doubtful.

The observation that there are problems at the lowest and highest levels is reinforced by the analysis of the variation of thirty year subsets in the right column of Fig. S-2. These show the 200 hPa amplification varying wildly over all of the different 30 year datasets. In one of the thirty year subsets the 200 hPa amplification dips down to almost touch the highest 850 hPa line. There is clearly something wrong with the NCEP output at the 200 hPa level.

In addition, in the full NCEP record shown in Fig. S2(b) and all of the 30 year subsets shown in Fig. S2(d), the lowest level (850 hPa) increases steadily over time. After about 20 years it has more amplification than either of the 700 and 500 hPa levels. This behavior is not seen in either the observations or any of the models.

Conclusions from the NCEP reanalysis

1. The 700, 500, and 300 hPa level of the NCEP reanalysis are accurate. The 850 and 200 hPa levels suffer from large problems of unknown origin.

2. Use of the NCEP reanalysis in other work seems inadvisable until the 850 and 200 hPa amplification problems are resolved.


SOM Section 2. Data Sources

KNMI was the source for much of the data. It has a wide range of monthly data and model results that you can subset in various ways. Start at http://climexpknmi.nl/selectfield_co2.cgi?someone@somewhere . It contains both Hadley and UAH data, as well as a few model atmospheric results. My thanks to Geert for his excellent site.

 

Surface data for all observational datasets is from the CruTEM dataset at http://climexp.knmi.nl/data/icrutem3_hadsst2_0-360E_-20-20N_n.dat

UAH data is at http://www.nsstc.uah.edu/data/msu/t2lt/uahncdc.lt

RSS data is at http://www.remss.com/data/msu/monthly_time_series/

HadAT2 balloon data is at http://hadobs.metoffice.com/hadat/hadat2/hadat2_monthly_tropical.txt

CGCM3.1 model atmospheric data is at http://sparc.seos.uvic.ca/data/cgcm3/cgcm3.shtml in the form of a large (250Mb) NC file.

Data for all other models is from the “ta” and “tas” datasets from the WCRP CMIP3 multi-model database at <https://esg.llnl.gov:8443/home/publicHomePage.do&gt;

In particular, the datafiles used were:

GISSE-R: ta_A1.GISS1.20C3M.run1.nc, and tas_A1.GISS1.20C3M.run1.nc

HadCME: ta_A1_1950_Jan_to_1999_Dec.HadCM3.20c3m.run1.nc, and tas_A1.HadCM3.20c3m.run1.nc

BCCR: ta_A1_2.bccr_bcm2.0.nc, and tas_A1_2.bccr_bcm2.0.nc

INCM3: ta_A1.inmcm3.nc, and tas_A1.inmcm3.nc


As all of these are very large (1/4 to 1/2 a gigabyte) files, I have not included them in the online data. Instead, I have extracted the data of interest and saved this much smaller file with the rest of the online data.


SOM Section 3. Notes on the Function and Code.

The main function that does the calculations and created the graphics is called “amp”.

 

The input variables to the function, along with their default values are as follows:

datablock=NA : the input data for the function. The function requires the data to be in matrix form. By default the date is in the first column, the surface data in the second column, and the atmospheric data in any remaining columns. If the data is arranged in this way, no other variables are required The function can be called as amp(somedata), as all other variables have defaults.

sourcecols=2 : if the surface data is in some column other than #2, specify the column here

datacols=c(3:ncol(datablock)) : this is the default position for the atmospheric data, from column three onwards.

startrow=1 : if you wish to use some start row other than 1, specify it here.

endrow=nrow(datablock) : if you wish to use some end row other than the last datablock row, specify it here.

newplot=TRUE : boolean, “TRUE” indicates that the data will be plotted on a new blank chart

colors=NA : by default, the function gives a rainbow of colors. Specify other colors here as necessary.

plotb=-2 : the value at the bottom of the plot

plott=2 : the value at the top of the plot

periods_per_year=12 : twelve for monthly data, four for quarterly data, one for annual data

plottitle=”Temporal Evolution of Amplification” : the value of the plot title

plotsub=”Various Data” : the value of the plot subtitle

plotxlab=”Time Interval (months)” : label for the x values

plotylab=”Amplification” : label for the y values

linewidth=1 : width of the plot lines

linetype=”solid” : type of plot lines

drawlegend=TRUE : boolean, whether to draw a legend for the plot

SOM Section 4. Notes on the Method.

An example will serve to demonstrate the method used in the “app” function. The function calculates the amplification column by column. Suppose we want to calculation the amplification for the following dataset, where “x” is surface temperature, “y” is say T200, and each row is one month:

x   y
1   4
4   7
3   9

Taking the “x” value, I create the following 3X3 square matrix, with each succeeding column offset vertically by one row. This probably has some kind of special matrix name I don’t know, and an easy way to calculate it. I do it by brute force in the function:

1     4     3
4     3     NA
3    NA    NA

I do the same for the “y” value:

4     7     9
7     9     NA
9    NA    NA

I also create same kind of 3X3 matrix for x times y, and for x squared.

Then I take the cumulative sums of the columns of the four matrices. These are used in the standard least squares trend formula to give a fifth square matrix:

slope of regression line = (N*sum(xy) – sum(x)*sum(y)) / (N*sum(x^2) – sum(x)^2)


I then average the rows to give me the average amplification at each timescale.

This method exhaustively samples to find all contiguous sub-samples of each given length. This means that there will be extensive overlap (samples will not be independent). However, despite the lack of independence, using all available samples improves the accuracy of the method. This can be appreciated by considering a fifty year dataset. There are a number of thirty year contiguous subsets of a fifty year dataset, but if you restrict your analysis to non-overlapping subsets, you only can pick one of them … which way will give the best estimate of the true 30-year amplification?

SOM Section 5. Of Averages and Medians.

The distribution of the short-term amplifications is far from normal. In fact, it is not particularly normal at any scale. This is because the amplification is calculated as the slope of a line, and any slope is the result of a division. When the divisor approaches zero, very large numbers can result. This makes averages (means) inaccurate, particularly at the shorter time scales.

One alternative is the median. The problem with the median is that it is not a continuous function. This limits its accuracy, particularly in small samples. It also makes for a very ugly stair-step kind of graph.

Frustrated by this, I devised a continuous Gaussian mean function which outperforms the mean for some varieties of datasets, and outperforms the median on other datasets. It is usually in between the mean and median in value. In all datasets I tested, it equals or outperforms either the mean or the median.

To create this Gaussian mean I reasoned as follows. Suppose I have three numbers picked at random from some unknown stable distribution, let’s say they are 1,2, and 17. What is my best guess as to the actual underlying true center of the distribution?

Since we don’t know the true distribution, the best guess as to its shape has to be a normal distribution. With such a distribution, if we know the standard deviation, we can iteratively calculate the mean.

To do so, we start by calculating the standard deviation, and picking a number for the estimated mean, say 5. If that is the mean, the numbers (1,2,17) when measured in standard deviation units is (-0.6, -0.5,  1.2). Each of those standard deviations has an associated probability. I figured that the sum of these individual probabilities is proportional to the probability that the mean actually is 5.

I then iteratively adjust the estimated mean to maximize the total probability (the sum of the three individual probabilities. It turns out that the gaussian mean of (1, 2, 17) calculated by my method is 3.8. This compares with an average for the three numbers of 6.7, and a median of 2.

In general there is very little difference between my gaussian mean, the arithmetic mean, and the median. However, it is much better behaved than the mean in non-normal datasets. And unlike the median, it is a continuous function, which gives greater accuracy.

All three options are included in the amp() function, with two of them remarked out, so you can see the effects of each one. The only noticeable difference is that the mean is not very accurate at short time scales, and the gaussian mean and the mean are not discontinuous like the median.

Advertisements

114 thoughts on “Tropical Tropospheric Amplification – an invitation to review this new paper

  1. I am too scientifically illiterate to comment on your thesis, but I applaud your courage in testing it in open forum. Good luck!

  2. This is a welcome step in the right direction, Mr. Watts. I hope you receive the required response from your peers. I hope, too, that you will share the results in good time. A layman’s version for the climate science challenged readers like myself would be greatly appreciated if your busy schedule allows.

    Good luck.

    REPLY: Just so this is made clear, I did not author this. Willis Eschenbach did, as shown clearly under the title – Anthony

  3. You have done an amazing amount of work here Anthony. I don’t have time to read all of it right now, but it seems thorough, as is all you do. There appear to be 5-6 pictures that are not displayed, just a box with a red ‘X’, and ‘show picture’ doesn’t bring them through. Congratulations on finishing this.

    REPLY: Just so this is made clear, I did not author this. Willis Eschenbach did, as shown clearly under the title – Anthony

  4. After laying your heart and soul on the line, all I have to offer is this:

    I think it’s pretty when the yellow and green squigley lines twist around each other.

    Don’t hate me

  5. This is a very elegant way of presenting your paper and I suggest you present it in similar wording to the RealClimate community. How could they refuse and not applaud such a pleasant approach to peer review?

  6. Nice work, Willis. I hope that your efforts will get some sort of open response from climate scientists, even if it’s pure rebuttal. That’s how understanding improves. I agree that an “open source” philosophy and open review via the internet will be an almost inevitable trend for science moving forward. It will be resisted, of course, but those resisting will eventually lose.

  7. This paper is by Willis Eschenbach while I think it is great that Anthony has printed it you may want to address comments to Willis. I doubt I will I have only briefly skimmed it so far.

  8. Willis E:
    A few people seem to think Anthony wrote this — they must not know you or Anthony very well. :)

    A couple of questions:

    1. I assume you need a surface temperature baseline to which tropospheric temperature is compared to calculate amplification. Perhaps I missed it, but what data set are you using for the surface temperature?

    2. To me it looks like all of the models show tropospheric amplification over all time scales analyzed. To me that would mean that tropospheric amplification is a robust result of the models — the details are not the same but the general concept of amplification is robust. Your definition seems to be different. Why do you say that tropospheric amplification (in general) is not a robust feature of the models?

  9. Could there be some sort of relationship involving shear? Poleward, there is generally more shear than equatorward.

  10. Heating in the tropics seems to vary with the interrelation of humidity and barometric pressure in the vertical. Can you model one and observe the other and vice versus?

    Nice work and great idea to throw this out to the community at large. It would be nice to have a “sticky link” at the sidebar to follow the comments over time.

  11. Willis,

    Just some standard research style writing suggestions.

    1. Do a search for the word “I” in your paper and change every sentence to remove the first person singular. For example change “I next did a graph of comparisons.” to “A graph of comparisons was completed.”

    2. This paper is not narrow enough and should probably be done in parts or just narrowed down. You should be able to narrow down your topic to one simple sentence (two or fewer commas and no connecting “and”s) and have that sentence clearly define what the paper is about. If you can’t, your topic is not narrow enough. Your title should be that sentence. Your’s does not tell me what the paper is about and is waaayyy to broad.

    3. Within each paper, follow standard research article section protocol. For example, ,

    1. Abstract (should include the conclusion, it is NOT a teaser)
    2. Introduction of general nature of paper (not too specific here)
    3. Literature Review (and especially at the end of this review, what has not been done in all these papers but should be done)
    4. Problem Statement (the why of this research endeavor, why do we need what you have done)
    5. Research design, equipment, software, and procedures (is it a meta analysis, application of a new statistical technique, experiment, etc, and then the steps you followed)
    6. Results (cold hard data)
    7. Discussion (what cold hard data means)
    8. Conclusions (new or expanded insight)
    9. Recommendations (additional research needed, IE next step that everyone should consider)
    10. Appendix (your codes, data, etc)

    Or something like that.

    4. Buy a manual on writing research papers. This paper betrays the writer as someone who has not done this before, or at least does not do it for a living. There are several good ones that will help you through the step by step process of writing up your study. Your offering here is in dire need of editing, needs to be MUCH shorter, and should be considered a first and rough draft only.

  12. Kudos to Mr Eschenbach for giving his brainchild up to slaughter with such reckless abandon, I do enjoy observing the valiant efforts of both sides in a battle of science. I will certainly look forward to reading the discussion but I second the call for a layman’s version of this paper. I am still new to climate science and need some explanation.

    Thank you Mr Watts for hosting this here – maybe it could develop into a science battle tab at the top with even more papers being discussed openly, warts and all. This kind of accessibility and transparency is exactly what (climate) science needs right now. – As long as you provide some form of commentary for the spectators, that is.

  13. Unless I’m completely misreading the header to this article (“Guest Post by Willis Eschenbach”), we should be thanking Anthony for the forum (and bandwidth), but thanking Willis for the article itself.

    Good work, Willis. My chops are way too rusty to make any substantive comments, but I commend you on your efforts. I hope it is the first step in a long and fruitful endeavor.

  14. You have no error bars on your plots. Given the uncertainties attached to the observational data you use it’s necessary to present the error bars.

    Why do you only select 5 models? The IPCC model output database contains output from all GCMs involved in the IPCC process. If you are only going to select a small subset of models you should give proper justification for doing so, demonstrate that this selection does not affect your results or use all of the available data. In addition, have you used ensemble model output or are you using single realisation model output? You discuss the inter model variability yet you include no discussion of how these model runs were setup. It’s therefore difficult to assess the inter model variability. This is compounded by the apparent lack of an assessment of the ensembles and the ensemble means from each model. These are pertinent points in light of your conclusion that inter model variability is unexpected and at odds with theory. You focus on trying to defend your central conclusion in a more thorough way.

    You’ve not included your references in this version. Including references would ease the burden on the reviewer.

    Some of your plots have very confusing legends. You should have either a standard legend at the bottom of each panel or have legends which describe each dataset in each plot.

    REPLY: please read the section on posting comments, in red above. – Anthony

  15. Recommend that you remove this comment. IMHO, it is inconsistent with the spirit that you purport to want to create.
    “(Man, all those 34 scientists on that side … and on this side … me. I’d better attack quick, while I have them outnumbered … )”

  16. As the averaging length gets up towards 1/2 of sample length surely end effects swamp any signal?

  17. I don’t think potential responders to your invitation really needed all those admonishments not to do this, and not to do that. Hardly seems like a good way to start off a scientific discussion.

  18. Pamela Gray (09:08:11):

    “…”

    Thanks a lot, Pamela. I needed your style writing guideline.

    @Willis… I’ve read your paper and found it very useful. From the next paragraph in your paper:

    “In the observations, amplification increases steadily with altitude from 850 hPa to 200 hPa.”

    My first thought was on the natural explanation of this phenomenon through the processes of induced emission and negative induced emission. It would explain the amplification of the tropical upper troposhere temperature. The algorithm should be integrated to models.

  19. Willis: Do you have any thoughts as to why there is a long term decline in the amplification factor as shown by the observational record? And having very limited statistical faculties – could I ask: if there was a ‘phase change’ mid or late in the data record, would this drag the trend line down – put another way, does your calculation of trend obscure a cyclic pattern?

    I ask because from my reading of the climate models, they do not incorporate the major ocean cycles – such as the PDO, which appears to have a global signature over roughly 30 year time periods.

  20. Fascinating work Willis! Certainly some food for thought…

    My own opinion (at this point) of the “amplification” issue is that it provides strong evidence for a bias in the long term trends of the surface data sets. That amplification seems to vary with timescale is, essentially, my reasoning!
    http://www.climateaudit.org/phpBB3/viewtopic.php?f=3&t=740
    It seems to me that you reckon that their is some physical reason why the amplification declines in the long term…would not a warm bias in the surface data introduce precisely that effect? Readers here are surely aware of the problems with the surface data by now. ;)

  21. Willis,

    I don’t have time right now to go through your interesting paper in detail, but a couple of quick comments are in order.

    1) At some point you need to deal with issues of uncertainty in the models and data (systematic, random weather noise, etc.). Scientists such as Santer and Schmidt will argue that everything is in agreement because the uncertainties are so large. Specifically in your case, are the downward slopes of the amplification versus time for the nonRSS data statistically different from the flat or increasing slopes of the models?

    2) I am having a hard time reconciling your graphs with those from Santer et al. and Douglas et al. For example, Santer et al. (2008) shows trends for HadAT2 and RATPAC-A that are below the surface trends for ALL pressures. This would imply an amplification below 1 for those data sets, but you have amplifications well above 1at 200 and 300 hPa. Why the discrepancy?

    3) Using only two balloon data sets opens you to charges of “cherrypicking,” especially since these two were among the lowest-trend sets in Santer. What happens if your analysis is applied to RAOBCORE, IUK, and RICH, to name some others from Santer et al. (2008)?

    I hope these comments are useful.

  22. Willis, regarding: ” 4. Even in scientific disciplines which are well understood, taking the position when models disagree with observations that “more plausibly” the observations are incorrect is adventurous. In climate science, on the other hand, it is downright risky. We do not know enough about either the climate or the models to make that claim. ”

    I’m certain that E. T. Jaynes (among others ) would agree wholeheartedly with this statement. Ref: “Probability Theory As Extended Logic”, http://bayes.wustl.edu/ .

  23. Willis

    great work though I am not really qualified to comment on the science to the layman it makes sense. I hope you keep up the good work.

  24. Brian Buerke (10:11:24) : RAOBCORE makes use of ERA-40 which has been shown to have a spurious warm bias:
    Sakamoto, M. and J.R. Christy, 2009: The influences of TOVS radiance assimilation on
    temperature and moisture tendencies in JRA-25 and ERA-40. J. Atmos. Oc. Tech.,
    doi:10.1175/2009JTECHA1193.1.
    RICH is similarly but not as extensively effected. See the EPA Christy comment in my post above.

    IUK appears to be in-between RATPAC and RICH.

    Speaking of “cherry picking” Santer et al. make some curious choices when it comes to their use of SST’s, and obsolete ones at that!

    That besides, Santer et al. curious decide to truncate the observational trend analyses around 1999, which would have changed their conclusions otherwise:
    http://arxiv.org/pdf/0905.0445

  25. Re Pamela Gray (09:08:11) :

    While your lesson on the structure of research papers is doubtless accurate, good writers of all kinds generally eschew the passive tense, though it is often used to give a false sense of importance to the proceedings.

    /Mr Lynn

  26. I hope there is some way to print this out. I find that I can’t read this type of paper on screen very easily, and keep track of details. The nomenclature is somewhat unfamiliar to me, so I would have to do some extra digging.

    Two things did stand out to me.

    You defined “amplification” as the ratio of Tropospheric to Surface (variables); then from time to time, you talk a bout “negative” amplification.

    To me, negative amplification would mean if Surface value goes up, then Tropospheric value must go down, and verse vicea. But I’m thinking that you really mean a ratio less than one, rather than greater than one. I would not call that :negative; and would suggest you find an alternative descriptor.

    The other thing I noted is that universally, all your graphs start of with an amplification ratio that is way less than one, and then quickly rise to some higher value.
    Is that a real physical phenomenon that can be observed; or is that some artifact of the mathematicval processes. It seems odd to me that the first time I obsever surface, and tropospheric temperatures that the tropospheric is always going to be smaller; just because I never looked before.

    Good luck on getting responses from those authors you listed.

    I can’t comment intelligently, since I have no knowledge of exactly what the theoretical basis for the various computer models mentioned is.

    I can think of lots of reasons why the troposphere, and the surface would not exactly track each other. I’m sensing that you are starting with some assumption that there is some causative relationshipo between the two.

    What if in fact the driving sources of energy are something(s) else, and they simply drive the surface and the troposphere differently.

    If I give a bundle of US dollars to one person; say a woman; and a like bundle to a second person, say a man; the woman is likely to go and buy a bunch of clothes, or jewellery (among other things); whereas the man will buy golf clubs or perhaps fishing gear.

    That does not mean that when a man buys a fishing rod, his wife will go out and get a new dress; or vice versa. they just react to discretionary money differently.

    So just what is your thesis regarding the relationship you have described here ?

    George

  27. Willis,
    Thanks for showing the integrity and thick skin required for such an open approach. I will hold comments on content until I fully understand the subject matter but would like to amplify on Pamela’s style guidance.

    Pamela Gray (09:08:11) said:
    “…
    8. Conclusions (new or expanded insight)
    9. Recommendations (additional research needed, IE next step that everyone should consider)
    …”

    I would like to add that all conclusions should be supported by the cold hard facts presented in the report. Also all recommendations should be supported by the conclusions.

    Of course as always the cold hard facts must stand alone or be supported. The less likely each fact is to generate a positive head nod the more supporting documentation it needs.

    Good luck with this new experiment.

  28. “Three months to comment on a Journal paper is so 20th century. I’m amazed that the journals haven’t done something akin to this on the web, with various restrictions on reviewers and participants. Nothing sells journals like blood, and scientific blood is no different from any other.”

    It is being done – check out PLoS ONE sort of an “open source” approach to science publication.
    http://www.plosone.org/home.action
    http://www.plosone.org/static/information.action

  29. Some good comments. I’ll add just one: please add standard footnote references (superscript or bracketed numerals [#]), especially for this:

    “The concept of amplification, for example, is used in ‘adjusting’ temperature records based on nearby stations.[#] ”

    I’m not sure this could be true. It sounds too ridiculous, even for climatology. Well, okay, maybe not, but let’s have the reference, anyway. Thanks.

  30. I do not know if you have time to answer the following question:

    Under Figure 2, you include the following in your conclusions about tropical amplification from Figs. 1 & 2.:
    “2. Figure 1(c) clearly shows the theoretically predicted “tropical hot spot”. …”

    Point of interest by this old skeptic of AGW: Are you referring to the model profile hot spot of the type on page 675 of chapter 9 in the IPCC Assessment report 4?

    I am slugging through your work. Will have comments that I hope will provide positive contributions.

    Thank you,

  31. ABSTRACT: A new method is proposed for exploring the amplification of the tropical tropospheric temperature with respect to the surface. The method looks at the change in amplification with the length of the record. The method is used to reveal the similarities and differences between the HadAT2 balloon, UAH MSU satellite, RSS MSU satellite, RATPAC balloon, AMSU satellite, NCEP Reanalysis, and five climate model datasets In general, the observational datasets (HadAT2, RATPAC, and satellite datasets) agree with each other. The climate model and reanalysis datasets disagree with the observations. They also disagree with each other, with no two being alike.

    I suggest re writing the abstract for starters. I like the topic you explore.
    1.. Elaborate on method and new method.
    2 In the last statement for example, re word it to be more specific. “disagree with observation” Re word it like
    a. Do not use the word “disagree’ Use an expression like Oberservations are inconsistent with reanalysis data sets. Then explain how they differ.
    b as soon as you apply the word “disagree” you ask the reader to seek or reject agreement. But you are Pamela offered some ideas to improve the structure and flow of your document. You are on the right track.

    Break this down some more:
    The method looks at the change in amplification with the length of the record

  32. As a Ph.D. I would have liked to participate in this review but unfortunately your wish to ignore standard review practice of anonymity prevents that.

  33. Heck, Willis,

    If you listen to this lot you’ll have a lifetime ahead of you just refining and re refining your article ad infinitum.

    Whilst supporting everything you say and your ambition and intent I do feel that your description of tropical events whilst worthy and at least in broad terms correct needs to be supplemented by a wider global perspective.

    As I said in your other thread your ideas about convective events in the tropics fit nicely into my global scenario by providing more detail on the tropical area which then goes on to drive the global changes in the air which I think were initially provoked by sea surface temperature variations (ultimately by solar variations).

    I think we both agree that an enhanced rate of energy transfer from air to space has the potential to scotch the effects of any GHG changes and thereby prevent them from warming the oceans and changing the global temperature equilibrium. Only the oceans can change that in my opinion.

    Your ideas adequately deal with the tropical end of things but not the rest of the global process.

    Good luck and best wishes. Cracking the political/media/scientific grant maintained establishment/energy producer monolith was always going to be an uphill struggle but at least the real world is currently on our side.

  34. Willis

    Politically correct fascism can seriously damage one’s career.
    While you seek the pure scientific method, in the interest of upholding the scientific method in the midst of such minefields, may I recommend you reconsider “no anonymous posting, please” to accommodate those like John Doe who wish to submit comments anonymously.
    Perhaps the moderators could verify that the person has the credentials claimed.

    REPLY: it is up to Willis- A

  35. I don’t have time to answer all of these right now, I will return to them, but many thanks to those who have written.

    Someone above comments:

    As a Ph.D. I would have liked to participate in this review but unfortunately your wish to ignore standard review practice of anonymity prevents that.

    This, unfortunately, is one of the down sides of peer review. You get to snipe at some poor bugger, but you don’t need to sign your name. Why should reviewers be unidentified? Either you believe in your comments enough to sign them, or you don’t. Why should you be free to anonymously attack my work? If you are telling the truth, what on earth do you have to worry about? Are you concerned that people will find out what you really believe? I thought that was part of science.

    Anonymous reviews are part of the problem with science, not part of the solution. When I have submitted articles for review, the reviewers know who I am. I’m not protected by anonymity and they can anonymously put me out of business … what’s up with that? I want to face my accusers in public, not have them blackball me in their private gentleman’s clubs.

    So I would invite the anonymous Ph.D. to screw your courage to the sticking point, and post under your own name. Like I said, science is a blood sport, and I’m tired of being stabbed in the back by nameless masked men. Either do it without a mask, or sit back and watch the fun. I would love to read your thoughts.

    Finally, yes, I know it is not in proper scientific format. Somebody chided me for my poor attempt at humor, which was my saying there’s 34 of them and only one of me, so I need to attack while I have them outnumbered. Well, that’s in the intro, not in the paper, and science could stand to lighten up a bit in my opinion … but if that offends you, my apologies.

    Pamela Gray, your points on style are well taken, but that’s part of the reason that I chose this way of publishing, because I have greater latitude. In addition, when the work is done by thirty or so scientists as the Santer et al. papers were, it’s easy to say “we” and use the passive tense. I have no one but my self, laboring away in my small apartment in the Solomon Islands … not a whole lot of “we” there, and very little passivity.

    I will provide a pdf soon, it is easier to read in that form. I will be quite busy for the next few days (moving on Friday) so please excuse my temporal lapses …

    w.

    PS – Someone pointed out above that anonymity allows one to post their true scientific opinions without endangering their career. Sorry, but if your career requires that you hide your scientific beliefs, you’re in the wrong job. Get a new job if your current job censors you. That’s a dilemma, but the solution is not to end the requirement for putting your name where your mouth is. The solution is to take a job where you can speak your mind, where you don’t have to lie about your beliefs …

  36. Third, no anonymous posting, please. If you are think your ideas are scientifically valid, please put your name on them.

    If the idea can stand on its own, why not let it through?

  37. Willis: I’m a non-scientist, but is there a reason why you didn’t also include Richard Lindzen or Roy Spencer?

  38. I salute you sir! Hats off?

    A fresh breeze blows this way! At last… This reminds me of old times. When science was science. I see no reason for this forum not having at least equivalent stature as any of the majors when it comes to review. In fact, it already has -witness the stature afforded by the EPA!

    Best of luck on the challenge!

  39. nanny_govt_sucks (13:34:11), you ask:
    :

    Third, no anonymous posting, please. If you are think your ideas are scientifically valid, please put your name on them.

    If the idea can stand on its own, why not let it through?

    Look, obviously I will not be fanatical about this, as your username shows. If you truly feel that you will put your career in danger by posting under your real name, or you have what you consider a valid reason to not stand behind your ideas by putting your name on them, then use an alias.

    Like I said, I’m just tired of being stabbed by anonymous masked men … but I also don’t want to see this hijacked into a discussion on anonymity. So lets do it this way. If you can’t live without your alias, fine … but I strongly encourage you to use your real names. So PhD, post away. Can’t say I like it, but discussing anonymity is too much of a distraction from discussing the science.

    Richard DeSousa (13:36:25), I had forgotten that I had mentioned the excellent work of both Lindzen and Spencer. My apologies to both men, and I certainly invite them to comment on my work.

    w.

  40. In my world, anonymity is synonymous with cowardliness. In my experience, cowards are more likely to spew venomous lies than to shed truth on any given subject. I believe the author is wise to insist that all participants in his grand experiment come out into the sunlight and expose their true identities.

    CH

  41. Richard deSousa (13:36:25) : Willis is hoping for comments from the scientists who authored the particular papers he references. Unless you count his publication of the satellite data with Christy, Roy hasn’t written anything on the amplification issue to my knowledge. Lindzen has but Willis makes no reference to it. For instance:
    Lindzen, R.S. (2007) Taking Greenhouse Warming Seriously, Energy & Environment, 18, 937-950.
    But the focus here seems to be on the Douglass/Santer debacle.

  42. Oh, yeah, one final comment before I go to work (I’m at GMT+11).

    Anthony kind of jumped the gun a bit, and I did not get him the code and the data on time. I am providing all of the data and code used to prepare the paper, so anyone can reproduce my results and check my math. I’ll get that to Anthony tonight, so he can post a link to it with the paper.

    Best to all,

    w.

  43. I see a couple of advantages to anonymous review, along with at least one disadvantage.

    The disadvantage first: any moron can (and likely will) take potshots at the article and all its aspects, hiding safely behind his/her anonymity.

    The advantages:

    1) serious scientists who might enjoy critiquing and offering comments via WUWT could do so anonymously, but would not do so openly due to peer pressure / fear of job loss / recriminations.

    2) Anonymity removes the gravitas of famous names in science — and thus the potential for placing undue weight on those opinions.

    3) Anonymity encourages commenters to challenge the opinions of all, including famous scientists, whereas some might not be so bold if the identity of the famous was known.

    4) Anonymity may encourage the timid, or some who are not so bold, to speak up (or write forth in this case) with a different view or comment that otherwise they may hold back for fear of ridicule or disgrace.

    I do not believe any of this is new, as these considerations prevailed for millenia in open debates. This open review will be fascinating to observe, and perhaps to participate in, no doubt.

    Reply: No more OT discussions of anonymity please ~ charles the moderator

  44. Willis,

    Unfortunately you’ve chosen to go on the attack and complain about perceived attacks instead of maintaining the civility that you requested in your intro.

    A few of us asked direct questions about your methodology. Perhaps you could take a few minutes and answer those.

  45. Maybe this experiment needs two comment threads. All comments go on one and the scientific/review material is moved across only by a moderator.
    If this process works (and science will need it to regain its credibility) then an archive copy of the main steps will be needed?

  46. For those looking for a print copy, a simple copy and paste into MS Word should suffice. I do that regularly with posts found on WUWT to great success. Willis’ paper from the abstract through the end is twenty-one pages in my preferred format.

  47. Interesting approach Willis (note it’s not Anthony), it maybe better if you had posted on realclimate as well to get a wider audience. I don’t know for sure, but I suspect you’ll only get responses where you’ve FUBARed because the Team tend not to get into debate on the science as they don’t seem to feel they have a very strong case? I don’t know why but they don’t, and to be sure if I’d achieved my goal of persuading the politicos of GW I’d hardly jeopardise it by entering into open debate with those that disagreed with me. Betcha Nature and Science won’t publish your efforts.

    Good luck though.

  48. The long term surface temperature record appears to show a linear increase with a roughly 60 year oscillation superimposed on top of it. This combined with an extended lag in the tropospheric temperatures could be producing the declining amplification shown in figure 3(a). Look for example at a plot of stock prices with trailing 30, 50 and 200 day moving averages. The longer period averages of an oscillating plot will have a smaller amplitude.

    If the climate models you used for comparison do not reproduce the oscillation of the surface temperatures they would not show the decline in amplification over the longer time periods.

    It would be interesting to see a plot of the individual amplifications (for example the 30 year) over time to see if it varies over time in sync with the long period surface temperature variation.

  49. The author appears too defensive. He would be wise to try to incorporate the suggestions. There is a reason for scientific style and structure. This is not the place to go into detail about scientific writing but this reviewer has published in a scientific journal and has taken graduate course work in scientific writing.

    If the end goal of this paper is to stand shoulder to shoulder with papers that have made it into snail mail journals, the author is advised to swallow the bullet and use the same style and structure. It is important to note that the author presumably comes to the table with less credentials than the scientists listed. If this paper under consideration is to get into the ring, let alone the viewing stand, with the referenced scientific papers in the review section, this paper must wear similar attire. As it stands, it is a difficult paper to read and suffers from lack of a clear path to the conclusion, ergo the need for standard structure.

    This reviewer would also second another comment made above. A paper that reviews opposing literature (which is a good thing to do but very difficult to pull off) must do so with standard courtesy. Be careful what is written and how it is written in the literature review section. Try to use dry, neutral language. Authors of opposing ideas must agree to disagree without being disagreeable (IE, don’t use that word).

    One more thing, this reviewer must be aging. This new style of referring to oneself in a scientific paper with first person singular pronouns doesn’t sit well with this reviewer. It is easier to read a scientific paper with the emphasis on what the researcher DID. No what the RESEARCHER did. Even when reviewing a paper, the reviewer usually refers to him or herself with the phrase, “This examiner…” or “This reviewer…”

    Get the idea?

  50. I am an avocational, amateur climate scientist. As a regular consumer at this site, I am aware of the first line of defense against research that refutes AGW. Those who dare to oppose are slimed with labels such as “non-scientist”, “not a climate scientist”, “professional opponent funded by big oil”, or other forms of personal attack. The AGWers simply refuse to engage on the merits of the issues.

    With that in mind, I Goggled Willis and discovered that he is an amateur. I ponder this and realized that Einstein was also an amateur Go for it, Willis. You are in great company.

  51. Pamela,
    Your comments on style are well considered and will need to be taken into account if the author decides to publish this in a more traditional medium.

  52. Willis,

    This is an interesting experiment in open-source science. As it breaks a mold, you will get criticisms for not adhering to standard form. If you truly care less for personal credit than for advancement of knowledge, let them pass without reply. Good ideas eventually will be adopted as Wegener and Milankovich have proven.

  53. Those taking note of the unusual circumstances of a request for comments on this not yet submitted paper have valid points . This excercise in simulated peer review need not be constrained by so many pedantic restrictions. It is so realclimatelike. The criticisms or support are either scientifically valid, or they are not, regardless of the author’ s identity. Snip this and lose this interested and up to now appreciative reader.

    REPLY: Willis made this request, he’s headed into night now where he’s at, so it may be a few hours before we hear back from him. However, there’s a reply from him further up in comments that addresses your issue.- Anthony

  54. Willis,

    One small suggestion. People who have not read Santer, et al (2005 or 2008), and who do not have easy access to these papers, would benefit from a more detailed summary/description of those papers and their methodology so that your contrasting results would be more meaningful.

  55. REPLY: Willis made this request, he’s headed into night now where he’s at, so it may be a few hours before we hear back from him. However, there’s a reply from him further up in comments that addresses your issue.- Anthony

    Thank you. You are a gentleman and a scholar. I did not see Willis’s response. It should have not been necessary. His undoubtedly sincere and strenuous effort will stand or fall on it’s own. I do commend him for graciously conceding his error.

  56. Since this does not have to be a formal review but a comment on the paper, although Pamela’s 2nd comments are very valid I would suggest the following:

    1) Remove 75% of the content and aim of this paper. In this I mean concentrate predominantly on defining a metric for amplification and dealing with autocorrelation. Show what happens when taking a difference between surface to troposphere trends for various time periods. Is the persistance signal minimised and autocorrelation of trends better accounted for. Show that it is possible to define a better metric by calculating a previous metric and comparing.
    2) Show all errors/uncertainties for the trends. This is especially important with reference to point 1. If the metric has less uncertainty than the ratios then introduce some examples for application and comment. If the mathematics of the metric are robust enough leave further detailed analysis of data sets for a different paper.

    3) Always use passive voice, “The author notes that ” or don’t mention the author at all.

    4) Lastly, make sure your uncertainties and errors have influence over the remit of your conclusions. This is not a typo. Be brutally honest about your analysis and its pros and cons. Also do not make statements that cannot be referenced or that appear general i.e. “Climate is ponderous and never at rest” , which is also why cutting a lot of information out and just concentrating on the metric and rationale of defining it is better

    It’s an interesting analysis and a good start. Once the autocorrelation issues are better appraised then it could be very useful.

  57. Look people, there are standard forms for many things. Examples: Recipes. Math proofs. Fables. Table of Contents. Indexes. Citations. There really is a reason for these forms. The message is clearer if standard form is used. Can you imagine trying to find a page number of an index item if instead of a simple word and page numbers you get a rambling paragraph? It would certainly be a fresh way of doing things, but not very useful.

  58. Willis you’d be well advised to listen to Pamela Gray.

    Pamela Gray (15:03:26) : is essentially what you would get back from whatever journal you submitted your work to! They would reject your paper on style and it would be returned without any comments of the substance.

    Save yourself a step if you are truly serious about trying to get this published.

    Best of luck (even if I don’t agree with your conclusions!)

    Ben

  59. There needs to be physical basis/explanation to carry on the analysis beyond the timelines of the forces at work here. 600 months is 50 years.

    The tropospheric amplification needs to have boundaries around the timelines which can be logically tied to the physical warming effects provided by surface temperatures. These are the timelines I consider relevant.

    18 hours – the greenhouse effect operates at the speed of light modulated by atmospheric, land and ocean molecules ability to store up the energy represented by photons from the Sun. The energy represented by a photon from the Sun only spends 18 hours in the Earth system on average before it escapes to space. Other than the timelines noted below, all the amplification in the troposphere from surface temperatures occurs within that 18 hours – over 90% of the total amplification effect will be felt within this time period.

    35 days – surface temperatures on land peak about 35 days after the solstices. Land molecules can store up a few tens of a Watt/metre^2 of energy from the Sun each day for 35 days – after that the energy starts to leak off and is lost to space – with a delay of 18 hours before the troposphere starts to feel this effect again.

    80 days – ocean surface temperatures peak about 80 days after the solstice. Like the land surface temperatures lag behind the Sun’s energy by 35 days throughout the year, the ocean surface lags behind the Sun by about 80 days throughout the year. The troposphere feels this effect about 18 hours later than the 80 day lag again.

    Several years – the upper ocean down to several hundred metres lags behind surface temperatures by several years. The troposphere in the long-term could therefore lag behind the surface for several years (probably at a very minimal level) as the upper ocean absorbs some of the energy.

    1500 years – the deep ocean takes up to 1,000 years to catch up to surface temperatures. In the interval, the deep ocean may be absorbing energy from the surface that would available to amplify the troposphere. Ice sheets could be slowly absorbing energy and ice-sheet melt and vegetation changes can affect the Earth’s albedo and, thus, on the same 1500 year timeline, affect the troposphere amplication factor.

    So, generally I think much shorter timelines need to be incorporated and any timeline longer than several years can be left for future generations to examine.

  60. Willis,

    An interesting article. Here are a few comments that I have on it:

    (1) You have tried very hard here to push the models until they break…I.e., to find a more sensitive metric that can probe in more detail exactly what the data and models show, and that is great. However, I would suggest that you do the same in regards to the experimental data. Hence, I would recommend looking at, say, the RAOBCORE re-analysis (despite the claims that some may make that it has some bias…After all, there are lots of concerns about the general biases in the radiosonde data set that RAOBCORE is trying to correct). And, you could look how sensitive your result (particularly concerning whether the amplification trend with time interval is up or down at the longer times) is to the overall trend in the data set in order to understand how sensitive things are to possible artifacts in the data set that affect that the overall multidecadal trends without having much effect on the shorter-term variability, since it is the longer term secular trends that are most subject to artifacts in the observational data sets.

    (2) I think that your conclusions are in general overstated based on the results that you show. Where you see the glass half-empty, I see it half-full. I.e., you seemed to consider it some sort of significant problem that you can find any disagreement between the models and the data. In fact, I would be shocked if you couldn’t, particularly given how hard you are trying to develop a metric to test things as diligently as possible. Honestly, when I look at Figs. 3 and 4, I say, “Wow…The models are doing pretty well at getting the most of the basic features correct. Are there discrepancies? Sure…and it would be interesting to further probe these and understand what problems with the models or the data they could be due to. But, the take-away message is that there are many basic features that are quite robust across the data and the models and then also some notable differences in the details.” In essence, I think you are creating a strawman that is “The models are perfect” and then demolishing it by showing that they are not perfect. (Although the extent to which the discrepancy is due to imperfections of the models and the extent to which they are due to limitations or problems with the data are unclear. However, I am willing to imagine that at least some of it is due to imperfection of the models). I think your presentation would seem more balanced if you had more emphasis of the points where the models and observational data do agree rather than only the points where they differ.

    (3) You comment on Santer et al.’s statement that “Other observations show weak or even negative amplification”. However, I know that their statement of negative amplification in particular applied to the UAH data that was available at the time, which was just before a major fix was applied to that data that I am pretty sure changed the amplification (at least as they defined it) from negative to positive. So, it isn’t really fair to say this statement by Santer is wrong since it is based on the UAH data that has been corrected between the time they made that statement and the time you have done your analysis. [See here for UAH’s record of the corrections to their data: http://vortex.nsstc.uah.edu/data/msu/t2lt/readme.17Apr2009 I think that the correction that I am referring to is the one from 7 Aug 2005. (Note the statement “This artifact contributed an error term in certain types of diurnal cycles, most noteably in the tropics.”)]

    (4) Just to point you in this direction in case you missed it, Arthur Smith posted some stuff on his blog on the tropical tropospheric amplification issue using your metric as a jumping-off point here: http://arthur.shumwaysmith.com/life/content/hot_spot_redux_analysis_of_tropical_tropospheric_amplification I have to admit that I haven’t read that in a long time and even then not that carefully, but I thought you might want to have a look at it to see where it agrees and disagrees with your analysis.

    (5) I agree with some of Pamela’s stylistic points but I disagree with her on always using the passive voice…I think that is somewhat “old school” thinking these days and the scientific community has begun to catch up with the rest of the world in rejecting the need to use the passive voice so exclusively. In particular, I have noticed that overuse of the passive voice sometimes even makes it difficult to determine when the author is referring to work that he/she has done versus when the reference is to the work of others. Having written almost all of my papers with co-authors, I am not sure what the guidelines are for using “I” vs “we” though. I kind of think that “we” sounds better, but I know that Physical Review once had a prohibition against the use of “we” for a single-author paper that led an author to once add his dog as a co-author so that he could keep the “we”!

  61. Brian– I think some answer to your questions may be found at the Christy link in the post just before yours.

  62. I will not comment on the science of the paper as it is far outside my specialty but I have two suggestions. In the Observations and Models section first paragraph you mention choosing models at random, you clarify this in the next sentence but I suggest just leaving the sentence out or use different wording. Random sort-of implies you looked at a bunch and picked 5 from a hat or some such thing. My second suggestion is to evaluate the scales and symbology on the graphs. The third set of graphs with constant Y axis were much easier to compare. Sometimes doing this makes thing more difficult to read and a lager vertical magnification helps, but I would at least look into it. Also using only color to separate lines on a plot makes it tough for the color blind out there also the few print technical journals that I read do not incorporate color, although on-line versions often do.

    I would also try and incorporate Pamela’s suggestions on style.

  63. Dear Joel,

    Gawd. I’m old school. Listen here sonny boy. I can run circles around you without even having any overies! Gawd damned whipper snapper. That said, your comments are quite good.

  64. Willis,

    A couple more comments:

    (6) You say “The oddity in Fig 1(a) is that I had expected the amplification at higher altitude (T2 and TMT) to be larger than at the lower altitude (T2LT and TLT) amplification. Instead, the higher altitude record had lower amplification. This suggests a strong stratospheric influence on the T2 and TMT datasets.” I think this is a fairly well-known issue so I think it is not really such an oddity and it is a good place to make reference to some previous literature (the lack of such references being a general weakness of this paper that others have already pointed out).

    (7) You say

    Even in scientific disciplines which are well understood, taking the position when models disagree with observations that “more plausibly” the observations are incorrect is adventurous. In climate science, on the other hand, it is downright risky.

    However, if one looks at the context in which Santer et al. made that claim, it seems to be in arguing that they believe that the tropospheric tropical amplification as seen in the models and not so much in the data at multidecadal timescales really is there. And, strangely enough, it seems to me that your analysis seems to confirm this basic fact…In fact, you seem to find that the amplification is there in the data (perhaps a little weaker at the multidecadal timescales than on average in the models although it depends on which model you compare to). So, you seem to chide Santer et al. for their hubris at the same time as you confirm their basic point that the amplification is really there at the multidecadal timescales. One thing that confuses me a bit though is why you seem to see it in the data sets whereas they don’t…I.e., how does your analysis of the data (in the 30-year limit) differ from theirs (besides the fact that you used the corrected UAH data set)? Is it the difference between the length of the data set that you used vs what they used (since they didn’t have 30 years yet); Or, is there some other difference in the definition of the amplifications or what?

  65. Wally, your comment on “random selection” is close to the central issue of Willis’ method. There are several vitally important parameters regarding “random selection” that researchers should be well-school in. It would be wise to study these when doing any kind of small subject pseudo-random selection. Decisions regarding choice of statistic analysis depends on subject number and “kind” of randomness.

    Regarding my own research, because my study subjects served as their own control, I needed to find subjects who had clear ABR’s with clicks (white noise clicks) in order to find frequency specific responses to pips. That meant that I had to perform 1000’s of data collection runs in order for the f to be acceptable for such a small subject number (6). And back then, I hand entered the data using punch cards. Sonny.

  66. Pamela is right and gives good advice. So does Michael Corbett.

    It is worth taking the time to polish good work into concise professional presentation and it makes it much easier and more rewarding to read and review.

    An important part of any paper is identifying what remains uncertain to anticipate any criticism. Be the first to find your own weaknesses.

  67. Willis,

    Do not be discouraged by format/form comments. Although it is true that if you do not submit your income taxes on the appropriate format they will not be accepted by authorities, it is not true that a scientific paper in a format different from institutionalized science can be rejected by authority. This is simply because there are no authorities in science . . . no appeal to authority fallacies in the final outcome.

    Go for it. I will read you paper in more detail as my schedule permits.

    Regards,
    John

  68. OK, first things first. Got home, haven’t even read the thread.

    The data, the function, and the R code used to generate all the graphics is located at:

    http://homepage.mac.com/williseschenbach/.Public/Amplification.zip

    There are PDF and Word DOC copies of the original post, “A New Amplification Metric” (about 2 mb each) at

    http://homepage.mac.com/williseschenbach/.Public/Tropical_Amplification.pdf

    and
    http://homepage.mac.com/williseschenbach/.Public/Tropical_Amplification.doc

    My apologies for not posting this with the original paper, life gets away from you sometimes.

    Anthony, if you’d be kind enough to to post these links at the foot of the original paper, it would be much appreciated.

    Onwards,

    w.

  69. Willis,

    I may be way off base but I do not see what you did shows what you claim. Amplification is an integral value of running present temperature ratios. If the data record or model calculated temporal resolution is not adequate, assumptions have to be made and supported on what happens between points. You can average the amplification over different length time periods, but each amplification point in the average still has to be the very short time period ratio, and stretching out the time between points is not physically meaningful. As one comment stated, the physics of all the processes that probably affect the temperatures occur on a short time basis except for deep sea effects. Based on the above comments, the values for all periods of about 3 months are well less than 1 and thus there is no amplification greater than 1. If you look at 3 month records restarting at all times along the record and average them over the total time, you still get less than 1. Please reply.

  70. Willis,

    My previous comment was overly simplified. In order for the troposphere to add net heat to the Earth (i.e., have positive amplification), the very short time values of T to the 4th power of the troposphere minus T to the 4th power of Earth have to be examined, so large day to night swings are not symmetrical in their effect. The integral value of this difference then can then be converted to a short time value of average amplification, which then has to be integrated over the desired longer time interval. I don’t think there is much data with the required temporal resolution to do this properly, but even a small amount of data, along with estimated swings for other data, may allow approximate values to be used. Only then do you have the effective amplification. This can then be averaged over longer periods, but not the way you did it. If each point in the short average is less than one, then all longer averages will still be less than one.

  71. JohnV (08:37:51)

    1. I assume you need a surface temperature baseline to which tropospheric temperature is compared to calculate amplification. Perhaps I missed it, but what data set are you using for the surface temperature?

    At the KNMI web site listed in the paper, you can get the global surface temperature for any combination of latitude and longitude. I used the data at:

    http://climexp.knmi.nl/data/icrutem3_hadsst2_0-360E_-20-20N_n.dat

    My thanks, I should have included that in the data section. For the models, I used each model’s surface air temperature datasets. See the data section above.

    2. To me it looks like all of the models show tropospheric amplification over all time scales analyzed. To me that would mean that tropospheric amplification is a robust result of the models — the details are not the same but the general concept of amplification is robust. Your definition seems to be different. Why do you say that tropospheric amplification (in general) is not a robust feature of the models?

    “Robust” to me means that the different models would give the same results. While amplification occurs in all models, the results are very different. The amplification shown by the models have little in common, as Fig. 4 shows.

    Glug (09:19:14), thanks for your pertinent questions.

    You have no error bars on your plots. Given the uncertainties attached to the observational data you use it’s necessary to present the error bars.

    See my discussion of error bars above.

    Why do you only select 5 models? The IPCC model output database contains output from all GCMs involved in the IPCC process. If you are only going to select a small subset of models you should give proper justification for doing so, demonstrate that this selection does not affect your results or use all of the available data.

    Because I live in the Solomon Islands, the datasets are huge, I collect what I can from hotel broadband when I travel, I have been stymied by the gigantic size of the datasets exceeding my computer’s memory, I don’t have thirty people helping me, and I work full time at my day job …

    Having said that, all of those are excellent questions. I can only report that I was surprised by the differences in how the models represent the atmosphere. I would be exceedingly surprised if I had picked five that were totally unrepresentative of the full group.

    However, had we but world enough, and time …

    In addition, have you used ensemble model output or are you using single realisation model output? You discuss the inter model variability yet you include no discussion of how these model runs were setup. It’s therefore difficult to assess the inter model variability. This is compounded by the apparent lack of an assessment of the ensembles and the ensemble means from each model. These are pertinent points in light of your conclusion that inter model variability is unexpected and at odds with theory. You focus on trying to defend your central conclusion in a more thorough way.

    Excellent points. I have used individual model runs, as this is the output of the model. To see if the results were different in different runs, I compared two GISSE runs. The temporal evolution of amplification was nearly identical in the two runs. While this was only one test, the results were so close that it was clear that I was measuring something not affected by whether it was run one or run two, something more basic to and characteristic of the model itself.

    You’ve not included your references in this version. Including references would ease the burden on the reviewer.

    Analyzing amplification by looking at how it evolves over time is a new method, that as far as I know, I invented, although it is likely it has been done before … so no references there. I referenced the three papers that I was discussing. I would have thought that those three papers have provided all the background needed. What other ideas or statements would you like referenced?

    Some of your plots have very confusing legends. You should have either a standard legend at the bottom of each panel or have legends which describe each dataset in each plot.

    Thank you, I will attend to that. An excellent point.

    Robert (09:22:02) :

    Recommend that you remove this comment. IMHO, it is inconsistent with the spirit that you purport to want to create.
    “(Man, all those 34 scientists on that side … and on this side … me. I’d better attack quick, while I have them outnumbered … )”

    Agree, apologies, my poor attempt at humor …

    Sandy (09:24:10) :

    As the averaging length gets up towards 1/2 of sample length surely end effects swamp any signal?

    I would have thought so too, but a look at Fig. 4 shows great variety. Since those are thirty year subsamples of a fifty year dataset, it seems that the signal is not swamped for large subsamples.

    Peter Taylor (10:03:22) :

    Willis: Do you have any thoughts as to why there is a long term decline in the amplification factor as shown by the observational record?

    My guess is that it is a signal of the temperature thermostat I describe The Thermostat Hypothesis. Since the Earth’s temperature is always trending towards some central value, in the long term the amplification value will trend down. The models, on the other hand, contain no such natural thermostat. They have instead a CO2 ratchet that is guaranteed to raise their temperature over time. This built-in underlying trend leads to a long-term upward trend in amplification.

    And having very limited statistical faculties – could I ask: if there was a ‘phase change’ mid or late in the data record, would this drag the trend line down – put another way, does your calculation of trend obscure a cyclic pattern?

    I ask because from my reading of the climate models, they do not incorporate the major ocean cycles – such as the PDO, which appears to have a global signature over roughly 30 year time periods.

    If I understand your question, my analysis is separate from the question of cycles. Certainly they exist.

    timetochooseagain (10:07:08) :


    It seems to me that you reckon that their is some physical reason why the amplification declines in the long term…would not a warm bias in the surface data introduce precisely that effect? Readers here are surely aware of the problems with the surface data by now. ;)

    See above.

    The amplification can also be measured against the lowest 850 hPa level. This is unaffected by any surface temperature problems. It shows the same decline in amplification in the upper levels over time.

    Brian Buerke (10:11:24) :


    1) At some point you need to deal with issues of uncertainty in the models and data (systematic, random weather noise, etc.). Scientists such as Santer and Schmidt will argue that everything is in agreement because the uncertainties are so large. Specifically in your case, are the downward slopes of the amplification versus time for the nonRSS data statistically different from the flat or increasing slopes of the models?

    I discuss uncertainties above, as shown in Fig. 4.

    2) I am having a hard time reconciling your graphs with those from Santer et al. and Douglas et al. For example, Santer et al. (2008) shows trends for HadAT2 and RATPAC-A that are below the surface trends for ALL pressures. This would imply an amplification below 1 for those data sets, but you have amplifications well above 1at 200 and 300 hPa. Why the discrepancy?

    As the title states, we are using different metrics. He is simply comparing the long-term trends. This does not measure whether one is an “amplified” version of the other. I use the slope of the regression line.

    3) Using only two balloon data sets opens you to charges of “cherrypicking,” especially since these two were among the lowest-trend sets in Santer. What happens if your analysis is applied to RAOBCORE, IUK, and RICH, to name some others from Santer et al. (2008)?

    None of those are what I would call datasets. They are complex computer analyses, using different algorithms. IUK is bizarre, with the 500, 300, and 200 layers all together at about 1.5, the 700 hPa layer with an amplification of 1, and the 850 hPa layer with an amplification of 0.75. Unlike anything, models or observations. I looked at it a while ago, I could dig it out. I haven’t found RAOBCORE gridded data, just station data. You are right, I probably should have showed IUK, but the paper was so long anyhow.

    Onwards …

    w

  72. A key message of your analysis (which you note) is that rising CO2 forcing in the models governs the shape of these curves, but the shape does NOT match reality. I agree. Your elegant method of analysis is exactly what is needed to detect this.
    A practical matter is that you need lots more citations to look “scientifical”

  73. Excellent paper Willis and your gamble on going public here looks to be paying off, with some good points raised on style and content.

    The poor fit between the models and the observations surprised me. I think the modellers need to do much more work on understanding the mechanisms of the amplification process, which is clearly non-linear.

    Perhaps a next step could be to try to understand the energy components which drive the system so that a better understanding of the ‘thermostat’ and its operation can be successfully modelled.

    Please keep up the good work.

  74. Willis,

    A brief note regarding the R source might be in order. It was pretty obvious but I got it wrong first so I’m putting the steps I have done here:

    1) > source(‘amp_function_090701.R’);
    then
    2)> source(‘Amplification_090701.R’);

    Also after drawing figure 2(c) but without actually labing it as (c) just “HadAT2 Quarterly Balloon Data,… ” I get the following error:
    Error in legend(plotlength * legendh, plotb + (plott – plotb) * legendv, :
    ‘legend’ is of length 0

    I can probably debug this (though I’m quite an R novice so possibly not) but you probably want to fix it for others.

  75. I don’t have much hope for this model of science when the second comment identifies the author as Anthony Watts, instead of the true author, who is using a pen anme based on Godel Escher Bach.

    REPLY: So quick to jump to conclusions, so unwilling to ask questions. The issue with me being identified as the author was about a 5 minute window where the author name disappeared from the top of the post when I first published it. Moving the position of the top ITCZ graphic right before publishing resulted in an accidental deletion of the name. It was fixed immediately and I pointed out that I was not the author. Note that it has not recurred. But if you want to use that argument to support your implied view that people that read these things are “not very smart”, be my guest. It is typical fodder of little merit.

    As for your claim that Willis is using a pen name derived from ” Godel Escher Bach ” all I can say is “puhleeeeze”. Having corresponded with the man, I say, prove your point. Here’s mine:

    Eschenbach has published: See Energy & Environment (E&E). In 2004 (Vol. 15, No. 3)

    E&E published a paper on sea level rise at Tuvalu by Willis Eschenbach. He’s described as an amateur scientist and “Construction Manager” for the Taunovo Bay Resort in Fiji. Will your next point be that they didn’t check his name?

    – Anthony

  76. Citations and references in the literature review:
    In the literature review section, referencing means that in the body of the paper, when talking about other authors’ works, the reference is immediately tied to the quoted phrase, or paraphrase not in quotes taken from those papers. In-between these direct references you simply supply your own comments in the literature review section. Based on this definition, you have not provided proper citations and references, which leads to confusion on the part of the reader.

    Use of quotes:
    There is a proper style of quotation marks and indentations related to short quotes and paragraph long quotes. I would also never include an entire abstract of someone else’s work. These formats and styles makes it easy for the reader to understand when the paper is citing someone else and when the paper is talking about your work. Once your review is finished, from then on the paper should only concentrate on your research. It is exceedingly rare to find citations of papers other than your own after the literature review is done.

    Red-Marking a poorly written work sample:
    Different styles are used for all kinds of written work. For good reason. Without proper style and format, even a fiction work would be hard to read. Again, a style manual is most helpful and makes the scientific offering believable. If you fail at style and format, your work will not be taken seriously, as it will “read” very amateurish. It won’t matter if it is right or wrong. You could be the most intelligent person on Earth but won’t be able to flip hamburgers if you show up for your interview dressed in rags.

    On a scale of 0 to 6 with 5 being a passing mark as this is an adult paper:

    Ideas and Content: 5 (sufficient to produce several papers)
    Organization: 1 (does not show basic organization)
    Sentence Fluency: 3 (some sentences are hard to follow and flow is disrupted)
    Conventions: 1 (does not show basic use of conventions)
    Word choice: 4 (could improve use of standard technical words used in research writing)
    Voice: 3 (needs improvement when examining other papers and should use passive voice when referring to his own work)

    Willis, the bottom line is that if this paper travels beyond this blog, it will be dismissed out of hand. And that is my final word.

  77. Joel Shore (19:05:33) :

    Huh? In what way does this analysis “confirm” the reality of amplification on a multidecadal timescale? It seems to me that it confirms it on all but the multidecadal timescales.

    Joel Shore (18:20:50) :

    In comments above I explained that RAOBCORE has a KNOWN warm bias. To use it would be inappropriate given the KNOWN problems with it.

    You also make much hay of the UAH corrections issue. This has been blathered on about a lot. I don’t feel like getting into it, however at the very least I hope you are aware that research is tending to show UAH as superior (again, I have posts above where you can find this information). You also put some innuendo in about the “stratosphere” “problem” which UAH long ago developed a satisfactory correction for…the publications you presumably wish to see referenced are blatant crap.

  78. Pamela Gray (09:43:26) :

    “Willis, the bottom line is that if this paper travels beyond this blog, it will be dismissed out of hand. And that is my final word”

    She is 100% right Willis. Like I said before, if you are truly serious about getting this published then you need to follow the established formats in science writing. You will be dismissed outright on the organization of your paper without a word or moment of consideration on the substance of your paper if you chose to stick with your current format.

    Getting published is a tough endeavor, you may as well do everything you can to stack the deck in your favor rather then making things more difficult on yourself when you “chose this way of publishing, because [you] have greater latitude” is really shooting yourself in the foot.

  79. Bravo for this attempt at open science. I have not read much of the paper yet,
    but will. Here are two comments about the “background” section.

    “Vice versa” does not say what you want to say. I think you need a sentence with
    “less than” in it. This comment is about presentation, not science.

    “Amplification” is a bad term for the subject. It implies a causal connection,
    and that has not been established. Perhaps the term is too well established
    to be changed. If so, the bias in the terminology should be explained.

    Thanks for your current effort, and your previous useful posts.

  80. Pamela Gray (09:43:26), my apologies for the lack of clarity in my response to you. I am not disagreeing with you, you are correct. The paper is not in the scientific style for the reasons you list. After a number of excellent points, which I have clipped and carefully saved in addition to your earlier postings , you say:

    Willis, the bottom line is that if this paper travels beyond this blog, it will be dismissed out of hand. And that is my final word.

    Your most clear set of instructions on how I should have written the paper is excellent, and you are quite correct. As much as I might wish it otherwise, we live in a world where, as you point out, scientific style can be more important than scientific substance, and my ideas may well be “dismissed out of hand” regardless of their correctness because of the way they are presented.

    I hope that is not your final word, as your previous words have been very useful. Should I choose to re-write this for submission to a journal, I have saved all of your words because they will be extremely helpful to me, either for this or future papers.

    My thanks to you for taking the time to fight my ignorance, more later, gotta run, 6:21 am, I’m out the door.

    w.

  81. Willis. Bravo, Bravo. Yes, open science is the way to go. The world is in love with the idea of speed and democracy so let’s not worry too much about stuffy formalitities. The ideas are the thing. Open debate. Don’t hide behind the screen. Coffee culture. Guys like Leif Svalgaard are showing the way.

    Some very clever people can’t spell. Are they to be disqualified? We will be the poorer for it.

    Now to the ideas:

    “Ampification”. Are we accepting here that the greenhouse effect is real? What is the mechanism that is supposed to be producing the temperature gain. Is the rate of convection slowing? Is the atmosphere getting more viscous?

    At 850hPa there might be seen to be ‘amplification’, but the term is inappropropriate. Release of latent heat drives temperature at this level and so if evaporation from the ocean increases, more energy is released at 850hPa so the temperature rises faster there than at the surface……but only if the surface has the means to limit energy gain that would drive up its temperature. So, the tropical oceans are a case in point because they appear to be almost ‘energy saturated’ . More energy into the system provokes very little temperature gain at the surface but a strong increase in temperature at 850hPa over time. That response may be evident all the way up to 500hPa. Last time I looked the rate of increase at 850hPa was about double that at the surface. Logically, if we wish to monitor the rate of energy gain by the Earth system we get a better idea of what’s happening by observing the change at 850hPa than at the surface.

    Now let’s turn our attention to the 200hpa level. Labitzke and Van Loon in their studies of the response of the atmosphere to the sun found plenty of evidence for a response in the stratosphere. Yes. But the important thing to note is that response could be observed into the troposphere and 200hpa was the cut off point. The parameter responsible for this response is ozone that is generated in the upper stratosphere but is well conserved in the lower stratosphere due to low temperature and the dryness of the air. Just remember that in the industrial manufacture of ozone the air is dried by chilling it to minus 80°C. Conditions for conservation gradually deteriorate below the tropopause. But in certain parts of the globe the tropopause ‘folds’. High pressure cells of the subtropics also bring ozone into the troposphere.

    So, the presence of ozone in the lower stratosphere/upper troposphere means that we are dealing with a medium that is heated from two sources. Long wave radiation from the Earth is bang on in terms of wave length for the ozone reactivity reaction and ozone also responds to UVB from the sun.

    Let’s throw in another variable. If the tropical ocean warms convection increases and outgoing long wave radiation is observed to diminish. So, wherever ozone is present the temperature of the surrounding air will be observed to fall.

    So, what is this ‘amplification’ thing?

    I have just put together a post for Climate change1 where I observe a strong response in 200hPa temperature to the seasonal increase in total column ozone in mid winter between 20 and 40° S latitude off the coast of Chile. Looks like this drives the Southern Oscillation.

    Every time we have a sudden stratospheric warming at the poles we can observe a jump in total column ozone across the globe. That too affects 200hpa temperature. Just plot the monthly data for 200hpa temperature against sea surface temperature and total column ozone and you will see what I mean. Do this for places chosen to represent low ozone and high ozone atmospheres and see what you get.

    So, the variables that determine the temperature of the upper troposphere are numerous. I would expect to see the relationship between 200hpa temperature and surface temperature to change by the month, the year, the place and over time. What it tells us? I dont think we are clever enough to work out what it means. Do the modellers capture all this? Don’t think so.

    Just by the way. I love your urbanity and humility and the way you acknowledge a good suggestion. You are a gentleman as well as a scholar.

  82. Willis,

    Thanks for responding to my points. I acknowledge that it is difficult, from a technical standpoint, to deal with vast volumes of model output. If you’re limited that by that constraint I’d recommend trying to collaborate with someone who can handle that volume of data as, IMO, it would improve your analysis.

    Wrt references, I thought that by including 34 authors you had more than 3 refs. My mistake. I would recommend placing citations in text though.

  83. Willis

    Here is a link to a story I was reading that brought your paper to mind. You will note that the pilots reported rain and warm air in the cloud tops at fl390! (39000 feet) http://www.aviationweek.com/aw/blogs/commercial_aviation/ThingsWithWings/index.jsp?plckController=Blog&plckScript=blogScript&plckElementId=blogDest&plckBlogPage=BlogViewPost&plckPostId=Blog%3A7a78f54e-b3dd-4fa6-ae6e-dff2ffd7bdbbPost%3A6edfeb4b-d969-49f5-bdb3-2558c92f6ebb
    The temps at that alt are usually quite crisp. An example of clouds convecting heat. But at such high altitudes?

    Interesting. Sorry if O/T.

  84. Data Smoothing and Spurious Correlation.

    Just kidding Willis.

    Interesting work. Still reading it.

    Regards, Allan

  85. Having been through the peer review process myself, I agree with Pamela Gray; in its present format, this “paper” will never become a paper. Even in a blog like this, you are not likely to get much feedback from scientists, because they don’t bother to read it when it’s not properly referenced and when the methodology is not properly explained. I’m not saying this to discourage you, but to help you improve in this and later work:)

    On the actuall content:
    -Your use of the term temporal evolution is not actually a temporal evolution. A temporal evolution would be a figure showing how the amplification changes with actuall time, not by the size of your averaging window (I could be mistaken here, as it’s not really clear how you calculate the amplification). The temporal evolution using it’s proper definition would be more interesting in my view than what are shown herein.
    -It’s common to use a Gaussian fit to estimate the maximum (that will give you your mean)when one only have a few points. However, there is an inherent bias in that method, called “Peak locking”. This will bias the maximum to lock onto an integer value, and will be evident if you plot a histogram of all your Gaussian values at mean. There is an extensive litterature on this in the field of Particle Image Velocimetry (do a search on PIV and Peak locking).
    -State your conclusions in the abstract! Most readers will never read further than that.
    -In your SMO 4, the description of the method for estimating the amplification is confusing. If you do not know what a method is called, find it out. Naming methods properly is important for replication, an important constituent of the scientific method. From what I can understand from SOM 4, I could probably do this in five minutes using Matlab, but that’s provided that I actually understand what you are doing.
    -Further on SMO 4, state what you are doing in plain language. Are you calculating the linear regression for a running window, then averaging over these windows? And then increasing the window gradually to get what you call the temporal evolution?
    -When you are referring to equations in other work, present the equation. This will help the reader.

    Finally, it’s not a valid argument to blame the bandwith of your internett connection for not doing the total work. Either do it properly, or don’t do it. You put yourself in a position for being accused of cherry picking if you can’t give a scientific reason for your selection of data.

    It’s a good start, but it needs major re-working if it’s going to be considered in any serious journal.

  86. timetochooseagain says:

    Huh? In what way does this analysis “confirm” the reality of amplification on a multidecadal timescale? It seems to me that it confirms it on all but the multidecadal timescales.

    The balloon data in Figs. 1 and 2 for the tropics show amplification factors of greater than 1 out to the longest times studied for the 200 hPa and 300 hPa levels. That in a nutshell is the tropical tropospheric amplification at multidecadal timescales.

    In comments above I explained that RAOBCORE has a KNOWN warm bias. To use it would be inappropriate given the KNOWN problems with it.

    So, one claim in a 2009 paper by Christy means the data is inappropriate? And, if that is the case, why use the original radiosonde data that is known to have cool bias artifacts due to better shielding of the temperature sensor from the sun over time? (See http://www.sciencemag.org/cgi/content/full/sci%3B309/5740/1556)

    I’m not even saying that the RAOBCORE re-analysisdata should be favored over the raw data but just that we should see what the sensitivity of Willis’s results are to the data set used…and particularly how it is adjusted to correct for any possible trend artifacts at the multidecadal timescales.

    You also make much hay of the UAH corrections issue. This has been blathered on about a lot. I don’t feel like getting into it, however at the very least I hope you are aware that research is tending to show UAH as superior (again, I have posts above where you can find this information).

    I am not making hay over it. And, what I am talking about is a correction that UAH acknowledged needed to be made..and that they now have made… due to a significant error in their analysis. The issue has nothing to do with the issue that you raise of which data set (UAH or RSS) is now superior AFTER UAH made this correction.

    I am just explaining to Willis why a statement made in the 2005 Santer paper seems to contradict what he found: The Santer paper was written before the UAH data set was corrected for this major error and hence the data set they used significantly differs from the data set that Willis used and, in particular, the data set prior to the correct actually shows a slightly negative temperature trend in T2LT in the tropics!

    You also put some innuendo in about the “stratosphere” “problem” which UAH long ago developed a satisfactory correction for…the publications you presumably wish to see referenced are blatant crap.

    Take up the point with Willis since he is the one who states (correctly in my view) that “this suggests a strong stratospheric influence on the T2 and TMT datasets.” I am merely pointing out that it would be most appropriate to reference the previous literature where this was already discussed.

    Honestly, I have no idea why you wrote your post except to be argumentative.

  87. By the way, I realize that what you may be confusing the stratospheric issue with is the issue raised by Fu et al. that even the UAH ***LOWER*** tropospheric temperature record may be contaminated somewhat by the cooling in the stratosphere. I don’t know what the current belief is on this claim of Fu et al. but that is a different issue than whether T2 and TMT are so contaminated, about which I think there is little or no controversy.

  88. Joel Shore (14:54:23) : “So, one claim in a 2009 paper by Christy means the data is inappropriate?” YES.

    “why use the original radiosonde data that is known to have cool bias artifacts due to better shielding of the temperature sensor from the sun over time?”

    Who is???? I’ve heard about this somewhere before though, and I remember there being more to it than that. But I know that the “raw data” were very different from the present data…Some older papers claimed that, like satellites originally did, that there was no trend at all. This was subsequently corrected. But there are certainly important discussions about what biases do and do not still remain. I’m going to investigate this further.

    “Take up the point with Willis since he is the one who states (correctly in my view) that “this suggests a strong stratospheric influence on the T2 and TMT datasets.””

    Your “view” is, frankly, totally erroneous.

    “Honestly, I have no idea why you wrote your post except to be argumentative.”

    I have no idea why you responded then…

  89. WRT radiosonde biases, from:
    http://www.pas.rochester.edu/~douglass/papers/Published%20JOC1651.pdf

    “Several investigators revised the radiosonde datasets
    to reduce possible impacts of changing instrumentation
    and processing algorithms on long-term trends. Sherwood
    et al. (2005) have suggested biases arising from daytime
    solar heating. These effects have been addressed by
    Christy et al. (2007) and by Haimberger (2006). Sherwood
    et al. (2005) suggested that, over time, general
    improvements in the radiosonde instrumentation, particularly
    the response to solar heating, has led to negative
    biases in the daytime trends vs nighttime trends in unadjusted
    tropical stations. Christy et al. (2007) specifically
    examined this aspect for the tropical tropospheric layer
    and indeed confirmed a spuriously negative trend component
    in composited, unadjusted daytime data, but also discovered
    a likely spuriously positive trend in unadjusted
    nighttime measurements. Christy et al. (2007) adjusted
    day and night readings using both UAH and RSS satellite
    data on individual stations. Both RATPAC and HadAT2
    compared very well with the adjusted datasets, being
    within ±0.05 °C/decade, indicating that main cooling
    effect of the radiosonde changes were evidently detected
    and eliminated in both. Haimberger (2006) has also studied
    the daytime/nighttime bias and finds that ‘The spatiotemporal
    consistency of the global radiosonde dataset
    is improved by these adjustments and spurious large daynight
    differences are removed.’ Thus, the error estimates stated by Free et al. (2005); Haimberger (2006), and
    Coleman and Thorne (2005) are quite reasonable, so that
    the trend values are very likely to be accurate within
    ±0.10 °C/decade.”

    References:
    Christy JR, Norris WB, Spencer RW, Hnilo JJ. 2007. Tropospheric
    temperature change since 1979 from tropical radiosonde and satellite
    measurements. Journal of Geophysical Research-Atmospheres 112:
    D06102, DOI:10.1029/2005JD006881.
    Coleman H, Thorne PW. 2005. HadAT: An update to 2005 and development
    of the dataset website. Data at .
    Free M, Seidel DJ, Angell JK. 2005. Radiosonde Atmospheric
    Temperature Products for Assessing Climate (RATPAC): a new
    dataset of large-area anomaly time series. Journal of Geophysical
    Research 110: D22101, DOI:10.1029/2005/D006169.
    Haimberger L. 2006. Homogenization of radiosonde temperature time
    series using innovation statistics. Journal of Climate 20: 1377–1402.
    Sherwood SC, Lanzante JR, Meyer CL. 2005. Radiosonde daytime
    biases and late – 20th century warming. Science 309: 1556–1559.

    The Radiosonde data appear to be very accurate, thank you.

  90. Mr. Eschenbach-

    I applaud your approach to review. A thought came to mind as to how you, and most others, are approaching your research. It seems that everyone is trying to measure the Earth’s “pulse”, each by grabbing a different appendage, and discovering how different things are, tactilely.

    It leaves some “holes” as far as potential systems are concerned.

    Just a thought, but your’s are much more impressive than my snipes.

    Good luck. (and I only have one person left to call,on our other subject, the others are no longer with us. depressing.

    I wish you well, no need to respond, concentrate on the others, please.

  91. I wouldn’t call ±0.10 °C/decade “very accurate”, but whatever. Let’s assume Christy and co. are right. Then, Willis could try imposing a + or -0.10 °C/decade change to the trend of the data and see how that affects things. That’s a pretty significant change.

  92. Joel, you propose a very interesting experiment regarding the trend. I’m out of here tomorrow on a week long holiday, but I will definitely run the numbers when I return.

    Also, after two years overseas I’m moved back to Nowherica now, so I’ll have fast internet. And I have a new computer with larger memory. So I plan to continue downloading and analyzing the models. I would note, however, that I have already done one model from about a third of the modeling groups (some of which use two different versions of a given model, like GISSE-H and GISSE-R variants.) So I’d be surprised to find that all of the rest of the models were very similar to the observations.

    Anyhow, I’m off on holiday, back soon with more responses and more models.

    My thanks to all for your review of the paper,

    w.

  93. I did six more models and the Yale kriged data, and tried to post a picture.

    w.

    REPLY: Post them to flickr or tinypic and then link with a URL
    – Anthony

  94. I posted them on my web storage space and linked to them as follows:

    No good?

    w.

    REPLY: That works, you may want to repost the link with explantion – A

  95. The previous link is to the amplification of another six models. The variation in thirty year subsets is shown at

    As you can see, my previous prediction, viz:

    So I’d be surprised to find that all of the rest of the models were very similar to the observations.

    was 100% correct. Not a one of these looks like the observations.

    More later,

    w.

  96. Someone asked why I had not included the RICH and RAOBCORE datasets … the answer is that they have very few datapoints in the area of interest (20°N/S), viz:

    w.

  97. Willis,

    Your paper first needs to show the physics of what you are doing. As it stands you jump in with an assumption with no justification that I can see, and claim that this means something. You have not shown why your analysis shows feedback. You only show the ratio of the long term relative levels of variation. Since the ground is a boundary, vertical convection is restricted. At altitude, large vertical convection currents can change the temperature much more, so I don’t see how looking just at the relative variation amplitude shows anything other than large vertical convection effects, which you would expect in the tropics. The whole idea of a “hot” spot is that the integrated effect of the temperature at altitude is hotter than it would be for no amplification, not that it varies up and down more. I do not see how your analysis addresses that issue.

  98. lweinstein (07:53:18), thank you for your reply. You say inter alia:

    Your paper first needs to show the physics of what you are doing. As it stands you jump in with an assumption with no justification that I can see, and claim that this means something.

    Perhaps I misunderstand your point, but I ascribe no meaning to the amplification behavior of the atmosphere. I am merely trying to understand and measure the phenomenon.

    “Amplification” is the term used to describe the fact that the atmospheric temperature varies more than the surface temperature. Does amplification “mean something”? Not that I know of. It is just one of the many atmospheric phenomena that we are trying to understand.

    However, the existence, size, and nature of amplification have become a bit of a battleground in the ongoing climate wars. Many claims have been made by both sides about how large it is, and in particular, about whether the models correctly simulate the amplification behavior of the atmosphere.

    As such, I decided to investigate the phenomenon to try to clear the air and place the claims on some firm footing. Having done so, I find that the models do a very poor job of simulating the amplification at various levels in the atmosphere. I do not claim that “means” anything either, it just points out another of the models’ many failings.

    However, perhaps I am not following your objection, and you could explain further.

    My best to you,

    w.

  99. Joel, I finally have the time to answer some of your interesting and cogent points. As I said above, what seems like a while ago now, I’ll do it piece by piece. To start:

    (1) You have tried very hard here to push the models until they break…I.e., to find a more sensitive metric that can probe in more detail exactly what the data and models show, and that is great. However, I would suggest that you do the same in regards to the experimental data. Hence, I would recommend looking at, say, the RAOBCORE re-analysis (despite the claims that some may make that it has some bias…After all, there are lots of concerns about the general biases in the radiosonde data set that RAOBCORE is trying to correct). And, you could look how sensitive your result (particularly concerning whether the amplification trend with time interval is up or down at the longer times) is to the overall trend in the data set in order to understand how sensitive things are to possible artifacts in the data set that affect that the overall multidecadal trends without having much effect on the shorter-term variability, since it is the longer term secular trends that are most subject to artifacts in the observational data sets.

    I have looked at the RAOBCORE gridded data (see the link a couple of posts upstream). The problem is that there is very little coverage in the tropics. As such, I don’t see how we can conclude much of anything from the RAOBCORE set. Perhaps there is a set with more coverage, but I haven’t found it.

    Regarding changes in trends, take a look at the analysis above of the AMSU data. In figure S-1(b), the two graphs are identical except for having different trends. As you suggest, this affects the long-term amplification but not the short-term.

    I would say, however, that the gradual decline in the long-terms trends is real rather than an artifact. I say so because it appears in:

    1) All layers of the HadCRUT tropical data from 700 hPa to 200 hPa.

    2) All layers of the RATPAC tropical data from 700 hPa to 200 hPa.

    3) The UAH MSU T2LT and TLT data.

    4) All layers of the HadCRUT global data from 700 hPa to 200 hPa.

    5) All layers of the RATPAC global data from 700 hPa to 200 hPa.

    6) All layers of the RATPAC quarterly tropical data.

    7) All layers of the NCEP reanalysis data from 700 to 200 hPa.

    On the other hand, this behavior does not appear in a single model.

    More to come, thanks for raising the issues,

    w.

  100. Joel, in #2 you say:

    (2) I think that your conclusions are in general overstated based on the results that you show. Where you see the glass half-empty, I see it half-full. I.e., you seemed to consider it some sort of significant problem that you can find any disagreement between the models and the data. In fact, I would be shocked if you couldn’t, particularly given how hard you are trying to develop a metric to test things as diligently as possible. Honestly, when I look at Figs. 3 and 4, I say, “Wow…The models are doing pretty well at getting the most of the basic features correct. Are there discrepancies? Sure…and it would be interesting to further probe these and understand what problems with the models or the data they could be due to. But, the take-away message is that there are many basic features that are quite robust across the data and the models and then also some notable differences in the details.” In essence, I think you are creating a strawman that is “The models are perfect” and then demolishing it by showing that they are not perfect. (Although the extent to which the discrepancy is due to imperfections of the models and the extent to which they are due to limitations or problems with the data are unclear. However, I am willing to imagine that at least some of it is due to imperfection of the models). I think your presentation would seem more balanced if you had more emphasis of the points where the models and observational data do agree rather than only the points where they differ.

    I suspect that you are right that my conclusions are somewhat overstated.

    But saying that “the models are doing pretty well” is a whole lot different from saying “the models are good enough to make century long forecasts”. To do that, they need to do more than be in the general ballpark.

    I’m not the one who is saying “the models are perfect”. That would be the AGW supporters, since it would obviously take a model that is nearly perfect to forecast the evolution of climate over the next century.

    Next, I’d be glad to highlight where the models and the observations agree … I just can’t find them. None of the models show the slow increase over 8-10 years to a peak. This is seen in all of the data. None of the models show increasing amplification with height up to 200 hPa and decreasing after that, as theory suggests and as all observation datasets show.

    So I’m in a bit of mystery here … what do you see as “the points where the models and observational data do agree”?

  101. (3) You comment on Santer et al.’s statement that “Other observations show weak or even negative amplification”. However, I know that their statement of negative amplification in particular applied to the UAH data that was available at the time, which was just before a major fix was applied to that data that I am pretty sure changed the amplification (at least as they defined it) from negative to positive. So, it isn’t really fair to say this statement by Santer is wrong since it is based on the UAH data that has been corrected between the time they made that statement and the time you have done your analysis. [See here for UAH’s record of the corrections to their data: http://vortex.nsstc.uah.edu/data/msu/t2lt/readme.17Apr2009 I think that the correction that I am referring to is the one from 7 Aug 2005. (Note the statement “This artifact contributed an error term in certain types of diurnal cycles, most noteably in the tropics.”)]

    Joel, thanks for pointing this out. Here’s the record in full:

    Update 7 Aug 2005 ****************************

    An artifact of the diurnal correction applied to LT
    has been discovered by Carl Mears and Frank Wentz
    (Remote Sensing Systems). This artifact contributed an
    error term in certain types of diurnal cycles, most
    noteably in the tropics. We have applied a new diurnal
    correction based on 3 AMSU instruments and call the dataset
    v5.2. This artifact does not appear in MT or LS. The new
    global trend from Dec 1978 to July 2005 is +0.123 C/decade,
    or +0.035 C/decade warmer than v5.1. This particular
    error is within the published margin of error for LT of
    +/- 0.05 C/decade (Christy et al. 2003). We thank Carl and
    Frank for digging into our procedure and discovering this
    error. All radiosonde comparisons have been rerun and the
    agreement is still exceptionally good. There was virtually
    no impact of this error outside of the tropics.

    I don’t think this is the source of the “negative amplification”, however. I had forgotten that they are not actually measuring amplification, but are simply measuring a ratio of trends. And with or without the adjustment, the UAH MSU trend over 1979-2005 is less than the HadCRUT trend over the same period.

    I appreciate the heads-up.

    w.

  102. Joel, your comment on style is interesting, viz:

    (5) I agree with some of Pamela’s stylistic points but I disagree with her on always using the passive voice…I think that is somewhat “old school” thinking these days and the scientific community has begun to catch up with the rest of the world in rejecting the need to use the passive voice so exclusively. In particular, I have noticed that overuse of the passive voice sometimes even makes it difficult to determine when the author is referring to work that he/she has done versus when the reference is to the work of others. Having written almost all of my papers with co-authors, I am not sure what the guidelines are for using “I” vs “we” though. I kind of think that “we” sounds better, but I know that Physical Review once had a prohibition against the use of “we” for a single-author paper that led an author to once add his dog as a co-author so that he could keep the “we”!

    Pamela and others have made excellent points. Being a lone wolf in this one, I loved the addition of the dog …

  103. Joel, you bring up a good point:

    (6) You say “The oddity in Fig 1(a) is that I had expected the amplification at higher altitude (T2 and TMT) to be larger than at the lower altitude (T2LT and TLT) amplification. Instead, the higher altitude record had lower amplification. This suggests a strong stratospheric influence on the T2 and TMT datasets.” I think this is a fairly well-known issue so I think it is not really such an oddity and it is a good place to make reference to some previous literature (the lack of such references being a general weakness of this paper that others have already pointed out).

    My problem is that I’m too damn honest. People have busted me a number of times for saying “I was surprised by” … hey, I was surprised, what can I say?

    Yes, it is a “well-known issue” … but I was surprised to see how much that well known issue affected the amplification.

    And yes, more references are always good, and I plan to add more.

    However, there has been very little peer-reviewed publication on the general subject of amplification in general. In addition, this is a new metric, so there is nothing published on e.g. how stratospheric contamination (which I should have referenced) might affect my metric of amplification.

    w.

  104. Joel, I’m not sure I agree with you when you say:

    (7) You say

    Even in scientific disciplines which are well understood, taking the position when models disagree with observations that “more plausibly” the observations are incorrect is adventurous. In climate science, on the other hand, it is downright risky.

    However, if one looks at the context in which Santer et al. made that claim, it seems to be in arguing that they believe that the tropospheric tropical amplification as seen in the models and not so much in the data at multidecadal timescales really is there. And, strangely enough, it seems to me that your analysis seems to confirm this basic fact…In fact, you seem to find that the amplification is there in the data (perhaps a little weaker at the multidecadal timescales than on average in the models although it depends on which model you compare to). So, you seem to chide Santer et al. for their hubris at the same time as you confirm their basic point that the amplification is really there at the multidecadal timescales. One thing that confuses me a bit though is why you seem to see it in the data sets whereas they don’t…I.e., how does your analysis of the data (in the 30-year limit) differ from theirs (besides the fact that you used the corrected UAH data set)? Is it the difference between the length of the data set that you used vs what they used (since they didn’t have 30 years yet); Or, is there some other difference in the definition of the amplifications or what?

    The part I disagree with is:

    However, if one looks at the context in which Santer et al. made that claim, it seems to be in arguing that they believe that the tropospheric tropical amplification as seen in the models and not so much in the data at multidecadal timescales really is there.

    I don’t read it that way at all. They said:

    These results suggest that either different physical mechanisms control amplification processes on monthly and decadal timescales, and models fail to capture such behavior, or (more plausibly) that residual errors in several observational datasets used here affect their representation of long-term trends.

    Seems pretty clear, they are saying it is “A” or “B”, where “A” is the models don’t capture all the mechanisms, and “B” is the observations are wrong.

    Is amplification there at longer timescales? Yes. But it is much more subtle and complex than their analysis suggests. They claim (but never show) that the models agree with the data at (unspecified) short timescales, but not at the longest timescales. In addition, they make the implicit (but totally unfounded) assumption that the amplification is basically constant over time.

    This constancy is seen in all of the models … and in none of the datasets. That strongly suggests to me that the models “fail to capture such behavior” … but as always, YMMV.

    However, the huge differences in the models definitely mean one thing:

    They can’t all be right.

    w.

Comments are closed.