Guest post by Willis Eschenbach
This is another of my occasional reports from my peripatetic travels through the Argo data (see the Appendix for my other dispatches from the front lines). In the comments to my previous post, I had put up a graphic showing how the January/February/March data for one gridcell varied by latitude and day of the year. Figure 1 shows that graphic:
As you can see, in this gridcell the ocean gets cooler as you go north. It also gets cooler from the first of the year to the end of March, although obviously that will change during the year as it warms and cools.
I decided to take a look at every bit of the available data for that gridcell, not just three months of each year. Figure 2 shows that result.
Figure 2. All Argo temperature measurements for the gridcell, from 2002 to 2012. (a) Upper panel shows measurements only (blue diamonds). (b) Lower panel shows measurements (blue) and the trend (yellow/orange circles) as estimated by a linear model involving the day of year and the latitude of each measurement.
Note that my linear model does a pretty good job of resolving the variation by day and by latitude. For example, in the coldest and warmest parts of 2010 and 2011, there were a variety of measurements. The model does a good job of replicating that.
Curiously, this linear model does a better job than another linear model I tried using a single variable, the insolation by day and latitude. You’d think the insolation data by day and latitude would encapsulate the day and latitude data I used in Figure 1 (b). But that turned out not to be the case. It wasn’t nearly as successful at separating out the “by latitude” variation. Which was odd, because I hadn’t even bothered to use cosine(latitude) … hang on … OK, I just went and checked it with cos(latitude), and the results are indistinguishable. Not too surprising, I guess, it’s only a 5° slice so linear and cosine are not terribly different. It is surprising that it does so much better than the insolation data, though.
My model is fitted, of course. It’s kind of a linear model. For the day variable, I iteratively fit a function of the form
Temperature = (2 + Sin(2 * Pi * (JulianDay+Lag)/365.254)) ^ Somepower
JulianDay is the day count from some fixed starting point. Lag is an adjustment for the phase lag due to the delayed heating and cooling of the ocean mass. Somepower is the power to which the function is raised. The number 365.254 is days per year. The 2*PI is to convert to radians for the Sin calculation.
This (2+Sin(time))^Somepower type function is a reasonable approximation of the form of a natural system where the heat input is cyclical and the heat loss is a power function of the temperature. See, for example, the ocean temperatures shown in cyan in Figure 1 here, which spike upwards to a point and then drop quickly in the summer, but have a rounded base in the winter.
In the case of the ocean, heat loss varies linearly with wind, it varies as T^2 with Clausius-Clapeyron for evaporation, and it varies as T^4 for radiation. Of course, all of that is mitigated by a host of factors. In this case, Somepower had a best fit at 1.4.
Moving forwards, I decided to compare the Argo data with the Reynolds sea surface temperature data. Figure 3 shows a comparison of the Argo float data with the Reynolds Optimally Interpolated satellite plus observational based temperatures. The Reynolds data is from KNMI.
Generally, this is good news, as the agreement between the two is quite surprising. I used a cubic spline to interpolate the Reynolds OI data to match the dates of the Argo data, and it shows a surprisingly close match.
Now here’s the oddity. The Reynolds data has a trend. The Argo data has no trend at all. Figure 4 shows the residuals, what is left after removing the cyclical portions of the signal. It also shows the linear trends of the residuals.
Figure 4. Residuals after the removal of confounding variables. Upper panel shows the Argo data after the removal of the day-of-year and latitude effects. The lower panel shows the Reynolds data after removal of the monthly averages.
I’m not entirely sure what to make of all of this. I am encouraged that the residuals of the Reynolds data are well correlated with those of the Argo data. Since the datasets are produced entirely independently and with no common procedural steps, this is good news. But the difference in the trends is quite large.
My guess about the reason for the difference? Bear in mind that of these two, only the Argo measurements are actual observational data. The Argo floats measure one point in time, to an accuracy of 0.005°C. The Reynolds dataset, on the other hand, is interpolated. You can find a description of the method and other data here. They say the interpolation is “optimal”, which always makes me nervous, but never mind that.
The problem is, nature doesn’t do interpolation. There isn’t any gradual transition between say the cool waters off of New York and the Gulf Stream, it is a sudden jump. And when I used to fish albacore off of the California coast, I learned that in the fall you drive the fishing boat offshore through green cold water covered with fog, miles and miles of fog and green water, and then suddenly you emerge from the fog bank to see blue warm water and clear skies, and a clear dividing line between the two … there’s no way to “interpolate” that.
So if you are interpolating between two areas that are warming, the intervening area will interpolate as warming as well … and that may or may not be the case.
As before, I am making no great over-arching claims for these results. It is one gridcell in a large ocean, and I’m using these posts to list and discuss some of the things I’m learning about it as I go. I’m working to assess the validity of the Argo data, compare it to other datasets, and take some guesses as to its accuracy. I do note that in this gridcell, over the full decade, both Reynolds and Argo show a trend that has a statistical error of plus/minus a tenth of a degree per decade, and yet the trends are half a degree per decade different from each other, and that’s a lot … so one of them has to be pretty wrong. My money’s on the Argo data as being the better of the two … but to what kind of accuracy?
There’s one final issue I want to discuss. The assumption is often made, by Hansen and others, that correlation between two datasets implies similar trends. Based on good correlation between stations, the GISS folks believe that you can use one dataset to extrapolate temperatures out 250 to 1200 kilometres away from the nearest stations. I have argued against this misconception in a post called “GISScapades“.
But look at these two datasets, Argo and Reynolds. The correlation between the Argo and Reynolds residuals is 0.66, and the correlation between the first differences of the residuals is 0.82, both quite good. There’s a reasonable amount of data, 953 data points. For climate science where small dissimilar datasets are the norm, those are impressive correlations, particularly given that the two analyses are totally independent.
Yet the Argo data shows no trend at all, 0.0°C per century … while the Reynolds data, which claims to be measuring the same thing at the same time, shows a trend of 5.1°C per century! (Yes, I know, it’s only a decade of data, I know you can’t extrapolate that out a century, and I’m not doing that. This is not a forecast of any kind. I mention the difference over a century purely because I want people to be clear about the size of the difference in trends between these two very well correlated datasets.)
Go figure …
My best regards to everyone,
APPENDIX: Previous Posts on Argo