I got into this investigation of Argo because I disbelieved their claimed error of 0.002°C for the annual average temperature of the top mile of the ocean. I discussed this in “Decimals of Precision“, where I showed that the error estimates were overly optimistic.
I wanted to know more about what the structure of the data looked like, which led to my posts Jason and the Argo Notes, Argo Notes Part Two, and Argo and the Ocean Temperature Maximum.
This is the next part of my slow wander through the Argo data. In How well can we derive Global Ocean Indicators from Argo data? (PDF, hereinafter SLT2011), K. von Schuckmann and P.-Y. Le Traon describe their method for analyzing the Argo data:
To evaluate GOIs [global oceanic indicators] from the irregularly distributed global Argo data, temperature and salinity profiles during the years 2005 to 2010 are uploaded spanning 10 to 1500m depth.
and
To estimate the GOIs from the irregularly distributed profiles, the global ocean is first divided into boxes of 5° latitude, 10° longitude and 3 month size. This provides a sufficient number of observations per box.
So I thought I’d take a look at some gridcell boxes. I’ve picked one which is typical, and shows some of the issues involved in determining a trend. Figure 1 shows the location of the temperature profiles for that gridbox, as well as showing the temperatures by latitude and day. The data in all cases is for the first three months of the year. The top row of Figure 1 shows all of the temperature for those three months (Jan-Feb-Mar) from all the years 2005-2011. The bottom row shows just the 2005 measurements. The following figures will show the other years.
Figure 1. Click on image for full size version. Gridcell is in the Atlantic, from 25°-30°N, and 30°-40°W. Left column shows the physical location of the samples within the gridbox. Colors in the left column are randomly assigned to different floats, one color per float. Right column shows the temperature by latitude. Small numbers above each sample show the day of the year that the sample was taken. Colors in the right column show the day the sample was taken, with red being day one of the year, shading through orange, yellow and green to end at blue at day 91. Top row shows all years. Bottom row shows 2005. Text in the right column gives the mean (average) of the temperature measurements, the standard deviation (StdDev), and the 95% confidence interval (95%CI) of the mean of the temperature data. The 95% CI is calculated as the standard error of the mean times 1.96.
Let’s consider the top row first. In the left column, we see the physical location of all samples that Argo floats took from 2005-2011. We have pretty good coverage of the area of the gridbox over that time. Note that the gridboxes are huge, half a million square kilometres for this particular one. So even with the 216 samples taken over the six-year period, that’s still only one sample per 2,500 square km.
Next, let’s consider the top right image. This shows how the temperatures vary by time and by latitude. As you would expect, the further north you go, the colder the ocean, with a swing of about three degrees from north to south.
In addition, you can see that the ocean is cooling from day 1 (start of January) to day 91 (end of March). The early records (red and orange) are on the right (warmer) side of the graph. The later records (green and blue) are concentrated in the left hand (cooler) side of the records.
This leads to a curious oddity. The spread (standard deviation) of the temperature records from any given float depends on the direction that the float is moving. If the float is moving south, it is moving into warmer waters, but the water generally is cooling, so the spread of temperatures is reduced. If the float is moving north, on the other hand, it is moving into cooler waters, and in addition the water is generally cooling, so the spread is increased. It is unclear what effect this will have on the results … but it won’t make them more accurate. You’d think that the directions of the floats might average out, but no such luck, south is more common than north in these months for this gridcell.
A second problem affecting the accuracy can be seen in the lower left graph of Figure 1. It seems that we have nine measurements … but they’re all located within one tiny part of the entire gridbox. This may or may not make a difference, depending on exactly where the measurements are located, and which direction the float is moving. We can see this in the upper row of Figure 2.
Figure 2. As in Figure 1, with the top row showing 2006, and the bottom row 2007.
The effects I described above can be seen in the upper row, where the floats are in the northern half of the gridbox and moving generally southwards. There is a second effect visible, which is that one of the two floats (light blue circles) was only within the gridbox in the late (cooler) part of the period, with the first record being on day 62. As a result, the standard deviation of the measurements is small, and the temperature is anomalously low … which gives us a mean temperature of 20.8°C with a confidence interval of ± 0.36°C. In fact, the 95% confidence interval of the 2006 data does not overlap with the confidence interval of the mean of the entire 2005-2011 period (21.7° ± 0.12°C) … not a good sign at all
The 2007 data offers another problem … there weren’t any Argo floats at all in the gridcell for the entire three months. The authors say that in that case, they replace the year’s data with the “climatology”, which means the long-term average for the time period … but there’s a problem with that. The climatology covers the whole period, but there are more gaps in the first half of the record than in the latter half. As a result, if there is a trend in the data, this procedure is guaranteed to reduce that trend, by some unknown amount.
Figure 3 shows the next two years, 2008-2009.
Figure 3. As in Figure 1 and 2, for 2008 (top row) and 2009 (bottom row).
2008 averages out very close to the overall average … but that’s just the luck of the draw, as the floats were split between the north and south. 2009 wasn’t so lucky, with most of the records in the south, This leads to a warmer average, as well as a small 95%CI.
Finally, Figure 4 shows 2010 and 2011.
Figure 4. As in Figure 1 and 2, for 2010 (top row) and 2011 (bottom row).
In the final two years of the record, we are finally starting to get a more reasonable number of samples in the gridbox. However, there are still some interesting things going on. Look at the lower right graph. In the lower right of that graph there are two samples (day 71 and 81) from a float which didn’t move at all over that ten days (see bottom left graph, blue circles, with “81” on top of “71”). In that ten days, the temperature in that one single location dropped by almost half a degree …
DISCUSSION.
In this particular gridcell, the averages for each of the years 2005-2011 are 21.4°C, 20.8°C, no data, 21.7°C, 22°C, 21.9°C, and 21.7 °C. This gives a warming trend of 0.13°C/year, as shown in Figure 5.
Figure 5. Trend of the gridcell three-month temperatures
My question is, how accurate is this trend? Me, I’d say we can’t trust it as far as we can throw it. The problem is that the early years (2005, ’06, and ’07) way undersample the gridcell, but this is hidden because they take a number of samples in one or two small areas. As a result, the confidence intervals are way understated, and the averages do not represent a valid sampling in either time or space.
My conclusion is that we simply do not have the data to say a whole lot about this gridcell. In particular, despite the apparent statistical accuracy of a trend calculated from from these numbers, I don’t think we can even say whether the gridcell is warming or cooling.
Finally, the law of large numbers is generally understood to relate to repeated measurements of the same thing. But here, two measurements ten days apart are half a degree different, while two measurements at the same time in different areas of the gridcell are as much as three degrees apart … are we measuring the “same thing” here or not? And if not, if we are measuring different things, what effect does that have on the uncertainty? Finally, all of these error calculations assume what is called “stationarity”, that is to say that the mean of the data doesn’t change over a sufficiently long time period. However, there is no reason to believe this is true. What does this do to the uncertainties?
I don’t have any answers to these questions, and looking at the data seems to only bring up more questions and complications. However, I had said that I doubted we knew the temperature to anything like the precision claimed by the authors. Table 1 of the SLT2011 paper claims a precision for the annual average heat content of the top mile of the ocean of ± 0.21e+8 Joules. Given the volume involved (414e+8 cubic kilometres), this means they are claiming to measure the temperature of the top mile of the ocean to ± 0.002°C, two thousandths of a degree …
As cited above, I showed before that this was unlikely by noting that there are on the order of 3500 Argo floats. If the SLT2011 numbers are correct and the error from 3500 floats is ± 0.002°C, it means that 35 floats could measure the temperature of the top mile of the ocean to a tenth of that accuracy, or ± two hundredths of a degree. This is highly unlikely, the ocean is way too large to be measured to plus or minus two hundredths of a degree by 35 floats.
Finally, people have the idea that the ocean is well-mixed, and changes slowly and gradually from one temperature to another. Nothing could be further from the truth. The predominant feature of the ocean is eddies. These eddies have a curious property. They can travel, carrying the same water, for hundreds and hundreds of miles. Here’s an article on one eddy that they have studied. Their illustration is shown as Figure 6.
Figure 6. Illustration of an eddy transporting water for a long distance along the south coast of Australia.
Figure 7 shows another example of the eddying, non-uniform nature of the ocean. It is of the ocean off of the upper East Coast of the US, showing the Gulf Stream.
Figure 7. Oceanic temperature variation and eddies. Blue box is 5° latitude by 10° longitude. Temperature scale runs from blue (10°C, 50°F) to red (25°C, 77°F). SOURCE
The blue rectangle shows the size of the gridcell used in SLT2011. The red circles approximate the distribution within the gridbox of the measurements shown in the bottom row of Figure 1 for 2005. As you can see, this number and distribution of samples is way below the number and breadth of samples required to give us any kind of accuracy. Despite that, the strict statistical standard error of the mean would be very small, since there is little change in temperature in the immediate area. This gives an unwarranted and incorrect appearance of an accuracy of measurement that is simply not attainable by sampling a small area.
Why is this important? It is important because measuring the ocean temperature is part of determining the changes in the climate. My contention is that we still have far too little information to give us enough accuracy to say anything meaningful about “missing heat”.
Anyhow, that’s my latest wander through the Argo data. I find nothing to change my mind regarding what I see as greatly overstated precision for the temperature measurements.
My regards to everyone,
w.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Thanks, Willis. This is yet another lesson regarding the incredible complexity of our planet’s climate system. If the ARGO measurements are so uncertain, how much worse are the pre-ARGO numbers?
Are the ARGO floats randomly distributed? If not, could their distribution be affected by temperature-related events, eg, winds, currents, upwellings, etc?
I love this: “This leads to a curious oddity. The spread (standard deviation) of the temperature records from any given float depends on the direction that the float is moving. If the float is moving south, it is moving into warmer waters, but the water generally is cooling, so the spread of temperatures is reduced.”
I KNOW this is not what you are saying… but this does suggest another convenient trick somebody could use to create Global Warming… just progressively shift the floats toward the Equator, or progressively delete the ‘colder’ ones.
Of course, I am sure that the top scientists entrusted with this vital data would never do such a thing, except, for deletions, on land.
Willis, why are the graphs in the right hand column labeled “Change in Temperature By Latitude and Day”, instead of “Temperature By Latitude and Day”? They don’t look like “changes”, and the text says “variation” , not “change”.
To estimate the GOIs from the irregularly distributed profiles, the global ocean is first divided into boxes of 5° latitude, 10° longitude and 3 month size. This provides a sufficient number of observations per box.
Seriously, has nobody at all in climate science actually learned calculus? Does nobody know what a bloody Jacobean is?
Of all the stupid ways of dividing the ocean into cells, this has to be the absolute dumbest. It underweights the equator, overweights the poles, leads to many underful/empty/sparse cells. I mean seriously — one would have to go out of one’s way to come up with a worse way of doing it.
If somebody ever wants to do this right, there are a number of ways to do it. None of them use equal partitioning in spherical polar angles. Either use an icosahedral tesselation:
https://ziyan.info/2008/11/sphere-tessellation-using-icosahedron/
or start by using a fine-grained tessellation and use e.g. and affinity model to build equal-population cells out of the tesselation triangles or use a gaussian overlap model, offhand. Of course all of these approaches require some actual work. Latitude and longitude is easy — and wrong.
Sorry, I’m just sayin’…
rgb
The golden fleecers at work.
And is the upshot of all this that the argo data is neither widespread or sufficently adequate over a long enough time series for any meaningful analysis? Certainly, extrapolating it into trends is seemingly futile over such poorly sampled areas (and even worse if one starts gridding and infilling ‘missing’ data with averages, etc). It appears that:
a) the argo data is showing apparent significant temp variation
b) the temp variation could be caused by any number of issues aside from the obvious time of year and lat/long ones.
c) add in ‘depth’ and we are looking increasingly like we are measuring a tiny, almost imperceptible fraction of a very large body of water!
d) any ‘average’ or ocean temp anomaly ‘measurement’ must logically, almost by default – be very suspect as a ‘real’ or useable value, with potential to be ‘adjusted’ as requried.
Hmm – now, where have we seen such datasets before?? I fear the team will be able to use this data well!
Willis, you wrote: If the float is moving south, it is moving into warmer waters,
How can you identify the direction of movement of individual floats? Can you do it in these graphs, I mean, or have you that information without displaying it?
Willis, nothing is easier than asking questions, but I have another. You wrote: Colors in the left column are randomly assigned to different floats, one color per float.
Is that mapping one-to-one, so that all the orange dots represent the same float? The orange dots are in 3 clumps, lines actually, seeming to display the same float in 3 different years. If that is correct, why are there only 3 lines, instead of one for each year in 2005 – 2011? In figure 1, lower left panel, it looks as though there are data from only one float? Is that a correct interpretation, and why only for one float that year? Is that described in the ARGO metadata?
Nice analysis Willis. It makes me appreciate the degree of interpolation in the ARGO dataset that I was using with it’s monthly 1×1 resolution. It also makes me wonder about which types of analysis are appropriate and which are not using this data.
Given 180*360*71% ~ 46,000 ocean 1×1 gridboxes and ~700,000 profiles, this averages about 15 profiles per box or about 2 profiles per box per year over the 7 year period 2005-2011. The resolution has been getting better though.
Willis, about the upper right-hand corner of fig 1 you write: In addition, you can see that the ocean is cooling from day 1 (start of January) to day 91 (end of March). The early records (red and orange) are on the right (warmer) side of the graph. The later records (green and blue) are concentrated in the left hand (cooler) side of the records.
If the bands of dots can be interpreted as the motions of the floats, it looks like the floats are moving generally north. It would seem odd that the NH water would be cooling from day 1 to day 91 of the year, when insolation is increasing (Dec 21 (day -10) having been the shortest day.) Is it a current change in that gridcell.
I grant you that the claimed precision (represented by the small reported s.d.) is fanciful. The temps can’t be considered as a random sample from any population. I think you have made that point well. At best, even within gridcells, they could only be considered as correlated samples from within strata.
Robert Brown: Does nobody know what a bloody Jacobean is?
“Bloody Jacobean” is pretty good. Did you perhaps mean “bloody Jacobian”? In all my trying encounters with Jacobians, I never met a bloody one.
Willis, I’m having trouble following your claim of +/- 0.002 deg C as the SLT2011 paper doesn’t seem to even discuss temperature accuracy. Plus they are talking about 15 year trends and 3500 Argo units. Their claims are:
Am I missing something?
Oops, I think I was mentally confusing the spelling of an architectural era with the matrix of differentials used to relate tensor forms to e.g. area or volume elements in a coordinate system. Sorry;-)
As for the blood, it comes from my having pounded my head against them until it bled a few times, largely in trying to figure out N-dimensional spherical geometries…
rgb
The data is very sparse and unevenly distributed. If we averaged all the data for each year you’d get something that might be called the global ocean temp. for that year, plus or minus about 5 degrees. Not useful, but accurate.
@Steve from Rockwood. Willis explains his numbers in the first link of this article. The SLT2011 paper, as you note, doesn’t address temperature accuracy.
Septic Matthew says:
February 29, 2012 at 2:03 pm
Because I created and printed them one at a time, it’s a slow process, and I didn’t think of that detail ’til I was done … at which point I said something which likely sounded very much like “aw, bugger it”, but likely was something else …
w.
Hi Willis
I’m your Gualala neighbor. When you wrote: “This leads to a curious oddity. The spread (standard deviation) of the temperature records from any given float depends on the direction that the float is moving. If the float is moving south, it is moving into warmer waters, but the water generally is cooling, so the spread of temperatures is reduced.”
Did you mean when the float is moving south … the water generally is warming, not cooling?
Robert Brown says:
February 29, 2012 at 2:09 pm
Thank you very much, Robert, for saying what needs to be said. The sad news is that the AGW climate science community doesn’t intersect with actual scientists at too many points, so it tends to do things according to the time tested methods used by Lavoisier and Arrhenius … or perhaps even before that. I’ve suggested the use of kriging, but just about anything would be better than this method.
Their method also suffers from the foolishness of being divided into months, which also introduces spurious values into the outcome … but none of this ever seems to sink in to the climate folks, who generally deal with objections by saying they’ve “moved on” ™ since whatever it is you are pointing out is wrong, and they now have some newer and stranger method that they are flogging.
Always good to hear from you,
w.
Septic Matthew says:
February 29, 2012 at 2:17 pm
It’s easier to see if you click on the graphic to see it full size. Above every measurement is a number, which is the day-number for that year. Since we’re looking at the first three months, they range from 1 to 91. Take a look in the left hand column for a float and you can see which way the numbers are getting larger. They take samples about every ten days, so the numbers will be something like “4, 14, 24, 34, 45, 55 …”
w.
Septic Matthew says:
February 29, 2012 at 2:22 pm
It’s a bit hard to tell, but I don’t think that many floats were in there more than once. The problem is that there are more floats that visible distinctions in the rainbow, especially on the web where only certain colors are allowed. Let me check …
OK, over the six years, here’s the results:
Float_ID, Samples
1900040, 2
1900074, 5
1900075, 27
1900638, 3
1900776, 9
1900779, 9
4900207, 9
4900208, 5
4900212, 16
4900444, 7
4900687, 1
4900729, 4
4900786, 2
4900805, 3
4900827, 4
4900840, 30
4901058, 9
4901205, 14
6900410, 12
6900412, 33
6900625, 1
6900771, 1
6900773, 4
6901118, 3
6901119, 3
In three months we’d expect nine samples, so as you can see, some of them are indeed in there twice, and two of them are in three times.
Regarding the first panel, yes, there’s only one float there during that three months. It’s just random chance, the Argo floats don’t follow a given path, they all just drift freely. I discussed their distribution in one of the previous articles on the subject.
w.
The experimental error of the ARGO float’s devices may we extremely small. That and about $2.50 will get you a fancy cup of coffee at some places. It would appear that the entire enterprise is based on two rather dubious assumptions. 1. The volume of water being measure is relatively homogenous and I am not making reference to the atomic composition of water. 2. That volume changes little over time within the measurement intervals and between intervals. I am still not convinced this whole ARGO thing is much more then some vast generalizations that sound nice but mean very little.
Steve from Rockwood says:
February 29, 2012 at 2:52 pm
Thanks, Steve. You need to convert units from heat content to temperature change. I disagree strongly with not expressing this at some point as temperature, because temperature is what is actually being measured, not heat content. Their claimed annual precision of the heat content (from their Table 1) is ± 0.21e+8 Joules. I used the volume of ocean being heated (listed above), along with the density (about 1.033) and the specific heat of the ocean (about 4 MJ per tonne per degree C) to do the conversion.
w.
majormike1 says:
February 29, 2012 at 3:15 pm
Hey, major, glad the North Coast is representing …
Sorry for the confusion. The water nearer the equator is generally warmer at any given instant than the water further away. However, over the period Jan-Feb-Mar, the whole gridcell is cooling. So it is moving south (warmer water) while there is cooling everywhere, and the two tend to cancel each other out.
w.
@Wayne2. Ah, I thought I was being lazy not going through all the posts first.
@Willis. Thanks. I’ll go through the older posts in more detail. I appreciate all the work you are putting into the Argo data and I’m trying to follow along.