I got into this investigation of Argo because I disbelieved their claimed error of 0.002°C for the annual average temperature of the top mile of the ocean. I discussed this in “Decimals of Precision“, where I showed that the error estimates were overly optimistic.

I wanted to know more about what the structure of the data looked like, which led to my posts Jason and the Argo Notes, Argo Notes Part Two, and Argo and the Ocean Temperature Maximum.

This is the next part of my slow wander through the Argo data. In How well can we derive Global Ocean Indicators from Argo data? (PDF, hereinafter SLT2011), K. von Schuckmann and P.-Y. Le Traon describe their method for analyzing the Argo data:

To evaluate GOIs [global oceanic indicators] from the irregularly distributed global Argo data, temperature and salinity profiles during the years 2005 to 2010 are uploaded spanning 10 to 1500m depth.

and

To estimate the GOIs from the irregularly distributed profiles, the global ocean is first divided into boxes of 5° latitude, 10° longitude and 3 month size. This provides a sufficient number of observations per box.

So I thought I’d take a look at some gridcell boxes. I’ve picked one which is typical, and shows some of the issues involved in determining a trend. Figure 1 shows the location of the temperature profiles for that gridbox, as well as showing the temperatures by latitude and day. The data in all cases is for the first three months of the year. The top row of Figure 1 shows all of the temperature for those three months (Jan-Feb-Mar) from all the years 2005-2011. The bottom row shows just the 2005 measurements. The following figures will show the other years.

*Figure 1. Click on image for full size version. Gridcell is in the Atlantic, from 25°-30°N, and 30°-40°W. Left column shows the physical location of the samples within the gridbox. Colors in the left column are randomly assigned to different floats, one color per float. Right column shows the temperature by latitude. Small numbers above each sample show the day of the year that the sample was taken. Colors in the right column show the day the sample was taken, with red being day one of the year, shading through orange, yellow and green to end at blue at day 91. Top row shows all years. Bottom row shows 2005. Text in the right column gives the mean (average) of the temperature measurements, the standard deviation (StdDev), and the 95% confidence interval** (95%CI)** of the mean of the temperature data. The 95% CI is calculated as the standard error of the mean times 1.96.*

Let’s consider the top row first. In the left column, we see the physical location of all samples that Argo floats took from 2005-2011. We have pretty good coverage of the area of the gridbox over that time. Note that the gridboxes are huge, half a million square kilometres for this particular one. So even with the 216 samples taken over the six-year period, that’s still only one sample per 2,500 square km.

Next, let’s consider the top right image. This shows how the temperatures vary by time and by latitude. As you would expect, the further north you go, the colder the ocean, with a swing of about three degrees from north to south.

In addition, you can see that the ocean is cooling from day 1 (start of January) to day 91 (end of March). The early records (red and orange) are on the right (warmer) side of the graph. The later records (green and blue) are concentrated in the left hand (cooler) side of the records.

This leads to a curious oddity. The spread (standard deviation) of the temperature records from any given float depends on the direction that the float is moving. If the float is moving south, it is moving into warmer waters, but the water generally is cooling, so the spread of temperatures is reduced. If the float is moving north, on the other hand, it is moving into cooler waters, and in addition the water is generally cooling, so the spread is increased. It is unclear what effect this will have on the results … but it won’t make them more accurate. You’d think that the directions of the floats might average out, but no such luck, south is more common than north in these months for this gridcell.

A second problem affecting the accuracy can be seen in the lower left graph of Figure 1. It seems that we have nine measurements … but they’re all located within one tiny part of the entire gridbox. This may or may not make a difference, depending on exactly where the measurements are located, and which direction the float is moving. We can see this in the upper row of Figure 2.

The effects I described above can be seen in the upper row, where the floats are in the northern half of the gridbox and moving generally southwards. There is a second effect visible, which is that one of the two floats (light blue circles) was only within the gridbox in the late (cooler) part of the period, with the first record being on day 62. As a result, the standard deviation of the measurements is small, and the temperature is anomalously low … which gives us a mean temperature of 20.8°C with a confidence interval of ± 0.36°C. In fact, the 95% confidence interval of the 2006 data does not overlap with the confidence interval of the mean of the entire 2005-2011 period (21.7° ± 0.12°C) … not a good sign at all

The 2007 data offers another problem … there weren’t any Argo floats at all in the gridcell for the entire three months. The authors say that in that case, they replace the year’s data with the “climatology”, which means the long-term average for the time period … but there’s a problem with that. The climatology covers the whole period, but there are more gaps in the first half of the record than in the latter half. As a result, if there is a trend in the data, this procedure is **guaranteed to reduce that trend**, by some unknown amount.

Figure 3 shows the next two years, 2008-2009.

2008 averages out very close to the overall average … but that’s just the luck of the draw, as the floats were split between the north and south. 2009 wasn’t so lucky, with most of the records in the south, This leads to a warmer average, as well as a small 95%CI.

Finally, Figure 4 shows 2010 and 2011.

In the final two years of the record, we are finally starting to get a more reasonable number of samples in the gridbox. However, there are still some interesting things going on. Look at the lower right graph. In the lower right of that graph there are two samples (day 71 and 81) from a float which didn’t move at all over that ten days (see bottom left graph, blue circles, with “81” on top of “71”). In that ten days, the temperature in that one single location dropped by almost half a degree …

DISCUSSION.

In this particular gridcell, the averages for each of the years 2005-2011 are 21.4°C, 20.8°C, no data, 21.7°C, 22°C, 21.9°C, and 21.7 °C. This gives a warming trend of 0.13°C/year, as shown in Figure 5.

My question is, how accurate is this trend? Me, I’d say we can’t trust it as far as we can throw it. The problem is that the early years (2005, ’06, and ’07) way undersample the gridcell, but this is hidden because they take a number of samples in one or two small areas. As a result, the confidence intervals are way understated, and the averages do not represent a valid sampling in either time or space.

My conclusion is that we simply do not have the data to say a whole lot about this gridcell. In particular, despite the apparent statistical accuracy of a trend calculated from from these numbers, I don’t think we can even say whether the gridcell is warming or cooling.

Finally, the law of large numbers is generally understood to relate to repeated measurements of the same thing. But here, two measurements ten days apart are half a degree different, while two measurements at the same time in different areas of the gridcell are as much as three degrees apart … are we measuring the “same thing” here or not? And if not, if we are measuring different things, what effect does that have on the uncertainty? Finally, all of these error calculations assume what is called “stationarity”, that is to say that the mean of the data doesn’t change over a sufficiently long time period. However, there is no reason to believe this is true. What does this do to the uncertainties?

I don’t have any answers to these questions, and looking at the data seems to only bring up more questions and complications. However, I had said that I doubted we knew the temperature to anything like the precision claimed by the authors. Table 1 of the SLT2011 paper claims a precision for the annual average heat content of the top mile of the ocean of ± 0.21e+8 Joules. Given the volume involved (414e+8 cubic kilometres), this means they are claiming to measure the temperature of the top mile of the ocean to ± 0.002°C, two thousandths of a degree …

As cited above, I showed before that this was unlikely by noting that there are on the order of 3500 Argo floats. If the SLT2011 numbers are correct and the error from 3500 floats is ± 0.002°C, it means that 35 floats could measure the temperature of the top mile of the ocean to a tenth of that accuracy, or ± two hundredths of a degree. This is highly unlikely, the ocean is way too large to be measured to plus or minus two hundredths of a degree by 35 floats.

Finally, people have the idea that the ocean is well-mixed, and changes slowly and gradually from one temperature to another. Nothing could be further from the truth. The predominant feature of the ocean is eddies. These eddies have a curious property. They can travel, carrying the same water, for hundreds and hundreds of miles. Here’s an article on one eddy that they have studied. Their illustration is shown as Figure 6.

Figure 7 shows another example of the eddying, non-uniform nature of the ocean. It is of the ocean off of the upper East Coast of the US, showing the Gulf Stream.

The blue rectangle shows the size of the gridcell used in SLT2011. The red circles approximate the distribution within the gridbox of the measurements shown in the bottom row of Figure 1 for 2005. As you can see, this number and distribution of samples is way below the number and breadth of samples required to give us any kind of accuracy. Despite that, the strict statistical standard error of the mean would be very small, since there is little change in temperature in the immediate area. This gives an unwarranted and incorrect appearance of an accuracy of measurement that is simply not attainable by sampling a small area.

Why is this important? It is important because measuring the ocean temperature is part of determining the changes in the climate. My contention is that we still have far too little information to give us enough accuracy to say anything meaningful about “missing heat”.

Anyhow, that’s my latest wander through the Argo data. I find nothing to change my mind regarding what I see as greatly overstated precision for the temperature measurements.

My regards to everyone,

w.

Thanks, Willis. This is yet another lesson regarding the incredible complexity of our planet’s climate system. If the ARGO measurements are so uncertain, how much worse are the pre-ARGO numbers?

Are the ARGO floats randomly distributed? If not, could their distribution be affected by temperature-related events, eg, winds, currents, upwellings, etc?

I love this: “This leads to a curious oddity. The spread (standard deviation) of the temperature records from any given float depends on the direction that the float is moving. If the float is moving south, it is moving into warmer waters, but the water generally is cooling, so the spread of temperatures is reduced.”

I KNOW this is not what you are saying… but this does suggest another convenient trick somebody could use to create Global Warming… just progressively shift the floats toward the Equator, or progressively delete the ‘colder’ ones.

Of course, I am sure that the top scientists entrusted with this vital data would never do such a thing, except, for deletions, on land.

Willis, why are the graphs in the right hand column labeled “Change in Temperature By Latitude and Day”, instead of “Temperature By Latitude and Day”? They don’t look like “changes”, and the text says “variation” , not “change”.

To estimate the GOIs from the irregularly distributed profiles, the global ocean is first divided into boxes of 5° latitude, 10° longitude and 3 month size. This provides a sufficient number of observations per box.Seriously, has nobody at all in climate science actually learned calculus? Does nobody know what a bloody Jacobean is?

Of all the stupid ways of dividing the ocean into cells, this has to be the absolute dumbest. It underweights the equator, overweights the poles, leads to many underful/empty/sparse cells. I mean seriously — one would have to go out of one’s way to come up with a worse way of doing it.

If somebody ever wants to do this right, there are a number of ways to do it. None of them use equal partitioning in spherical polar angles. Either use an icosahedral tesselation:

https://ziyan.info/2008/11/sphere-tessellation-using-icosahedron/

or start by using a fine-grained tessellation and use e.g. and affinity model to build equal-population cells out of the tesselation triangles or use a gaussian overlap model, offhand. Of course all of these approaches require some actual work. Latitude and longitude is easy — and wrong.

Sorry, I’m just sayin’…

rgb

The golden fleecers at work.

And is the upshot of all this that the argo data is neither widespread or sufficently adequate over a long enough time series for any meaningful analysis? Certainly, extrapolating it into trends is seemingly futile over such poorly sampled areas (and even worse if one starts gridding and infilling ‘missing’ data with averages, etc). It appears that:

a) the argo data is showing apparent significant temp variation

b) the temp variation could be caused by any number of issues aside from the obvious time of year and lat/long ones.

c) add in ‘depth’ and we are looking increasingly like we are measuring a tiny, almost imperceptible fraction of a very large body of water!

d) any ‘average’ or ocean temp anomaly ‘measurement’ must logically, almost by default – be very suspect as a ‘real’ or useable value, with potential to be ‘adjusted’ as requried.

Hmm – now, where have we seen such datasets before?? I fear the team will be able to use this data well!

Willis, you wrote:

If the float is moving south, it is moving into warmer waters,How can you identify the direction of movement of individual floats? Can you do it in these graphs, I mean, or have you that information without displaying it?

Willis, nothing is easier than asking questions, but I have another. You wrote:

Colors in the left column are randomly assigned to different floats, one color per float.Is that mapping one-to-one, so that all the orange dots represent the same float? The orange dots are in 3 clumps, lines actually, seeming to display the same float in 3 different years. If that is correct, why are there only 3 lines, instead of one for each year in 2005 – 2011? In figure 1, lower left panel, it looks as though there are data from only one float? Is that a correct interpretation, and why only for one float that year? Is that described in the ARGO metadata?

Nice analysis Willis. It makes me appreciate the degree of interpolation in the ARGO dataset that I was using with it’s monthly 1×1 resolution. It also makes me wonder about which types of analysis are appropriate and which are not using this data.

Given 180*360*71% ~ 46,000 ocean 1×1 gridboxes and ~700,000 profiles, this averages about 15 profiles per box or about 2 profiles per box per year over the 7 year period 2005-2011. The resolution has been getting better though.

Willis, about the upper right-hand corner of fig 1 you write:

In addition, you can see that the ocean is cooling from day 1 (start of January) to day 91 (end of March). The early records (red and orange) are on the right (warmer) side of the graph. The later records (green and blue) are concentrated in the left hand (cooler) side of the records.If the bands of dots can be interpreted as the motions of the floats, it looks like the floats are moving generally north. It would seem odd that the NH water would be cooling from day 1 to day 91 of the year, when insolation is increasing (Dec 21 (day -10) having been the shortest day.) Is it a current change in that gridcell.

I grant you that the claimed precision (represented by the small reported s.d.) is fanciful. The temps can’t be considered as a random sample from any population. I think you have made that point well. At best, even within gridcells, they could only be considered as correlated samples from within strata.

Robert Brown:

Does nobody know what a bloody Jacobean is?“Bloody Jacobean” is pretty good. Did you perhaps mean “bloody Jacobian”? In all my trying encounters with Jacobians, I never met a bloody one.

Willis, I’m having trouble following your claim of +/- 0.002 deg C as the SLT2011 paper doesn’t seem to even discuss temperature accuracy. Plus they are talking about 15 year trends and 3500 Argo units. Their claims are:

Am I missing something?

Oops, I think I was mentally confusing the spelling of an architectural era with the matrix of differentials used to relate tensor forms to e.g. area or volume elements in a coordinate system. Sorry;-)

As for the blood, it comes from my having pounded my head against them until it bled a few times, largely in trying to figure out N-dimensional spherical geometries…

rgb

The data is very sparse and unevenly distributed. If we averaged all the data for each year you’d get something that might be called the global ocean temp. for that year, plus or minus about 5 degrees. Not useful, but accurate.

@Steve from Rockwood. Willis explains his numbers in the first link of this article. The SLT2011 paper, as you note, doesn’t address temperature accuracy.

Septic Matthew says:

February 29, 2012 at 2:03 pm

Because I created and printed them one at a time, it’s a slow process, and I didn’t think of that detail ’til I was done … at which point I said something which likely sounded very much like “aw, bugger it”, but likely was something else …

w.

Hi Willis

I’m your Gualala neighbor. When you wrote: “This leads to a curious oddity. The spread (standard deviation) of the temperature records from any given float depends on the direction that the float is moving. If the float is moving south, it is moving into warmer waters, but the water generally is cooling, so the spread of temperatures is reduced.”

Did you mean when the float is moving south … the water generally is warming, not cooling?

Robert Brown says:

February 29, 2012 at 2:09 pm

Thank you very much, Robert, for saying what needs to be said. The sad news is that the AGW climate science community doesn’t intersect with actual scientists at too many points, so it tends to do things according to the time tested methods used by Lavoisier and Arrhenius … or perhaps even before that. I’ve suggested the use of kriging, but just about anything would be better than this method.

Their method also suffers from the foolishness of being divided into months, which also introduces spurious values into the outcome … but none of this ever seems to sink in to the climate folks, who generally deal with objections by saying they’ve “moved on” ™ since whatever it is you are pointing out is wrong, and they now have some newer and stranger method that they are flogging.

Always good to hear from you,

w.

Septic Matthew says:

February 29, 2012 at 2:17 pm

It’s easier to see if you click on the graphic to see it full size. Above every measurement is a number, which is the day-number for that year. Since we’re looking at the first three months, they range from 1 to 91. Take a look in the left hand column for a float and you can see which way the numbers are getting larger. They take samples about every ten days, so the numbers will be something like “4, 14, 24, 34, 45, 55 …”

w.

Septic Matthew says:

February 29, 2012 at 2:22 pm

It’s a bit hard to tell, but I don’t think that many floats were in there more than once. The problem is that there are more floats that visible distinctions in the rainbow, especially on the web where only certain colors are allowed. Let me check …

OK, over the six years, here’s the results:

Float_ID, Samples

1900040, 2

1900074, 5

1900075, 27

1900638, 3

1900776, 9

1900779, 9

4900207, 9

4900208, 5

4900212, 16

4900444, 7

4900687, 1

4900729, 4

4900786, 2

4900805, 3

4900827, 4

4900840, 30

4901058, 9

4901205, 14

6900410, 12

6900412, 33

6900625, 1

6900771, 1

6900773, 4

6901118, 3

6901119, 3

In three months we’d expect nine samples, so as you can see, some of them are indeed in there twice, and two of them are in three times.

Regarding the first panel, yes, there’s only one float there during that three months. It’s just random chance, the Argo floats don’t follow a given path, they all just drift freely. I discussed their distribution in one of the previous articles on the subject.

w.

The experimental error of the ARGO float’s devices may we extremely small. That and about $2.50 will get you a fancy cup of coffee at some places. It would appear that the entire enterprise is based on two rather dubious assumptions. 1. The volume of water being measure is relatively homogenous and I am not making reference to the atomic composition of water. 2. That volume changes little over time within the measurement intervals and between intervals. I am still not convinced this whole ARGO thing is much more then some vast generalizations that sound nice but mean very little.

Steve from Rockwood says:

February 29, 2012 at 2:52 pm

Thanks, Steve. You need to convert units from heat content to temperature change. I disagree strongly with not expressing this at some point as temperature, because temperature is what is actually being measured, not heat content. Their claimed annual precision of the heat content (from their Table 1) is ± 0.21e+8 Joules. I used the volume of ocean being heated (listed above), along with the density (about 1.033) and the specific heat of the ocean (about 4 MJ per tonne per degree C) to do the conversion.

w.

majormike1 says:

February 29, 2012 at 3:15 pm

Hey, major, glad the North Coast is representing …

Sorry for the confusion. The water nearer the equator is generally warmer at any given instant than the water further away. However, over the period Jan-Feb-Mar, the whole gridcell is cooling. So it is moving south (warmer water) while there is cooling everywhere, and the two tend to cancel each other out.

w.

@Wayne2. Ah, I thought I was being lazy not going through all the posts first.

@Willis. Thanks. I’ll go through the older posts in more detail. I appreciate all the work you are putting into the Argo data and I’m trying to follow along.

Willis, Roger Tallbloke has a post up on Argo and, if I read it correctly, he says that screenshots of the current temps should be archived as the historic values are being altered downward to manufacture a warming trend. Or did I misunderstand his post?

Are the ARGO floats randomly distributed? If not, could their distribution be affected by temperature-related events, eg, winds, currents, upwellings, etc?The Argo floats are free floating, which means they will tend to move away from areas of upwelling and stay at areas of ocean downwelling.

Outside polar regions, downwelling water is warmer than upwelling water,and this will cause an increasing warm bias over the life of a float.

There may be a similar effect with ocean eddies.

Even if Argo floats start out randomly distributed, they will become progressively less random relative to cool/warm areas of ocean.

‘My contention is that we still have far too little information to give us enough accuracy to say anything meaningful about “missing heat”.’

So all the M&Ms scattered about lead to the conclusion that we can’t say the missing heat is or is not in the oceans?

Here is a paper that describes how Argo floats have measured temperatures in upwelling/downwelling areas resulting from a Kelvin Wave. Although nothing about what bias may be introduced into ocean temp data.

http://w3.jcommops.org/FTProot/Argo/Doc/2007_mjo.pdf

Willis, IIRC, salinity has a non-trivial effect on thermal capacity and density. Can’t locate any numbers right now on thermal capacity vs salinity. Density increases for roughly 5% at 40 g/kg salt content.

The effect is obviously more pronounced in warmer waters where solubility is higher.

Willis:

Take a look in the left hand column for a float and you can see which way the numbers are getting larger. They take samples about every ten days, so the numbers will be something like “4, 14, 24, 34, 45, 55 …”Despite some overplotting of numbers, that is just possible if you match the dots on the left to the dots on the right that have the same latitude and day. Thanks.

I am still puzzled that the NH waters are cooling even as insolation (peak and duration) is increasing.

I am still puzzled that the NH waters are cooling even as insolation (peak and duration) is increasing.The main driver of ocean heat loss is the temperature difference between the ocean surface and the atmosphere above the ocean.

In the NH winter the atmosphere is colder than the temperature than would result in temperature equilibrium with solar insolation heating the ocean, and as a result the oceans lose heat.

Alexander K says:

February 29, 2012 at 4:15 pm

Sorry, the good news is I’m banned by Roger Tallbloke because I said that N&Z’s claims violated conservation of energy. It’s good news because it saves me from the effort of trying to educate the ineducable.

w.

Septic Matthew says:

February 29, 2012 at 6:08 pm

That’s the lag in the system, which is large in ocean temps because of the heat capacity of the water.

w.

Excellent work Wllis. Thanks!

Willis – my take is the ARGO floats are diving to a density depth and not a specific depth below the sea surface. That means they are free to wander up and down as the weight of the water column above changes with density as they move about the grid as will an airplane on a cross-country flight. Is that the case, and given the unpredictable location of thermoclines in the ocean is there anything anyone is doing to fiddle the data to allow for thermocline crossings by these wandering floats?

Thought y’all might enjoy this … it shows the data split out by day of the year, latitude, temperature, and year. You can see the problem with the meaning and value of any presumed trend …

w.

dp says:

February 29, 2012 at 10:25 pm

Interesting quesion, dp. Inherently, the Argo float measures pressure at each profile level. This then has to be converted to depth, taking into account temperature and salinity. I’m using data which has been converted and then interpolated onto standard depth intervals … none of which is too relevant since in this analysis I’m just looking at the surface.

You are correct that their “parking level” is set by pressure and not depth. However, from what I’ve looked at the variations in depth for a given pressure given the common variation in temperature and salinity is not that great.

Also, their parking depth is generally down at 1000 hPa, which is about 1,000 metres down. There’s not too much in the way of thermoclines down that deep, the main thermocline (between the “mixed layer” and the deep ocean), is way shallower than that.

All the best,

w.

Further inquiry considers the linear model …

Translation. The second line says that the linear model (lm) is that the Argo measured surface temperature (OceanTemp) is a function (shown in R as “~”) of both the latitude (Lat) and the day of the year (Day). Both of these are shown to be highly significant factors. In the section headed “Coefficients”, they both have “p-values” ≈ zero (shown as “Pr(>|t|)” above), and by the “significance codes” of three stars. The R-squared value (how much of the variation the model explains) is about 0.7.

Next, I looked at the same thing, but I added the year of the observation as another variable. This allows me to see if there is a year-over-year trend. Here are those values …

Adding the year does not improve the model in the slightest. The “p-value” of the Year variable is 0.92, meaning far from significant. So there is no statistically significant trend in the data.

Anyhow, that’s my interpretation of those results, your mileage may vary, your comments solicited.

w.

Oh, yeah, here’s the residuals of the linear model … that’s what’s left over after you subtract what your model predicts.

The colors are by float, and correspond to the floats colors in the top left graph in Figure 1.

Note that 2012 only has the first part of the data for JanFebMar.

w.

Willis,

I too, appreciate your efforts.

re the argo’s, argo placement might be more useful by being anchored, maybe with the concrete foundations of defunct wind turbines, cut to size.

argo mark 3 or whatever could, for example, be strategically placed to monitor currents, volcanic activity, ‘open’ ocean, and items of interest.

I’m betting that if the argo enterprise doesn’t become redundant, future floats will be designed for particular placement, such as in shallow water.

regards,

William (Bill) Martin

Thanks Willis.

William Martin says:re the argo’s, argo placement might be more useful by being anchored

The solution to non-random horizontal drift is to make them powered. This will allow them to be relocated using some randomizing algorithm or even keep them at a single location.

I’m rather surprised this wasn’t build into the Argo floats in the first place.

Willis,

Nice to see that you have begun an analysis of the sources of variance in the ARGO data. In your earlier ‘decimals of precision’ post, I had made the comment that there are other sources of variability besides the instrument precision. At the time I offered some possible sources of variability, several of which you have included in the current posting. My brainstorm suggestions were:

*Time window (month or season)

*Ocean subdivided into Ocean region (eg N S E W or Gulf Stream – sub sections don’t necessarily need to be the same size and shape)

*Latitude (to account for such things as Gulf Stream cooling as it moves from Bahama to Iceland)

*Ocean ‘phase’ (such as AMO, PDO, El Nino, etc)

*Thermocline (percentage of an Argo profile above vs below the thermocline)

In this post you show estimates of 95% confidence interval significantly larger than the 0.002°C from SLT2011 because they include other sources of variance. Although you minimized my suggestion at the time, it is gratifying to see a first slice at quantifying some of those other sources of variance.

cheers,

Dave

In this post you show estimates of 95% confidence interval significantly larger than the 0.002°C from SLT2011 because — your estimated values — include other sources of variance.

which is what I meant, although ‘they’ in my earlier posting was not succinct.

I’ve suggested the use of kriging, but just about anything would be better than this method.Their method also suffers from the foolishness of being divided into months, which also introduces spurious values into the outcome … but none of this ever seems to sink in to the climate folks, who generally deal with objections by saying they’ve “moved on” ™ since whatever it is you are pointing out is wrong, and they now have some newer and stranger method that they are flogging.

Precisely — kriging, perhaps with Gaussians, would be a reasonable thing to try, although I’d want to try several things and not just one. In all cases the real problem isn’t the dense zones where “anything” works — it is the sparse places where a single sample can have a disproportionate effect on a large volume. If you give neighboring samples too great a “range”, the real data gets overwhelmed by data from somewhere else, and in a non-uniform equatorially peaked distribution, that will

alwayslead to net warming (compared to the correct answer, the true average) because there are simply more buoys in more area near the warmer equator than there are near the poles. If you give it too little range, your model has big holes.We’ve seen this problem in spades recently in the infamous Antarctica paper, where a large number of densely packed thermal sensors in the one part of Antarctica that unambiguously warmed actually raised the assigned temperature of sensors thousands of miles away on the other side of the continent. Doing this right isn’t rocket science, but it does require a modicum of mathematical and statistical competence, ideally applied without (confirmation) bias that causes you to always pick corrections that seem to make everything just a bit warmer.

It sounds like they have equal problems with time, especially given that the buoys move, and move in bodies of water that are not well-mixed. And I don’t doubt that they have problems with depth. Building an accurate three dimensional thermal map of the ocean is a daunting project, the first step of which is to study the data, study it some more, maybe spend some time with the data,

livewith the data, and only then think about how to take the data and begin to transform it into a map, using some real mathematics.Sadly, I think you’ve very definitely proven one thing. You are dead right — their error estimate is truly absurd. But then, I was convinced of that before I even read you very first article on the subject. Given the volume of the ocean and the number of buoys and the observed thermal granularity visible in TOA IR pictures of the ocean (which clearly reveals currents of water with very different SSTs forming a paisley mosaic at many scales) one would require very nearly photographic resolution in sampling buoys to get the kind of accuracy they claim.

In fact, that’s a very simple test right there. One can infer the SSTs from satellite data. One can also infer it from the buoy data. One can compare. I vote for satellites as being the gold standard, but either way if they aren’t in agreement within 0.002C — truly, an absurd assertion as I”m not certain how I would measure the water in my bathtub’s average temperature to that precision assuming that I had a thermometer accurate to (say) 0.00001C — then sorry, major fail.

rgb

Thought y’all might enjoy this … it shows the data split out by day of the year, latitude, temperature, and year. You can see the problem with the meaning and value of any presumed trend …Hi Willis,

I’m having trouble understanding the box data. Shouldn’t it be toroidally symmetric on day of year? I’m having a hard time visualizing why polar temperatures jump up on Jan 1 if the data does, as it appears to, stretch between corners. Also, does this include NH and SH data? Finally, I assume that the latitude range is equator-to-pole (or as close to pole as one can get) but I would have expected it to get a lot sparser for the polar data. A LOT sparser, given that Antarctica sort of occupies the south polar region and the Arctic ocean is really rather small.

Or maybe my question is just, what exactly are the ranges of the axes and what are the data points?

rgb

Dr. Brown,

I agree, the climate research space is full of poor spatial analysis. Just as the base stat work needed a professional review, the spatial underpinnings need a good look.

I have learned at school that when experimenting and measuring one should change as little as possible. Why did they not position the floats at fixed anchored positions?

Thanks WIllis, your tenacity is amazing.

Patrick de Boevere [March 1, 2012 at 7:28 am]

“I have learned at school that when experimenting and measuring one should change as little as possible. Why did they not position the floats at fixed anchored positions?”

Cost. That’s a lot of anchoring systems to install and maintain.

“The water nearer the equator is generally warmer at any given instant than the water further away.”

Willis, is this always really the case? The equator does not actually receive the greatest total amount of insolation. This is partly because the rate of change of solar angle is greatest at the equinoxes and least at the solstice (when it passes through zero). Think of the simple harmonic motion of a pendulum. As the sun moves from the zenith, the effect of increasing angle is actually outweighed by the length of the day in summer months, up to a certain latitude nearer the tropics.

Offhand I can’t recall the exact latitude of maximum isolation, but I remember a text book indicating that it was a long way from the equator. Also, within the tropics “high summer” is not the same time as locations north or south of the tropics.

I wouldn’t be surprised to see this leaving a temperature signal in surface water.

@ Robert Brown

The “box” data is the same as in figure 1. It is from one gridcell and only from January 1st to March 31st. This explains why the temperatures “jump” on Jan 1 – they do compared with the previous year’s March 31st. Polar regions are not included as it is only one gridcell.

Statistical analysis requires repeated measurements of the same thin under the same conditions.

Statistically speaking, I see a few problems analyzing this data.

@ Michael Hart

There is a difference between maximum daily insolation and annual mean insolation. Although the maximums will be north and south of the equator, depending on the season, the highest annual mean insolation is at the equator.

Google “insolation as a function of latitude” for better explanations.

Willis:

That’s the lag in the system, which is large in ocean temps because of the heat capacity of the water.In principle I knew that, but having lived on land all my life (or perhaps merely being ignorant), I thought that the lag was only a month.

I de-anonymized on the other climate blogs I read, so it’s time I do so here as well. I strongly support anonymous commentary in public debates, the heritage of The Federalist and other great examples, but it’s not for me anymore.

Robert Brown says:

March 1, 2012 at 5:29 am

Sorry for the lack of clarity, Robert, posting late. All of the data is from the one gridcell, 25°-30°N and 20°-30°W, and is from the first three months of the year. This is to match the lat/long and time slicing done in the SLT2011 paper.

w.

michael hart says:

March 1, 2012 at 8:00 am

No, it’s not always the case, I see I wasn’t clear. I’m speaking solely about one gridcell (25°-30°N and 20°-30°W) and one part of the year (first three months).

w.

Willis:

Thought y’all might enjoy this …Thanks. I almost recommended more 3D plots, especially for the upper panels of figure 1. Also since you use color to identify floats in figure 1 Upper Left, and color to represent time in figure 1 Upper Right, I thought that you might assign a different plot symbol to each float within each grid cell. But as I wrote, it’s easy to recommend work to others, and I most like to wait and see what you’ll come up with next.

My public LinkedIn profile is here: http://www.linkedin.com/pub/matthew-marler/15/21b/9a9

Willis:

The colors are by float, and correspond to the floats colors in the top left graph in Figure 1.Very few floats are within the grid cell more than 2 consecutive years. Unless you have reliable knowledge that the Argo floats are exquisitely precisely made, the fact that different floats float into and out of the grid cell confounds year with float. I’d recommend adding the orthogonal quadratic and cubic polynomials of year to the model (just on general principles, because a great many things are adequately modeled over short intervals of time by cubic polynomials). The residuals are surprisingly (to me) large.

It looks from the 3D graph as though day of the year, latitude and year are all confounded. Do you want to play with alternative models? Like, wouldn’t it make more sense to have sin(day) instead of day, and cos(latitude) instead of latitude? Naively, I’d expect changing day to sin(day) would matter more than changing latitude to cos(latitude) because of the fractions of the total ranges involved.

I see these data as being a useful addition for courses on applied data analysis. That’s just one grid cell.

As always, thank you for your work.

Willis:

I’m speaking solely about one gridcell (25°-30°N and 20°-30°W)The plots have -40°– -30°W, and seemingly reversed. Another “Pfui!”, I wonder?

We have to teach them about eddies? Oh My God…

I guess I spent too much time on the water. I thought everyone knew about eddies and their big brother, gyres. Next thing you know, you’ll be telling me that they don’t sample enough to allow for major ocean hot and cold currents and upwellings… Oh, wait, I think you just did…

“My conclusion is that we simply do not have the data to say a whole lot about this gridcell. In particular, despite the apparent statistical accuracy of a trend calculated from from these numbers, I don’t think we can even say whether the gridcell is warming or cooling.”

SOP for the Warmistas. Take a look at GIStemp where they had 1200 thermometers in GHCN and 8000 grid cells. (From 2007 until 2011 USHCN was not used for anything after 2007, after 2011 they added it back in, but that’s only the 2% of the surface (or gridcells) that are the USA, leaving the rest of the world covered by those original thermometers.

Then again, they’ve now bumped the gridcell count up to 16,000.

Now, “do the math”… If you have 1200 thermometers covering 98% of 16,000 grid cells, the typical grid cell is EMPTY BY DEFINITION. I make that 15680 grid cells for that 1200 or so thermometers. Or 7.7% of the cells have a thermometer in them. 92.3% of grid cells are EMPTY.

From this GIStemp claims to compute the Global Average Temperature by imagination and handwaving… But at least the thermometers only move when they delete one, update it, move it, or build a new airport… Just so broken…

I’d love to do the kind of exposition you did above on the land grid cells, but with so much of it a complete fantasy value I’m not sure where to begin…

IMHO, the entire Global Average Temperature (values and trends) is nothing more than a statistical artifact of measurement error, splice artifacts inside grid cells, and ‘fabrication error’ in the calculation of the phantom grids.

Robert Brown says:

February 29, 2012 at 2:09 pm

I have always thought the proper way would be to fit a truncated expansion in spherical harmonics, just like we do for gravity anomalies and Earth’s magnetic field.

Philip Bradley says:

February 29, 2012 at 4:52 pm

“Outside polar regions, downwelling water is warmer than upwelling water,and this will cause an increasing warm bias over the life of a float.Hmm… Sounds like this could be a reason the water below 700m appears to be trending upward, and that potentially invalidates the claim that this supports the notion that the “missing heat” is in the depths. I assume that is what you were getting at, but wanted to spell it out to be sure.

@Judge,

What I should have said is that the eccentricity of the earth’s orbit causes the southern hemisphere to receive about 7 to 9% more solar input during the southern summer which shifts the annual mean away from the equator. [I forgot that I wrote the same thing myself on another blog by Willis only a few days ago]. In the northern hemishere summer I think my original explanation may still hold in at least some of the locations. Finding the mathematically detailed description is not proving easy.

Many internet explanations do not acknowledge this. They tend to use simple illustrations derived from calculated values apparently modelled as a circular orbit. [Models which do not make accurate simulations of reality have been noted in other areas of climatology !]

Have a look at the link below, and scroll down to page 15.

http://www.docstoc.com/docs/9654876/Sun-Insolation

I have always thought the proper way would be to fit a truncated expansion in spherical harmonics, just like we do for gravity anomalies and Earth’s magnetic field.Yeah, but I suspect that this turns out not to be the case. Spherical harmonics are a vector decomposition based on projection via integration over the sphere. The problem there is that integration over the sphere for spherical harmonics involves spherical polar coordinates with the spherical polar Jacobian again. If you can do the integrals analytically it is great and you can use orthogonality and all that. If you have to do the integrals numerically you are screwed because of the horrible nonlinearities at the poles. You can’t just e.g. cover with a uniform grid and get a decent result — you’ll be packed in like crazy at the poles and sparse as all hell at the equator. You can do an adaptive quadrature but almost any default gridding will oversample the poles.

I’ve tried things like Gauss-Legendre and Gauss-Chebyshev quadratures, where one basically picks special angles that reduce the number of points you have to evaluate to form the quadratures (via orthogonality) but they don’t work terribly well even for relatively smooth decompositions with not too large. Better than rectangular quadratures I suppose.

Ultimately, spherical harmonics work well when they match the symmetry of the problem, e.g. expanding a multipolar field or potential. Not so well expanding an elephant or Rodin’s thinker or — the Earth’s continents and oceans.

Hence the two general approaches mentioned so far — covering the surface with a uniform tesselation, e.g. a subdivided icosahedron. That basically covers the surface with tiles that are all the same area and shape, and if you divide finely enough you can actually represent a fair amount of detail such as continental shapes and medium-small islands. Or kriging, a method of smoothly interpolating or approximating a random function supported on an irregular grid. Or a combination of the two — coarse graining the samples on the icosahedral grid at some granularity and then using the tile coordinates and centered data to krige a reasonably smooth interpolatory map.

I don’t really know what is best — I just have a pretty good idea of what is the worst, from days spent writing code to numerically decompose things in terms of spherical harmonics where there simply aren’t any really good algorithms for doing so. A good, fairly recent review of the math and methods is here:

http://www.maths.sussex.ac.uk/preprints/document/SMRR-2009-22.pdf

Some of the best (simplest) are basically tesselations, note well. Adaptive cubature/quadrature is just plain difficult for spheres.

rgb

I was wondering about eddies. If a float finds itself in a current does it naturally move to the current boundary and get trapped in the eddies. The stuck float recording different temperatures could be stuck in an eddy between two different temperatures of water, but wouldn’t the regular deep diving free it? Unlike an Argo I am way out of my depth.

Robert Brown says:

March 1, 2012 at 11:19 pm

Thanks, Robert. As an example of one way that wasn’t in your very interesting reference, I have often thought one way to do this might be to use NURBS, “non-uniform rational B-splines”. The temperature could be represented as a 3-D NURBS surface, whose distance from the origin (0,0,0) at any point on the NURBS surface represents the absolute temperature at that point. Then the average temperature is available as the radius of the sphere with the same volume as the NURBS surface.

The advantage is that you can (either globally or locally) adjust the … mmm … well, I call it the “tension” of the connections between the NURBS control points and the final surface, I forget what the real name is, I haven’t written any NURBS computer code in about 20 years. That allows the surface to “drape” over temperature values which are near each other but significantly different.

NURBS surfaces adjust automatically for the number and spacing of the control points. Seems like implementing global temperatures as a 3-D kinda lumpy NURBS oblate spheroid solves a whole host of averaging problems. I just haven’t had the time (and would have to seriously advance my NURBS chops) to implement it. Here’s a sample NURBS quasi-spheroid I just whipped up as an example …

Seems like something like that (but much more complex of course) would work …

w.

Robert Brown says:

March 1, 2012 at 11:19 pm

“The problem there is that integration over the sphere for spherical harmonics involves spherical polar coordinates with the spherical polar Jacobian again.”The idea would be to get an appropriate functional basis and do a numerical fit of the measurements to that basis. That way, the problems and kluges associated with non-uniform sampling go away. This can be done with least squares techniques to minimize the (possibly weighted) sum of the squared errors between measurements and truncated expansion model evaluated at the discrete set of coordinates where measurements are taken.

I haven’t done it for this application, so I cannot say whether there might not be complications, or that there might not be a more appropriate set of basis functions, but the technique is fairly straightforward. The basis functions used in the spherical harmonic models are the eigenfunctions of the Laplacian operator, and that is what makes them particularly appropriate for gravity and magnetic field modeling. Really, any linearly independent set of functions can be used, but the goal would be to provide the best approximation for a given number of terms in the truncated expansion.

“If you have to do the integrals numerically you are screwed because of the horrible nonlinearities at the poles.”Not sure if I am interpreting you right. The standard way of dealing with that is to transform from latitude as a variable to normalized vertical height (sine of latitude) as the third coordinate, which is equivalent to the transformation given on page 4 of your link.

Robert Brown says: March 1, 2012 at 11:19 pm“Some of the best (simplest) are basically tesselations, note well. Adaptive cubature/quadrature is just plain difficult for spheres.”

I’ve tried a few of these schemes for surface temperatures with varying success:

1. Equal area cells. It’s the one I regularly use now.

2. Triangular tesselation Voronoi and similar

3. Spherical harmonics. The method is described here. I produce a map every month; here is September 2011, with a spherical projection. And a GISS comparison. Mainly used for display.