
By Walter Dnes – Edited by Just The Facts
Investopedia defines “Leading Indicator” thusly…
A measurable economic factor that changes before the economy starts to follow a particular pattern or trend. Leading indicators are used to predict changes in the economy, but are not always accurate.
Economics is not the only area where a leading indicator is nice to have. A leading indicator that could predict in February, whether this calendar year’s temperature anomaly will be warmer or colder than the previous calendar year’s anomaly would also be nice to have. I believe that I’ve stumbled across exactly that. Using data from 1979 onwards, the rule goes like so…
- If this year’s January anomaly is warmer than last year’s January anomaly, then this year’s annual anomaly will likely be warmer than last year’s annual anomaly.
- If this year’s January anomaly is colder than last year’s January anomaly, then this year’s annual anomaly will likely be colder than last year’s annual anomaly.
This is a “qualitative” forecast. It doesn’t forecast a number, but rather a boundary, i.e. greater than or less than a specific number. I don’t have an explanation for why it works. Think of it as the climatological equivalent of “technical analysis”; i.e. event X is usually followed by event Y, leaving to others to figure out the underlying “fundamentals”, i.e. physical theory. I’ve named it the “January Leading Indicator”, abbreviated as “JLI” (which some people will probably pronounce as “July”). The JLI has been tested on the following 6 data sets, GISS, HadCRUT3, HadCRUT4, UAH5.6, RSS and NOAA
In this post I will reference this zipped GISS monthly anomaly text file and this spreadsheet. Note that one of the tabs in the spreadsheet is labelled “documentation”. Please read that tab first if you download the spreadsheet and have any questions about it.
The claim of the JLI would arouse skepticism anywhere, and doubly so in a forum full of skeptics. So let’s first look at one data set, and count the hits and misses manually, to verify the algorithm. The GISS text file has to be reformatted before importing into a spreadsheet, but it is optimal for direct viewing by humans. The data contained within the GISS text file is abstracted below.
Note: GISS numbers are the temperature anomaly, multiplied by 100, and shown as integers. Divide by 100 to get the actual anomaly. E.g. “43” represents an anomaly of 43/100=0.43 Celsius degrees. “7” represents an anomaly of 7/100=0.07 Celsius degrees.
- The first 2 columns on the left of the GISS text file are year and January anomaly * 100.
- The column after “Dec” (labelled “J-D”) is the January-December anomaly * 100
The verification process is as follows:
- Count all the years where the current year’s January anomaly is warmer than the previous year’s January anomaly. Add a 1 in the Counter column for each such year.
- For each such year, we count all where the year’s annual anomaly is warmer than the previous year’s annual anomaly and add a 1 in the Hit column for each such year.
| Jan(current) > Jan(previous) | J-D(current) > J-D(previous) | ||||
| Year | Counter | Compare | Hit | Compare | Comment |
| 1980 | 1 | 25 > 10 | 1 | 23 > 12 | |
| 1981 | 1 | 52 > 25 | 1 | 28 > 23 | |
| 1983 | 1 | 49 > 4 | 1 | 27 > 9 | |
| 1986 | 1 | 25 > 19 | 1 | 15 > 8 | |
| 1987 | 1 | 30 > 25 | 1 | 29 > 15 | |
| 1988 | 1 | 53 > 30 | 1 | 35 > 29 | |
| 1990 | 1 | 35 > 11 | 1 | 39 > 24 | |
| 1991 | 1 | 38 > 35 | 0 | 38 < 39 | Fail |
| 1992 | 1 | 42 > 38 | 0 | 19 < 38 | Fail |
| 1995 | 1 | 49 > 27 | 1 | 43 > 29 | |
| 1997 | 1 | 31 > 25 | 1 | 46 > 33 | |
| 1998 | 1 | 60 > 31 | 1 | 62 > 46 | |
| 2001 | 1 | 42 > 23 | 1 | 53 > 41 | |
| 2002 | 1 | 72 > 42 | 1 | 62 > 53 | |
| 2003 | 1 | 73 > 72 | 0 | 61 < 62 | Fail |
| 2005 | 1 | 69 > 57 | 1 | 66 > 52 | |
| 2007 | 1 | 94 > 53 | 1 | 63 > 60 | |
| 2009 | 1 | 57 > 23 | 1 | 60 > 49 | |
| 2010 | 1 | 66 > 57 | 1 | 67 > 60 | |
| 2013 | 1 | 63 > 39 | 1 | 61 > 58 | |
| Predicted 20 > previous year | Actual 17 > previous year | ||||
Of 20 candidates flagged (Jan(current) > Jan(previous)), 17 are correct (i.e. J-D(current) > J-D(previous)). That’s 85% accuracy for the qualitative annual anomaly forecast on the GISS data set where the current January is warmer than the previous January.
And now for the years where January is colder than the previous January. The procedure is virtually identical, except that we count all where the year’s annual anomaly is colder than the previous year’s annual anomaly and add a 1 in the Hit column for each such year.
| Jan(current) < Jan(previous) | J-D(current) < J-D(previous) | ||||
| Year | Counter | Compare | Hit | Compare | Comment |
| 1982 | 1 | 4 < 52 | 1 | 9 < 28 | |
| 1984 | 1 | 26 < 49 | 1 | 12 < 27 | |
| 1985 | 1 | 19 < 26 | 1 | 8 < 12 | |
| 1989 | 1 | 11 < 53 | 1 | 24 < 35 | |
| 1993 | 1 | 34 < 42 | 0 | 21 > 19 | Fail |
| 1994 | 1 | 27 < 34 | 0 | 29 > 21 | Fail |
| 1996 | 1 | 25 < 49 | 1 | 33 < 43 | |
| 1999 | 1 | 48 < 60 | 1 | 41 < 62 | |
| 2000 | 1 | 23 < 48 | 1 | 41 < 41 | 0.406 < 0.407 |
| 2004 | 1 | 57 < 73 | 1 | 52 < 61 | |
| 2006 | 1 | 53 < 69 | 1 | 60 < 66 | |
| 2008 | 1 | 23 < 94 | 1 | 49 < 63 | |
| 2011 | 1 | 46 < 66 | 1 | 55 < 67 | |
| 2012 | 1 | 39 < 46 | 0 | 58 > 55 | Fail |
| Predicted 14 < previous year | Actual 11 < previous year | ||||
Of 14 candidates flagged (Jan(current) < Jan(previous)), 11 are correct (i.e. J-D(current) < J-D(previous)). That’s 79% accuracy for the qualitative annual anomaly forecast on the GISS data set where the current January is colder than the previous January. Note that the 1999 annual anomaly is 0.407, and the 2000 annual anomaly is 0.406, when calculated to 3 decimal places. The GISS text file only shows 2 (implied) decimal places.
The scatter graph at this head of this article compares the January and annual GISS anomalies for visual reference.
Now for a verification comparison amongst the various data sets, from the spreadsheet referenced above. First, all years during the satellite era, which were forecast to be warmer than the previous year
| Data set | Had3 | Had4 | GISS | UAH5.6 | RSS | NOAA |
| Ann > previous | 16 | 15 | 17 | 18 | 18 | 15 |
| Jan > previous | 19 | 18 | 20 | 21 | 20 | 18 |
| Accuracy | 0.84 | 0.83 | 0.85 | 0.86 | 0.90 | 0.83 |
Next, all years during the satellite era, which were forecast to be colder than the previous year
| Data set | Had3 | Had4 | GISS | UAH5.6 | RSS | NOAA |
| Ann < previous | 11 | 11 | 11 | 11 | 11 | 11 |
| Jan < previous | 15 | 16 | 14 | 13 | 14 | 16 |
| Accuracy | 0.73 | 0.69 | 0.79 | 0.85 | 0.79 | 0.69 |
The following are scatter graph comparing the January and annual anomalies for the other 5 data sets:
HadCRUT3

HadCRUT4

UAH 5.6

RSS

NOAA

The forecast methodology had problems during the Pinatubo years, 1991 and 1992. And 1993 also had problems, because the algorithm compares with the previous year, in this case Pinatubo-influenced 1992. The breakdowns were…
- For 1991 all 6 data sets were forecast to be above their 1990 values. The 2 satellite data sets (UAH and RSS) were above their 1990 values, but the 4 surface-based data sets were below their 1990 values
- For 1992 the 4 surface-based data sets (HadCRUT3, HadCRUT4, GISS, and NCDC/NOAA) were forecast to be above their 1991 values, but were below
- The 1993 forecast was a total bust. All 6 data sets were forecast to be below their 1992 values, but all finished the year above
In summary, during the 3 years 1991/1992/1993, there were 6*3=18 over/under forecasts, of which 14 were wrong. In plain English, if a Pinatubo-like volcano dumps a lot of sulfur dioxide (SO2) into the stratosphere, the JLI will not be usable for the next 2 or 3 years, i.e.:
“The most significant climate impacts from volcanic injections into the stratosphere come from the conversion of sulfur dioxide to sulfuric acid, which condenses rapidly in the stratosphere to form fine sulfate aerosols. The aerosols increase the reflection of radiation from the Sun back into space, cooling the Earth’s lower atmosphere or troposphere. Several eruptions during the past century have caused a decline in the average temperature at the Earth’s surface of up to half a degree (Fahrenheit scale) for periods of one to three years. The climactic eruption of Mount Pinatubo on June 15, 1991, was one of the largest eruptions of the twentieth century and injected a 20-million ton (metric scale) sulfur dioxide cloud into the stratosphere at an altitude of more than 20 miles. The Pinatubo cloud was the largest sulfur dioxide cloud ever observed in the stratosphere since the beginning of such observations by satellites in 1978. It caused what is believed to be the largest aerosol disturbance of the stratosphere in the twentieth century, though probably smaller than the disturbances from eruptions of Krakatau in 1883 and Tambora in 1815. Consequently, it was a standout in its climate impact and cooled the Earth’s surface for three years following the eruption, by as much as 1.3 degrees at the height of the impact.” USGS
For comparison, here are the scores with the Pinatubo-affected years (1991/1992/1993) removed. First, where the years were forecast to be warmer than the previous year
| Data set | Had3 | Had4 | GISS | UAH5.6 | RSS | NOAA |
| Ann > previous | 16 | 15 | 17 | 17 | 17 | 15 |
| Jan > previous | 17 | 16 | 18 | 20 | 19 | 16 |
| Accuracy | 0.94 | 0.94 | 0.94 | 0.85 | 0.89 | 0.94 |
And for years where the anomaly was forecast to be below the previous year
| Data set | Had3 | Had4 | GISS | UAH5.6 | RSS | NOAA |
| Ann < previous | 11 | 11 | 11 | 10 | 10 | 11 |
| Jan < previous | 14 | 15 | 13 | 11 | 12 | 15 |
| Accuracy | 0.79 | 0.73 | 0.85 | 0.91 | 0.83 | 0.73 |
Given the existence of January and annual data values, it’s possible to do linear regressions and even quantitative forecasts for the current calendar year’s annual anomaly. With the slope and y-intercept available, one merely has to wait for the January data to arrive in February and run the basic “y = mx + b” equation. The correlation is approximately 0.79 for the surface data sets, and 0.87 for the satellite data sets, after excluding the Pinatubo-affected years (1991 and 1992).
There will probably be a follow-up article a month from now, when all the January data is in, and forecasts can be made using the JLI. Note that data downloaded in February will be used. NOAA and GISS use a missing-data algorithm which results in minor changes for most monthly anomalies, every month, all the way back to day 1, i.e. January 1880. The monthly changes are generally small, but in borderline cases, the changes may affect rankings and over/under comparisons.
The discovery of the JLI was a fluke based on a hunch. One can only wonder what other connections could be discovered with serious “data-mining” efforts.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Dr. Strangelove said on February 4, 2014 at 10:29 pm
“Bernie
The concepts are simple enough to understand without computer programming, at least for those familiar with probability theory. If not, a google search on statistics would be more informative.”
Wow – that was dismissive! I am familiar enough with probability theory. What I do not understand is the “nuts and bolts” of what you claim to be doing in your post of 5:46pm. It is vague and makes no sense – what you did, or why you are even doing certain things. If it’s not BS on your part, than post some code. Or are you just hand-waving?
walterdnes says:
February 4, 2014 at 1:30 am
Sorry, I’ve said this before, but it bears repeating. You don’t get to pick and choose what data to use based on whether it fits your theory.
More to the point … who cares? To give a real-world example, the farmers around where I live like hot summers for the grapes. So I took a look at your JLI for my local weather station, Santa Rosa. I used all of the months, not just January, as the leading indicator for that month plus the next 11 months.
Just like you said, it works a treat, it gives me a 59% success rate. So I set myself as Nostradamus of the North, the Weather Prognosticator.
So now, when the January is warmer than last year, the good farmers around here come to me and I’ll tell them “Yep, Walter’s indicator says it will be warmer”. And they all go away satisfied, because they can now plan for the future … except for one ornery old geezer who comes back and says, “Hang on … how much warmer than last year will it be?”
So I go back to my data, I average out all of the results, and I tell him “Walter’s method says it will be a bit more than a tenth of a degree warmer than last year” … he considers that a moment, then asks for the standard deviation of the results … I go back and calculate that one … “Plus or minus half a degree”, I tell him.
And the farmer says “You’re telling me that this year will be a tenth of a degree warmer than last year, plus or minus half a degree? Have you lost your mind? What do I care about a tenth of a degree, particularly with that wide an error in the results?”
I’m sure you can see the moral of the story. It’s a difference that doesn’t make a difference, and things are even worse (much smaller values) at a global level. In fact, using January alone for all of the GISS LOTI data, yes, there is a real result (Average of positive = 0.05, average of negative = -0.05), but the standard deviation is twice that value (0.10).
Finally, upstream someone commented:
Using the GISS LOTI data, we can say that if this January is warmer than last January, we can say that this year will be will be a heart-stopping 0.05°C warmer than last year ON AVERAGE, with a 95% confidence interval from -0.15°C to 0.25°C.
Anyone who thinks that a projected possible warming of five hundredths of a degree is an “essential ingredient for policy making” hasn’t thought this all the way through.
w.
Dr. Strangelove says:
February 4, 2014 at 5:46 pm
Read up on the difference between “white noise” and “red noise”, Doc. You’ve used white noise, perhaps not even random normal white noise (excel “RAND” function gives uniform random rather than normal random numbers) … but the temperature data you are testing is red noise, actually very red noise. As a result, you need red noise pseudodata for the monte carlo test.
w.
Guy says:
February 4, 2014 at 6:40 pm
Thanks for the support, Guy. I’ve pointed that out a bunch of times, only to be told it doesn’t matter. It does matter, but these guys love their “positive” results … go figure.
w.
Terry Oldberg says:
February 4, 2014 at 9:01 pm
I suppose I should try decoding Terry again, although it’s not been too productive in the past … Terry, defining a “Condition” in capital letters as a “condition on the Cartesian product of the values” of the independent variables, doesn’t mean anything to me.
Suppose we have two independent variables, J and K. The values of J are {a, b, c} and the values of K are {d, e, f}. The Cartesian product of those two sets, more commonly called the “cross product” is the set of all possible pairs,
{ { a, d }, { a, e }, { a, f }, { b, d }, { b, e }, { b, f }, { c, d }, { c, e }, { c, f } }
OK, that’s our Cartesian product of the values of the independent variables. But what on earth is a “condition on” the set { { a, d }, { a, e }, { a, f }, { b, d }, { b, e }, { b, f }, { c, d }, { c, e }, { c, f } }? That makes no sense at all, and its application to the current situation is completely unclear.
w.
Bernie Hutchins says:
February 4, 2014 at 9:30 pm
Dr. Strangelove says:
February 4, 2014 at 10:29 pm
Doc, Bernie’s not asking about your concepts. He wants to see exactly what you did. Not what your concepts say you did. Not what you truly believe you did.
What you actually did.
Either show your code or we are under no obligation to listen to a word you say. This is a scientific site.
w.
Willis Eschenbach says:
February 5, 2014 at 3:44 pm
Anyone who thinks that a projected possible warming of five hundredths of a degree is an “essential ingredient for policy making” hasn’t thought this all the way through.
I agree. And the MET office is no better. See
http://www.metoffice.gov.uk/media/pdf/1/8/decadal_forecast_2014-2018_jan2014.pdf
“• Averaged over the 5-year period 2014-2018, global average temperature is expected to remain high and is likely to be between 0.17°C and 0.43°C above the long-term (1981–2010) average.”
“Conclusions
It also has a broad range of potential applications in terms of policy making and investment decisions.”
Aren’t we talking about autocorrelation here? Each month is the start point for the next.
Brian H says:
February 6, 2014 at 12:17 am (Edit)
Umm … yep, I talked about that very thing, as have other folks. A search for “autocorr…” on the page will find much discussion of the subject.
w.