By Walter Dnes – Edited by Just The Facts
Investopedia defines “Leading Indicator” thusly…
A measurable economic factor that changes before the economy starts to follow a particular pattern or trend. Leading indicators are used to predict changes in the economy, but are not always accurate.
Economics is not the only area where a leading indicator is nice to have. A leading indicator that could predict in February, whether this calendar year’s temperature anomaly will be warmer or colder than the previous calendar year’s anomaly would also be nice to have. I believe that I’ve stumbled across exactly that. Using data from 1979 onwards, the rule goes like so…
- If this year’s January anomaly is warmer than last year’s January anomaly, then this year’s annual anomaly will likely be warmer than last year’s annual anomaly.
- If this year’s January anomaly is colder than last year’s January anomaly, then this year’s annual anomaly will likely be colder than last year’s annual anomaly.
This is a “qualitative” forecast. It doesn’t forecast a number, but rather a boundary, i.e. greater than or less than a specific number. I don’t have an explanation for why it works. Think of it as the climatological equivalent of “technical analysis”; i.e. event X is usually followed by event Y, leaving to others to figure out the underlying “fundamentals”, i.e. physical theory. I’ve named it the “January Leading Indicator”, abbreviated as “JLI” (which some people will probably pronounce as “July”). The JLI has been tested on the following 6 data sets, GISS, HadCRUT3, HadCRUT4, UAH5.6, RSS and NOAA
In this post I will reference this zipped GISS monthly anomaly text file and this spreadsheet. Note that one of the tabs in the spreadsheet is labelled “documentation”. Please read that tab first if you download the spreadsheet and have any questions about it.
The claim of the JLI would arouse skepticism anywhere, and doubly so in a forum full of skeptics. So let’s first look at one data set, and count the hits and misses manually, to verify the algorithm. The GISS text file has to be reformatted before importing into a spreadsheet, but it is optimal for direct viewing by humans. The data contained within the GISS text file is abstracted below.
Note: GISS numbers are the temperature anomaly, multiplied by 100, and shown as integers. Divide by 100 to get the actual anomaly. E.g. “43” represents an anomaly of 43/100=0.43 Celsius degrees. “7” represents an anomaly of 7/100=0.07 Celsius degrees.
- The first 2 columns on the left of the GISS text file are year and January anomaly * 100.
- The column after “Dec” (labelled “J-D”) is the January-December anomaly * 100
The verification process is as follows:
- Count all the years where the current year’s January anomaly is warmer than the previous year’s January anomaly. Add a 1 in the Counter column for each such year.
- For each such year, we count all where the year’s annual anomaly is warmer than the previous year’s annual anomaly and add a 1 in the Hit column for each such year.
|Jan(current) > Jan(previous)||J-D(current) > J-D(previous)|
|1980||1||25 > 10||1||23 > 12|
|1981||1||52 > 25||1||28 > 23|
|1983||1||49 > 4||1||27 > 9|
|1986||1||25 > 19||1||15 > 8|
|1987||1||30 > 25||1||29 > 15|
|1988||1||53 > 30||1||35 > 29|
|1990||1||35 > 11||1||39 > 24|
|1991||1||38 > 35||0||38 < 39||Fail|
|1992||1||42 > 38||0||19 < 38||Fail|
|1995||1||49 > 27||1||43 > 29|
|1997||1||31 > 25||1||46 > 33|
|1998||1||60 > 31||1||62 > 46|
|2001||1||42 > 23||1||53 > 41|
|2002||1||72 > 42||1||62 > 53|
|2003||1||73 > 72||0||61 < 62||Fail|
|2005||1||69 > 57||1||66 > 52|
|2007||1||94 > 53||1||63 > 60|
|2009||1||57 > 23||1||60 > 49|
|2010||1||66 > 57||1||67 > 60|
|2013||1||63 > 39||1||61 > 58|
|Predicted 20 > previous year||Actual 17 > previous year|
Of 20 candidates flagged (Jan(current) > Jan(previous)), 17 are correct (i.e. J-D(current) > J-D(previous)). That’s 85% accuracy for the qualitative annual anomaly forecast on the GISS data set where the current January is warmer than the previous January.
And now for the years where January is colder than the previous January. The procedure is virtually identical, except that we count all where the year’s annual anomaly is colder than the previous year’s annual anomaly and add a 1 in the Hit column for each such year.
|Jan(current) < Jan(previous)||J-D(current) < J-D(previous)|
|1982||1||4 < 52||1||9 < 28|
|1984||1||26 < 49||1||12 < 27|
|1985||1||19 < 26||1||8 < 12|
|1989||1||11 < 53||1||24 < 35|
|1993||1||34 < 42||0||21 > 19||Fail|
|1994||1||27 < 34||0||29 > 21||Fail|
|1996||1||25 < 49||1||33 < 43|
|1999||1||48 < 60||1||41 < 62|
|2000||1||23 < 48||1||41 < 41||0.406 < 0.407|
|2004||1||57 < 73||1||52 < 61|
|2006||1||53 < 69||1||60 < 66|
|2008||1||23 < 94||1||49 < 63|
|2011||1||46 < 66||1||55 < 67|
|2012||1||39 < 46||0||58 > 55||Fail|
|Predicted 14 < previous year||Actual 11 < previous year|
Of 14 candidates flagged (Jan(current) < Jan(previous)), 11 are correct (i.e. J-D(current) < J-D(previous)). That’s 79% accuracy for the qualitative annual anomaly forecast on the GISS data set where the current January is colder than the previous January. Note that the 1999 annual anomaly is 0.407, and the 2000 annual anomaly is 0.406, when calculated to 3 decimal places. The GISS text file only shows 2 (implied) decimal places.
The scatter graph at this head of this article compares the January and annual GISS anomalies for visual reference.
Now for a verification comparison amongst the various data sets, from the spreadsheet referenced above. First, all years during the satellite era, which were forecast to be warmer than the previous year
|Ann > previous||16||15||17||18||18||15|
|Jan > previous||19||18||20||21||20||18|
Next, all years during the satellite era, which were forecast to be colder than the previous year
|Ann < previous||11||11||11||11||11||11|
|Jan < previous||15||16||14||13||14||16|
The following are scatter graph comparing the January and annual anomalies for the other 5 data sets:
The forecast methodology had problems during the Pinatubo years, 1991 and 1992. And 1993 also had problems, because the algorithm compares with the previous year, in this case Pinatubo-influenced 1992. The breakdowns were…
- For 1991 all 6 data sets were forecast to be above their 1990 values. The 2 satellite data sets (UAH and RSS) were above their 1990 values, but the 4 surface-based data sets were below their 1990 values
- For 1992 the 4 surface-based data sets (HadCRUT3, HadCRUT4, GISS, and NCDC/NOAA) were forecast to be above their 1991 values, but were below
- The 1993 forecast was a total bust. All 6 data sets were forecast to be below their 1992 values, but all finished the year above
In summary, during the 3 years 1991/1992/1993, there were 6*3=18 over/under forecasts, of which 14 were wrong. In plain English, if a Pinatubo-like volcano dumps a lot of sulfur dioxide (SO2) into the stratosphere, the JLI will not be usable for the next 2 or 3 years, i.e.:
“The most significant climate impacts from volcanic injections into the stratosphere come from the conversion of sulfur dioxide to sulfuric acid, which condenses rapidly in the stratosphere to form fine sulfate aerosols. The aerosols increase the reflection of radiation from the Sun back into space, cooling the Earth’s lower atmosphere or troposphere. Several eruptions during the past century have caused a decline in the average temperature at the Earth’s surface of up to half a degree (Fahrenheit scale) for periods of one to three years. The climactic eruption of Mount Pinatubo on June 15, 1991, was one of the largest eruptions of the twentieth century and injected a 20-million ton (metric scale) sulfur dioxide cloud into the stratosphere at an altitude of more than 20 miles. The Pinatubo cloud was the largest sulfur dioxide cloud ever observed in the stratosphere since the beginning of such observations by satellites in 1978. It caused what is believed to be the largest aerosol disturbance of the stratosphere in the twentieth century, though probably smaller than the disturbances from eruptions of Krakatau in 1883 and Tambora in 1815. Consequently, it was a standout in its climate impact and cooled the Earth’s surface for three years following the eruption, by as much as 1.3 degrees at the height of the impact.” USGS
For comparison, here are the scores with the Pinatubo-affected years (1991/1992/1993) removed. First, where the years were forecast to be warmer than the previous year
|Ann > previous||16||15||17||17||17||15|
|Jan > previous||17||16||18||20||19||16|
And for years where the anomaly was forecast to be below the previous year
|Ann < previous||11||11||11||10||10||11|
|Jan < previous||14||15||13||11||12||15|
Given the existence of January and annual data values, it’s possible to do linear regressions and even quantitative forecasts for the current calendar year’s annual anomaly. With the slope and y-intercept available, one merely has to wait for the January data to arrive in February and run the basic “y = mx + b” equation. The correlation is approximately 0.79 for the surface data sets, and 0.87 for the satellite data sets, after excluding the Pinatubo-affected years (1991 and 1992).
There will probably be a follow-up article a month from now, when all the January data is in, and forecasts can be made using the JLI. Note that data downloaded in February will be used. NOAA and GISS use a missing-data algorithm which results in minor changes for most monthly anomalies, every month, all the way back to day 1, i.e. January 1880. The monthly changes are generally small, but in borderline cases, the changes may affect rankings and over/under comparisons.
The discovery of the JLI was a fluke based on a hunch. One can only wonder what other connections could be discovered with serious “data-mining” efforts.