Guest essay by Sam Outcalt
Introduction: The object of this document is to present a brief and concise introduction to Hurst ReScaling. More detail is presented in a paper by Outcalt et.al. (1997), which is posted on the WUWT
Website ( http://wattsupwiththat.files.wordpress.com/2012/07/sio_hurstrescale-1.pdf ). A extensive reference list in this paper can be consulted so references are omitted here.
Background: During a study of the hydrology of the Nile a British Engineer discovered that the annual runoff appeared to have a memory. The Hurst Exponent (H) named in his honor is calculated using Equation 1, in which R, S and n are the rescaled range, standard deviation and observation number.
Equation 1. H = Log[R(n)/S(n)] / Log (n)
The rescaled range is the amplitude of the integral trace of deviations from the mean of a serial data vector. Hurst had anticipated an exponent near 0.5, which is termed Brown Noise and can be simulated from a series of random numbers. Values above 0.5 indicate increasing auto-correlation in the data with 1.0 termed Black Noise indicating extreme correlation with the past or strong “memory”. In rather obscure paper (Outcalt et.al.(1997)) discovered that the extremes and inflections of the integral trace used to determine the value of the rescaled range flagged regime changes in the data. These regimes were discovered to pass tests for a statistically normal distribution at significance levels where the bulk data failed. Linear trend lines fit to the “regimes” also displayed significant slope differences at the transitions.
Example: A useful and instructive example calculation can be carried out on the NASA GISS Data used to document “Global Warming” or “Climate Change” or the pressing need for new Carbon Taxes.
Before presenting the example let’s outline the steps in the calculation.
1. A calculation of the Hurst Exponent can be made to estimate the level of “memory” in the data.
2. The mean is subtracted if the data is not presented as deviations from the record mean.
3. The integral trace is calculated as the accumulated deviations from the record mean.
4. Any slight trend is removed from the integral trace.
An upward linear trend will produce a parabolic trace of negative values below the zero level as the early deviation sum is downward and the later values upward terminating near zero. A downward linear trend will produce a positive parabolic trace. More complex functions with upward and downward trending sectors will display an integral trace with sectors both above and below the zero level.
The NASA GISS data is displayed in Figure 1. It should be mentioned that this data vector is already in the deviation from record mean format.
Figure 1. The data show a “strong memory” with a Hurst Exponent of 0.787.
There are two major inflections on the integral trace in 1936 and 1976. The latter inflection is the base of the “hockey stick” warming regime. A 5 year moving average trace indicates that the end of the warming trend may have occurred early in the 21st Century. Before moving to a consideration of the early 21st century it is necessary to mention that there is an alternate method for estimating the Hurst exponent. The Hurst Exponent can also be calculated from a log-log linear fit to the data FFT as the X,Y axes of an FFT have the following for listed in Equation 2.
Equation 2. Y = a + Exp [H X] or Ln Y = Ln a + H Ln X or Log10 Y = Log10 a + H Log10 X
Figure 2. The slope of the Log-Log transform of the FFT estimates H at 0.774 compared to 0.787 in Figure 1.
To explore the transition into the early years of the 21st Century the period from 1980-2010 was lifted from Figure 1 and the mean subtracted and the analysis carried out on the data. The results are displayed as Figure 3.
Figure 3. The data shows a strong “memory” similar to the 1880-2010 data set and a single inflection at the integral trace minimum value in 1997.
In Figure 3 the integral inflection in 1997 and leveling of the “warming trend” in 2004 hint that the warming regime which began in 1976 may have ended with the turn of the century.
The reader is encouraged to run thru the calculations using a spreadsheet program or other resources. I use Dplot software because it has automated functions to create integral traces and FFT’s. I strongly believe that it is impossible to understand analytical procedures without actually carrying out the calculations or better still writing the source code.
Data from Fountain Hills, AZ:
There is some controversy about the requirement for a strong memory to display regime transitions. To test this hypothesis I have analyzed a data set created from readings collected from a WiFi temperature probe. The probe was located on a north facing wall of my winter house in Fountain Hills, AZ. The data were initially collected a 30 minute interval and decimated to yield only noon readings of air temperature. The decimated data is displayed as Figure 4.
Figure 4. Data collected with a Lascar WiFi Probe from 19 March thru 16 April 2013. This data set was decimated to preserve only noon readings.
Even with a Hurst Exponent just above the threshold of Brown Noise and well below the value of the NASA GISS data the integral trace inflections flag air mass transitions in the Phoenix Region. Although there are only 29 observations in the set the integral information content remains in tact
indicating air mass transitions. A data set from 3 probes at radically different sites at my winter house produce some interesting conclusions. This data is displayed as Figure 5.
Figure 5. Data from 3 USB Probe sites at the Winter House in Fountain Hills, AZ.
The traces in Figure 5 indicate that the integrals flagged major air mass transitions at 3 probe sites. The strong diurnal signal is still present in the integral but the regime changes are still evident. This indicates that probe location is not critical for airmass transition analysis. The spike in the Wall Plot is due to the beam radiation striking the probe but has little impact on the integral.
Conclusion: Hurst Rescaling provides a unique method for detecting regime changes in weather and climate data. The detection of strong airmass control of the integral in Brown Noise indicated that strong regimes may be detected even in data with almost no auto correlation. Probe location appears to have little effect on integral transitions. The major consideration in probe location appears to be placing the probes at sites where there in no unnatural time dependent influence. A probe placed in the exhaust area of a clothes dryer running on a irregular schedule is one thing but the diurnal passage of shadows due to buildings may have a minimal impact on the integral trace.
The reader is encouraged to collect data using WiFi or USB Temperature Probes. The Lascar USB-Lite (under 50$US) will record data at 30 minuted intervals for a month before a battery replacement. I waterproof these tiny probes with a section of old 23C road bike inner tube trimmed to the probe length and mount the probes inside a small styrofoam coffee cup attached to a length of stiff wire. The wire is then attached to a tree limb.
RCSaumarez (May 7, 2013 at 3:02 pm) wrote:
“I wrote an article on Judith Curry’s blog when Hurst dynamics were all the rage about a year ago.
http://judithcurry.com/2012/02/19/autocorrelation-and-trends/
Frankly, I wish I hadn’t!
My upshot was that it is extremely difficult, using real data, to separate a power law, i.e.: Hurst dynamics from other models.
Even if there is a pure Hurst Law relationship, what does this mean? Basically it means that there are variable dynamics with different delays. I don’t think that saying that we can regard temperature as a Hurst system is particularly helpful as it doesn’t uncover other the basic mechanisms underlying temperature, it simply gives one statistical description of the signal, which could be explained in many other ways.”
I welcome Sam Outcalt’s contributions. I’m also pleased to see you encouraging more sober thinking about fundamentals here. The biggest mistake I see in general with applications of HK (not with Sam’s work) is failure to explore the variability of parameter estimates as a function of aggregation criteria. For example, what insights arise if the data are sorted by month of year or by some spatial criteria and by many, many other criteria? There’s a major blindspot in some of the narratives. The insights will deepen and the stories will get better once aggregation criteria make their way onto the radars of key HK advocates.
RCSaumarez
Good man/woman. The linked article makes things a lot clearer. Still don’t see the point of the H dimension it seems entirely meaningless outside a purely academic field. A simple Markov approach would do the job without all this arm waving.
1) Detrend the data
2) Discretised data derived from bin ranges (e.g. 1 = -1.0 to -0.6…5 = 0.6 to 1.0).
3) Construct a Markov mesh using these discrete values (indicators) where the mesh is composed of two sample points at distance h = 0.
Repeat 3 using different mesh parameters: h = 1…10
4) Plot the conditional probabilities against h for each transition (1->2, 1->3 etc.)
Yes this does give you – indirectly – a pseudo-autocorrelation of the thresholded values. The difference however is that you don’t need to model the relationship any further as the conditional probabilities will either be a function of h or not. End of story!
R. C. Saumarez: Even if there is a pure Hurst Law relationship, what does this mean? Basically it means that there are variable dynamics with different delays. I don’t think that saying that we can regard temperature as a Hurst system is particularly helpful as it doesn’t uncover other the basic mechanisms underlying temperature, it simply gives one statistical description of the signal, which could be explained in many other ways.
You are more polite than I was, but yeh. There are zillions of statistical summaries that can be computed for extant data, especially extant data that have been worked over as much as these have been. Without explicit methods to test the fit of the corresponding underlying models, all anyone achieves is another set of statistics of dubious value.
“Equation 2. Y = a + Exp [H X] or Ln Y = Ln a + H Ln X or Log10 Y = Log10 a + H Log10 X”
The first equation is not equivalent to the second and third equations.
The second and third equations are equivalent Y=a X^H and not to Y = a + e^ (H X).
Please explain what you had in mind when you took the logarithm of the first equation.