*Lots of pressure to publish this post early and it is raining this morning here in The Woodlands, so no golf. I checked it over and think it is OK. Here you go! *

By Andy May

In Part 1 of this series, we examined the data and analysis that was presented in AR6 to support their conclusion that sea level rise is accelerating. In Part 2 we looked at a serious examination of the observational record for sea level rise over the past 120 years and the modeled components of that rise. We concluded in Part 1 that the statistical evidence presented in AR6 for acceleration was crude and cherry-picked. In Part 2 we saw that the error in both the estimates of sea level rise and in estimating the components of that rise is very large. The error precluded determining acceleration with any confidence, but the data revealed an approximately 60-year oscillation of the rate of sea level rise that matches known natural ocean cycles.

Modern statistical tools allow us to forecast time series, like GMSL (global mean sea level) change, in a more valid and sophisticated way than simply comparing cherry-picked least squares fits as the IPCC does in AR6. Our forecast is based on pure statistics. It is done in the correct way, but not necessarily correct, statistics are like that. We will not know for sure until 2100. That said, let’s do it. If you have a certain kind of nerdy mind, you will enjoy this.

Figure 1 is a plot of the data we will use—the NOAA sea level dataset. Simply looking at it we can tell it is autocorrelated, which means that each quarter’s mean sea level estimate is highly dependent upon the previous quarter’s value. Autocorrelation is important to consider in least squares regression, especially when forecasting time series, but routinely ignored by the IPCC.

Figure 2 plots each sea level estimate versus the previous estimate, this is called a plot of the first lag and the correlation of the two is a measure of autocorrelation. The R^{2} of the first lag is 0.97, so sea level is very autocorrelated. This is obvious but means that normal least squares linear fit statistics are invalid, the least squares statistics, such as R^{2}, assume that the errors of regression are independent. Least squares, as used in AR6 to show acceleration, is inappropriate with a dataset like this. Most of any given value is heavily dependent upon the previous value. This means the mean-square-error (MSE) will be much too small, causing the error of the fit to be too small. As a result, any least squares line of the data in Figure 1 or any portion of that data is statistically useless, unless the autocorrelation is accounted for.

So how can we forecast GMSL in a statistically valid way? We clearly cannot use least squares and need to apply more advanced techniques. The first step is to remove the autocorrelation from the data, this is normally done by subtracting the previous GMSL value from the current one and progressing in this way throughout the data set. We have done this and show a plot of the result in Figure 3.

The first difference data from GMSL looks pretty good, very much like white noise. This is exactly what we want for valid statistical analysis and forecasting. We will be using an R function called “arima” to create our GMSL forecast, and this function requires three parameters to work, they are called p, d, and q. These parameters tell arima how to condition the input data and build a model that can project valid future values. The plot in Figure 3 shows us that “d” is one. That means taking one difference of adjoining values removes autocorrelation. We also need the data to be stationary, that is the statistical properties do not change with time (left to right). The original dataset (Figure 1) was clearly not stationary, and this is OK, we just do not want the way GMSL changes to be a function of time for this analysis. The R Augmented Dickey-Fuller Test (ADF) function confirms this, as the original dataset has an ADF p value of 0.79, meaning it is non-stationary. The arima p value is not the same as the statistical p test.

The differences plotted in Figure 3 have an ADF p value of 0.01, well below 0.05, the threshold needed to show stationarity. Data are stationary when the distribution over the period being studied is evenly distributed around the mean. That is the distribution, up and down, does not vary significantly with the time axis (x).

Next, we need to derive the arima p and q values. For this we need the ACF (autocorrelation) and PACF (partial autocorrelation) plots shown in Figure 4.

Analyzing the GMSL time series gives us an arima parameter set of (1,1,2) for (p,d,q). We can also run an R function called auto.arima to see what parameters it recommends. We find that it settles on (1,1,2) as well. This is good confirmation that our parameter selection is correct. Figure 5 plots the results.

Figure 5 tells us that the model is successfully capturing the essence of the trends in mean sea level from 1880 through 2020. The model residuals show no trend and they are not autocorrelated. Figure 6 shows the arima forecast from the (1,1,2) model.

Figure 7 is a plot of the forecast from Excel that is easier to read. The forecast we created predicts that GMSL will rise between 148 (6 inches) and 258 mm (10 inches) by 2100. Many researchers call this alarming, but humans have successfully adapted to much higher rates of sea level rise in the past as we can see in Figure 2 of Post 1, and they did so without the technology we have today. When we consider that the average open ocean daily tide range is 1,000 mm or three feet, eight inches of sea level rise over 100 years does not seem like much. In the 20th century sea level rose 5.5 inches, did anyone notice or care, aside from a few researchers?

**Conclusions**

In the United States we would call the AR6 attempt to convince us that the rate of GMSL rise is accelerating, using adjoining cherry-picked least squares lines “high school,” meaning unsophisticated. Their method is problematic because GMSL is heavily autocorrelated and non-stationary, rendering their cherry-picked least squares fits and least squares statistics invalid.

Our fit, using the R function arima, is at least statistically valid. We specifically corrected for autocorrelation and forced the series to be stationary. We also addressed the minor partial autocorrelation that was left at one quarter and three quarters. The residuals of our model passed both the overall Ljung-Box test and multiple-lag Ljung-Box tests for white noise, meaning the arima model properly captured the 140-year trend in the NOAA sea level data.

Thus, while AR6 cherry-picked periods to support their conclusion that GMSL is accelerating, we reached the opposite conclusion using all the data in a statistically valid way. This does not mean that our forecast is correct, but it does mean that the AR6 speculation that sea level might rise 5 meters by 2150 is extremely unlikely and is best characterized as irresponsible speculation. Our analysis found no statistical evidence of acceleration and produced a linear extrapolation.

While warming of Earth’s surface is clearly the reason land-based glaciers are melting, which does contribute to rising sea level, AR6 provides no evidence the warming is caused by human activities. They use models to infer humans caused it, but unfortunately their models are also not statistically valid as shown in Part 2, here, and by McKitrick and Christy (McKitrick & Christy, 2018). We can all agree that humans probably have some impact on atmospheric warming, but we do not know how much is caused by humans and how much is natural, because we are emerging from the unusually cold Little Ice Age—the “preindustrial” period. Further, as we saw in Part 2, the 30-year rates of sea level rise reveal a distinctly natural-looking oscillation. Glacial ice and ice sheet melting is likely responsible for most of sea level rise, as AR6 states, but the human fraction of that warming might be quite small.

Thus, from a purely statistical point of view, the AR6 claims are childishly invalid. A proper analysis of the data leads to a forecast of roughly 20 cm (~8 inches) of sea level rise by 2100. In the year 2100, our descendants will know who was right.

*The data and R code to create the figures in this chapter can be downloaded **here**. The R code and spreadsheet provide much more detail about the arima forecast, including references not supplied below.*

# Works Cited

McKitrick, R., & Christy, J. (2018, July 6). A Test of the Tropical 200- to 300-hPa Warming Rate in Climate Models, Earth and Space Science. *Earth and Space Science, 5*(9), 529-536. Retrieved from https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2018EA000401

Forecasts are difficult. Especially about the future.

Like deja vu all over again.

As I was schooling Bindidon put up an image of the best fit quadratic it’s hilarious. Not only is the future prediction wrong the hindcast is wrong. That fit is just random noise and anyone who puts any faith in it’s prediction needs a lesson in basic science.

Instead of playing the big teacher, try to contribute in a meaningful way.

For example, by doing the same job as I did:

https://wattsupwiththat.com/2022/03/21/ar6-and-sea-level-part-2-the-complexity-of-measuring-gmsl/#comment-3482268

When you will have done that with the same success, come back here and show us how it looks like.

It was not

myidea to compare quadratic fits: I am an engineer, and no statistician; and hence can’t contribute to this current discussion anymore.What Frederikse & alii, Dangendorf & alii have achieved with their studies is and remains what people like you have to be measured against.

You, LdB, are light years away from that.

I doubt you are an engineer a student maybe but if you can’t see the problem with the quadratic fit given then you probably need to stop posting now because you aren’t helping you are just looking more stupid.

The moment you start to fit anything then you need to stop and be critical because it’s easy to fool yourself and there are lots of correlations that are nothing more than chance or come about thru badly wrong statistical analysis. One of the first thing to do is sanity checks look at forecast and hindcast and does it make sense?

This crap fails both so there a problem. The issue is very straight forward so see if you can work it out. Perhaps lets give you another point to consider if you melted all the ice on earth sea level rises by 80-90m from present and then can rise no further (it flat lines) 🙂

My problem is that I do not know stat well enough to take this on anything but trust, but eyeballing the graph gives pretty much the same result.

Tom,

It usually does. Eyeballs are pretty good forecasting tools. The methods I used are correct and commonly used in financial and economic forecasting. This is one reason I do my own investing. I only rarely use these techniques in choosing investments, they are mathematically correct, but often wrong.

Eyeballs are good to two significant figures.

I know ARIMA inside out. AM did a masterful job of it. This is as theoretically as solid as anything in statistics can be.

I agree, the indicated, measured rise is a “natural-looking oscillation” as it appears the sea level is returning to previous normal (higher) levels achieved during the medieval warm period.

Harlech castle sea gate, built 1000 years ago.

Like so much climate related stuff, all up-side down

Must be Down Under…Or at least Wales

Ah, a blast from the past. I got a summa in economics as an undergrad, had my thesis accepted for the PhD, and had already passed all the general PhD exams except history of economics. Chose the joint JD/MBA program rather than finish the PhD, as there was nothing left to learn. My general interest was math modeling, so my economics degree was largely econometrics. ARIMA (auto regressive integrated moving average) is a BIG deal in econometrics.

Well done, Andy May. So much for IPCC AR6 ‘science’.

Thanks Rud, appreciate the kind words.

The stats seem pretty valid and far more rigorous than we typically see in models from climate ‘scientists’. Indeed, it never ceases to amaze me how any so-called ‘scientists’, from climate disciplines or otherwise, can claim the climate models are remotely accurate given the number of variables involved for what is very much a non-deterministic system. It is, quite frankly, farcical.

The conclusion also points out that the author’s forecast might not be correct, which anyone who knows anything about modelling will understand is very sensible. The key point of this analysis, however, is to test the predictions of AR6; and nobody should be surprised that the AR6 ‘findings’ are nonsensical.

It’s a dreadful reflection on the state of ‘science’ in today’s world that this type of analysis should even be necessary, but it is.

Mark, I agree. I should not have had to write this. It should be obvious, and it clearly is to many.

Andy, always keep in mind it was necessary for you to write this. People like me don’t know these things, we deserve to know them but info like this will never be given to us unless people like you do it. Well done.

Thanks Bob, much appreciated.

What I just cannot understand, Andy, is why statisticians don’t question these things? I know that many academics are afraid they’ll lose their careers if they dare to challenge the climate religion, but the statistics involved here are basic. As you say, it’s High-School stuff.

How can any statistician with any degree of ethics or principles just let this stuff go without challenging it? The basis on which climate models are built means their predictions have pretty much zero statistical validity.

Saying that isn’t even a criticism of the models. It’s just a basic function of numerous variables, many being colinear, for a system that is inherently non-deterministic.

The “king has no clothes”. It’s a crazy situation.

Andy May, The Woodlands as in Texas? If so, I can recall playing golf there many years (decades) ago, but I don’t remember what course. But I can remember playing the Tour 18 in nearby Humble.

Regards,

Bob

Hi Bob. Tour 18 is fun and very popular. I haven’t played there in many years though. I play the Woodlands courses mostly, Tournament, Player, Oaks, and Panther Trail. Rarely I will play at Palmer.

It’s interesting number crunching…. but really all we need to know is that we should check how much safety factor the engineers built into the sea walls…you know, above the highest expected storm surge, and probably build them a foot higher, heck lets make it 2 ft, over the next century…..

Agreed. With a fraction of what we are paying for climate modeling and unreliable energy we could shore up all our storm defenses and be done with it!

Yes! And keep in mind that sea wall design will include maximum expected storm driven wave height and surge – plus a safety factor. SLR may mean a bit more spray coming over a sea wall in bad weather, but is unlikely to result in serious flooding anywhere with properly designed seawalls

Example- Melbourne, Australia

Highest astronomical tide HAS 0.6m ( theoretical still water height)

HAS + normal storm surge 1.1m

Highest ever recorded height 1.4m ( 1934)

100 year return event used for planning purposes with safety margin 1.6m.

Maximum allowable floor level for all New dwellings 2.2m.

So in Melbourne the lowest house floor level is 800mm above the highest ever recorded tide.

While there is no dought historic homes may be low lying, New homes already have more than a century buffer to SLR.

Some quotes that I can understand and or offer some critique:

Modern statistical tools allow us to forecast time series, like GMSL (global mean sea level) change, in a more valid and sophisticated way than simply comparing cherry-picked least squares fits as the IPCC does in AR6.Autocorrelation is important to consider in least squares regression, especially when forecasting time series, but routinely ignored by the IPCC.The question to ask is why don’t the IPCC scientists use up to date sophisticated methods?

________________________________________________________________

This does not mean that our forecast is correct, but it does mean that the AR6 speculation that sea level might rise 5 meters by 2150 is extremely unlikely and is best characterized as irresponsible speculation.It’s more than irresponsible, they are obviously producing propaganda. You can’t prove that so you didn’t say that, but the “Duck Test” says so.

I didn’t realize they said 5 meters by 2150. And if that’s so, it’s obvious that they’ve moved out 50 years past 2100 which has been the standard target date for the previous IPCC assessment reports.

________________________________________________________________

While warming of Earth’s surface is clearly the reason land-based glaciers are melting, … Glacial ice and ice sheet melting is likely responsible for most of sea level rise, as AR6 states, …I take issue with notion that the ice sheets (Greenland & Antarctica) are melting. There is some surface melting during several weeks during the summer, but the calving of icebergs occurs year round, or at least I think it does. I’ve never confirmed that though.

But the water has to be coming from somewhere, and the ice caps are a good bet, but how much ice is gained or lost from the ice caps is a function of snow fall and calving of icebergs. Temperature on the ice caps that is well below freezing nearly everywhere nearly all of the time, has nothing to do with it. (In my opinion)

Six to ten inches of sea level rise by 2100 is very close to what I’ve always come up with after fooling around with PSMSL data. Thanks for confirming that for me.

If you can get an answer to this, let me know to. I have no idea. The statistical methods I used to make this post are well known and have been around a long time.

Page SPM-28, AR6. [You knew it was in the SPM, didn’t you?]

Here is the full quote:

This is an expert lie, notice the clever use of “low confidence” and “cannot be ruled out”

Our tax dollars at work: Leftist propaganda.

Sometimes people don’t seek answers to questions that they would rather not know.

In any case, Ph.D. atmospheric chemists usually only have one semester of graduate level statistics and the rest is on the job training. In short, the training is inadequate and staff statisticians, like good service providers, produce the desired result.

My immunology PhC son was told by his college advisor, when he was starting his microbiology degree, not to worry about taking any math courses. If he needed an experiment analyzed just find a math major to do it for him. Thankfully he listened to me and took several math courses including some advanced mathematics. He does all his own work now and does it well. I can’t tell you how scary it is in that field because of the inability of so many to make even basic judgements on their data and what it means.

It’s really the blind leading the blind. The scientists can’t understand their data from a statistical analysis basis and the statisticians can’t understand the data from a real world biology basis. And people wonder why so much of experimental science today can’t be duplicated. It’s so much like climate science it’s unbelievable.

Tim G,

We helped our 2 sons through University, one with Surveying and its heavy math, the other with Commerce hence econometrics. It turned out to be a useful combo for finding some of the many problems in climate research. Geoff S

A real statistician, especially one with a finance background is like gold. They live and die by their predictions and know this stuff inside and out. Every branch of science should pay attention to them, especially meteorologists and climate scientists – but they don’t. Climate scientists know everything, why should they listen to anyone else, right?

Read Andrew Montfort’s “The Hockey Stick Illusion:” CliSciFi is worse than you thought.

Sissor

“… the training is inadequate and staff statisticians, like good service providers, produce the desired result.”

I guess it depends on where you worked. At the place I worked, the engineers had mostly B.S. and M.S. degrees, our three staff statisticians were PhDs. Their goal seemed to be to keep the engineers out of trouble. I remember two pieces of advice I received early in my employment. The first: “Too many engineers think statistics is a black box into which you can pour bad data and crank out good answers. IT IS NOT!” The second was: “Next time come to me BEFORE you write the test plan.”

I am indebted to these three for keeping me out of trouble for all of my 30 year career.

Does the IPCC have scientists or merely political writers? More to the point, do they do any statistical analysis or do they just report on the statistics of journal papers they have selected as working material?

I do recall a position paper a few years ago, probably featured here, wherein the National Academy of Sciences acknowledged the woeful sate of statistical analysis in “climate” papers and strongly suggested that any ‘climate scientists’ using statistics to analyze or explain their work co-author with a real, card carrying statistician. Of course that would never do in practice, for obvious enough reasons, and so has been ignored.

My understanding is that the IPCC has

access to(climate) scientists and their work,in addition totheir “inter-governmental” (the “I” in “IPCC” …) political appointees.Most problems seem to arise because the “political writers” have

priority, AKA the “finalediting” rights, not the scientists.Partially true Mark: The political types appoint the Authors and Lead Authors (supposedly scientists) that pick and choose which scientific documentation to include in various chapters and sections. During development of the UN IPCC Third Assessment Report, for the paleo climate section they picked Michael E. Mann who had just completed his PhD, a unique political pick.

Lead Authors have absolute control of what goes into their section of the reports. Mann selected his MBH98 “Hockey Stick” study as the premier study to be used in his section, elevated above all the rest. The Leftist Sir John Houghton, then IPCC head, elevated the Hockey Stick graph as the grand representation of the whole report and spread it all over the world. Al Gore picked up on it for his “An Inconvenient Truth” movie and the paleo climatological community has been politically corrupted ever since.

My experience means I think of “The Chapter 8 Controversy” for the SAR (in 1995) as the first public display of this

modus operandi.It turns out a post titled “Can We ‘Trust the Science’?” has just been posted here at WUWT that includes the following :

The Santer and Mann examples show just how fuzzy the line between “political writers” and “scientists” can get.

Oh yes, the outlying forecasts presume utter doom, 12 feet, 20 feet, 30 feet. This is all predicted on total ice sheet collapse because occasionally a piece of an iceberg breaks off.

Hmmm. Since NOAA and the GISS have obviously been adjusting their temperatures, both historic and present, to match the rise in atmospheric CO2 I wonder just how cold the past will have to be and how hot it will have to be by 2091 if they continue that practice.

Mark Steyn famously said:

How are we supposed to have confidence in whatthe temperature will be in 2100 when we don’t know it WILL be in 1950!!Outstanding. A little technical but that was needed to show how dishonest and corrupt the AR6 message is and to show how proper work should be done. I salute you.

Andy,

Good job. I hope someone in the government reads this. But I’m not going to hold my breath waiting.

Quite a number of problems in the application of ARIMA here. The ACF and the ADF test suggest that the sea level is driven by a stochastic trend. The only way to forecast a stochastic trend variable is to find another stochastic trend variable that cointegrates with the sea level. For this ARIMA model the PACF suggests p = 2 not q =2. And, you must remive the stchastic trend by taking the first difference in order to determine p and q correctly. When you use the level of the series you cannot see the order of p and q because it will be completely covered by the stchastic trend. This work is hastly done and not correct. Remove the post and redo it correctly. However, ARIMA is only good for short term forecasts and in this case only for changes in the dependent variable.

Been a million years, but would agree that the cut-off in the PACF after two lags implies p=2, hence AR(2). Not sure if Andy tested the actual or differenced series, but if the latter, would suggest ARIMA(2,1,0). We were always advised to try several models and to go with the most parsimonius model that resulted in iid residuals. It’s important to remember that ARIMA is a ‘black box’ approach to forecasting – there’s no ‘causality’.

Fredrik and Frank,

You are both making several assumptions that could easily be checked by downloading the R code and doing the analysis yourself. Until you’ve done that, I’ve nothing to say. I’m pretty sure I did it correctly. I would be willing to wager, that you will get the same answer I did.

The reason for posting the code and data is to answer exactly the questions you are asking.

Ouch! Great reply, as I am reasonably sure what you carefully explained is exactly correct ARIMA.

Andy,

You’re a lot smarter than I am, and like I said, it’s been a million years, so I’ll just quote from ‘Forecasting: Methods and Applications’, Makridakis, Wheelwright and McGee, 1983:

Page 380 –

“Trends of any kind result in positive autocorrelations that dominate the autocorrelation diagram, and…it is important to remove the non-stationarity before proceeding with the time-series model building.”

I assume your spectra in Figure 4 were run on the differenced (stationary) data. If so, then –

Page 375 –

“In summary when there are only p partial autocorrelations that are significantly different from zero, the process is assumed to be an AR(p).”

If your diagnostic R spectra were run on the original (non-stationary) data series, then I would have to agree with Fredrik.

The data used to make Figure 4 were stationary, as clearly stated in the post.

The next step was Figure 4. Remember AR does not equal arima, they are two different things. If you have any further questions, download the R code and run it yourself step-by-step. Otherwise, we could go around in circles for days. Running it would take you less time than it has taken for me to write this reply.

Okay. I’m not an R maven, so just looking at your code (copied below). I see that both ‘acf2’ (the function used to produce the ACF and PACF diagnostics shown in Figure 4, as well as the ‘auto.arima’ function are run on a file named MSL. Since you previously created another file named MSL_diff from MSL using R’s diff() function, I’m assuming that MSL is not differenced and, therefore, is unstationary. If that’s correct, I go back to the quote from Makridakis et. al. (page 380) I provided, above.

# Build arima model, need parameters (p,d,q)

# p is the autocorrelation term, after removing autocorrelation, which is high

# are the residuals white noise?

acf2(MSL,main=“Autocorrelation and Partial Autocorrelation plots of Mean Sea Level”)

#

# sarima takes three numbers, p,d,q

# p are the number of significant acf lags, here p=1 or greater, 1 is sufficient

# d is the number of differences required for stationarity, d=1

# q is the number of significant pacf lags, that is the lags after

# the first order acf lags are removed. q is sometimes called the MA term,

# or the moving average term. here q=2 (lags=1 and 3)

#

# Let’s check if auto.arima gets the same answer

MSL_fit = auto.arima(MSL, approximation=FALSE, trace=FALSE)

summary(MSL_fit)

Frank and Fredrik,

Thanks for downloading the code! That allows us to discuss the details, even if you do not have R on your computer. I tried to keep the post as brief as possible, so the code has the details.

Remember we are talking about arima, NOT ARMA or AR, which are different.

Figure 2 shows that one difference (lag) is highly correlated with GMSL, so we need d=1 to reach stationarity. Figure 3 shows that d=1 is sufficient for stationarity.

Figure 4 uses the original dataset: MSL, not MSL_diff which is lag=1. It is the function acf2 you list, line 172 in the R code.

The ACF plot shows all lags are significant, but p cannot be that high, we need the minimum p that works, in this case it was 1 by trial and error. Fredrik wants to use p=2 due to the PACF plot, this is possible, but the results are not as good as p=1, see attached plot. Compare to Figure 3, which is p=1. The ACF is the same and Ljung-Box stats are worse when p=2, especially for the critical short lags. Besides, I believe in using the smallest p that works.

The PACF plot fairly clearly points to 2 as q. Also auto,arima selects q=2, which is the clincher here.

This gives us p=1, d=1, and q=2 for sarima.

auto.arima derives the same values. auto.arima uses various statistics to derive the parameters, mainly Hyndman-Khandakar algorithm and it works well in simple cases like this one.

This is a good reference that explains most of the methodology I used:

8.7 ARIMA modelling in R | Forecasting: Principles and Practice (2nd ed) (otexts.com)

Andy,

You’ve been very kind and patient in your correspondence, as usual, so will let you go off to more important things – like golf! I also note that there’s been some heavy weather down your way, and hope that you’ve been spared.

I would, however, like to make a few points in passing:

First, as you indicate above, the ACF and PACF plots in Figure 4 were run on the undifferenced, non-stationary data, and are therefore not useful in positing possible models. If the data were stationary, then plots for an ARMA(p,q) would have shown both the ACF and the PACF plots ‘tailing off’.

The fact that you obtained a similar form of the model from auto.arima, which, if I correctly understand from the paper you graciously attached, initially performs any differencing required to achieve stationarity, seems coincidental to me.

Second, the main reason I questioned the model selection process was that a plot of the raw data (MSL) shows that the series is not exactly linear and, in fact, the slope seems to be increasing with level.

This is confirmed by the plot of the differenced series, which is not stationary in the variance, so maybe using a different functional form to transform the data would have been better.

The end result of both these points is a linear forecast whose slope appears to be visually lower than that of any sub-segment of the original data.

And finally, I’m no expert, and it’s been a while since I’ve done any such modeling, but I do recall being taught the importance of selecting models using a subset of the data, say, 80%, and then comparing the forecast of the last 20% to the actual data to see how each model fared.

The ‘best’ model would then be re-estimated using the entire data set, and only then used to forecast beyond the existing data. As an aside, I don’t see any evidence that many climate scientists of the alarmist persuasion actually do this, regardless of what forecasting tools they are using.

Anyways, that’s just my two cents, and, again, thanks for your patience and kind consideration.

Frank,

Thank you for explaining your concerns. I didn’t understand what you meant before. It seems this was your main concern:

Attached is a plot of ACF and PACF for MSL_diff, the first difference of GMSL, which is stationary.

If suggests, as we previously determined, that p=2 and q=2. But p=1 turned out to be better. Both work OK though.

Thanks Andy. You’re correct that I was uncomfortable trying to diagnose the undifferenced data series. Based on the new plots, I would have started with ARIMA(1,1,1), but would also have tried others to see if I could improve on the residuals. So it’s possible we could have come out at the same place.

It is not a coincidence and, yes, auto.arima must be fed the actual series, not the first lag. Also, I prefer to use ACF and PACF of the times series itself, rather than the first lag which is obviously what you were taught. I’ll have to think about that, perhaps in the future I will do both. Either methodology works and in this case the two methodologies very quickly get to the right answer. I reached p=1 first, whereas your method got p=2 first and then p=1 (the correct answer) second time around.

The auto.arima function looks like a real time saver, compared to what we had to go through in the old SCA days. Obviously, a good place to start, although I would remain wary of trusting the software too much, as I’ve seen automatic regression packages that will just start adding variables to maximize the ‘fit’ while resulting in serious multi-collinearity issues.

You are correct, that would be the next step and in a real study it would be required. So, I will do that next. You are the second person to ask for this. The point of the post was only to show how primitive AR6’s justification for acceleration was, but one more step won’t take long.

Agreed and thanks!

We are not making assumptions. We simply point out some methodolgical flows. Thank you sharing the data. I will get back and tell you it says. In the meanwhile I recommend Therence Mills excellent introduction to the topic. https://www.thegwpf.org/content/uploads/2016/02/Forecasting-3.pdf

Fredrik,

Thanks for the reference, I will read it. I don’t think there are any “methodolgical flaws.” If after reading the code, and hopefully running it in R, you still think there are flaws, point them out here and refer to the code. As I discuss above, p=2 works OK, but the Ljung-Box statistics are not as good as they are with p=1, further auto-arima picks p=1 also. q is fairly clearly 2.

It was done correctly. See below where I tried p=2. p=2 is OK, just not as good as p=1. I agree that sea level is a stochastic trend.

This is true in real life. But the point of this exercise was to show the cherry-picked OLS line comparisons in AR6 are juvenile and invalid. This was meant to show the proper way to forecast GMSL using statistics only.

“But the point of this exercise was to show the cherry-picked OLS line comparisons in AR6 are juvenile and invalid.”

But your final point was to minimize the now-2100 ranged increase in sea level, with an expected value of ~ 20cm, as inconsequential. That is why I was interested in a similar evaluation of a few more recent time periods, to see those other 2100 ranges.

FYI, no gotcha intended. I think you can make the case that even larger increases are much less damaging to lower income levels and more remediable generally than other AGW consequences. I even vaguely remember Nick Stokes writing something similar. I am just interested in the more relevant evaluations….

How do we know the acceleration or deceleration of the glacial rebound or subsidence of individual locations?

There could have easily been periods of higher SLR rate pre 1900.

There were many times when it was much higher, see post #1, Figure 2.

AR6 and Sea Level Rise, Part 1 – Andy May Petrophysicist

Statistics will not provide the answer to any scientific question, they only tell you how much trust you can have on a certain answer based on probability analysis.

But having a lot of statistical trust does not mean it is correct, nor having very little trust means it is incorrect. Deep knowledge of a process can beat statistical analysis.

As an example, in 2016 I was among several people that noticed that despite the 2012 low in Arctic sea-ice, the trend in September sea ice extent was no longer down. Knowledge of multidecadal variability led me to understand that a climate shift had taken place and we should not expect a continuation of the previous worrisome (to some) decreasing trend. I published an article here at WUWT comunicating that Arctic sea ice had turned a corner. Tamino (a statistician) used statistics to say I was wrong. Well, six years later I am still right and the statistics are starting to show the change of trend I was able to spot based on knowledge.

The lesson is that statistics is not the final arbiter of scientific questions, just a tool to make better decissions.

I had forgotten that post, thanks for reminding me. I definitely agree with you on statistics. We should all know the rules but be skeptical and practical about it.

The joke back then was consulting three econometricians will result in 6 different opinions, each ‘on the one hand, but then on the other hand’, since each has different hands.

That is why Truman said he wanted a one armed economist, because they always came in and after their briefing said “on the other hand “

Sounds very much like the UN are angling for more money to ‘combat accelerating sea level rise’ to me.

“On March 22, the newest U.S.-European sea level satellite, named Sentinel-6 Michael Freilich, became the official reference satellite for global sea level measurements. This means that sea surface height data collected by other satellites will be compared to the information produced by Sentinel-6 Michael Freilich to ensure their accuracy.”

So, a new satellite from today…

https://sealevel.nasa.gov/news/233/international-sea-level-satellite-takes-over-from-predecessor/

FYI, I get all of the “cherry picked” periods in your first post. They all improve their R^2 values with consideration of acceleration included.

Using normal OLS for linear and accelerating trends, and all of the data, I get changes from 2022.25 to 2100 of ~128 and 242 mm respectively. So, your number is reasonable. I also understand how autocorrelation must be accounted for. I can’t replicate your plots, but what do you get if you plot more recent periods, using the same process?. Say 1950-present, 1960-present, 1970-present, 1980-present. These are the relevant time periods w.r.t. AGW emissions. When you do normal OLS, with acceleration considered, the difference between now and 2100 increases, between trends calculated before and after 1950, 1960, 1970, 1980.

I know it’s a lot of work (that I can’t replicate) but I wonder if your process shows anything different.

Also, your trend seems linear, except for some possible days of month departures. Why is is that it’s linear R^2 for the forecast period of 1900-late 2020 is ~0/983, but the R^2 for a normal acceleration treatment is a slightly larger 0.987. Not challenging your process, just curious.

bigoilbob,

The process you describe is taken care of and much more by arima. Arima checks for curves as well as linear processes. What this process tells you is there is no improvement in the fit with a curve, at least a power or quadratic curve. The arima process is not perfect, but since the original fit was 97%, there is not much chance of improving the fit with a curve anyway. Visually you can see what arima is saying.

Fair enough. The difference I found was admittedly tiny.

Now, per my comment above that, what about performing the same eval on the more recent data? The data that is more likely to be influenced by increasing [CO2] and other GHG’s.

I am curious about the effect of homogenization on auto-correlation within climate signals like seal level and temperature. Do the kind of homogenization procedures employed by the climate modelers tend to increase autocorrelation, decrease it, or have an unpredictable effect? I would think they might increase it, thus amplifying the kinds of effects that you are trying to expose and reject.

Homogenization is used for temperature, not sea level. You correctly intuit that it will increase general autocorrelation.

They increase it. All smoothing or homogenization functions increase it.

Is any of your graphs broken down into regions? Because there some places like the Chesapeake bay area which the MSL is rising , mostly due to subsidence from what I’ve read. These places do need to invest in the appropriate infrastructure.

While I appreciate all your hard work in showing that GMSLs are not accelerating , I find that global averages and anomalies from some arbitrary zero point just give in to the narrative that there is some climate utopia point

Nope, I just worked with the data shown. This was a purely statistical exercise. I did it to make a point about the AR6 methods.

Very interesting Andy, but I’m a very poorly educated bloke who isn’t very good at stats or maths.

But here’s a question for Andy or Rud or anyone else to ponder. It has taken fully evolved Humans 200K years to reach 1 billion global population by about 1800 or about the start of the Industrial REV, so why ( how?) did it take just another 127 years (1927) to reach 2 billion and then just another 33 years to reach 3 billion?

In 1970 the global population was about 3.7 billion and today about 7.9 billion, so more than doubling in about 50 years.

So how can this occur so quickly when the Biden donkey etc have told us repeatedly that we’re facing an EXISTENTIAL threat?

BTW for the first 200K years Human life expectancy was under 40 and today is about 73 and much higher in wealthy OECD countries.

So how is our climate so dangerous today when all the real data proves they’re wrong? Africa ( 53 countries) is even more extreme since 1970 when pop was just 363 million and today is about 1400 million and life expectancy was about 46 yrs then and now about 64 years.

And Africa has experienced the misery of HIV/AIDs over the last 50 years as well. Anyway I’d like someone to point out why we’re committed to wasting endless TRILLIONs $ on this so called EXISTENTIAL threat when we can find so little evidence for it?

Because the things they say are problems are just the smoke screen they are using to gain control.

It boggles the mind how an “amateur” working solo in his spare time can provide more convincing, cogent argumentation than a raft of taxpayer-paid “professionals”.

Andy,

1. About 7 years ago I did similar autocorrelation studies on annual temperatures for several Australian cities including Melbourne. The results looked broadly similar to yours on sea level. Given that it is plausible that one affects the other (in part) there could be scope for stats nerds to look deeper.

2. To round off your present work, why not use the first half of your data to forecast the second half than compare to measured.

3. Why do you tack your forecast to your last observation, which has a chance of being an outlier, might be better to tack it onto average over last 10 years.

4. Are you comfortable that the NOAA data that you use in Figure 1 are reliable? It looks more curved than most individual tide gauge results I have studied. Goog work like yours might entrench the curve which would be a pity if it has been trough an Establishment process with a purpose. Geoff S

All very good points Geoff. As your homework for tonight, I expect you to download the R code and perform the exercises you outline. Post the results for my review tomorrow my time. As for #2, I would expect a projection within the margin of error.

Choosing the NOAA data was arbitrary. I would expect very similar results with all the datasets, they are all pretty much the same. The differences are minor.

Andy,

Sadly I have an impediment that prevents me from using R, so I place a lot of value on the work of people like you. (My start in multivariate stats was with pencil, paper and eraser using Fisher’s method of analysis of variance.)

My comments were intended to be helpful, anticipating objections that readers might like settled before they understand. No way did I mean to criticize. Regards Geoff S

Throwing a bit of fat on the fire, the American Council on Surveying and Mapping published a bulletin in December 2008 explaining sea level change.

Link to PDF here: https://tidesandcurrents.noaa.gov/publications/Understanding_Sea_Level_Change.pdf

Note the ~19-year periodicity in sea level caused by the lunar Metonic Cycle.

WUWT 2017 reference:

https://wattsupwiththat.com/2017/02/07/even-more-on-the-david-rose-bombshell-article-how-noaa-software-spins-the-agw-game/

My comment at WUWT:

Neil Jordan

Reply to

Willis Eschenbach

February 7, 2017 6:27 pm

Let me add to what the surveyors have written. It gets even better (or worse) when the cooked temperature is used to generate an also-cooked future sea level to establish a coastal boundary. ACSM BULLETIN December 2008 covers how it should be done, averaging over the previous 19-year tidal epoch.

https://www.tidesandcurrents.noaa.gov/…/Understanding_Sea_Level_Change.pdf

Note that sea level has a periodic component from the ~19-year lunar/solar Metonic cycle that the local coastal agency attributed to wind-driven upwelling. As a result of this and other shaky reasoning, there are three different sea level rises for 2100 at the same location, 11″, 42″, and 66″.

My understanding was that the purpose of autocorrelation is precisely to detect these hidden signals in the data. Detecting this Metonic cycle in the data would actually be a valid use of this technique. But I am not sure if Andy’s work here is.

Neil and kzb,

Correct this methodology can be used to look for specific suspected periodic components and tell you how significant they are. But it is a lot of work and involves much more than what I did here.

Ferdinand,

All of the papers I have read on the derivation of Henry’s Law factors were done in a constrained lab setting. The water in contact with the air held only so much CO2, there were small volumes and a steady state was soon met.

In the open oceans, of course, the sea surface is constantly replenished with water that can have various levels of CO2 depending on factors like how old the water was when it reached the atmosphere interface. Much more dynamic and longer term concepts than the lab work.

Ferdinand, can you refer me to any Henry’s Law work that is studied in experiments closer to the natural case? I have searched and not found.

Regards Geoff S

As a non-specialist I don’t know if a technique used in econometrics is appropriate for this use. It seems to me that every simple relation in science could be said to be autocorrelated if this is true.

Switch on an immersion heater in a vessel of water and plot the temperature increase with time. That would be autocorrelated by the definition used here.

I was taught that the way to detect if a fitted equation is satisfactory is to plot the residuals.

Sounds pretty good to another non-specialist.

arima is a form of residual analysis. Simply plotting them is useful, but you can do a much more thorough analysis with arima.

So why have 80% of coral atoll islands grown in size over the last 40 years?

Of course the young Charles Darwin worked this out on his journey of discovery over 180 years ago.

And Prof Kench has been carrying out these studies and reporting on his findings for decades.

https://www.rnz.co.nz/national/programmes/saturday/audio/2018640643/climate-change-in-the-pacific-what-s-really-going-on

Sorry 40% of islands have grown and 40% are stable,

Thanks

You made a good academic job i guess.

But I still like to look at this graph for acceleration check:

https://tidesandcurrents.noaa.gov/sltrends/sltrends_station.shtml?plot=50yr&id=140-012

Even if it rises an entire 12 inches by 2100, don’t you think we can adjust in 80 years? It’s not a disaster.

This post explains why the claims about rising Global Sea Levels are untrustworthy from a statistical standpoint. But meanwhile, the green PR scare machine uses webpages and desktop publishing tools to frighten people living or invested in coastal settlements. I recently did a tour of several US cities to illustrate how imaginary flooding is promoted far beyond anything appearing in tidal gauge records. For example, Boston:

Example of Media WarningsCould it be the end of the Blue Line as we know it?The Blue Line, which features a mile-long tunnel that travels underwater, and connects the North Shore with Boston’s downtown, is at risk as sea levels rise along Boston’s coast.To understand the threat sea-level rise poses to the Blue Line, and what that means for the rest of the city, we’re joined by WBUR reporter Simón Ríos and Julie Wormser, Deputy Director at the Mystic River Watershed Association.As sea levels continue to rise, the Blue Line and the whole MBTA system face an existential threat.The MBTA is also facing a serious financial crunch, still reeling from the pandemic, as we attempt to fully reopen the city and the region. Joining us to discuss is MBTA General Manager Steve Poftak.The computer simulation of the future:Imaginary vs. Observed Sea Level Trends (2021 Update)Already the imaginary rises are diverging greatly from observations, yet the chorus of alarm goes on. In fact, the added rise to 2100 from tidal gauges ranges from 6 to 9.5 inches, except for Galveston projecting 20.6 inches. Meanwhile models imagined rises from 69 to 108 inches. Clearly coastal settlements must adapt to evolving conditions, but also need reasonable rather than fearful forecasts for planning purposes.

Seven US cities presented at

https://rclutz.com/2022/01/27/sea-level-scare-machine-2021-update/

Excellent analysis. Now try it on a tide-gauge-only dataset and a satellite-only dataset. I’m curious to see the result. The NOAA data you used appears to be the infamous hybrid of the two where sea level rise increases suddenly starting in the mid-1990’s when the tide gauge data is discarded and the satellite data is grafted in. There are two distinctly different, essentially linear, slopes in the second plot. If you draw a straight line from 7/22/1926 to 1/1/1995 you can easily see that the trend is linear. And if you draw a line from 1/1/1995 through the end in 2022 the trend is also linear, but at a steeper slope; satellite data (which has consistently shown a higher trend) grafted onto tide gauge data. The tide gauges never stopped measuring but they appear to have discarded the tide gauge in the 1990’s in favor of the higher-trending satellite data. I wonder why.

I’m sorry, but your analysis is not correct (about Andy May’s I can’t say anything because I lack math and stat skill to evaluate it – except that a trial to eliminate autocorrelation through first differentiation looks a bit strange).

”

The NOAA data you used appears to be the infamous hybrid of the two where sea level rise increases suddenly starting in the mid-1990’s when the tide gauge data is discarded and the satellite data is grafted in.””

The tide gauges never stopped measuring but they appear to have discarded the tide gauge in the 1990’s in favor of the higher-trending satellite data.”Where do you have such strange allegations from?

*

Here you see four different evaluations of the PSMSL data, of which mine is the most simple one, because it is based only on tide gauge data, whereas Thomas Frederikse’s and Sönke Dangendorf’s are much more complex:

The red plot is the yearly averaging of the NOAA data made available by Andy May in a zip file.

*

The reason why you believe this 1990 story is that like many others, you confound, in the tide gauge evaluations you read about

In the chart below, you may compare, for all four evaluations above, five-year distant, consecutive linear trends from 1900 till 2015, i.e. from 1900-2015 till 1995-2015:

I hope you see now that it makes few sense to compare lifetime trends with sat era trends when looking at tide gauges. In a previous sea level thread, I posted two links to uploaded pdf files, each containing, for over 300 gauges, life time and sat era trends.

*

What makes

suspicious ismethe NOAA data for thenot!satellite periodIt is the

data, you see it when comparing the right plot with the rest.entireNOAA’s evaluation is way above all others (which keep pretty near, except a peak of Frederikse – blue plot – between 1980 and 1990).

My guess: the difference between NOAA and the rest might be due to their anomaly construction technique.

Links to the PSMSL tide gauge trend lists (generated during my ‘Bin’ evaluations)

Lifetime

https://drive.google.com/file/d/1jIAhx1OifHrLF4Pf5YUqCwRenw26Ev3u/view

Sat altimetry era

https://drive.google.com/file/d/19dXIBq8Q7_ZtQm_V7tfcAPCmvEvHiY1P/view

A second way to look at the data is to compute the trends for the four series in the inverse manner, i.e. with the start fixed instead of the end:

You can see that if you were right, the slope of the red plot would have been much steeper from 1995 on.

The difference between NOAA and the rest starts much earlier than the sat era; hence, it can in my opinion not be due to a mix of sat and gauge data in the final series.

I anticipate the reaction:

“They made the past lower to get the present higher”

or so.

Here’s your hoped for response –

“They made the past lower to get the present higher”

Mainly because it’s true –

https://joannenova.com.au/2020/02/acorn-adjustments-robbed-marble-bar-of-its-legendary-world-record-death-valley-now-longest-hottest-place/

Thanks a lot. I’m aware of that Marble Bar story.

People, Andy May has just provided us with an excellent example of how peer review should work. As opposed to the paleo climatological club, Andy provided all his data and methods freely. When people questioned him, he politely answered in full, giving his reasoning. He directed people on different ways to look at the analyses and admitted to the limitations of his study.

We know damned good and well that none of the paleo climatological hockey stick studies revealed their data and methods, but actively fought people seeking such information. None of their studies have been subjected to rigorous statistical analyses and their pal reviewers did not look at nor analyze data and methodologies.

Andy,

Nice analysis of available data sets. This should be the P50 or best case for future forecasts.