- by Pat Frank
- “A good emulator can mimic the output of the black box.“
Last February 7, statistician Richard Booth, Ph.D. (hereinafter, Rich) posted a very long critique titled, What do you mean by “mean”: an essay on black boxes, emulators, and uncertainty” which is very critical of the GCM air temperature projection emulator in my paper. He was also very critical of the notion of predictive uncertainty itself.
This post critically assesses his criticism.
An aside before the main topic. In his critique, Rich made many of the same mistakes in physical error analysis as do climate modelers. I have described the incompetence of that guild at WUWT here and here.
Rich and climate modelers both describe the probability distribution of the output of a model of unknown physical competence and accuracy, as being identical to physical error and predictive reliability.
Their view is wrong.
Unknown physical competence and accuracy describes the current state of climate models (at least until recently. See also Anagnostopoulos, et al. (2010), Lindzen & Choi (2011), Zanchettin, et al., (2017), and Loehle, (2018)).
GCM climate hindcasts are not tests of accuracy, because GCMs are tuned to reproduce hindcast targets. For example, here, here, and here. Tests of GCMs against a past climate that they were tuned to reproduce is no indication of physical competence.
When a model is of unknown competence in physical accuracy, the statistical dispersion of its projective output cannot be a measure of physical error or of predictive reliability.
Ignorance of this problem entails the very basic scientific mistake that climate modelers evidently strongly embrace and that appears repeatedly in Rich’s essay. It reduces both contemporary climate modeling and Rich’s essay to scientific vacancy.
The correspondence of Rich’s work with that of climate modelers reiterates something I realized after much immersion in published climatology literature — that climate modeling is an exercise in statistical speculation. Papers on climate modeling are almost entirely statistical conjectures. Climate modeling plays with physical parameters but is not a branch of physics.
I believe this circumstance refutes the American Statistical Society’s statement that more statisticians should enter climatology. Climatology doesn’t need more statisticians because it already has far too many: the climate modelers who pretend at science. Consensus climatologists play at scienceness and can’t discern the difference between that and the real thing.
Climatology needs more scientists. Evidence suggests many of the good ones previously resident have been caused to flee.
Rich’s essay ran to 16 typescript pages and nearly 7000 words. My reply is even longer — 28 pages and nearly 9000 words. Followed by an 1800-word Appendix.
For those disinclined to go through the Full Tilt Boogie below, here is a short precis followed by a longer summary.
The very short take-home message: Rich’s entire analysis has no critical force.
A summary list of its problems:
1. Rich’s analysis shows no evidence of physical reasoning.
2. His proposed emulator is constitutively inapt and tendentious.
3. Its derivation is mathematically incoherent.
4. The derivation is dimensionally unsound, abuses operator algebra, and deploys unjustified assumptions.
5. Offsetting calibration errors are incorrectly and invariably claimed to promote predictive reliability.
6. The Stefan-Boltzmann equation is inverted.
7. Operators are improperly treated as coefficients.
8. Accuracy is repeatedly abused and ejected in favor of precision.
9. The GCM air temperature projection emulator (paper eqn. 1) is fatally confused with the error propagator (paper eqn. 5.2)
10. The analytical focus of my paper is fatally misconstrued to be model means.
11. The GCM air temperature projection emulator is wrongly described as used to fit GCM air temperature means.
12. The same emulator is falsely portrayed as unable to emulate GCM projection variability, despite 68 examples to the contrary.
13. A double irony is that Rich touted a superior emulator without ever displaying a single successful emulation of a GCM air temperature projection.
14. Assumed away all the difficulties of measurement error or model error (qualifying Rich to be a consensus climatologist).
15. Uncertainty statistics are wrongly and invariably asserted to be physical error or an interval of physical error.
16. Systematic error is falsely asserted as restricted to a fixed constant bias offset.
17. Uncertainty in temperature is falsely and invariably construed to be an actual physical temperature.
18. Empirically unjustified invariably ad hoc assumptions of error as a random variable.
19. The JCGM description of standard uncertainty variance is self-advantageously misconstrued.
20. The described use of rulers or thermometers is unrealistic.
21. Readers are advised to record and accept false precision.
A couple of preliminary instances that highlight the difference between statistical thinking and physical reasoning.
Rich wrote that, “It may be objected that reality is not statistical, because it has a particular measured value. But that is only true after the fact, or as they say in the trade, a posteriori. Beforehand, a priori, reality is a statistical distribution of a random variable, whether the quantity be the landing face of the die I am about to throw or the global HadCRUT4 anomaly averaged across 2020.”
Rich’s description of an a priori random variable status for some as-yet unmeasured state is wrong when the state of interest, though itself unknown, falls within a regime treated by physical theory, such as air temperature. Then the a priori meaning is not the statistical distribution of a random variable, but rather the unknown state of a deterministic system that includes uncontrolled but explicable physical effects.
Rich’s comment implied that a new aspect of physical reality is approached inductively, without any prior explanatory context. Science approaches a new aspect of physical reality deductively from a pre-existent physical theory. The prior explanatory context is always present. This inductive/deductive distinction marks a fundamental departure in modes of thinking. The first neither recognizes nor employs physical reasoning. The second does both.
Rich also wrote, “It may also be objected that many black boxes, for example Global Circulation Models, are not statistical, because they follow a time evolution with deterministic physical equations. Nevertheless, the evolution depends on the initial state, and because climate is famously “chaotic”, tiny perturbations to that state, lead to sizeable divergence later. The chaotic system tends to revolve around a small number of attractors, and the breadth of orbits around each attractor can be studied by computer and matched to statistical distributions.”
But this is not known to be true. On the one hand, an adequate physical theory of the climate is not available. This lack leaves GCMs as parameterized engineering models. They are capable only of statistical arrays of outputs. Arguing the centrality of statistics to climate models as a matter of principle begs the question of theory.
On the other hand, supposing a small number of attractors flies into the face of the known large number of disparate climate states spanning the entire variation between “snowball Earth” and hot house Earth. And supposing those states can be studied by computer and expressed as statistical distributions again begs the question of physical theory. Lots of hand-waving, in other words.
Rich went on to write that the problem of climate could be approached as “a probability distribution of a continuous real variable.” But this assumes the behavior of the physical system as smoothly continuous. The many Dansgaard- Oeschger and Heinrich events are abrupt and discontinuous shifts of the terrestrial climate.
None of Rich’s statistical conjectures are constrained by known physics or by the behavior of physical reality. In other words, they display no evidence of physical reasoning.
The Full-Tilt Boogie.
In his Section B, Rich set up his analysis by defining three sources of result:
1. physical reality ® X(t) (data)
2. black box model ® M(t) (simulation of the X(t)-producing physical reality)
3. model emulator ® W(t) (emulation of model M output)
I. Problems with “Black Box and Emulator Theory” Section B:
Rich’s model emulator W is composed to, “estimate of the past black box values and to predict the black box output.” That is, his emulator targets model output. It does not emulate the internal behavior or workings of the full model in some simpler way.
Its formal structure is given by his first equation:
W(t) = (1-a)W(t-1) + R₁(t) + R₂(t) + (-r)R₃(t), (1ʀ)
where W(t-1) is some initial value and W(t) is the final value after integer time-step ‘t.’ The equation number subscript “ʀ” designates Rich as the source.
As an aside here, it is not unfair to notice that despite its many manifestations and modalities, Rich’s superior GCM emulator is never once used to actually emulate an air temperature projection.
The eqn. 1ʀ emulator manifests persistence, which the GCM projection emulator in my paper does not. Rich began his analysis, then, with an analogical inconformity.
The factors in eqn. 1ʀ are described as: “R1(t) is to be the component which represents changes in major causal influences, such as the sun and carbon dioxide. R2(t) is to be a component which represents a strong contribution with observably high variance, for example the Longwave Cloud Forcing (LCF). … R3(t) is a putative component which is negatively correlated with R2(t) with coefficient -r, with the potential (dependent on exact parameters) to mitigate the high variance of R2(t).”
Emulator coefficient ‘r’ is always negative. The R₃(t) itself is negatively correlated with R₂(t) so that R₃(t) offsets (reduces) the magnitude of R₂(t), and 0 £ a £ 1. The Rn(t) are defined as time-dependent random variables that add into (1-a)W(t-1).
The relative impact of each Rn on W(t-1) is R₁(t) > R₂(t) ³ |rR₃(t)|.
A problem with factor R₃(t):
The R₃(t) is given to be “negatively correlated” with R₂(t), “to mitigate the high variance of R₂(t).” However, factor R₃(t) is also multiplied by coefficient -r.
“Negatively correlated” refers to R₃(t). The ‘-r’ is an additional and separate conditional.
There are three cases governing the meaning of ‘negative correlation’ for R₃(t).
1) R₃(t) starts at zero and becomes increasingly negative as R₂(t) becomes increasingly positive.
or
2) R₃(t) starts positive and becomes smaller as R₂(t) becomes large, but remains greater than zero.
or
3) R₃(t) starts positive and becomes small as R₂(t) becomes large but can pass through zero into negative values.
If 1), then -rR₃(t) is positive and has the invariable effect of increasing R₂(t) — the opposite of what was intended.
If 2), then -rR₃(t) has a diminishing effect on R₂(t) as R₂(t) becomes larger — again opposite the desired effect.
If 3), then -rR₃(t) diminishes R₂(t) at low but increasing values of R₂(t), but increases R₂(t) as R₂(t) becomes large and R₃(t) passes into negative values. This is because -r(-R₃(t)) = rR₃(t). That is, the effect of R₃(t) on R₂(t) is concave upwards around zero, è₀ø.
That is, none of the combinations of -r and negatively correlated R₃(t) has the desired effect on R₂(t). A consistently diminishing effect on R₂(t) is frustrated.
With negative coefficient -r, the R₃(t) term must be greater than zero and positively correlated with R₂(t) to diminish the contribution of R₂(t) at high values.
Curiously, Rich did not designate what X(t) actually is (perhaps air temperature?).
Nor did he describe what process the model M(t) simulates, nor what the emulator W(t) emulates. Rich’s emulator equation (1ʀ) is therefore completely arbitrary. It’s merely a formal construct that he likes, but is lacking any topical relevance or analytical focus.
In strict contrast, my interest in emulation of GCMs was roused when I discovered in 2006 that GCM air temperature projections are linear extrapolations of GHG forcing. In December 2006, John A publicly posted that finding at Steve McIntyre’s Climate Audit site, here.
That is, I began my work after discovering evidence about the behavior of GCMs. Rich, on the other hand, launched his work after seeing my work and then inventing an emulator formalism without any empirical referent.
Lack of focus or relevance makes Rich’s emulator irrelevant to the GCM air temperature emulator in my paper, which was derived with direct reference to the observed behavior of GCMs.
I will show that the irrelevance remains true even after Rich, in his Section D, added my numbers to his invented emulator.
A Diversion into Dimensional Analysis:
Emulator 1ʀ is a sum. If, for example, W(t) represents one value of an emulated air temperature projection, then the units of W(t) must be, e.g., Celsius (C). Likewise, then, the dimensions of W(t-1), R₁(t), R₂(t), and -rR₃(t), must all be in units of C. Coefficients a and r must be dimensionless.
In his exposition, Rich designated his system as a time series, with t = time. However, his usage of ‘t’ is not uniform, and most often designates the integer step of the series. For example, ‘t’ is an integer in W(t-1) in equation 1ʀ, where it represents the time step prior to W(t).
Continuing:
From eqn. (1ʀ), for a time series i = 1®t and when W(t-1) = W(0) = constant, Rich presented his emulator generalization as:
(2ʀ)
Let’s see if that is correct. From eqn. 1ʀ:
W(t1) = (1-a)W(0) + R₁(t1) + R₂(t1) -rR₃(t1), 1ʀ1
where the subscript on t indicates the integer step number.
W(t2) = (1-a)W(t1) + R₁(t2) + R₂(t2) -rR₃(t2) 1ʀ2
Substituting W(t1) into W(t2),
W(t2) = (1-a)[(1-a)W(0)+ R₁(t1) + R₂(t1) – rR₃(t1)] +[R₁(t2) + R₂(t2)-rR₃(t2)]
= (1-a)2W(0)+(1-a)[R₁(t1) + R₂(t1) – rR₃(t1)] + (1-a)⁰[R₁(t2)) + R₂(t2) -rR₃(t2)]
(NB: (1-a)⁰ = 1, and is added for completion)
Likewise, W(t3)=(1-a){[(1-a)2W(0)+(1-a)[R₁(t1)+R2(t1)-rR3(t1)]+[R1(t2)+R2(t2)-rR₃(t2)]} + (1-a)⁰[R₁(t3) + R₂(t3) -rR₃(t3)]
= (1-a)3W(0)+(1-a)2[(R₁(t1)+R2(t1)-rR3(t1)]+(1-a)[R1(t2)+R2(t2)-rR₃(t2)]+(1-a)⁰[R₁(t3)+R₂(t3)-rR₃(t3)]
Generalizing:
(1)
Compare eqn. (1) to eqn. (2ʀ). They are not identical.
In generalized equation 1, when i = t = 1, W(tt) goes to W(t₁) = (1-a)W(0) + R₁(t1) + R₂(t1) -rR₃(t1) as it should do.
However, Rich’s equation 2ʀ does not go to W(t₁) in the limiting case i = t = 1.
Instead 2ʀ becomes W(t₁) = (1-a)W(0)+(1-a)[R₁(0)+R₂(0) -rR₃(0)], which is not correct.
The R-factors should have their t₁ values, but do not. There are no Rn(0)’s because W(0) is an initial value that has no perturbations. Also, coefficient (1-a) should not multiply the Rn’s (look at eqn. 1ʀ).
So, equation 2ʀ is wrong. The 1ʀ®2ʀ transition is mathematically incoherent.
There’s a further conundrum. Rich’s derivation, and mine, assume that coefficient ‘a’ is constant. If ‘a’ is constant, then (1-a) becomes raised to the power of the summation e.g., (1-a)ᵗW(0).
But there is no reason to think that coefficient ‘a’ should be a constant across a time-varying system. Why should every new W(t-1) have a constant fractional influence on W(t)?
Why should ‘a’ be constant? Apart from convenience.
Rich then defined E[] = expectation value and V[] = variance = (standard deviation)², and assigned that:
E[R₁(t)] = bt+c
E[R₂(t)] = d
E[R₃(t)] = 0.
Following this, Rich allowed (leaving the derivation to the student) that, “Then a modicum of algebra derives
“E[W(t)] = b(at + a-1 + (1-a)t+1)/a2 + (c+d)(1 – (1-a)t)/a + (1-a)W(0)” (3ʀ)
Evidently 3ʀ was obtained by manipulating 2ʀ (can we see the work, please?). But as 2ʀ is incorrect, nothing worthwhile is learned. We’re told that eqn. 3ʀ ® 4ʀ as coefficient ‘a’ ® 0.
E[W(t)] = bt(t+1)/2 + (c+d)t + W(0) (4ʀ)
A Second Diversion into Dimensional Analysis:
Rich assigned E[R₁(t)] = bt+c. Up through eqn. 2ʀ, ‘t’ was integer time. In E[R₁(t)] it has become a coefficient. We know from eqn. 1ʀ that R₁(t) must have the identical dimensional unit carried by W(t), which is, e.g., Celsius.
We also know R₁(t) is in Wm⁻², but W(t) is in Celsius (C). Factor “bt” must be in the same Celsius units as [W(t)]. Is the dimension of b, then, Celsius/time? How does that work? The dimension of ‘c’ must also be Celsius. What is the rationale of these assignments?
The assigned E[R₁(t)] = bt+c has the formula of an ascending straight line of intercept c, slope b, and time the abscissa.
How convenient it is, to assume a linear behavior for the black box M(t) and to assign that linearity before ever (supposedly) considering the appropriate form of a GCM air temperature emulator. What rationale determined that convenient form? Apart from opportunism?
The definition of R₁(t) was, “…the component which represents changes in major causal influences, such as the sun and carbon dioxide.”
So, a straight line now represents the major causal influence of the sun or of CO2. How was that decided?
Next, multiplying through term 1 in 4ʀ, we get bt(t+1)/2 = (bt²+bt)/2. How do both bt² and bt have the units of Celsius required by E[R₁(t)] and W(0)?
Factor ‘t’ is in units of time. The internal dimensions of bt(t+1)/2 are incommensurate. The parenthetical sum is physically meaningless.
Continuing:
Rich’s final equation for the total variance of his emulator,
Var[W(t)] = (s12+s22+s32-2r s2 s3)(1 – (1-a)2t)/(2a-a2) (5ʀ)
included all the Rn(t) terms and the assumed covariance of his R₂(t) and R₃(t).
Compare his emulator 4ʀ with the GCM air temperature projection emulator in my paper:
(2)
In contrast to Rich’s emulator, eqn. 2 has no offsetting covariances. Not only that, all the DT-determining coefficients in eqn. 2 except fCO₂ are givens. They have no uncertainty variance at all.
In short, both Rich’s emulator itself and its dependent variances are utterly irrelevant to any evaluation of the GCM projection emulator (eqn. 2). Utterly irrelevant, even if they were correctly derived, which they were not.
Parenthetical summary comments on Rich’s “Summary of section B:”
- “A fairly general iterative emulator model (1) is presented.”
(Trivially true.)
- “Formulae are given for expectation and variance of the emulator as a function of time t and various parameters.“
(Never once used to actually emulate anything, and of no focused relevance to GCM air temperature projections.)
- “The 2 extra parameters, a, and R3(t), over and above those of Pat Frank’s emulator, can make a huge difference to the evolution.”
(An emulator that is critically vacant and mathematically incoherent, and with an inapposite variance.)
- “The “magic” component R3(t) with anti-correlation -r to R2(t) can greatly reduce model error variance whilst retaining linear growth in the absence of decay.“
(Extra parameters in an emulator that does not deploy the formal structure of the GCM emulator, and missing any analytically equivalent factors. The extra parameters are ad hoc, while ‘a’ is incorrectly specified in 3ʀ and 4ʀ. The emulator is critically irrelevant and its expansion in ‘a’ is wrong.)
- “Any decay rate a>0 completely changes the propagation of error variance from linear growth to convergence to a finite limit.“
(Component R₃(t) is likewise ad hoc. It has no justified rationale. That R₃(t) has a variance at all requires its rejection (likewise rejection of R₁(t) and R₂(t)) because the coefficients in the emulator in the paper (eqn. 2 above) have no associated uncertainties.)
(The behavior of a critically irrelevant emulator engenders a deserved, ‘so what?’
Further, a>0 causes general decay only by allowing the mistaken derivation that put the (1-a) coefficient into the Rn(t) factors in 2ʀ)
Section I conclusion: The emulator construction itself is incongruous. It includes an unwarranted persistence. It has terms of convenience that do not map onto the target GCM projection emulator. The -rR₃(t) term cannot behave as described.
The transition from eqn. 2ʀ to eqn. 3ʀ is mathematically incoherent. The derivations following that employ eqn. 3ʀ are therefore wrong, including the variances.
The eqn. 1ʀ emulator itself is ad hoc. Its derivation is without reference to the behavior of climate models and of physical reasoning. Its ability to emulate a GCM air temperature projection is undemonstrated.
II. Problems with “New Parameters” Section C:
Rich rationalized his introduction of so-called decay parameter ‘a’ in the “Parameter” section C of his post. He introduced this equation:
M(t) = b + cF(t) +dH(t-1), 6ʀ
where M = temperature, F = forcing, and H(t) is “heat content“.
The ‘b’ term might be the ‘b’ coefficient assigned to E(R₁(t)] above, but we are not told anything about it.
I’ll summarize the problem. Coefficients ‘c’ and ‘d’ are actually functions that transform forcing and heat flux (not heat content) in Wm⁻², into their respectively caused temperature, Celsius. They are not integers or real numbers.
However, Rich’s derivation treats them as real number coefficients. This is a fatal problem.
For example, in equation 6ʀ above, function ‘d’ transforms heat flux H(t-1) into its consequent temperature, Celsius. However, the final equation of Rich’s algebraic manipulation ends with ‘d’ inappropriately operating on M(0), the initial temperature. Thus, he wrote:
“M(t) = b + cF(t) + d(H(0) + e(M(t-1)-M(0)) = f + cF(t) + (1-a)M(t-1) (7ʀ)
where a = 1-de, f = b+dH(0)-deM(0). (my bold)”
There is no physical justification for a “deM(0)” term; d cannot operate on M(0).
Rich also assigned “a = 1-de,” where ‘e’ is an integer fraction, but again, ‘d’ is an operator function; ‘d’ cannot operate on ‘e’. The final (1-a)M(t-1) term is a cryptic version of deM((t-1), which contains the same fatal assault on physical meaning. Function ‘d’ cannot operate on temperature M.
Further, what is the meaning of an operator function standing alone with nothing on which to operate? How can “1-de” be said to have a discrete value, or even to mean anything at all?
Other conceptual problems are in evidence. We read, “Now by the Stefan-Boltzmann equation M [temperature – P] should be related to F^¼ …” Rather, S-B says that M should be related to H^¼ (H is here taken to be black body radiant flux). According to climate models M is linearly related to F.
We are also told, “Next, the heat changes by an amount dependent on the change in temperature: …” while instead, physics says the opposite: temperature changes by an amount dependent on the change in the heat (kinetic energy). That is, temperature is dependent on atomic/molecular kinetic energy.
Rich finished with, “Roy Spencer, who has serious scientific credentials, had written “CMIP5 models do NOT have significant global energy imbalances causing spurious temperature trends because any model systematic biases in (say) clouds are cancelled out by other model biases”. .”
Roy’s comment was originally part of his attempted disproof of my uncertainty analysis. It completely missed the point, in part because it confused physical error with uncertainty.
Roy’s and Rich’s offsetting errors do nothing to remove uncertainty from the prediction of a physical model.
Rich went on, “This means that in order to maintain approximate Top Of Atmosphere (TOA) radiative balance, some approximate cancellation is forced, which is equivalent to there being an R3(t) with high anti-correlation to R2(t). The scientific implications of this are discussed further in Section I.”
The only, repeat only, scientific implication of offsetting errors is that they reveal areas requiring further research, that the theory is inadequate, and that the predictive capacity is poor.
Rich’s approving mention of Roy’s mistake evidences that Rich, too, apparently does not see the distinction between physical error and predictive uncertainty. Tim Gorman especially, and others, have repeatedly pointed out the distinction to Rich, e.g., here, here, here, here, and here, but to no obvious avail.
Conclusions regarding the Parameter section C: analytically impossible, physically disjointed, wrongly supposes offsetting errors increase predictive reliability, wrongly conflates physical error with predictive uncertainty.
And once again, no demonstration that the proposed emulator can emulate anything relevant.
III. Problems with “Emulator Parameters” Section D:
In Section I above, I promised to show that Rich’s emulator would remain irrelevant, even after he added my numbers to it.
In his “Emulator Parameters” section Rich started out with, “Dr. Pat Frank’s emulator falls within the general model above.” This view could not possibly be more wrong.
First, Rich composed his emulator with my GCM air temperature projection emulator in mind. He inverted significance to say the originating formalism falls within the limit of a derivative composition.
Again, the GCM projection emulator is:
(2 again)
Rich’s emulator is W(t) = (1-a)W(t-1) + R₁(t) + R₂(t) + (-r)R₃(t) (1ʀ again)
(In II above, I showed that his alternative, M(t) = f + cF(t) + (1-a)M(t-1), is incoherent and therefore not worth considering further.)
In Rich’s emulator, temperature T₂ has some persistence from T₁. This dependence is nowhere in the GCM projection emulator.
Further, in the GCM emulator (eqn. 2-again), the temperature of time t-1 makes no appearance at all in the emulated air temperature at time t. Rich’s 1ʀ emulator is constitutionally distinct from the GCM projection emulator. Equating them is to make a category mistake.
Analyzing further, emulator R₁(t) is a, “component which represents changes in major causal influences, such as the sun and carbon dioxide,”
Rich’s R₁(t) describes all of, in the GCM projection emulator.
Rich’s R₁(t) thus exhausts the entire GCM projection emulator. What then is the purpose of his R₂(t) and R₃(t)? They have no analogy in the GCM projection emulator. They have no role to transfer into meaning.
The R2(t) is “a strong contribution with observably high variance, for example the Longwave Cloud Forcing (LCF).” The GCM projection emulator has no such term.
The R₃(t) is, “a putative component which is negatively correlated with R2(t)…” The GCM projection emulator has no such term. R₃(t) has no role to play in any analytical analogy.
Someone might insist that Rich’s emulator is like the GCM projection emulator after his (1-a)W(t-1), R₂(t), and (-r)R₃(t) terms are thrown out.
So, we’re left with this deep generalization: Rich’s emulator-emulator pared to its analogical essentials is M(tᵢ) = R(tᵢ),
where R(tᵢ) =.
Rich went on to specify the parameters of his emulator: “The constants from [Pat Frank’s] paper, 33K, 0.42, 33.3 Wm-2, and +/-4 Wm-2, the latter being from errors in LCF, combine to give 33*0.42/33.3 = 0.416 and 0.416*4 = 1.664 used here.”
Does anyone see a ±4 Wm⁻² in the GCM projection emulator? There is no such term.
Rich has made the same mistake as did Roy Spencer (one of many). He supposed that the uncertainty propagator (the right-side term in paper eqn. 5.2) is the GCM projection emulator.
It isn’t.
Rich then presented the conversion of his general emulator into his view of the GCM projection emulator: “So we can choose a = 0, b = 0, c+d = 0.416 F(t) where F(t) is the new GHG forcing (Wm-2) in period t, s1=0, s2=1.664, s3=0.“, and then derive
W(t) = (c+d)t + W(0) +/- sqrt(t) s2” (8ʀ)
There are Rich’s mistakes made explicit: his emulator, eqn. 8ʀ, includes persistence in the W(0) term and a ±sqrt(t) s2 term, neither of which appear anywhere in the GCM projection emulator. How can eqn. 8ʀ possibly be an analogy for eqn. 2?
Further, including “ +/- sqrt(t) s2“ will cause his emulator to produce two values of W(t) at every time-step.
One value W(t) stems from the positive root of sqrt(s2) and the other W(t) from the negative root of sqrt(s2). A plot of the results will show two W(t) trends, one perhaps rising while the other falls.
To see this mistake in action, see the first Figure in Roy Spencer’s critique.
The “+/-” term in Rich’s emulator makes it not an emulator.
However, a ‘±’ term does appear in the error propagator:
±uᵢ(T) = [fCO₂ ´ 33 K ´ (±4 Wm⁻²)/F₀] — see eqns. 5.1 and 5.2 in the paper.
It should now be is obvious that Rich’s emulator is nothing like the GCM projection emulator.
Instead, it represents a category mistake. It is not only wrongly derived, it has no analytical relevance at all. It is conceptually adverse to the GCM projection emulator it was composed to critically appraise.
Rich’s emulator is ad hoc. It was constructed with factors he deemed suitable, but with no empirical reference. Theory without empiricism is philosophy at best; never science.
Rich then added in certain values taken from the GCM projection emulator and proceeded to zero out everything else in his equation. The result does not demonstrate equivalence. It demonstrates tendentiousness: elements manipulated to achieve a predetermined end. This approach is diametrical to actual science.
The rest of the Emulator Parameters section elaborates speculative constructs of supposed variances given Rich’s irrelevant emulator. For example, “Now if we choose b = a(c+d) then that becomes (c+d)(t+1), etc. etc.” This is to choose without any reference to any explicit system or any known physical GCM error. The b = a(c+d) is an ungrounded levitated term. It has no substantive basis.
The rest of the variance speculation is equally irrelevant, and in any case derives from an unreservedly wrong emulator.
Nowhere is its competence demonstrated by, e.g., emulating a GCM air temperature projection.
I will not consider his Section D further, except to note that Rich’s Case 1 and Case 2 clearly imply that he considers the variation of model runs about the model projection mean to be the centrally germane measure of uncertainty.
It is not.
The precision/accuracy distinction was discussed in the introductory comments above. Run variation supplies information only about model precision — run repeatability. The analysis in the paper concerned accuracy.
This distinction is absolutely central, and was of immediate focus.
Introduction paragraph 2:
“Published GCM projections of the GASAT typically present uncertainties as model variability relative to an ensemble mean (Stainforth et al., 2005; Smith et al., 2007; Knutti et al., 2008), or as the outcome of parameter sensitivity tests (Mu et al., 2004; Murphy et al., 2004), or as Taylor diagrams exhibiting the spread of model realizations around observations (Covey et al., 2003; Gleckler et al., 2008; Jiang et al., 2012). The former two are measures of precision, while observation-based errors indicate physical accuracy. Precision is defined as agreement within or between model simulations, while accuracy is agreement between models and external observables (Eisenhart, 1963, 1968; ISO/IEC, 2008). (bold added)
…
“However, projections of future air temperatures are invariably published without including any physically valid error bars to represent uncertainty. Instead, the standard uncertainties derive from variability about a model mean, which is only a measure of precision. Precision alone does not indicate accuracy, nor is it a measure of physical or predictive reliability. (added bold)
“The missing reliability analysis of GCM global air temperature projections is rectified herein.”
It is evidently possible to read the above and fail to grasp it. Rich’s entire approach to error and variance ignores it and thereby is misguided. He has repeatedly confused model precision with predictive accuracy.
That mistake is fatal to critical relevance. It removes any valid application of Rich’s critique to my work or to the GCM projection emulator.
Finally, I will comment on his last paragraph: “Pat Frank’s paper effectively uses a particular W(t;u) (see Equation (8) above) which has fitted mw(t;u) to mm(t), but ignores the variance comparison. That is, s2 in (8) was chosen from an error term from LCF without regard to the actual variance of the black box output M(t).”
The first sentence says that I fitted “mw(t;u) to mm(t).” That is, Rich supposed that my analysis consisted of fits to the model mean.
He is wrong. The analysis focused on single projection runs of individual models.
Methodological SI Figure S3-2 shows each fit tested a single temperature projection run of a single target GCM plotted against a standard of GHG forcing (SRES, Meinshausen, or other).
SI Figure S3-2. Left: fit of cccma_cgcm3_1_t63 projected global average temperature plotted vs SRES A2 forcing. Right: emulation of the ccma_cgcm3_1_t63 A2 air temperature projection. Every fit had only one important degree of freedom.
Only Figure 7 showed emulation of a multi-model projection mean. All the 68 rest of them were single model projection runs. All of which Rich apparently missed.
There is no ambiguity in what I did, which is not what Rich supposed I did.
The second sentence, “That is, s2 in (8) was chosen from an error term from LCF without regard to the actual variance of the black box output M(t).” is also factually wrong. Twice.
First, there is no LCF term in the emulator, nor any standard deviation. The “s2” is a fantasy.
Second, the long wave cloud forcing calibration error in the uncertainty propagator is the annual average error CMIP5 GCMs make in simulating annual global cloud fraction (CF).
That is, LWCF calibration error is exactly the actual [error] variance of the black box output M(t) with respect to observed global cloud fraction.
Rich’s “the actual variance of the black box output M(t).” refers to the variance of individual GCM air temperature projection runs around a projection mean; a precision metric.
The accuracy metric of model variance with respect to observation is evidently lost on Rich. He brought up inattention to bare precision as though it faulted an analysis concerned with accuracy.
This fatal mistake is a commonplace among the critics of my paper.
It shows a foundational inability to effectuate any scientifically valid criticism at all.
The explanation for entering the LWCF error statistic into the uncertainty propagator is given within the paper (p. 10):
“GHG forcing enters into and becomes part of the global tropospheric thermal flux. Therefore, any uncertainty in simulated global tropospheric thermal flux, such as LWCF error, must condition the resolution limit of any simulated thermal effect arising from changes in GHG forcing, including global air temperature. LWCF calibration error can thus be combined with 1Fi in equation 1 to estimate the impact of the uncertainty in tropospheric thermal energy flux on the reliability of projected global air temperatures.”
This explanation seems opaque to many for reasons that remain obscure.
Citations Zhang et al. (2005) and Dolinar et al. (2015) gave similar estimates of LWCF calibration error.
Summary conclusions about the Emulator Parameter Section D:
1) The proposed emulator is ad hoc and tendentious.
2) The proposed emulator is constitutively wrong.
· It wrongly includes persistence.
· It wrongly includes a cloud forcing term (or the like).
· It wrongly includes an uncertainty statistic.
3) Confused the uncertainty propagator with the GCM projection emulator.
4) Mistaken focus of precision in a study about accuracy.
Or, perhaps, ignorance of the concept of physical accuracy, itself.
5) Wrongly imputed that the study focused on GCM projection means.
6) Never once demonstrated that the emulator can actually emulate.
IV. Problems with “Error and Uncertainty” Section E:
Thus far, we’ve found that Rich’s emulator analysis is ad hoc, tendentious, constitutively wrong, dimensionally impossible, mathematically incoherent, confuses precision with accuracy, includes incongruous variances, and is empirically unvalidated. His analysis could almost not be more bolloxed up.
I here step through a few of his Section E mistakes, which always seem to simplify things for him. Quotes are marked “R:” followed by a comment.
R: “Assuming that X is a single fixed value, then prior to measurement, M-X is a random variable representing the error,…”
Except when the error is systematic stemming from uncontrolled variables. In that case M – X is a deterministic variable of no fixed mean, of a non-normal dispersion, and of an unknowable value. See the further analysis in the Appendix.
R: “+/-sm is described by the JCGM 2.3.1 as the “standard” uncertainty parameter.”
Rich is being a bit fast here. He’s implying the JCGM Section 2.3.1 definition of “standard uncertainty” is limited to the SD of random errors.
The JCGM is the Evaluation of measurement data — Guide to the expression of uncertainty in measurement — the standard guide to the statistical analysis of measurements and their errors provided by the Bureau International des Poids et Mesures.
The JCGM actually says that the standard uncertainty is “uncertainty of the result of a measurement expressed as a standard deviation,” which is rather more general than Rich allowed.
The quotes below show that the JCGM includes systematic error as contributing to uncertainty.
Under E.3 “Justification for treating all uncertainty components identically” the JCGM says,
The focus of the discussion of this subclause is a simple example that illustrates how this Guide treats uncertainty components arising from random effects and from corrections for systematic effects in exactly the same way in the evaluation of the uncertainty of the result of a measurement. It thus exemplifies the viewpoint adopted in this Guide and cited in E.1.1, namely, that all components of uncertainty are of the same nature and are to be treated identically. (my bold)
Under JCGM E. 3.1 and E 5.2, we have that the variance of a measurement wᵢ of true value μᵢ is given by σᵢ² =E[(wᵢ – μᵢ)²], which is the standard expression for error variance.
After the usual caveats about [the] expectation of the probability distribution of each εi is assumed to be zero, E(εi) = 0, …, the JCGM notes that,
It is assumed that probability is viewed as a measure of the degree of belief that an event will occur, implying that a systematic error may be treated in the same way as a random error and that εᵢ represents either kind.(my bold).
In other words, the JCGM advises that systematic error is to be treated using the same statistical formalism as is used for random error.
R: “The real error statistic of interest is E[(M-X)2] = E[((M-mm)+(mm-X))2] = Var[M] + b2, covering both a precision component and an accuracy component.”
Rich then referenced that equation to my paper and to long wave cloud forcing (LWCF;
Rich’s LCF) error. However, this is a fundamental mistake.
In Rich’s equation above, the bias, b = M-mm = a constant. Among GCMs however, step-wise cloud bias error varies across the global grid-points for each GCM simulation. And it also varies among GCMs themselves. See paper Figure 4 and SI Figure S6-1.
The factor (M-mm) = b, above, should therefore be (Mᵢ-mm) = bᵢ because b varies in a deterministic but unknown way with every Mᵢ.
A correct analysis of the case is:
E[(Mᵢ-X)2] = E[((Mᵢ-mm)+(mm-X))2] = Var[M] + Var[b]
Systematic error is discussed in more detail in the Appendix.
Rich goes on, “But the theory of converting variances and covariances of input parameter errors into output error via differentiation is well established, and is given in Equation (13) of the JCGM.”
Equation (13) of the JCGM provides the formula for the error variance in y, u²c(y), but describes it this way:
The combined variance, u²c(y), can therefore be viewed as a sum of terms, each of which represents the estimated variance associated with the output estimate y generated by the estimated variance associated with each input estimate xᵢ. (my bold)
That is, the combined variance, u²c(y), is the variance that results from considering all forms of error; not just random error.
Under JCGM 3.3.6:
The standard uncertainty of the result of a measurement, when that result is obtained from the values of a number of other quantities, is termed combined standard uncertainty and denoted by uc. It is the estimated standard deviation associated with the result and is equal to the positive square root of the combined variance obtained from all variance and covariance (C.3.4) components, however evaluated, using what is termed in this Guide the law of propagation of uncertainty (see Clause 5). (my bold)
Under JCGM E 4.4 EXAMPLE:
The systematic effect due to not being able to treat these terms exactly leads to an unknown fixed offset that cannot be experimentally sampled by repetitions of the procedure. Thus, the uncertainty associated with the effect cannot be evaluated and included in the uncertainty of the final measurement result if a frequency-based interpretation of probability is strictly followed. However, interpreting probability on the basis of degree of belief allows the uncertainty characterizing the [systematic] effect to be evaluated from an a priori probability distribution (derived from the available knowledge concerning the inexactly known terms) and to be included in the calculation of the combined standard uncertainty of the measurement result like any other uncertainty. (my bold)
JCGM says that the combined variance, u²c(y), includes systematic error.
The systematic error stemming from uncontrolled variables becomes a variable component of the output; a component that may change unknowably with every measurement. Systematic error then necessarily has an unknown and almost certainly non-normal dispersion (see the Appendix).
The JCGM further stipulates that systematic error is to be treated using the same mathematical formalism as random error.
Above we saw that uncontrolled deterministic variables produce a dispersion of systematic error biases in an extended series of measurements and in GCM simulations of global cloud fraction.
That is, the systematic error is a “fixed offset” = bᵢ only in the Mᵢ time-step. But the bᵢ vary in some unknown way across the n-fold series of Mᵢ.
In light of the JCGM discussion, the dispersion of systematic error, bᵢ, requires that any complete error variance include Var[b].
The dispersion of the bᵢ can be determined only by way of a calibration experiment against a known X carried out under conditions as identical as possible to the experiment.
The empirical methodological calibration error, Var[b] of X, is then applied to condition the result of every experimental determination or observation of an unknown X; i.e., it enters the reliability statement of the result.
In Example 1, the 1-foot ruler, Rich immediately assumed away the problem. Thus, “the manufacturer assures us that any error in that interval is equally likely[, but I will] write 12+/-_0.1 …, where the _ denotes a uniform probability distribution, instead of a single standard deviation for +/-.”
That is, rather than accept the manufacturer’s stipulation that all deviations are equally likely, Rich converted the uncertainty into a random dispersion, in which all deviations are no longer equally likely. He has assumed knowledge where there is none.
He wrote, “If I have only 1 ruler, it is hard to see how I can do better than get a table which is 120+/-_1.0”.” But that is wrong.
The unknown error in any ruler is a rectangular distribution of -0.1 to +0.1″, with all possibilities equally likely. Ten measurements with a ruler of unknown specific error can be anywhere from 1″ to -1″ in error. The expectation interval is (1-(-1)”/2 =1″. The standard uncertainty is then 1″/sqrt(3) = ±0.58″, thus 120±0.58″.
He then wrote that if one instead made ten measurements using ten independently machined rulers then the uncertainty of measurement = “sqrt(10) times the uncertainty of each.” But again, that is wrong.
The original stipulation is equal likelihood across ±0.1″ of error for every ruler. For ten independently machined rulers, every ruler has a length deviation equally likely to be anywhere within -0.1″ to 0.1″. That means the true total error using 10 independent rulers can again be anywhere from 1″ to -1″.
The expectation interval is again (1-(-1)”/2 = 1″, and the standard uncertainty after using ten rulers is 1″/sqrt(3) = ±0.58″. There is no advantage, and no loss of uncertainty at all, in using ten independent rulers rather than one. This is the outcome when knowledge is lacking, and one has only a rectangular uncertainty estimate — a not uncommon circumstance in the physical sciences.
Rich’s mistake is founded in his immediate recourse to pseudo-knowledge.
R; “We know by symmetry that the shortest plus longest [of a group of ten rulers] has a mean error of 0…” But we do not know that because every ruler is independently machined. Every length error is equally likely. There is no reason to assume a normal distribution of lengths, no matter how many rulers one has. The shortest may be -0.02″ too short, and the longest 0.08″ too long. Then ten measurements produce a net error of 0.3″. How would anyone know? One has no way of knowing the true error in the physical length of a shortest and a longest ruler.
The length uncertainty of any one ruler is [(0.1-(-0.1)/2]/sqrt(3) = ±0.058″. The only reasonable stipulation one might make is that the shortest ruler is (0.05±0.058)” too short and the longest (0.05±0.058)” too long. Then 5 measurements using each ruler yields a measurement with uncertainty of ±0.18″.
Complex variance estimates notwithstanding, Rich assumed away all the difficulty in the problem, wished his way back to random error, and enjoyed a happy dance.
Conclusion: Rich’s Section E is wrong, wherever it isn’t irrelevant.
1. He assumed random error when he should have considered deterministic error.
2. He badly misconstrued the message of JCGM concerning systematic error, and the meaning of its equation (13).
3. He ignored the centrally necessary condition of uncontrolled variables, and the consequent unknowable variation of systematic error across the data.
4. He wrongly treated systematic error as a constant offset.
5. His treatment of rectangular uncertainty is wrong.
6. He then wished rectangular uncertainty into a random distribution.
7. He treated assumed distributions as though they were known distributions — OK in a paper on statistical conjectures, a failing grade in an undergraduate instrumental lab course, and death in a real-world lab.
V. Problems with Section F:
The first part concerning comparative uncertainty is speculative statistics and so is here ignored.
Problems with Rich’s Marked Ruler Example 2:
The discussion neglected the resolution of the ruler itself, typically 1/4 of the smallest division.
It also ignored the question of whether the lined division marks are uniformly and accurately spaced — another part of the resolution problem. This latter problem can be reduced with recourse to a high-precision ruler that includes a manufacturer’s resolution statement provided by the in-house engineers.
It ignored that the smallest divisions on a to-be-visually-appraised precision instrument are typically manufactured in light of the human ability to resolve the spaces.
To achieve real accuracy with a ruler, one would have to calibrate it at several internal intervals using a set of high-accuracy length standards. Good luck with that.
VI. Problems with Rich’s thermometer Example 3 Section G:
Rich brought up what was apparently my discussion of thermometer metrology, made in an earlier comment on another WUWT essay.
He mentioned some of the elements I listed as going into uncertainty in the read-off temperature including, “the thermometer capillary is not of uniform width, the inner surface of the glass is not perfectly smooth and uniform, the liquid inside is not of constant purity, the entire thermometer body is not at constant temperature. He did not include the fact that during calibration human error in reading the instrument may have been introduced.”
I no longer know where I made those comments (I searched but didn’t find them) and Rich provided no link. However, I would never have intended that list to be exhaustive. Anyone wondering about thermometer accuracy can do a lot worse than to read Anthony Watts’ post about thermometer metrology.
Among impacts on accuracy, Anthony mentioned hardening and shrinking of the glass in LiG thermometers over time. After 10 years, he said, the reading might be 0.7 C high. A process of slow hardening would impose a false warming trend over the entire decade. Anthony also mentioned that historical LiG meteorology thermometers were often graduated in 2 ⁰F increments, yielding a resolution of ±0.5 ⁰F = ±0.3 ⁰C.
Rich mentioned none of that, in correcting my apparently incomplete list.
Here’s an example of a 19th century min-max thermometer with 2 ⁰F divisions.
Louis Cassella-type 19th century min-max thermometer with 2 ⁰F divisions.
Image from the Yale Peabody Museum.
High-precision Louis Castella thermometers included 1 ⁰F divisions.
Rich continued: “The interesting question arises as to what the (hypothetical) manufacturers meant when they said the resolution was +/-0.25K. Did they actually mean a 1-sigma, or perhaps a 2-sigma, interval? For deciding how to read, record, and use the data from the instrument, that information is rather vital.”
Just so everyone knows what Rich is talking about, pictured below are a couple of historical LiG meteorological thermometers.
Left: The 19th century Negretti and Zambara minimum thermometer from the Welland weather station in Ontario, Canada, mounted in the original Stevenson Screen. Right: A C.W Dixey 19th century Max-Min thermometer (London, after ca. 1870). Insets are close-ups.
The finest lineations in the pictured thermometers are 1 ⁰F and are perhaps 1 mm apart. The Welland instrument served about 1892 – 1957.
The resolution of these thermometers is ±0.25 ⁰F, meaning that smaller values to the right of the decimal are physically dubious. The 1880-82 observer at Welland, Mr. William B. Raymond, age about 20 years, apparently recorded temperatures to ±0.1 ⁰F, a fine example of false precision.
In asking, “Did [the manufacturers] actually mean a 1-sigma, or perhaps a 2-sigma, interval?“, Rich is posing the wrong question. Resolution is not about error. It does not imply a statistical variable. It is a physical limit of the instrument, below which no reliable data are obtainable.
The modern Novalynx 210-4420 Series max-min thermometer below is, “made to U.S. National Weather Service specifications.”
The specification sheet (pdf) of provides an accuracy of ±0.2 ⁰C “above 0 ⁰C.” That’s a resolution-limit number, not a 1σ number or a 2σ number.
A ±0.2 ⁰C resolution limit means the thermometers are not able to reliably distinguish between external temperatures differing by 0.2 ⁰C or less. It means any finer reading is physically suspect.
The Novalynx thermometers record 95 degrees across 8 inches, so that each degree traverses 0.084″ (2.1 mm). Reading a temperature to ±0.2 ⁰C requires the visual acuity to discriminate among five 0.017″ = 0.43 mm unmarked widths within each degree interval.
Historical thermometers were no better.
This leads to the question: — even though the thermometer is accurate to ±0.2 ⁰C, is it still reasonable to propose, as Rich did, that an observer should be able to regularly discriminate individual ±0.1 ⁰C intervals within merging 0.22 mm blank widths? Hint: hardly.
Rich’s entire discussion is unrealistic, showing no sensitivity to the meaning of resolution limits, of accuracy, of the graduation of thermometers, or of limited observer acuity.
He wrote, “In the present [weather thermometer] example, I would recommend trying for t2 = 1/100, or as near as can be achieved within reason.” Rich’s t² is the variance of observer error, meaning he recommends reading to ±0.1 ⁰C in thermometers that are not accurate to better than ±0.2 ⁰C.
Rich finished by advising the manufacture of false data: “if the observer has the skill and time and inclination then she can reduce overall uncertainty by reading to a greater precision than the reference value. (my bold)”
Rich recommended false precision; a mistake undergraduate science and engineering students have flogged out of them from the very first day. But one that typifies consensus climatology.
His conclusion that, “Again, real life examples suggest the compounding of errors, leading to approximately normal distributions.” is entirely unfounded, based as it fully is on unrealistic statistical speculations. Rich considered no real-life examples at all.
The moral of Rich’s section G is that it’s not prudent to give advice concerning methods about which one has no experience.
The whole thermometer section G is misguided and is yet another example, after the several prior, of an apparently very poor grasp of physical accuracy, of its meaning, and of its fundamental importance to all of science.
VII. Problems with “The implications for Pat Frank’s paper” Section H:
Rich began his Section H with a set of declarations about the implications of his various
Sections now known to be over-wrought or plain wrong. Stepping through:
Section B: Rich’s emulator is constitutively inapt. The derivation is both wrong and incoherent. Tendentiously superfluous terms promote a predetermined end. The analysis is dimensionally unsound and deploys unjustified assumptions. No empirical validation of claimed emulator competence.
Section C: incorrectly proposes that offsetting calibration errors promote predictive reliability. It includes an inverted Stefan-Boltzmann equation and improperly treats operators as coefficients. As in other sections, Section B evinces no understanding of accuracy.
Section D: displays confusion about precision and accuracy throughout. The GCM emulator (paper eqn. 1) is confused with the error propagator (paper eqn. 5.2) which is fatal to Section D. No empirical validation of claimed emulator competence. Fatally misconstrues the analytical focus of my paper to be GCM projection means.
Section E: again, falsely asserted that all measurement or model error is random and that systematic error is a fixed constant bias offset. It makes empirically unjustified and ad hoc assumptions about error normality. It self-advantageously misconstrued the JCGM description of standard uncertainty variance.
Section F: has unrealistic prescriptions about the use of rulers.
Section G: displays no understanding of actual thermometers and advises observers to record temperatures to false precision.
Rich wrote that, “The implication of Section C is that many emulators of GCM outputs are possible, and just because a particular one seems to fit mean values quite well does not mean that the nature of its error propagation is correct.”
There we see again Rich’s fatal mistake that the paper is critically focused on mean values. He also wrote there are many possible GCM emulators without ever demonstrating that his proposed emulator can actually emulate anything.
And again here, “Frank’s emulator does visibly give a decent fit to the annual means of its target,…”
However, the analysis did not fit annual means. It fit the relationship between forcing and projected air temperature.
The emulator itself reproduced the GCM air temperature projections. It did not fit them. Contra Rich, that performance is indeed, “sufficient evidence to assert that it is a good emulator.”
And in further fact, the emulator tested itself against dozens of individual GCM single air temperature projections, not projection means. SI Figures S4-6, S4-8 and S4-9 show the decent fit residuals remain close to zero.
The tests showed beyond doubt that every tested GCM behaved as a linear extrapolator of GHG forcing. That invariable linearity of output behavior entirely justifies linear propagation of error.
Throughout, Rich’s analysis displays a thorough and comprehensively mistaken view of the paper’s GCM analysis.
The comments that finish his analysis demonstrate that case.
For example: “The only way to arbitrate between emulators would be to carry out Monte Carlo experiments with the black boxes and the emulators.” recommends an analysis of precision, with no notice of the need for accuracy.
Repeatability over reliability.
If ever there was a demonstration that Rich’s approach fatally neglects science, that is it.
This next paragraph really nails Rich’s mistaken thinking: “Frank’s paper claims that GCM projections to 2100 have an uncertainty of +/- at least 15K. Because, via Section D, uncertainty really means a measure of dispersion, this means that Equation (1) with the equivalent of Frank’s parameters, using many examples of 80-year runs, would show an envelope where a good proportion would reach +15K or more, and a good proportion would reach -15K or less, and a good proportion would not reach those bounds.”
First it was his Section E, not Section D, that supposed uncertainty is the dispersion of a random variable.
Second, Section IV above showed that Rich had misconstrued the JCGM discussion of uncertainty. Uncertainty is not error. Uncertainty is the interval within which the true value should occur.
Section D 6.1 of the JCGM establishes the distinction:
[T]he focus of this Guide is uncertainty and not error.
And continuing:
The exact error of a result of a measurement is, in general, unknown and unknowable. All one can do is estimate the values of input quantities, including corrections for recognized systematic effects, together with their standard uncertainties (estimated standard deviations), either from unknown probability distributions that are sampled by means of repeated observations, or from subjective or a priori distributions based on the pool of available information; ...
Unknown probability distributions sampled by means of repeated observations describes a calibration experiment and its result. Included among these is the comparison of a GCM hindcast simulation of global cloud fraction with the known observed cloud fraction.
Next, the ±15 C uncertainty does not mean some projections would reach “+15K or more” or “-15K or less.” Uncertainty is not error. The JCGM is clear on this point, as is the literature. Uncertainty intervals are not error magnitudes. Nor do they imply the range of model outputs.
The ±15 C GCM projection uncertainty is an ignorance width. It means that one has no information at all about the possible air temperature in year 2100.
Supposing that uncertainty propagated through a serial calculation directly implies a range of possible physical magnitudes is merely to reveal an utter ignorance of physical uncertainty analysis.
Rich’s mistake that an uncertainty statistic is a physical magnitude is also commonplace among climate modelers.
Among Rich’s Section H summary conclusion, the first is wrong, while the second and third are trivial.
The first is, “Frank’s emulator is not good in regard to matching GCM output error distributions.” There are two mistakes in that one sentence.
The first is that the GCM air projection emulator can indeed reproduce all the single air temperature projection runs of any given GCM. Rich’s second mistake is to suppose that GCM individual run variation about a mean indicates error.
Regarding Rich’s first mistake, the Figure below is taken from Rowlands, et al., (2012). It shows thousands of individual HadCM3L “perturbed physics” runs. Perturbed physics means the parameter sets are varied across their uncertainty widths. This produces a whole series of alternative projected future temperature states.
Original Figure Legend: “Evolution of uncertainties in reconstructed global-mean temperature projections under SRES A1B in the HadCM3L ensemble.”
This “perturbed physics ensemble” is described as “a multi-thousand-member ensemble of transient AOGCM simulations from 1920 to 2080 using HadCM3L,…”
Given knowledge of the forcings, the GCM air temperature projection emulator could reproduce every single one of those multi-thousand ensembled HadCM3L air temperature projections. As the projections are anomalies, emulator coefficient a = 0. The emulations would proceed by varying only the fCO₂ term. That is, the HadCM3L projections could be reproduced using the emulator with only one degree of freedom (see paper Figures 1 and 9).
So much for, “not good in regard to matching GCM output [so-called] error distributions.”
Second, the variance of the spread around the ensemble mean is not error, because the accuracy of the model projections remains unknown.
Studies of model spread, such as that of Rowlands, et al., (2012) reveal nothing about error. The dispersion of outputs reveals nothing but precision.
In calling that spread “error,” Rich merely transmitted his lack of attention to the distinction between accuracy and precision.
In light of the paper, every single one of the HadCM3L centennial projections is subject to the very large lower limit of uncertainty due to LWCF error, of the order ±15 C, at year 2080.
The uncertainty in the ensemble mean is the rms of the uncertainties of the individual runs. That’s not error, either. Or a suggestion of model air temperature extremes. It’s the uncertainty interval that reflects the total unreliability of the GCMs and our total ignorance about future air temperatures.
Rich wrote, “The “systematic squashing” of the +/-4 W/m^2 annual error in LCF inside the GCMs is an issue of which I for one was unaware before Pat Frank’s paper.
The implication of comments by Roy Spencer is that there really is something like a “magic” component R3(t) anti-correlated with R2(t), … GCM experts would be able to confirm or deny that possibility.”
Another mistake: the ±4 Wm⁻² is not error. It is uncertainty: a statistic. The uncertainty is not squashed. It is ignored. The unreliability of the GCM projection remains no matter that errors are made to cancel in the calibration period.
GCMs do deploy offsetting errors, but studied model tuning has no impact on simulation uncertainty. Offset errors do not improve the underlying physical description.
Typically, error (not uncertainty) in long wave cloud forcing is offset by an opposing error in short wave cloud forcing. Tuning allows the calibration target to be reproduced, but it provides no reassurance about predictive reliability or accuracy.
General conclusions:
The entire analysis has no critical force.
The proposed emulator is constitutively inapt and tendentious.
Its derivation is mathematically incoherent.
The derivation is dimensionally unsound, abuses operator mathematics, and deploys unjustified assumptions.
Offsetting calibration errors are incorrectly and invariably claimed to promote predictive reliability.
The Stefan-Boltzmann equation is inverted
Operators are improperly treated as coefficients.
Accuracy is repeatedly abused and ejected in favor of precision.
The GCM emulator (paper eqn. 1) is fatally confused with the error propagator (paper eqn. 5.2)
The analytical focus of the paper is fatally misconstrued to be model means.
The difficulties of measurement error or model error is assumed away, by falsely and invariably asserting all error to be random.
Uncertainty statistics are wrongly and invariably asserted to be physical error.
Systematic error is falsely asserted to be a fixed constant bias offset.
Uncertainty in temperature is falsely construed to be an actual physical temperature.
Ad hoc assumptions about error normality are empirically unjustified.
The JCGM description of standard uncertainty variance is self-advantageously misconstrued.
The described use of rulers or thermometers is unrealistic.
Readers are advised to read and record false precision.
Appendix: A discussion of Error Analysis, including the Systematic Variety
Rich also posted a comment under his “What do you mean by “mean” critique here attempting to show that systematic error cannot be included in an uncertainty variance.
Comments closed on the thread before I was able to finish a critical reply. The subject is important, so the reply is posted here as an Appendix.
In his comment, Rich assumed uncertainty to be the dispersion of a random variable with mean b and variance s². He concluded by claiming that an uncertainty variance cannot include bias errors.
Bias errors are another name for systematic errors, which Rich represented as a non-zero mean of error, ‘b.’
Below, I go through a number of relevant cases. They show that the mean of error, ‘b’, never appears in the formula for an error variance. They also show that the systematic errors from uncontrolled variables must be included in an uncertainty variance.
That is, the foundation of Rich’s derivation, which is:
“the uncertainty of a sum of n independent measurements with respective [error] means bᵢ and variances vᵢ is that given by JCGM 5.1.2 with unit differential: sqrt(sumᵢ g(vᵢ,bᵢ)²) where v = sumᵢ vᵢ, b = sumᵢ bᵢ.”
is wrong.
Given that mistake, the rest of Rich’s analysis there also fails, as demonstrated in the cases that follow.
Interestingly, the mean of error, ‘b,’ does not enter in the variance equation (10) in JCGM 5.1.2, either.
++++++++++++
For any set of n measurements xᵢ of X, xᵢ = X + eᵢ, where eᵢ is the total error in the xᵢ.
Total error eᵢ = rᵢ + dᵢ where rᵢ = random error and dᵢ = systematic error.
The errors eᵢ cannot be known unless the correct value of X is known.
In what follows “sumᵢ” means sum over the series of i where i = 1 ® n, and Var[x] is the error variance of x.
Case 1: X is known.
1.1) When X is known, and only random error is present.
The experiment is analogous to an ideal calibration of method.
Then eᵢ = xᵢ – X, and Var[x] = [sumᵢ(xᵢ – X)²]/n = [sumᵢ(eᵢ)²]/n. In this case eᵢ = rᵢ only, because systematic error = dᵢ = 0.
Then [sumᵢ (eᵢ)²]/n = [sumᵢ(rᵢ)²]/n.
For n measurements of xᵢ the mean of error = b = sumᵢ(eᵢ)/n.
When only random error contributes, the mean of error b tends to zero at large n.
So, Var[x] = sumᵢ[(xᵢ – X)²]/n = sumᵢ[(X + rᵢ) -X)²]/n = sumᵢ[(rᵢ)²]/n
and the standard deviation describes a normal dispersion centered around 0.
Thus, when error is a random variable, the mean of error ‘b’ does not appear in the variance.
In case 1.1, Rich’s uncertainty, sqrt[sumᵢ g(vᵢ,bᵢ)²] is not correct and in any event should have been written sqrt[sumᵢ g(sᵢ,bᵢ)²].
1.2) X is known, and both random error and constant systematic error are present.
When the dᵢ are present and constant, then dᵢ = d for all i.
The mean of error = ‘b,’ = sumᵢ [(xᵢ – X)]/n = sumᵢ [(X + rᵢ + d) – X]/n = sumᵢ [(rᵢ+d)]/n = nd/n + sumᵢ [(rᵢ)/n], which goes to ‘d’ at large n.
Thus, in 1.2, b = d.
And: Var[x] = sumᵢ[(xᵢ – X)²]/n = sumᵢ{[(X + rᵢ + d) – X]²}/n = sumᵢ [(rᵢ+d)²]/n, which produces a dispersion around ‘d.’
Thus, because X is known and ‘d’ is constant, ‘d’ can be found exactly and subtracted away.
The mean of the final error, ‘b’ never enters the variance.
That is, when b => d = a real number constant that can be known and can be corrected out of subsequent measurements of samples.
This last remains true in other laboratory samples where the X is unknown, because the method has been calibrated against a similar sample of known X and the methodological ‘d’ has been determined. That ‘d’ is always constant is an assumption, i.e., that experimenter error is absent and the methodology is identical.
Case 2: X is UNknown, and both random error and systematic error are present
Then the mean of xᵢ = [sumᵢ (xᵢ)/n] = x_bar.
As before, let xᵢ = X + eᵢ = X + rᵢ + dᵢ.
Var[x] = sumᵢ[(xᵢ – x_bar)²]/(n-1), and the SD describes a dispersion around x_bar.
2.1) Systematic error = 0.
If dᵢ = 0, then eᵢ = rᵢ is random, and x_bar becomes a good measure of X at large n.
Var[x] = sumᵢ[(xᵢ – x_bar)²]/(n-1) = sumᵢ[(X + rᵢ) – (X+r_r)]²/(n-1) = sumᵢ[(rᵢ – r_r)]²/(n-1), where r_r is the residual of error in x_bar over interval ‘n’.
As above, the mean of error ‘b’ = sumᵢ[(xᵢ – x_bar)]/n, = sumᵢ[(X+rᵢ) – (X+r_r)]/n = sumᵢ[(rᵢ – r_r)]/n = [n(r_r)/n + sumᵢ(rᵢ)/n], and b = r_bar + r_r, where r_bar is the average of error over the ‘n’ interval.
Then b is a real number, which again does not enter the uncertainty variance and which approaches zero at large n
2.2) If dᵢ is constant = c
Then xᵢ = X + rᵢ + c.
The error mean = ‘b’ = sumᵢ[(xᵢ – x_bar)]/n = sumᵢ{[(X + rᵢ + c) – (X + c + r_r)]}/n, and b = sumᵢ[(rᵢ -r_r)/]n = sumᵢ(rᵢ)/n – n(r_r)/n, and b = (r_bar – r_r), wherein the signs of r_bar and r_r are unspecified.
The Var[x] = sumᵢ[(xᵢ – x_bar)²]/(n-1) = sumᵢ [(X+rᵢ+c) – (X+r_r+c)]²/(n-1) = sumᵢ[(rᵢ – r_r)²]/(n-1).
The variance describes a dispersion around r_r.
The mean error, ‘b’ does not enter the variance.
Case 3; X is UNknown and systematic error, dᵢ, varies due to uncontrolled variables.
Uncontrolled variables mean that every measurement (or every model run) is impacted by inconstant deterministic perturbations, i.e., inconstant causal influences. These modify the value of each result with unknown biases that vary with each measurement (or model run).
Any measurement, xᵢ = X +rᵢ + dᵢ, and dᵢ is a deterministic, non-random variable, and usually non-zero.
Over two measurement sequences of number n and m, the mean error = b = sumᵢ(rᵢ + dᵢ)/n, = bn, and bm = sumj(rj + dj)/m and bn ≠ bm, even if interval n equals interval m.
Var[x]n = sumᵢ[(xᵢ – x_bar-n)²]/(n-1), where x_bar-n is x-bar over sequence n.
Var[x]m = sumj[(xj – x_bar-m)²/](m-1)
and Var[x]n = sumᵢ[(xᵢ – x_bar-n)²]/(n-1) = sumᵢ [(X+rᵢ+dᵢ)-(X+r_r+d_bar-n)]²/(n-1) = sumᵢ[(rᵢ – r_r + dᵢ – d_bar-n)²]/(n-1) = sumᵢ[rᵢ -(d_bar-n + r_r – dᵢ)]²]/(n-1).
Likewise, Var[x]m = sumj[(rj – (d_bar-m + r_r – dj)]²]/(m-1).
Thus, neither bn nor bm enter into either Var[x], contradicting Rich’s assumption.
The dᵢ, dj enter into the total uncertainty of the x_bar-n, x_bar-m. Further, the variation of dᵢ, dj with each i, j means that the dispersion of Var[x]n,m will include the dispersion of the dᵢ, dj. The deterministic cause of dᵢ, dj will very likely make their distribution non-normal.
That is, when systematic error is inconstant due to uncontrolled variables, dᵢ will vary with each i, and will produce a dispersion represented by the standard deviation of the dᵢ.
This negates the claim that systematic error cannot contribute an uncertainty interval.
Also, x_bar-n ≠ x_bar-m, and [dᵢ – (d_bar – n] ≠ [dj – (d_bar-m].
Therefore Var[x]n ≠ Var[x]m, even at large n, m and including when n = m over well-separated periods.
Case 4: X is known, and dᵢ varies due to uncontrolled variables.
This is a case of calibration against a known X when uncontrolled variables are present, and mirrors the calibration of GCM-simulated global cloud fraction against observed global cloud fraction.
4.1) A series of n measurements.
Here, eᵢ = xᵢ – X, and Var[x] = sumᵢ[(xᵢ – X)²]/n = sumᵢ(eᵢ)²/n = sumᵢ[(rᵢ + dᵢ)²]/n = [u(x)²].
As eᵢ = rᵢ + dᵢ, then Var[x] = sumᵢ(rᵢ + dᵢ)²]/n, but the values of each rᵢ and dᵢ are unknown.
The denominator is ‘n’ rather than (n+1) because X is known and degrees of freedom are not lost to a mean in calculating the standard variance.
For n measurements of xᵢ the mean of error = b = sumᵢ(eᵢ)/n = sumᵢ(rᵢ + dᵢ) = variable depending on ‘n,’ because dᵢ varies in an unknown but deterministic way across n.
However, X is known, therefore (xᵢ – X) = eᵢ is known to be the true and complete error in the i-th measurement.
At large n, the sumᵢ(rᵢ) becomes negligible. and Var[x] = sumᵢ[(eᵢ)²/n] = sumᵢ[(dᵢ)²/n] = [u(x)²] at the limit, which is very likely a non-normal dispersion.
The systematic error produces a dispersion because the dᵢ vary. At large n, the uncertainty reduces to the interval due to systematic error.
Th mean of error, ‘b,’ does not enter the variance.
The claim that systematic error cannot produce an uncertainty interval is again negated.
5) X is UNknown and dᵢ varies due to uncontrolled variables. The experimental sample is physically similar to the calibration sample in 4.
5.1) Let xᵢ’ be the i-th of n measurements of experimental sample 5.
The estimated mean of error = b = sumᵢ(x’ᵢ – x’_bar)/n
When x’ᵢ is measured, and X’ is unknown, Var[x’] = sumᵢ[(x’ᵢ – x’_bar)²]/(n-1).
= sumᵢ[(X’+r’ᵢ+d’ᵢ) – (X’ + d’_bar + r’-r)]²/(n-1).
and Var[x’] = sumᵢ[r’ᵢ + (d’ᵢ – d’_bar – r’_r))²]/(n-1).
Again, the mean of error b’ does not enter into the empirical variance.
And again, the dispersion of the implicit dᵢ contributes to the total uncertainty interval.
The empirical error mean ‘b’ is not an accuracy metric because the true value of X is not known.
The empirical dispersion, Var[x’], is an uncertainty interval about x’_bar within which the true value of X is reckoned to lay. The [Var[x’] does not describe an error interval, because the true error is unknown.
In the event of an available calibration, the uncertainty variance of the mean of the xᵢ’ can be assigned as the methodological calibration variance [(u(x)]² as in 4.1, if the multiple of measurements is close to the conditions of the calibration experiment.
The methodological uncertainty then describes an interval within which the true value of X is expected to lay. The uncertainty interval is not a dispersion of physical error.
Modesty about uncertainty is deeply recommended in science and engineering. So, if measurement n₅ < calibration n₄, we choose the conservative empirical uncertainty, [u(x’)²] = Var[x’] = sumᵢ[(x’ᵢ – x’_bar)²]/(n-1).
The estimated mean of error, b’, does not enter the variance.
The presence of the d’ᵢ in the variance again ensures that the uncertainty interval includes a contribution from the interval due to variable systematic error.
5.2) A single measurement of x’. The experimental sample is again similar to 4.
One measurement does not have a mean, so (x’_i – x’_bar) is undefined.
However, from 4.1, we know the methodological [u(x)²] from the calibration experiment using a sample of known X.
Then, for a single measurement of x’ in an unknown but analogous sample, we can indicate the reliability of x’ by appending the standard deviation of the known calibration variance above, sqrt(u(x)²) = ±u(x).
Thus, the measurement of x’ is conveyed as x’±u(x), and ±u(x) is assigned to any given single measurement of x’.
Single measurements do not have an error mean, ‘b.’ which in any case cannot appear in the error statement of x’.
However, the introduced calibration variance includes the uncertainty interval due to systematic error.
The uncertainty interval again does not represent the spread of error in the measurement (or model output). It represents the interval within which the physically true value is expected to lay.
Conclusions:
In none of these standard cases, does the error mean, ‘b,’ enter the error variance.
A constant systematic error does not produce a dispersion.
When variable systematic error is present and the X is known, the uncertainty variance of the measured ‘x’ represents a calibration error statistic.
When variable systematic error is present and X is unknown, the true error variance in measured x is also unknown, but is very likely a non-normal uncertainty interval that is not centered on the physically true X.
That uncertainty interval can well be dominated by the dispersion of the unknown systematic error. A calibration uncertainty statistic, if available, can then be applied to condition the measurements of x (or model predictions of x).
Rich’s analysis failed throughout.
This Appendix finishes with a very relevant quote from Vasquez and Whiting (2006):
“[E]ven though the concept of systematic error is clear, there is a surprising paucity of methodologies to deal with the propagation analysis of systematic errors. The effect of the latter can be more significant than usually expected. … Evidence and mathematical treatment of random errors have been extensively discussed in the technical literature. On the other hand, evidence and mathematical analysis of systematic errors are much less common in literature.”
My experience with the statistical literature has been the almost complete neglect of systematic error as well. Whenever it is mentioned, systematic error is described as a constant bias, and little further is said of it. The focus is on random error.
One exception is in Rukhin (2009), who says,
“Of course if [the expectation value of the systematic error is not equal to zero], then all weighted means statistics become biased, and [the mean] itself cannot be estimated. Thus, we assume that all recognized systematic errors (biases) have been corrected for …”
and then off he goes into safer ground.
Vasquez and Whiting go on: “When several sources of systematic errors are identified, ‘β’ is suggested to be calculated as a mean of bias limits or additive correction factors as follows:
β = sqrt{sumᵢ[u(x)ᵢ]²}
where i defines the sources of bias errors, and [dᵢ] is the bias range within the error source i . Similarly, the same approach is used to define a total random error based on individual standard deviation estimates,
ek = sqrt{sumᵢ[σ(x)ᵢ]²}”
That is, Vasquez and Whiting advise estimating the variance of non-normal systematic error using exactly the same mathematics as is used for random error.
They go on to advise combining both into a statement of total uncertainty as,
u(x)total = sqrt[β² +(ek)²].
The Vasquez and Whiting paper completely justifies the method of treating systematic error employed in “Propagation…”
++++++++++++
References:
V. R. Vasquez and W.B. Whiting (2006) Accounting for Both Random Errors and Systematic Errors in Uncertainty Propagation Analysis of Computer Models Involving Experimental Measurements with Monte Carlo Methods Risk Analysis 25(6),1669-1681 doi: 10.1111/j.1539-6924.2005.00704.x.
A. L. Rukhin, (2009) Weighted means statistics in interlaboratory studies Metrologia 46, 323-331; doi: 10.1088/0026-1394/46/3/021
My head just exploded and I was a physicist
I am so glad to hear that! I was not a physicist so, I expected my head to explode, which it did. Glad I’m in good company. I did get the gist of it, I think, which is that climate models are predictive of nothing but whatever the modeler wants. Is that close enough?
I have a built-in safety. My eyes glaze over, preventing cranial explosion.
“I did get the gist of it, I think, which is that climate models are predictive of nothing but whatever the modeler wants. Is that close enough?”
wrong. the models dont match observations in some key areas.
skeptics make 2 contradictory arguments:
A) models output what the scientists want
B) models fail becauee they dont match observations
Mosher
Your statement is illogical. For A), the implication is that scientists want lots of warming; the models provide that. For B), observations show little warming. There is nothing contradictory there. Move along, there is nothing to see here.
I’m with Clyde S’s reply on that, which basically means that I do not see a contradiction either.
(A) Models output quite a bit of warming. Climate “scientists” WANT warming.
(B) Such output of quite a bit of warming that scientists want does not match observations.
Where’s the contradiction?
Climate scientists want something that is not there.
Climate models output something that is not there.
Models and scientists, thus, fail.
Steve Mosher, “wrong. the models dont match observations in some key areas.”
Wrong. Models are forced to match observations in some key areas.
Their match of observations is no indication of physical accuracy. Their dis-match of observations is a positive indication of physical inadequacy.
AGW asserters make two contradictory arguments:
1) model air temperature projections are just story lines, not predictions
2) inter-model consistency means air temperature projections are predictions
I’m not even prepared to attack something that long, especially after not being particularly impressed with previous offerings.
Greg, your impressions are not important to the conversation.
How about MY impression of readers like Greg? — not important scientifically, of course, but in terms of willingness to intellectually engage deeply, … perhaps.
Does this count as a passive-aggressive ad hominem? Sorry, I couldn’t resist.
Carry on with the real discussion, intact-head folk. I will, at least, try to tune in, without being dissuaded by my previous impressions, which went something like, “Good God, I’ll never be able to understand that level of technical detail!”
… yet here I am — I WILL read it all, and take away what I can.
I just skipped 99% of it.
The problem is the model.
I used to say if one model Earth as world completely covered with an Ocean,
then you might get somewhere.
But it seems that is too hard. Now I have simpler idea:
the average temperature of the entire volume of the Ocean determines
global climate, or global temperature.
And Earth’s average ocean temperature is currently about 3.5 C.
And having ocean entire volume temperature of 3.5 , means we have a cold ocean. And doing our current Ice Age, the average temperature of the ocean has been in the range of about 1 to 5 C.
Whenever the ocean temperature is nearer 5 C, Earth is in the warmest periods of an interglacial period.
Increasing the ocean temperature from about 3.5 C to 4 C will result an significant amount of “global warming”. But such significant amount of global warming has nothing to do with the Earth becoming “too hot”. Though one could characterize the world as becoming more tropical.
And if or when ocean warms to 4 C, we still in an Ice Age.
Likewise there would a huge effect if ocean temperature were to cool from about 3.5 C to about 3 C.
Though just .5 C increase doesn’t cause Earth to become “too hot”, a .5 C decrease doesn’t cause Earth to become too cold. A .5 C drop in ocean temperature is an indication that we could be entering a glacial period. Or if we can’t predict when going to enter a glacial period {and apparently we currently can’t do this] .5 C drop in ocean temperature indicates we are somewhere near the “brink” of entering a glacial period.
And if we entering a glacial period, it still doesn’t mean Earth is “too cold”, but effects would be a tendency of more desertification and/or Earth becoming less “tropical”. Of course another aspect of entering a glacial period is growth of temperate glaciers. Another aspect of cooling oceans is more violent weather in the Temperate Zones. And you can get warmer winters, but most notable is the colder winters- or generally you get more extreme weather.
I can give a concise, ten-word summary:
As serious predictive tools for policy makers, climate models suck.
Yes, I am proposing that the last word in my summary be instituted as standard language in modern scientific writing.
Seriously, though, the thing about this article that attracted me was the phrase, “physical reasoning” in the title.
I’m getting the impression that this phrase, “physical reasoning” has a more in-depth, defined meaning in computing than I associate it with my plain-language understanding of the phrase. But I also get the impression that my plain-language understanding and the possibly more in-depth formal meaning might not be so separated.
During my intense, although brief, encounter with the study of mathematics, I was always amazed by students who could crank out the calculations flawlessly at the highest level of complexity. But I always wondered whether they really understood the meaning of what they were doing. I always needed to understand the deep meaning of what I was doing, and I felt as though people were not taught this — there was no time for this. I needed to go more slowly, connecting to first principles, first definitions, and so forth. I could not advance as fast as my robotic-minded comrades. Teaching math does not seem to be oriented in this way at universities. I could not do it the way it was being done, and so I left the scene.
It seems easy to be captivated by the sheer power of math, so much so that the meaning of it can get lost, and hence the person gleefully cranking it out can also get lost, because he/she has lost touch with the meaning, which is what I would call “physical meaning”.
This is why I have to ask questions like, “What does a global average temperature really mean?” OR “What does an average solar flux really mean?” The math can be correct, sure, no doubt, BUT is the meaning correct? — I think not. That’s why I treat the concept of “global average temperature” with a great sense of caution, and why I flat-out reject the concept of “average solar flux”.
Resolving this argument between Pat and Richard, then, seems to be a very, very advanced exercise in pinpointing the minutia of differences between a strictly calculating approach and a calculating-with-meaning approach. I cannot understand the minutia, for sure, but I think I get the gist that implies that people can weave all manner of complex math justifications to cover their failures to understand the meaning behind their math.
I don’t feel so bad now either. It’s going to take me a few days to read it all, but I think the “short take-home message” was very helpful.
Will read it later, but as a shining light of something that even us lesser trained knew about long ago.
Statisticians should enter research as in Smith, E.P. 2020. Ending Reliance on Statistical Significance Will Improve Environmental Inference and Communication. Estuaries and Coasts 43, 1–6. https://doi.org/10.1007/s12237-019-00679-y https://link.springer.com/article/10.1007/s12237-019-00679-y
From the abstract–“Numerous authors have commented and criticized its use as a means to identify scientific importance of results and have called for an end to using the term “statistical significance.” Recent articles in Estuaries and Coasts were evaluated for reliance on the use of statistical significance and reporting errors were identified.”
Thanks for the reference HD Hoese.
That paper looks like an important corrective tonic, and went right into my Endnote library.
My serious thanks to Anthony and Charles for posting my essay, and for all you.
Thank you for taking the time to unwind Booth’s paper; for me it was obtuse, opaque, and hard-to-read, and now I know why. It was so obtuse that I completely missed how the dimensions in his main equation are unbalanced. I will note that when I pointed out his treatment of uncertainty did not follow the JCGM GUM, he had no real answer.
Thanks Pat for your systematic explorations and Monte for JCGM GUM ref.
BIPM’s “GUM: Guide to the Expression of Uncertainty in Measurement”
See Evaluation of measurement data — Guide to the expression of uncertainty in measurement. JGCM 100:2008 PDF
Sadly this international guide standard is rarely if ever mentioned or applied by the IPCC or climate science authors. While they address statistical distributions, they hardly ever mention systematic uncertainties (Type B) that can be as large or larger then the statistical (Type A) errors.
The wide gap between climate model predictions of the Anthropogenic Signature and the reality of satellite and radiosonde data is exposed by McKitrick & Christy 2018 etc.
McKitrick R, Christy J. A Test of the Tropical 200‐to 300‐hPa Warming Rate in Climate Models. Earth and Space Science. 2018 Sep;5(9):529-36.
When will Booth etc. dare to address the massive systematic uncertainties between those? Where are they? How do we identify them? How large are they? etc.
https://www.bipm.org/en/publications/guides/gum.html
https://www.bipm.org/utils/common/documents/jcgm/JCGM_100_2008_E.pdf
https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2018EA000401
A few special characters have mistranslated into HTML. Most especially the special character forward arrow, such as ->, translated as an R inscribed within a circle.
There are a few other special character glitches, including that none of the Word-defined super- or subscripts appear, which I hope do not cause anyone any trouble.
Pat,
I suspect I have stumbled on another instance of mistranslation. Is the following correct, and if not, what did you intend? “…, and 0 £ a £ 1.”
Hi Clyde, regrets about that.
The British pounds in the original were ‘less than or equal to,’ as in 0 ≤ a ≤ 1. Or, if that doesn’t come out either, 0 < or = a < or = 1.
Hi Pat,
Looking forward to your thorough, expanded version of this rebuttal.
Over on Climate Etc I am showing examples of problems with temperature measurements and methods and getting blank stares in reply, plus a few diehard comments invoking authority.
Sooner or later, proper error treatment will sink in for climate researchers. It is so important. Geoff S
Pat Frank, thank you again for your effort.
Rich’s description of an a priori random variable status for some as-yet unmeasured state is wrong when the state of interest, though itself unknown, falls within a regime treated by physical theory, such as air temperature. Then the a priori meaning is not the statistical distribution of a random variable, but rather the unknown state of a deterministic system that includes uncontrolled but explicable physical effects.
I think you are wrong there.
Unfortunately the universal presence of random measurement variation (variation that is neither exactly predictable nor reproducible) makes the last sentence untestable. There was much discussion of empirical (epistemological) random variability and fundamental (metaphysical) random variability in the quantum mechanical developments of the early 20th century. If you can’t tell whether there is in fact an underlying determinism, then you are unwise to assume to assume. Both before and after measurment, all that is known empirically about a measurable attribute is the range of its most likely values. and estimates of the parameters of the distribution. This is true of the center of mass of an aircraft and the center of lift of its wing — as well as your blood hemoglobin concentration and the O2 saturation of the blood. Without believing in the accuracy of the Bayesian mathematics, you can do well to think of measurement as reducing the variance of the measured quantity, and reducing the bias, but not of eliminating the random variation.
Random variation in measurement outcomes is the most thoroughly documented result in all of empirical science. Whatever you think is the outcome of a deterministic process, the next 3 measurements of it will, with high probability, not all be equal.
Admittedly, some measurement variation is extremely slight. Modern measurements of the speed of light and Avogadro’s number are so precise that the former is taken as a constant and used to redefine the meter; the second has been proposed as a constant to redefine mass. Cases like that are not common.
I don’t think my metaphysical/epistemological note affects any thing substantial in your response to Richard Booth.
Hi Matthew, thanks for your kind encouragement.
In your comment, it looks to me like you’ve first raised the inevitable appearance random error of some measurement, i.e., your “random measurement variation.” That appears to concern measurement error rather than the measurement itself.
In the bit you quoted, Rich wasn’t addressing error. He was addressing the unknown state as though it should be considered a random variable prior to measurement.
That is not true in science, where theory is always present. The measurement either validates the theory or refutes the theory.
Even when some new phenomenon is discovered, it is interpreted as unexpected, given a deficient explanatory scope of the existing theory. In science, as-yet unmeasured phenomena are neither viewed nor defined to be random variables until they are measured.
Quantum Mechanics is a fully deterministic theory, as the quantum state itself evolves in an entirely deterministic way described by the equations of the theory; in particular as governed by the Bell inequality (pdf).
The fact that quantum states emerge into a probabilistic distribution when they are scattered, e.g., on their measurement, does not imply that the unmeasured state itself is a random variable.
I don’t want to get into the philosophy of QM, which would be a distraction in this forum. And in any case, far too much air has been expended on it by others.
I just want to know when my Infinite Improbability Drive will be ready.
Flash Gordon had it in 1964, Jeff. I know because it was reliably reported in the comics. Somehow, we lost it. 🙂
Pat Frank: That appears to concern measurement error rather than the measurement itself.
There is no “measurement itself” free from measurement error. It’s at best a conceptual distinction which disappears when actual measurements are undertaken.
No problem with that, Matthew.
I had measurement as a category in mind.
Matthew:
“Unfortunately the universal presence of random measurement variation”
Think of it this way. When you go out to read your thermometer in the morning just how many measurements do you take? Do you read it once? Twice?
In order for there to be a “random measurement variation” you need multiple measurements. One or two don’t give you any kind of a probability distribution which you can use to determine a “variation”.
But there *is* an uncertainty associated with that morning measurement. For all the reasons that Pat discussed.
If you use that measurement to determine a daily average then how do you account for the uncertainty you have associated with that morning measurement in the daily average itself? The uncertainty doesn’t just disappear when you form the average. If that uncertainty interval is greater than the difference you are trying to discern, e.g. a +/-0.01deg difference from a +/-0.5deg uncertainty, then you are only fooling yourself. If you follow the rules of significant digits when you calculate the average, i.e. one decimal place, you won’t even see a 0.01deg difference from your average calculation!
There are too many computer scientists and statisticians associated with the CGM’s and too few engineers and physical scientists. The CGM’s should even be outputting temperature differences past the first decimal, their inputs just don’t justify any more significant digits than that! Any engineer or physical scientist can tell you that simple rule of significant digits!
Many modern thermometers are electronic. As far as I know, the most common use a resistance measurement (voltage drop compared to a standard) of a bit of special metal that is (hopefully) calibrated to the temperature of said piece of metal, and which is hopefully in close agreement with the air temperature. I have read many comments about the ability of such thermometers to make very brief interval measurements. A frequent complaint is that in some places a momentary spike, which can be fairly large, will be recorded as the official high temperature (e.g. in Australia).
I have also read that the international metrological standard practice for such thermometers is to make one measurement per second for two and a half minutes, calculate the average over those 2.5 minutes, throw out any extreme values (greater than 2 SD?), calculate the average of the remaining values, and record that as the official temperature. I have also read that NOAA uses 5 minutes instead of 2.5 minutes.
It should be possible to empirically determine, in general, what the distribution and variation in air temperature is over an average measurement period. In this way one could determine what the reasonable uncertainty is that should be expressed. In the best case it would be like making multiple unbiased measurements of something that does not vary and thus getting a random distribution of variations that will reduce the uncertainty in the simplest way. This should include the variation of the instrument itself when the temperature of the air surrounding the instrument is held to a closer temperature than the precision of the instrument (i.e. how linear the thermometer is).
However, I have no idea if such measurements will usually produce anything like a normal distribution, or if any such validation is employed. Does anyone?
AndyHce
You asked, “if such measurements will usually produce anything like a normal distribution,”. Sometimes they will, sometimes they won’t. A common day will provide an approximately sinusoidal temperature change from sunrise to sunset. Now, imagine a day when a cold front moves through in the late morning and the temperatures drop 20 deg F. That will not be a normal distribution, and may not even be symmetrical.
Andy,
Modern devices being electronic doesn’t do away with uncertainty in their readings. Even the thermistors used in the Argo float are non-linear over the range of temperatures they measure. and it gets worse when the thermistor is embedded in a device that has its own contribution to uncertainty.
Averaging over a period of time has its own uncertainty. In a two minute period clouds can put the device in shade, out of shade, and in shade. A five minute period makes it even worse. Even a one minute exposure to bare sun can change the temperature of the atmosphere surrounding the measurement device. So can one minute of shade. So what does that do to the uncertainty surrounding the temperature measurements? How do you determine what a “reasonable uncertainty” actually is?
It would seem that, as with so many, you are confusing error with uncertainty. You can take multiple measurements of the same thing using the same device and use the central limit theory to determine a more accurate mean. But spreading temperature measurements over a period of time is *not* taking multiple measurements of the same thing even though the same device is used. That’s like measuring 1000 8’x2″x4″ boards with the same measuring tape, averaging the results and then saying that average is the actual length of each of the 100 boards. Common sense should be all you would need to understand that just isn’t the case.
“In the best case it would be like making multiple unbiased measurements of something that does not vary and thus getting a random distribution of variations that will reduce the uncertainty in the simplest way.”
How does the temperature not vary over time? And even if it doesn’t vary how does that lessen the uncertainty associated with the measuring device? You seem to still be addressing error and not uncertainty.
If I take multiple measurements of an 8′ rod using a tape measure using a ruler I think is 12″ long but is actually 13″ long, how does taking multiple measurements and averaging them reduce the error in the mean you determine?
“This should include the variation of the instrument itself when the temperature of the air surrounding the instrument is held to a closer temperature ”
How do you hold the temperature of the outside air to a closer temperature than the precision of the thermometer? Don’t confuse calibration with either error or uncertainty. No calibration lab I know of pretends to be able to duplicate all environmental conditions a thermometer might endure when in the outside environment. They will duplicate certain, specified conditions and calibrate the instrument to those conditions. Everything else has an uncertainty associated with it. And one the instrument leaves the calibration lab the instrument’s calibration will begin to degrade, thus introducing uncertainty into any measurement. How many thermometers in the 19th and 20th centuries were regularly calibrated? Yet we use those measurements as a baseline to compare to today’s measurements.
The manipulations of data today adds in even *more* uncertainty into results. When they increase or decrease temperature data to “adjust” it, with no absolute knowledge of the calibration of each instrument whose measurement is thus adjusted how do we know they aren’t *increasing* uncertainty instead of decreasing it?
Matthew ->> “you can do well to think of measurement as reducing the variance of the measured quantity, and reducing the bias, but not of eliminating the random variation.”
Unless you make enough multiple readings of the temperature at a given time to be able to plot a distribution of measurements, you do not have anything but the one reading to work with. Therefore, you can not reduce the variance because there is no variance with one reading, and one reading only. The only uncertainty you have is the uncertainty involved with that one time measurement.
It is also impossible to understand how someone can think that the Central Limit Theory applies to temperature measurements and their averages. Each time a temperature measurement is made at a given device, that reading is the only measurement that you have and could ever have of that measurand at that point in time. You simply can not make another reading hours later, average the two and consequently increase the accuracy and precision, and also reduce the uncertainty by claiming CTL lets me divide by N. The same logic applies to averaging readings from multiple devices at multiple locations. Each measurand that goes into the average is a single unique population of 1 (one) with a given precision and a minimum uncertainty specified by the range of the next digit after the recorded value. There are multiple sources on the internet that tell you how to combine populations with different variances. Use them. Trying to say the CTL to divide the population variance by sqrt(N) only tells you how close a sample distribution is to the true mean. It doesn’t let you increase the precision and accuracy of the mean nor reduce the variance of the population. I’ve done it. Combine all your readings into one population and find the simple mean and the variance. Then go to the work of choosing samples, calculating the each sample mean, then find the mean of that sample mean distribution. The mean will be the same.
One should be careful about using the GUM in determining how to handle uncertainties from different devices, different times of reading, and different temperatures. The GUM basically deals with determining the uncertainty associated with the measurement of one thing (the measurand) at a time. It also deals with uncertainty in measuring devices and how their uncertainty should be included in an uncertainty budget.
The GUM has little about trending which is what we are really dealing with. To arrive at a GAT (global average temperature) stock traders probably have a better idea of how it should be done since they deal with indexes made up of unique individual measurements and trend them to resolve index pricing variations.
Sadly, it’s far beyond my horizon
There are many other things, possibly more important, within your horizon.
I trust all that maths has some meaning to it.
“estimate of the past black box values and to predict the black box output.” That is, his emulator targets model output. It does not emulate the internal behavior or workings of the full model in some simpler way.”
Black boxes for climate simulation are most likely waste of time.
In years past I have designed and build few ‘black boxes’ which were later manufactured for limited use for number of customers. At the start I knew what the input is and what the required output was meant to be, but I never ever bothered to start with an equation. When whole lot was done, I would do number of tests varying the input and recording the output. Data was used to produce graphs with useful working range clearly indicated, and then the maths of the transfer curve would be ‘constructed’ and numerically verified. Less wise were suitably impressed with the paperwork, but the engineers were only interested in a simple but the very essential ‘does it do the job’; and as it happens most of the time it did, but sadly there were one or two failures.
This reminds me of the famous painting; ‘This is not a pipe’.
Climate modelling would have us understand a pipe by looking at a painting of a pipe. The first thing we would notice is that pipes are very thin and cannot actually hold much tobacco. The dimensions of the pipe have been corrupted to produce the painting.
A climate model is not climate. It is a painting of climate. The dimensions of the painting of climate are not the dimensions of the original.
Rich and climate modelers both describe the probability distribution of the output of a model of unknown physical competence and accuracy, as being identical to physical error and predictive reliability.
===========
Mathematically, this is an error. The statistics of a painting of climate will not equal the statistics of actual climate, because your painting is not an exact replica. It is not a pipe. It is simply a picture of a pipe. Perhaps a complex and expensive picture, but a picture all the same.
What was the middle part again?
The climate models have not been Validated or Verified. Hence they are useless.
Regards
Climate Heretic
Not all software is a model of a physical system. Rules have been created for the verification and validation of such software.
The obvious first requirement is that software that models a physical system should accurately reproduce that system’s behavior.
Since climate models fail at reproducing the climate’s behavior, the climate modelers try to use the verification and validation rules which apply to software that doesn’t model physical systems. It’s the old switcheroo. Their claim that their models are verified and validated is just bunk.
I disagree. As engineering process models they are useful in a limited way: for studying certain subsets of atmospheric physics and improving our understanding of them. But they are not fit for the purpose of projecting (predicting) future “global climate” states.
Climate models fail at a very basic level. The linked chart shows the precipitation minus evaporation for the mean of the CMIP5 models for RCP85:
http://climexp.knmi.nl/data/icmip5_pme_Amon_modmean_rcp85_0-360E_-90-90N_n_+++_2000:2020.png
I produced this chart to see how well the models tracked the measured atmospheric water vapour. The TPW data is not available on the KNMI site. I realised that if I integrated the the pme for each month I should get the TPW.
If you take a close look at the chart it is clear that there is considerably more precipitation than evaporation over each yearly cycle. Integrating over this 20 year period results in a very dry atmosphere. In fact the atmosphere needs to be manufacturing water because the integration results in MINUS 60mm TPW over the 20 years.
How ever the models are constructed, they do not bear any resemblance to the real physical world. Knowing the importance of atmospheric water vapour to the global weather and climate, it should be one of the key variable to get physically correct.
Indeed. Increased water vapor is the mechanism which provides the positive feedback which increases the climate sensitivity beyond 2°C per doubling of CO2. As far as I can tell, the models ignore the energy it takes for the evaporation that results in water vapor.
It just occurred to me that it shouldn’t take increased CO2 to raise the temperature, thus causing enhanced evaporation which causes even more greenhouse effect. Just water, by itself, should be able to start the process of runaway global warming. Why doesn’t that happen?
Have I just found another fatal flaw in CAGW, or is that just my cold medication talking?
commieBob,
You aren’t the first one to notice this. It’s the old “don’t believe your lying eyes” magic. Supposedly water vapor by itself doesn’t provide a positive feedback loop which would, sooner or later, run away. It requires CO2 to make the water vapor into a positive feedback loop.
Which, of course, is garbage.
Yes, I pointed out that fallacy a long time ago. The ridiculous “positive feedback loop” wouldn’t need any CO2 to drive it.
What? Another one? ( been done) . The science is settled! 😂
Rick, my admiration for doing the detailed work, and congratulations for a striking result. It’s very worth publishing; most especially if the same result appears for other scenarios.
Your data are the CMIP5 mean, which means the random error in precipitation and evaporation should have averaged down to some small residual. So, the result you got then reveals a deterministic error.
You can demonstrate that as fact by assessing several models to see if they all make the same error.
Pat
Other than mean, I have only looked at one model – the CSIRO. The KNMI site only provides a 10 run average for the CSIRO Mk3 model. This link is the 2000 to 2020 pme plotted for the RCP85:
http://climexp.knmi.nl/data/icmip5_pme_Amon_CSIRO-Mk3-6-0_rcp85_0-360E_-90-90N_n_+++_2000:2020.png
I have not integrated the actual data but by observation it appears that precipitation and evaporation are better balanced than the model mean.
The KNMI site has numerous runs for CMIP 3 and 5 models. This is the list of CMIP5 runs available:
http://climexp.knmi.nl/selectfield_cmip5.cgi?id=someone@somewhere
You can download the data from the site but it is tedious and any basic analysis demonstrates it is rubbish.
I have plotted the NASA earth observation data for TPW and OLR for the last 3 years:
https://1drv.ms/b/s!Aq1iAj8Yo7jNg1uzA-KKFEvD5BzX
The annual variation in TPW and OLR are positively correlated and in phase. This is the opposite of the “greenhouse effect”.
I was aiming to see how well the models related to what has been measured in the last three years. I didn’t bother looking at any more models once I found the model mean was so far “unphysical”. The models are fundamentally no better than an X order polynomial where X is chosen to to fit the number of slope reversals in the historical temperature record. Between CMIP3 and CMIP5, there must have been additional orders (more tunable factors) because there were a few more slope reversals that needed to be accounted for. There is no doubt your black box is as effective as any of these unphysical models in predicting some imagined future climate; and orders of magnitude simpler. (Could you imagine putting your hand out for billions to come up with a simple equation) It is no wonder climate modellers resent the concept of a black box with a single very simple equation.
Congrats Rick, that’s an iceberg below the water line.
“He then wrote that if one instead made ten measurements using ten independently machined rulers then the uncertainty of measurement = “sqrt(10) times the uncertainty of each.” But again, that is wrong.
The original stipulation is equal likelihood across ±0.1″ of error for every ruler. For ten independently machined rulers, every ruler has a length deviation equally likely to be anywhere within -0.1″ to 0.1″. That means the true total error using 10 independent rulers can again be anywhere from 1″ to -1″.
The expectation interval is again (1-(-1)”/2 = 1″, and the standard uncertainty after using ten rulers is 1″/sqrt(3) = ±0.58″. There is no advantage, and no loss of uncertainty at all, in using ten independent rulers rather than one. This is the outcome when knowledge is lacking, and one has only a rectangular uncertainty estimate — a not uncommon circumstance in the physical sciences.”
The above is not correct. The error distribution for ten different rulers is much different than using one ruler.
Tom,
“The above is not correct. The error distribution for ten different rulers is much different than using one ruler.”
Ten *independent* rulers don’t have an error distribution. Ten independent rulers each have their own separate uncertainty interval. Their uncertainty intervals add.
If one picks a single ruler then its error distribution will be the same rectangular distribution of everything the manufacturer produces, that being an equal likelihood of error from -0.1” to +0.1 inches per foot. A repeated measurement with that one ruler just multiplies the error, so that in the case of 10 repetitions, the error would be from -1.0″ to +1.0″.
If instead, one picks 10 rulers all at once and measures by putting them end to end, the result is not going to be an equal likelihood of from -1.0″ to +1.0″. A selection of 10 rulers will result in a group with a normal (gaussian) distribution with a mean of zero. The probability of picking ten rulers all with the same error is vanishingly small. Its also as Rich said, if you just looked at all the rulers and picked the shortest one and the longest one you could find, you could be almost certain that the mean error of the two rulers would be zero. So, my understanding of the problem is that repeated measurement improves accuracy of the result when the error of the measuring device is not constant as would be the case with almost any real world situation that I can think of.
This has little to do with the problem of models which have errors of a different sort. The error in measuring todays temperature is one thing; an error in predicting tomorrow’s temperature is something else altogether.
Tom –> Let’s put it a different way. You assumed that “A selection of 10 rulers will result in a group with a normal (gaussian) distribution with a mean of zero”. You simply can’t make this assumption because you are uncertain what each ruler has for an error. Hence the term uncertainty. You could end up with ten rulers too long or ten that are too short or even ten rulers that are right on.
Consequently, you can’t assume any distribution. This is why WHEN you use the Central Limit Theory to predict a “true value” by averaging measurements there are two main assumptions, independent measurements of the same thing and a random distribution of measurements. In many cases not enough measurements of the same thing by the same device are made to insure a random distribution. This by itself results in additional uncertainty.
Ultimately, you can not reduce uncertainty by using different measuring devices (ten different rulers) to measure the same thing. Nor can you increase accuracy and precision. The uncertainties add whether you are using one ruler or ten.
If I pick ten of the rulers at random (of the stipulated rectangular error distribution), what are the chances of picking two in a row that both are short by 0.1″; ten in a row?
Tom,
Let’s consider picking ten rulers at random from a run of 1000 from the same machine in a single day.
At the start of that day the operator should have calibrated the machine so the first few rulers churned out will be pretty close to the correct length. As the production run continues however, the cutting die will either start churning out rulers that get shorter and shorter or longer and longer. In other words you will no longer have a random distribution, it will be significantly skewed one way or the other.
If you pick from the machine that is progressively cutting the rulers shorter and shorter then your liklihood of picking ten rulers that are too short are pretty high. The odds of picking any that are too long are pretty low. If you pick from a machine that is progressively cutting the rulers longer and longer then your liklihood of picking ten rulers that are too long is pretty likely. The odds of picking any ruler that is too short is pretty low.
“stipulated rectangular error distribution”
And now we are back to the usual position of trying to equate error and uncertainty. They are not the same! That is what Pat’s whole point has been all along.
Can you suggest an experiment that would help me understand?
Tom,
An experiment?
Go to ebay and buy three, inexpensive analog multimeters. You can get them for about $5.50 each.
Take one reading each on your car battery (with car not running). Estimate the uncertainty associated with each reading. Average the readings together and estimate the uncertainty associated with that average.
Then, if you can, find a lab-grade, recently calibrated voltmeter and see what it gives you for a reading.
Tom “A selection of 10 rulers will result in a group with a normal (gaussian) distribution with a mean of zero.”
You don’t know that, Tom. And the manufacturer’s specs do not say so. You’re just making a convenient assumption — rather like Rich did.
Every ruler has an equal chance of being anywhere between -0.1″ and 0.1″. Any distribution of lengths is equally possible for 10 rulers. The distribution is not Gaussian. It’s rectangular.
The manufacture’s specs say any ruler has an equal chance of being in error by up to +/- 0.1″. If I get ten of their rulers, what are the chances they will all be in error by the same amount; I think the answer is approximately zero.
Getting all the rulers with the same error has the same probability as getting any other set of lengths. There’s no way to know.
Following on from Tim Gorman, all we know is the manufacturer specification that any given ruler is within (+/-)0.1 inch of 12 inches. And every ruler can be anywhere in that interval with equal probability.
There is no known distribution of error.
Ten independent measurements yield an uncertainty interval of (+/-)1 inch, i.e., one could have 10 rulers all off by -0.1 inch or +0.1 inch. There’s no way to know.
If you submit a 9-meter tape measure (i.e. a long ruler) to a cal lab for calibration, what they will do is give you a report that states the distance indicated by the tape is within the manufacturer’s specifications at several discrete points, such as 10%, 50%, and 90% of full scale. They do not generate a calibration correction table for every 1-mm hash mark, it would be horrendously expensive. The uncertainty of a measurement using the tape can only be calculated from the specs as a Type B uncertainty interval. Averaging multiple measurements of the same distance cannot reduce the uncertainty.
There is no known distribution of error? I understood that the error distribution was rectangular, and therefore equal probability of selecting a ruler with any error. This means that the mean error for the entire population of rulers is zero. I contend that as you increase your sample size, the mean of that sample will approach the mean of the total population, or zero, and therefore you are more likely to get an accurate reading if you use multiple rules, if the population is as I understand it to be.
Perhaps my understanding of the problem is incorrect?
The uncertainty specification is about manufacturing deficiencies in accuracy, Tom. It’s not about the distribution of any population of manufactured rulers.
They could just as well have a run of rulers short by 0.03″ as anything else. But one never knows.
If someone wanted to get a proper handle on the problem, they’d have to do an accuracy study of methods and machines on the manufacturing floor. Is one operator or one machine more likely to produce shorter or longer rulers than another?
The point of the present exercise is that one never knows the length distribution of any given population, no matter how large. All one has is the (+/-)0.1″ uncertainty.
If one never knows, then why did you set up the problem on the basis of a rectangular distribution with the error range of +/- 0.1 in/ft? Doesn’t that have a very well defined meaning?
Rich set up the rectangular distribution, not I.
We’ve been discussing the well-defined meaning. All length errors are equally possible. You continue to treat them as though they are not. So did Rich.
I think what Pat is describing is a kludge.
Having evaluated a lot of student software, I am painfully familiar with kludges. Well written software is elegant and easy to understand. A kludge will make your brain explode. The writer of the kludge will assume that other peoples’ inability to understand his crap proves his superior intellect. The truth is that the writer of said crap, almost 100% guaranteed, understands it worse than the poor benighted person trying to grade it.
Just because someone can throw a bunch of stuff together, and it doesn’t actually crash, doesn’t mean it’s useful or valid in any way.
Kudos Pat. I suspect Hercules had an easier time cleaning the Augean Stables.
“Well written software is elegant and easy to understand.”
Well written articles are elegant and easy to understand. This was not one such.
Not helpful, Nick.
To belabor my metaphor, we are presented with the image of muck flying in all directions rather than that of the shining edifice at the end of the process.
I’m sure you can point to examples where this kind of thing has been handled much better.
Do you think the article is elegant and easy to understand?
Not at all. Given what Pat is trying to do, I don’t think I could do better. I was hoping you could provide some useful guidance or an example or two.
” I was hoping you could provide some useful guidance or an example or two.”
Well, take just the treatment of Rich’s recurrence relation, starting
“From eqn. (1ʀ), for a time series”
To the end of the 3R section is several pages. It’s just blundering about with a simple first order difference equation. Rich in fact got it right, in a few lines. 3R is the correct solution of 1R with his E{} values. The error with the starting point of the summation in 2R is evidently just a typo.
It’s true that 3R as shown here has the last term wrong; it should be (1-a)^t W(0), as Rich correctly wrote it.
Rich’s 3ʀ derives from his 2ʀ, which is wrong.
From Rich’s 2ʀ:
Case i = 0 = t
W(0) = (1-a)⁰[R₁(0)+R₂(0)-rR₃(0)]+ (1-a)⁰W(0)
which reduces to W(0) = W(0) because the Rn(0) are undefined.
Case i = 1 = t
W(t1) = (1-a)¹[R₁(t1)+R₂(t1)-rR₃(t1)]+ (1-a)¹W(0), which is wrong.
Case 1 should yield W(t1) = [R₁(t1)+R₂(t1)-rR₃(t1)]+ (1-a)¹W(0), which is Rich’s foundational “concrete time evolution of W(t)” eqn. 1ʀ, produced de novo, ex cathedra, and apropos of nothing.
The “blundering about” included the demonstration that 2ʀ is wrong.
The blundering also noticed the violation of dimensionality, the tendentious linearity, and the inappropriate variances.
Thank-you for pointing out the typo that left out the t-exponent in (1-a)^tW(0) in my rendition of 3ʀ.
Mr. Stokes,
If you want to elucidate how Mr. Frank’s analysis is incorrect, you would do well to address the following:
***
Until you refute the above, Rich Booth’s attempt to criticize Frank fails and rather miserably, at that.
FYI: My highest level of math is college level Calculus and, thanks to his clear presentation and eloquent writing, I understood Mr. Frank quite well. He writes as we were taught to in Computer Science: accurately, completely, economically, and logically.
Janice
Janice, really nice to see you. 🙂
I’m impressed with your advanced STEM education. I had no idea. 🙂 Congratulations!
Hope things are well with you, and best wishes 🙂
Aw, Pat. How kind of you … . One of my degrees was in Computer Science, but, I never “practiced” in that field. Sure am glad I took those courses, though. Well, heh, most of them…
You are such a fine scientist. A scientist’s scientist. I am currently reading Walter Isaacson’s biography of Albert Einstein. You remind me of him (Einstein – heh). A fine thinker AND an all-around fine human being. With an excellent sense of humor! 🙂
Thank you for your kind wishes (and for taking the time to write them — I was kind of hoping you would… it’s funny, but, just getting a friendly “Hi!” on WUWT can really make my day. And being ignored can make me sad …). Things are not exactly “well,” but, really, not so bad. After all, I can see. I can hear. I am healthy. Thus, I am rich, indeed.
And, now that I have had the pleasure of reading what you wrote to me — I am happy, too. 🙂
Take care,
Janice
Do you think the article is elegant and easy to understand?
Suppose you had to convince somebody that ingesting crap was not that good for them. Trouble is, this person has been eating crap for a lifetime.
You might first start by giving a detailed description of the human body, then a detailed description of its biochemistry, elucidating all levels of minutia about why the molecular structure of crap was not good for the body. This could go on for page after page, because, after all, the person you are trying to illuminate with better insight has a visceral, reflexive attachment to eating crap.
Breaking down crap-eating into its many, many faults is itself a crappy process. To assume that it would be elegant and easy is, perhaps, an inelegant starting assumption. (^_^)
Well written articles are elegant and easy to understand. This was not one such.
Bite yer Nickers there budrow . . . ad hom Stokes understanding as much as it seeks it.
But is it right? I assume you think it is, since your only criticism is of the style.
Nick,
Cleaning up someone else’s mess is often a messy process in itself. Been there, done that.
Pat,
Thanks for tackling such a massive analysis. I have only a couple of comments of a general nature.
First, you say…
I am unsure what gets flogged out of whom early in one’s education, but education is not very uniform. Lots of graduate physical science educations are deficient in probability and statistics. That is why our statistics department began to offer a course for new faculty hires, post-docs, etc. in statistics. I think they gave up most recently because our university is shorthanded across the “sciences”.
This was true of my own education. If it weren’t for a great deal of study on my own, plus one graduate course in probability, and then a willingness to teach statistics courses that no one else ever wanted to teach, I’d be pretty ignorant. I am sure most of my cohort stayed ignorant. The one course in my scientific discipline (physics/geophysics) where I was introduced to anything advanced was a course in “inverse theory” where our textbook was Philip Bevington’s book on data analysis. Yet, what was emphasized in this course were not the sections on propagation of error, but rather the algorithms for inverting data to find model parameter values. I.e. find an answer but be unable to articulate how insignificant it might be.
At any rate, the inability to do statistics or misunderstandings about what propagation of error is are understandable. You ought to see what ABET recommends that engineers know about probability or statistics or measurement uncertainty to see what a “low bar” means.
Second point — I see the reference to GUM and to JCGM which I was unfamiliar with, and could not find in the context of this work or in others referred to here. Perhaps others are puzzled about these acronyms too, so I will just mention that JCGM refers to the “Joint Committee for Guides in Metrology”. I have looked now at JCGM 100:2008 and see that it is very similar to the NIST Statistics Handbook. There is a PDF version of 100:2008 found here.
Kevin,
“I am unsure what gets flogged out of whom early in one’s education, but education is not very uniform. Lots of graduate physical science educations are deficient in probability and statistics.”
Respectfully, this really isn’t an issue of probability and statistics. And engineering students *do* get false precision beat out of them pretty early.
You can take ten voltage measurements using a voltmeter that reads out to two decimal places, average the measurements together, and come up with an answer out to 16 decimal places on your calculator. At least when I was in the engineering lab the instructor would give you an F for such an answer.
I know they still teach the rules for significant digits in high school chemistry and physics. You are expected to know those rules when you get to college.
But it seems pretty apparent that the climate scientists and computer programmers doing the CGM’s either never learned the rules or are ignoring them out of convenience. Probability and statistics simply can’t create more precision than your inputs can provide. Yet the CGM’s do.
Tim,
Respectfully, you have missed what I had to say, utterly.
1. I have had some 5,000 plus students in my one-half career as a college professor. I do my best to inform them about significant digits, but they persist. Through four years many of them. So, I am not sure who gets what beat out of whom. No one, to my knowledge, flunks out of engineering schools for reporting excessive precision. In fact, nowadays, you will lose a grade appeal for trying so.
2. I didn’t say this is a problem with statistics and probability per se, although it is hard to argue it is unrelated to it. Pat stated that there are too many statisticians in climate science already. My point was that there is a problem with inadequate preparation in ways through the science disciplines. Our statistics faculty recognized it among our scientist hires. I was never required to take a single course in probability and statistics through three scientific degrees. I did so on my own, but few others did. The problems with what one does not know runs all directions, and becomes critical in interdisciplinary studies, like climate science. I was reacting to something that Pat said with which someone like Nic Lewis, for example, might beg to differ.
3. I have no idea what is apparent with climate scientists. Perhaps they are not well versed in a broad range of topics, and tend to do what amateurs do. What I have noticed among the few climate science types I have met, but is just about universally true among people with science degrees and who believe deeply in climate change is this: They have mixed their work with a belief system and with their politics.
Thanks, Kevin. In my earliest undergrad lab courses, we were introduced to significant digits and the limits of resolution. Notions of measurement error were introduced.
At year 3, my Analytical Chemistry lab came down hard on experimental error and its propagation, as you would expect. Analytical chemists are fanatics about accounting for error.
Then the major went on to “Instrumental Methods of Analysis,” a strong physical methods lab course that required full treatment of error. Many years later, they still teach it as Chem 422.
I’d be surprised if engineering doesn’t teach treatment of error and its basic statistics in the context of the undergrad courses, even if not in a formal statistics class. Can that be true?
Thanks for posting the link to the JCGM. I should have done that.
If you’re not familiar with it, the NIST published B.N. Taylor and C.E. Kuyatt (1994) Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results, which is a good treatment as well.
Though there, too, the attention to systematic error is sparse.
Pat,
Thanks. I will have a look at the Taylor, Kuyatt publication. I learn something new with each treatment I examine.
I have the distinct impression that chemists do the best job with significant digits and propagation of error at present, although one of my uncles was a land surveyor and wore me out as his young assistant with his treatment of closure error. Thus, I think you are really over-estimating what other people are likely to know, even if well educated.
Engineering curricula, in my experience, vary quite a bit depending on discipline and instructor. Some may not ever see the topic, others may get a formulaic approach. For example, for a time I taught a laboratory fluids course. By the time students reached it some had gotten introduced to a sort of propagation of error through the “measurement equation.” What they did was to evaluate a measurement equation at a number of extreme values of its parameters to arrive at uncertainty envelope. The idea of a coverage factor was unknown to them. Others had no idea what I was speaking about.
The reference guide for the FE (fundamentals of engineering) exam just states an equation to handle measurement errors as what they call the Kline-McClintock eq.
without elaboration. I doubt most engineering programs teach anything especially rigorous.
Kevin,
I had uncertainty beaten into me in my electrical engineering courses, especially those using analog computers. You could only read the voltmeters and ammeters to a certain resolution. If you ran a simulation and got an answer, then cleared it all, and then reran the simulation you could get different answers – unless you took into consideration significant digits and uncertainty intervals. It’s why two different people could run the same simulation and get two different answers – close but not the same.
With the advent of digital computers the concept of significant digits and uncertainty has seem to gotten lost. Perhaps analog computing needs to be re-introduced into engineering labs!
Btw, it wasn’t just in analog computing that this all became apparent. Two different people measuring stall loads on a motor could come up with different answers because of the resolution of the voltmeters, ammeters, and load inducers and the inherent uncertainty of setting them. Same for using lecher wires to measure the wavelength of microwave signals (nowadays they use frequency counters but ignore their uncertainties because they are digital! ).
This just happens to also be equation 10 from the JCGM GUM (Guide to the Expression of Uncertainty in Measurement)…
Excellent.
Pat Frank, thanks for the long post rebutting the Booth critique of your work. I have a formal PhD level background in probability theory and statistics, all in the service of econometrics. I thought you were right originally, and said so with a brief non math explanation. I thought Booth was off point (and in some ways wrong by perhaps unwittingly misdefining your carefully delineated actual issues), and said so, but without explanation. You have provided an eloquent rigorous explanation. Kudos.
To state the core issues differently and logically, not mathematically (as posted here several times before), there is an inescapable basic model problem. The CFL constraint means grids are 6-7 orders of magnitude bigger than needed to properly solve phenomena like convection cells using CFD. That forces parameterization of key processes. That requires parameter tuning to best hindcast. That drags in the unknowable (yet) attribution problem. So models provably run hot in aggregate; the nonexistent tropical troposphere hotspot models produce is sufficient evidence. All the climate model kluging cannot erase that.
And you add a second inescapable problem, that the real physical uncertainty around this wrong result compounds greatly—something no amount of pseudostatistical math futzing with model results can ever fix.
“The CFL constraint means grids are 6-7 orders of magnitude bigger than needed to properly solve phenomena like convection cells using CFD. That forces parameterization of key processes. That requires parameter tuning to best hindcast.”
Using CFD, as engineers do, grids are always too large to resolve turbulent eddies, including those generated by convection cells. That does not require parameter tuning to best hindcast. CFD practitioners don’t do that (they can’t). Nor do GCMs.
What it does require is an analysis of turbulent kinetic energy and its transport and dissipation.
In other words Nick, “settled science” 😉
Heh. Good one, Derg. 🙂
Dr. Chris Essex presents the actual state of the physics concisely and clearly here:
http://wattsupwiththat.com/2015/02/20/believing-in-six-impossible-things-before-breakfast-and-climate-models/
Dr. Christopher Essex, Chairman, Permanent Monitoring Panel on Climate, World Federation of Scientists, and Professor and Associate Chair, Department of Applied Mathematics, University of Western Ontario (Canada) in London, 12 February 2015
{video here on youtube: https://www.youtube.com/watch?v=19q1i-wAUpY} …
{25:17} 1. Solving the closure problem. {i.e., the “basic physics” equations have not even been SOLVED yet, e.g., the flow of fluids equation “Navier-Stokes Equations” — we still can’t even figure out what the flow of water in a PIPE would be if there were any turbulence}
Nick, why not post something on ‘engineering’ CFD, as you recommended to explain the validity of climate models. Come one, post what you falsely said exists. But before doing so, see my long ago technical post here on that subject.
Nick, so per your comment, why do all climate modelers parameter tune. You deny that they do?
“why do all climate modelers parameter tune. You deny that they do?”
I said they don’t parameter tune to best hindcast. And they don’t. One recent paper that describes tuning of a specific model was Mauritsen 2012. he says specifically
“The MPI‐ESM was not tuned to better fit the 20th century. In fact, we only had the capability to run the full 20th Century simulation according to the CMIP5‐protocol after the point in time when the model was frozen.”
They did use SST for a couple of decades post 1880 to tune something.
You have never shown, with evidence, that any specific GCM tuned to best hindcast. In your earlier writings, you seemed to rely on a paper by Taylor et al on CMIP5 experiment design. But that is just misunderstanding; the paper said nothing at all about tuning. The “design” just specified what the GCMs should actually investigate, so that results could be built up and compared.
The MPI-ESM didn’t need much further tuning because it is based upon the ECHAM5 model, which was extensively tuned (pdf).
In any case, Rud’s point is that climate models are tuned. The abstract of Mauritsen et al, (2012) fully supports Rud, by admitting that climate models are invariably tuned to produce target climates.
You were silent on that inconvenient contradiction. Your demand for a specific model as an example merely deflects from Rud’s point.
Here’s the abstract:
“During a development stage global climate models have their properties adjusted or tuned in various ways to best match the known state of the Earth’s climate system. These desired properties are observables, such as the radiation balance at the top of the atmosphere, the global mean temperature, sea ice, clouds and wind fields. The tuning is typically performed by adjusting uncertain, or even non‐observable, parameters related to processes not explicitly represented at the model grid resolution. The practice of climate model tuning has seen an increasing level of attention because key model properties, such as climate sensitivity, have been shown to depend on frequently used tuning parameters. Here we provide insights into how climate model tuning is practically done in the case of closing the radiation balance and adjusting the global mean temperature for the Max Planck Institute Earth System Model (MPI‐ESM). We demonstrate that considerable ambiguity exists in the choice of parameters, and present and compare three alternatively tuned, yet plausible configurations of the climate model. The impacts of parameter tuning on climate sensitivity was less than anticipated. (my bold)”
Nick,
A peer reviewed manuscript wil soon appear that mathematically proves that climate (and weathe) models are based on the wrong dynamical system of equations. And so one must ask how can they be said to be providing any result close to reality in a hindcast. The answer has been provided at climate audit using a simple exaample. If one is allowed to choose the forcing, the one can obtain any solution one wants for any time dependent system of equations even if that system is not the correct
dynamical system. That is exactly what the climate modelers have done.
Jerry
Rud, as you discussed back in 2015, engineers run physical experiments so as to derive the parameters needed to computationally reproduce the measured behavior over the entire range of operational limits.
The parameters are incorporated into their engineering model. This calibrates the model to produce accurate simulations of system behavior within their specification region.
They interpolate to predict the behavior of their system between the experimental points.
I’d expect good engineers to run further experiments at selected interpolation points as well, to verify that their engineering model correctly predicted behavior in the calibration region.
But engineering models are not reliable outside their calibration region.
Leo Smith had a great post on this topic, on-the-futility-of-climate-models-simplistic-nonsense/ back in 2015, which engendered lots of intense commentary by bona-fide engineers. Pretty much all of them expressed considerable critical disapproval of climate models.
Frank: “But engineering models are not reliable outside their calibration region.” Which is why, as a modeler, I have said in several comment areas that, in general, one can trust well-anchored numerical models to interpolate between known data points, but extrapolated beyond only with extreme caution…
Rud, there is recent support for your point, “That requires parameter tuning to best hindcast.” Here is an instance of modelers explicitly tuning to obtain and report improved hindcast results. In my view, it doesn’t get any clearer than this. This is a link to Zhao et al, 2018a, in which simulation characteristics of GFDL’s AM4.0/LM4.0 components of CM4.0 are reported. At this link, one may search “tuning” and see Figure 5 and nearby.
https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2017MS001208
“We emphasize especially that the reduction of global mean bias in OLR from AM2.1 (-2.5 W/m^2) and AM3 (-4.1 W/m^2) to AM4.0 (-0.6 W/m^2) is due to the explicit tuning of AM4.0 OLR toward the value (239.6 W m−2) in CERES‐EBAF‐ed2.8.” [dd re-formatted within the parentheses to read properly here]
This was not even worth the scroll down to see the comments.
Pat Frank
Does the following get to the gist of what you are saying?
AR5 presents a graph of multiple model runs, with a mean and bands at the 95% boundary. They use the term “95 percent confidence interval” for the model runs within the bands. In ordinary study design the term “95 percent confidence interval” is used when sampling a population, such that the derived mean and standard deviation produces a confidence interval (i.e. probability density function) that indicates the statistical likelihood that the sample mean distribution matches the true population value.
Climate modelers erroneously use the term “95 percent confidence interval” to indicate the probability that the model mean matches future temperature, i.e. there is a 95% chance that future temperature will be within the bands. In reality, the only thing they have shown is that is that there is a 95% probability that the next model run will be within the bands.
MPassey
I agree completely with your interpretation in your last paragraph.
While the 95% probability envelope characterizes the variance (and hence, precision — low) of the ensemble runs, it begs the question of the accuracy of the mean. That is, we can say something about the repeatability of the multiple runs, and probability of future runs; however, without comparison of the post-calibration mean to actual measurements, we don’t have a quantitative measure of accuracy. If the variance envelope is large enough, it will encompass reality. Unfortunately, that isn’t very useful for predicting what we can expect in the future unless we can say, with confidence, what the standard deviation is of the run(s) that do correspond to reality! It is necessary to calculate the bias and slope of the mean and correct the results accordingly. Climatologists routinely adjust historical temperature data, why not future data? It isn’t necessary to wait 30 years (Although, we do have Hansen’s predictions!). A decade should give us a fair idea of whether the bias and slope of the mean are even reasonable. Indeed, a decade out can be expected to be more accurate than three decades.
Clyde,
“That is, we can say something about the repeatability of the multiple runs, and probability of future runs; however, without comparison of the post-calibration mean to actual measurements, we don’t have a quantitative measure of accuracy. If the variance envelope is large enough, it will encompass reality.”
Sorry, I missed this earlier. It’s pretty astute.
Tim
Thanks! I was once astutent! 🙂
Assuming the models are in fact merely linear extrapolations of CO2 content, the variation graphs versus time published in the IPCC reports are a reflection of the operators’ opinions about what future CO2 concentrations will be. And as such, they are most certainly not random samples of identical quantities. In addition, what they call a 95% CI is merely the standard deviation of the average of all the models at a certain point of time in the future, multiplied by 2. This assumes there is a normal distribution about a mean of the different lines. It is meaningless.
MPassey,
+1
You are correct!
MPassey, you’ve pretty much got it right.
“AR5 presents a graph of multiple model runs, with a mean and bands at the 95% boundary. They use the term “95 percent confidence interval” for the model runs within the bands.”
As so often, no quote, no link, and it just isn’t true. Doesn’t seem to bother anyone. The reference is presumably to Fig TS14, also 11.25. They show a spaghetti plot and refer to a 5 to 95% range, which is just by count of model runs. They don’t use the term “95 percent confidence interval”.
Stokes
“The reference is presumably to Fig TS14, also 11.25.” How about a link or citation, which you complained that others rarely use?
Now, if 90% of the multi-thousand runs fall into the displayed range, and you deny that they represent uncertainty of the mean, just what do they represent? Why are they displayed? This is more of your typical sophistry. If you take thousands of sample measurements, and create a probability distribution graph, the +/- standard deviations are commonly accepted as the uncertainty of the central measure. What is different other than the mode of display?
You keep hand waving like that, Nick, and you are going to injure yourself!
Nick Stokes
Figure SPM6 shows graphs of temperature anomaly vs models with the explanation: “Model results shown are Coupled Model Intercomparison Project Phase 5 (CMIP5) multi-model ensemble ranges, with shaded bands indicating the 5 to 95% confidence intervals.”
Maybe it’s not a big deal because it’s in the Summary for Policymakers?? But here is just one example of why this misuse of the phrase “confidence interval” is important. Steven Novella runs the Neurologica Blog where he criticizes all kind of anti-science—- vaccines, GMOs, global warming, etc. He calls himself a science communicator. When it comes to climate change, he constantly invokes the IPCC consensus against all criticisms. So, for example, in a 2/2/2018 post on carbon capture he writes:
“Climate scientists have gone beyond just establishing that AGW is happening. They are trying to quantify it and project the trend lines into the future. This type of effort is always fraught with uncertainty, with the error bars increasing with greater time into the future. However, we can take a 95% confidence interval and make reasonable extrapolations of what is likely to happen.”
So, here is a smart guy, with enough knowledge of statistics to know the term “confidence interval”, who believes that the IPCC can quantify temperature trends into the future inside a 95% confidence interval band.
“Rich’s emulator equation (1ʀ) is therefore completely arbitrary. It’s merely a formal construct that he likes, but is lacking any topical relevance or analytical focus.”
We’d call that ex posterior.
“Let’s see if that is correct.
…
Compare eqn. (1) to eqn. (2ʀ). They are not identical.”
They are identical, except for the start point of the summation. Rich has simply given the standard convolution solution to a first order linear recurrence relation. It is well understood than in such convolutions you can equally exchange the adding index.
Sum (i=0 to t) a(t-i)b(i)
is the same as
Sum (i=0 to t) a(i)b(t-i)
I think Rich should have written the summation starting from i=1. That is a minor point.
Nick,
“They are identical, except for the start point of the summation.”
Huh?
W(t₁) = (1-a)W(0) + R₁(t1) + R₂(t1) -rR₃(t1)
W(t₁) = (1-a)W(0)+(1-a)[R₁(0)+R₂(0) -rR₃(0)]
are *not* identical. It’s not just an issue of starting point.
You’re right, Tim.
Rich’s t = i = 1 includes (1-a)*(Rn), which is not correct.
It should match his eqn. 1ʀ, but does not.
Sorry, Pat, you’ve failed at the start.
You called the fudged linear-progressions the warmists display “models” when they do not qualify as such, they are statistical constructs based on no functioning theory just an assumption that temperature increases follow CO2 concentration increases. We know the relationship between atmospheric CO2 and temperature is non-linear, which makes their charts garbage to begin with.
Basically they’re little more than charts of margin of error.
I’ve now read through Pat’s post twice and it is quite complicated in some areas, but it seems to be mostly necessary to respond fully to Richard Booth’s critique. I do find Pat’s defense to be sound. The underlying issue is clearly the difference between error and uncertainty. Error is, of course, a concept that statisticians should understand. However, Uncertainty of Measurement is a subject studied in metrology and engineering. Not many statisticians spend time designing or using measurement instruments.
I took a lot of math courses including probability and statistics. I did not encounter Measurement Uncertainty until I started working in laboratories. The fact is no one worried much about MU outside of national standards bureaus (e.g. NIST), physicists, electrical engineers and some Quality Control specialists. The global adoption of ISO 17025 as the basis for accrediting laboratories and calibration agencies in the 1990’s included a requirement to determine and clearly state Measurement Uncertainty for all certified measurements and calibrations. It took more than a decade for this to become common practice and it is still frequently misapplied or ignored.
That’s funny. My freshman Physics class had a lab that basically spent the whole semester learning about Uncertainty of Measurement. For example, using a 1 ft ruler measure a 10 ft bench. The result was obviously 10 ft +/1 the estimated error in the ruler. It was a requirement to report say 10 measures of the bemch and estimate how much of the error came from the ruler and how much came from how it was used- the errors always had to be stacked, adding together all the calculated error for each time the bench was measured. That was the total possible error.
It was a good demonstration of how hard reliable measurements can sometimes be.
On Thermometers: I did a Station Survey for the Surface Stations Project. I did Santo Domingo, Dominican Republic. I spoke to the emeritus national meteorologist — he was no longer paid, but still turned up for work each day at the office of the national meteorological station. He took it upon himself to show my wife and I the currently in service Stevenson screen with its glass hi-lo thermometers. (They had had an electronic weather stations for a few years, but a hurricane had blown it away).
He showed me the system — each day at a certain time (1o am I think) one of the junior weathermen would come out to the screen, bringing the log book, open the screen, and read the thermometers — record the values, and push the reset button. The log book showed that all readings were done to 0.5 degrees. All looked straight-forward, until I noticed the concrete block on the ground next to the screen.
I asked about the block…”Oh, : he said, “that’s for the short guys…..” Me: “Huh?” Him: “Yes, the short ones have to stand on the block to be at eye level to read the thermometer….or they get a false reading…of course, none of them will … it is embarrassing, so many readings are off a degree or so…:
True story….
Kip,
And yet we are to believe that the CGM’s can take inputs +/- 1deg and come out with a precision of 0.01deg or better for a global average!
“we are to believe that the CGM’s can take inputs +/- 1deg “
GCMs do not take surface temperatures as inputs at all. So these stories are irrelevant.
So their starting conditions don’t bother to start with starting conditions? They generate the starting surface temperatures directly from physical processes somehow?
I think not.
Nick,
Did you see the word “surface” in my reply anywhere?
If the inputs to CGM’s are not temperature related then exactly what are they? And what are their resolution in significant digits and what are their uncertainty interval?
Very reminiscent of Irving Langmuir’s discussion with Joseph Rhine about his experiments with ESP. There was a file in the office holding the results of people who didn’t like him and reported results too low….
Let it not be me who calls them block heads.
Pat Frank,
Your previous posts on this topic, and Richard Booth’s post on February 7th, attracted extensive comments referring to CFD (Computational Fluid Dynamics). The point of those comments, generally, seemed to be to describe GCM’s as a special case of CFD. Perhaps there will be similar comments here at this new post.
The purpose of my comment here is to make reference to this web site linked below. “Tutorial on CFD Verification and Validation”.
https://www.grc.nasa.gov/www/wind/valid/tutorial/tutorial.html
In this section of the tutorial, “Uncertainty and Error in CFD Simulations”, I find support for the distinction you consistently emphasize between uncertainty and error.
https://www.grc.nasa.gov/www/wind/valid/tutorial/errors.html
One more reference from this website, “Validation Assessment.”
https://www.grc.nasa.gov/www/wind/valid/tutorial/valassess.html
In this section, “Applying the code to flows beyond the region of validity is termed prediction.” This is the case for the use of GCM’s for projecting future air temperatures in response to greenhouse gas forcing, as the tuning is based on hindcasts (with historical estimated forcings) and on longer term “preindustrial control” simulations (with no change in greenhouse gas or other anthropogenic forcings.) Anything beyond those cases is prediction.
So as I see it, references to the accepted use of CFD simply reinforce the relevant questions about uncertainty and error. If a CFD simulation is proposed for prediction, a method for conditioning of the model output for calibration error would have to be applied, to estimate its reliability.
I’m replying to my own comment here to clarify that I don’t regard any GCM as having been “validated,” or in any sense confirmed valid for diagnosing or projecting the impact of greenhouse gas emissions, by hindcasting.
Perhaps we need a new professional field, Climate Engineer, to control these Climatologists.
with testing and registration!
Pat,
A masterful post. While many would disagree I found it succinct and to the point. You just covered a lot of territory!
I must say that I am simply continually amazed at how few people understand the rules of significant digits – rules that were developed to handle uncertainty in measurements, not error but uncertainty. You can’t gain precision by averaging independent measurements of different things nor can you reduce uncertainty in the averaged result. It truly is that simple. All the rest is just providing a way to handle the uncertainty in a standard manner.
Tim
The one professional field that had an early grasp of the importance of accuracy and precision is land surveying. What with differential GPS, laser range finders and electronic theodolites, they still teach the basics I don’t know. However, I doubt that the training is a rigorous in fundamentals as it once was.
Speaking of rules of significant digits, I am reminded of using a slide rule, back in the day, to multiply numbers just by adding the logarathmic distances slide-wise. By some stretch of the imagination, you could *almost* get three digits of precision on those multiplications, but you could *never* get four digits.
Now, one thing that occurs to me is that calculating a circumference means multiplying the diameter of a circle by the value ‘pi’, right? So suppose someone hands me a precisely manufactured lens, or some other circular object, and it is understood to be precisely toleranced enough that a good four digits of precision should easily apply to the circumference over diameter ratio. I am also told that the diameter is precisely 2.25 centimeters. At that point, I have my slide rule handy and I multiply 2.25 by the ‘pi’ mark at 3.14, as close as I can read the values, anyway. Obviously, since ‘pi’ on my preferred calculating device has only three significant digits, I can by no means conclude a 7.069 centimeter circumference! Since ‘pi’ obviously has only 3 digits (my slide rule says so), it doesn’t matter how precise the 2.25 value is, I can only conclude that the circumference comes out to about 7.06 cm (a three significant digit result), although notice that my best ‘eyeball’ read of the slide rule might just as easily say 7.07, I suppose?
I confess, I’m probably being a bit of a ‘troll’ here, but, well, I’m being sincere too, in the sense that it’s just not easy to decide how to properly apply measurement limits in every situation?
David, you have a pretty good handle on this. Add this in: my slide rule has markings of a certain width. When you are trying to estimate where the movable slide actually is do you use the left side of the mark? the right side of the mark? Or do you use the middle of the mark? If you use the middle then how closely can you estimate where the middle of the mark is?
All of this adds even more uncertainty to the result.
Now consider a liquid-in-glass thermometer. All liquids have some kind of meniscus, some have a concave and some a convex. Some have a very distinguishable meniscus and some have a small meniscus. Without looking it up do you know how to read a thermometer where a meniscus exists? Think of the meniscus as the width of the mark on a slide rule. Just how “precise” can the reading of such a thermometer be? How much uncertainty is in-built?
You use of the significant digits rule is perfect, and it applies directly to the temperature record. If the baseline temperature record has an uncertainty interval of +/- 1deg then how can anyone determine differences of 0.01deg when comparing to the baseline?
In my example of reading the constant ‘pi’ as marked on a slide rule, I’d now want to mention that I’d certainly rather be using a digital calculator, or digital computer, with lots of precision on the value of pi as such! After all, in math terms, pi isn’t supposed to be a measurement, right? Rather, it is supposed to be a precise number, as exact in principle as the integer ‘2’, say, despite the fact that pi is an irrational ‘real’ (with its supposedly precise nature being not so intuitive)?
One problem with this, is that if I and a lot of other people are persuaded about the precise reality of pure math (and the special results that math assumptions tend to generate), then at what point does the math model take over, we just live in a nice model world then? Granted that the mathematicians, the ‘Einstein’s’ of this world, can get a long, long, way on mathematical models, maybe there is still more uncertainty in the end, that even the best models can’t keep track of?
When it comes to models all models that have predictions as the outcome are only based on probabilities and are only as good as their inputs.I find it uncanny how models that say there is a 95% chance that a certain future event will happen how often they are wrong. Climate models are even more chaotic in that if one suggests that it’s going to be warmer or cooler in the future this could be right by tossing a coin. Even a coin toss can’t be more than 50% likely even if the previous 20 throws were heads. I get flabbergasted that governments are prepared to spend trillions of our money on coin tosses. A climate change future which purports to use historic information to predict future outcomes should at least have certainty in relation to that history. The major bureaus and scientific institutions have no qualms about adjusting away inconvenient data which doesn’t suit the alarmist narrative. Michael Mann got rid of the medieval warm period, the Australian BOM got rid of the federation drought, and adjusted out the extreme heat wave conditions in The 1930s and recently to top it all adjusted away Marble Bars Guinness book of records 160 consecutive days of 100 degrees F , the longest heat wave in the world in 1923. Climate models are a sad joke. If a model can’t predict the past how can anyone expect it to predict the future.
Accuracy and Precision of what? We do not take actual near ground measurements of all cell grids, so it’s impossible to determine the accuracy and/or precision for a near ground temperature prediction. Averaging all of the cells (or just sets of cells) and comparing that answer to a partial measurement treated the same way is even worse… The models cannot accurately predict something that we never have successfully measured and understood – it’s just a bunch of computer output that is based on people’s biases about how climate evolves.
So trying to pick apart statistical uncertainty versus physical error – while interesting and pertinent to scientifically based research doesn’t apply to the garbage output by climate models. One can only determine the output demonstrates some amount of error after taking real measurements, but you still have no idea of if that applies at all to the future predictions (i.e. guesses).
I don’t even understand why people seem to assume that climate is at all predictable over a long period of time. Imagine “climate” as being the result of interactions between many “almost” (but not completely) independent physical processes, each of which could be complex. In fact, imagine these processes as orbital bodies revolving around their center of mass. If there are only two processes then their interactions (through gravity in this case, but could be temperature if we are talking climate) are predictable. But if there are three bodies all we can hope for is a estimate that worsens over time. Now imagine that “climate” is made up of 6 or 8 or even 10 bodies of various masses and orbital speeds. We try to predict the path of a single body (we call it temperature) without ever even knowing how many bodies are interacting – good luck on this.
We have to understand what drives climate before we can hope to model it. Calling it “settled science” is just giving up. And we may discover it is IMPOSSIBLE to build a model that predicts (within a certain accuracy) more than a few tens of years out – in fact I think this is likely.
Robert,
Ahhh! The three-body (or n-body) problem! Very astute observation! My guess is that the computer programmers that developed the CGM’s likely know nothing about it. But it would seem to be applicable here.
As far as I understand it, in the ‘old’ days all of this was called ‘the tolerance’ for the thermometers, and effectively it was a function of their manufacturability for the materials used. That is to say the aggregation of competing variations and their impact on accuracy and precision gave a maximum figure by which the measurements could be wrong. Individual thermometers might actually be much better than the stated tolerance at some readings but they should never be worse.
I think that climate modelers approach the Earth as scientists might approach the study of the atom” it can only be described by statistics” which ignores all the causal things that influence the earth. I think that climate modelers like to think of themselves as like quantum physicists because it gives them greater status than they would have, also it takes away our ability to reject the models with observation.
The answer is 42, but what is the question?
Excellent article that contains a simple aphorism – “climate modeling is an exercise in statistical speculation. “
“reality is a statistical distribution of a random variable,” B*ll*cks
Reality is reality which also includes such things as a tree making a noise in a forest if no one is there to hear it and a room continuing to exist if anyone goes out of said room.
Nikola Tesla: “Today’s scientists have substituted mathematics for experiments, and they wander off through equation after equation, and eventually build a structure which has no relation to reality.”
His comment was valid a century ago, and apparently more so now!
While I believe Pat Frank is correct in that models are unreliable for predictions, the root or core is not statistics or error propagation, it is that they are unphysical. (that the climate modelers get error wrong is moot if the model does not reflect reality in the first place) (or maybe I misinterpreted Pat’s thrust – is he saying if they get error propagation so wrong, then in fact this means their root or core model is incorrect?)
The late Freeman Dyson said it in much fewer than 9000 words: “I just think they don’t understand the climate,” he said of climatologists. “Their computer models are full of fudge factors.”
https://wattsupwiththat.com/2013/04/05/freeman-dyson-speaks-out-about-climate-science-and-fudge/
D. Boss,
You said, “…or maybe I misinterpreted Pat’s thrust – is he saying if they get error propagation so wrong, then in fact this means their root or core model is incorrect?”
If I’m understanding what Dr. Frank is saying, then it would be more correct to say that since they get (calibration) error propagation so wrong, then in fact it is impossible to determine if their root model is correct or incorrect, which renders them useless for the purposes of gaining insight into future climate states. You can believe that the projections are correct or incorrect, but either way it’s just speculation on your part.
Paul Penrose:
Ok that makes more sense – Pat Frank is saying the error propagation is so wrong, it’s impossible to determine of their root models are correct or incorrect. Thus rendering them useless as predictors.
I still think there are easier ways to disprove CO2 as control knob though…However I suppose since the goal of the climate cult is to usher in a New Marxist world order using fake science as the cudgel, every refutation of their nonsense is helpful.
Paul and D. Boss, you’ve got it right. The model cloud error is so large, they cannot resolve the impact of CO2 on the climate, if any.
The calibration error propagation just shows that the projection uncertainty is immediately so great that nothing is presently knowable about future air temperature, CO2 increases or no.
The models can’t reliably predict anything about CO2 and the climate.
D. Boss
to paraphrase Tesla
‘Climate change (Life is an equation incapable of solution, but it contains certain known factors.’
Liars, damn liars, and statisticians.
Well done, Pat! 🙂
Your analysis is proof of your keen intelligence, your depth of knowledge, your fine writing ability, your fun sense of humor and sharp wit, and, most of all, of your highly admirable perseverance.
Thank you for all you doing, ultimately, for
FREEDOM.
Janice
Thank-you, Janice.
I do it to help bring our wonderful future, for individual freedom, for our species ad astra.
AGWism is collectivism is slavery. I detest it.
My best to you Janice,
Pat
“… for our species ad astra.”
To infinity — and beyond!
🙂
Pat Frank, your mathematical gish-gallop sucks. If you can’t provide an argument that a sophomore high school student can comprehend, you’ve lost. Throwing everything you have against the wall is a poor strategy. Not to mention that 97% of the folks reading this blog can’t get past your first equation.
..
PS, stop wasting your time posting to this blog, and try to get any kind of peer review of your “work.”
Professor Pool
Did you take a poll of readers to determine that 97% can’t get past the first equation, or are you just projecting based on your personal problems? Why do we have children go on to college when you theorize that it is possible to explain everything to a high school sophomore? By your ‘logic’ it should be possible to save those potential students a lot of debt by just finding the right explainers for all the world’s problems. Even Greta might understand.
Pat is getting peer review here. Some of us are or have been reviewers of technical articles and we perform the same function here. The difference is that Pat doesn’t have to pay to get his paper published, and it gets a wider distribution than a paper behind a paywall. Your negativity contributes nothing and I suggest that YOU “stop wasting your time posting to this blog.” You apparently are getting nothing out of even reading it.
Mr. Pool….. have you and “Pippen Kool” ever been seen together…..? Hm. Heh.
Here is the peer-reviewed work that is under discussion in this forum, Henry Pool. The paper shows that no one has a clue what, if anything, CO2 has done, is doing, or will do to the climate.
Here (870 kb pdf) is a peer-reviewed paper showing that the air temperature compilers have utterly neglected systematic measurement error. No one knows the rate or magnitude of the warming since 1850.
Here is a peer-reviewed paper showing that there is no science in any of consensus AGW work. The whole caboodle is a crock full of false precision.
Have fun reading, Henry Pool. And enjoy your crow.
A large fraction of readers here re very well trained. Others are science minded intelligent people. Your disdain is misplaced and in appearance self-servingly prejudicial.
No one took up the rulers question, but I would like to have heard a more in depth discussion of it. (I agree with Rich).
Tom, your rulers taken up by Tim Gorman here, followed by further comment.
Reading this ongoing discussion while focussing all the time on the the fundamental difference between science and simulation I felt no need to go for any of the formula’s or the detailed reasonings. Because you could tell from the start of these discussions Pat Frank derives his arguments from the fundamental approach in science. i.e. an ongoing confrontation of possible a-priori explanations of phenomena with physical reality. His critics derive their arguments from model parametrization via trial-and error to see if it has any simulation validity about physical reality. The fundamental difference between these approaches being: science looks for explanations, while the model approach is about just making a simulation look right in reference to physical reality – without explanation.
In my simple mind I equal a model with authority. You have to trust it, without being able to grasp the why. Science in contrast I equal with with an open inquisitive mind toward physical reality without any authoritarian directive.
I am not surprised models are so important for the believers and followers of the great leaders that will save them from doom. I wish them luck.
Hi, Jurgen,
A couple years (or 4 or 5? — time goes by so fast…) ago, you were so kind to try to help me with the “screen bouncing” when my computer accessed WUWT. I tried to thank you a couple of times, but just as I sent up my signal flags, your ship passed over the horizon and you couldn’t read my “thank you.”
In the hopes that you read this: THANK YOU, JURGEN! 🙂
Hope all is well with you over there,
Janice
Hi Janice, I remember a positive reaction from you, so something still came trough. Glad to help with computers. It is a hobby of mine to subdue these beasts into submission.
I am not a regular at WUWT. So I’ll make up for that by answering your question a bit more elaborate.
In Holland formal developments go in parallel with the official climate-scare policy of the European Union, although our coalition tries hard to bypass even that over-idealistic course and Germany’s “Klimawandel” with our “Klimaatwet”, intended to be the champion of champions battling the supposed climate-doom. Here we go, fasten your seat belt:
1: in 2030 our greenhouse gas production will have to be 49% lower than in 1990
2: in 2050 this has to be 95% lower that 1990
3: electricity production in 2050 has to be 100% CO2 neutral
We have our own Patriot Act and our own Soros. Our local Patriot Act is called “Crisis en herstelwet” (herstel=”repair”) intended for emergencies and now being abused to effectuate draconian idealistic measures outside the standard democratic monitoring. But in stead, and not surprisingly, with dominant involvement of privately-owned and financed NGO’s.
Our Soros is called Boudewijn Poelmann. In practical terms he owns and steers billions of dollars with his lottery empire (the biggest reaper called “Postcode Loterij”).
You can sense and see the corruptive political involvement in several ways. There is this strange loophole in our Civil Law (art 3:305a from 1-7-1994) staying not being repaired to this very day, allowing private parties without any representative status to challenge official government policy in court. All an organization has to do is have in their mission statement the sentence they are after the “common interest”. Unbelievably just this simple statement will give the legal ground to go all the way to High Court without any representative or democratic backing from the local population. No membership or consultation of the public required at all. The success of our “Urgenda” NGO is based on this legal loophole. And of course on their sole financier Poelmann (8,3 million euros since 2013).
Another corruptive political involvement is smelled from the activities of this Poelmann empire. He cleverly involves many politicians in the subsidiaries of his organization. So not surprisingly his legal obligation to donate 50% of his lottery-income to activities of “public interest” has been lowered to just 40%. Also the fact he has a practical monopoly with lotteries in our country. And of course, the NGO’s he supports are well in line with the policy of our government.
It is not all gloom and doom. A grassroots-like movement inspired by common sense is getting stronger all the time. Our “elite” tries to subdue that in many ways, among them smears (“populist movement”), exclusion from politics (“cordon sanitaire”) or worse. I’ll give an example of that.
Rypke Zeilstra from our province Friesland for years now is merciless in tracing and exposing the money-flows and corruptive intimate connections withing the green jet set and the policy-makers. He has a very sharp pen. So he is feared. So they try to subdue and smear and silence him with a recent preposterous criminal charge and arrest. Will they really succeed with this ploy? In my eyes it confirms their foolishness. It won’t silence Rypke for sure.
https://www.interessantetijden.nl/2020/02/21/dutch-sciencejournalist-arrested-for-windfarm-thoughtcrime/
Thank you, Jurgen, for your VERY generous reply. 🙂
I enjoyed reading of Dutch patriot Zeilstra. Abominable, how he and all fighting the CO2 scammers (Big Wind, Solar, electric vehicle, et al.) is being treated. The linked article in a phrase:
cui bono.
Solar, wind, electric vehicle, “carbon storage,” temperature data products and other scammers of the public are, ultimately, behind all the tyranny. For some, it is about power or career advancement, per se. For a tiny % of true believers, it is about their faith. For the vast majority of AGWists, however, it is about one thing: MONEY.
Disgusting (“smelled from the activities,” indeed!).
And sad.
Keep up the good fight, Jurgen.
In the end, truth wins. Every time.
Janice
The elephant in the room has woken up.
Or to put it another way, I am as usual late to the party, having been on holiday and then browsing different WUWT articles from this one.
I do think it would have been a nice courtesy if Charles The Moderator could have emailed me to notify me about this new posting of Pat’s. Obviously it is very long, and I don’t know how long it will take me to compose a useful response.
I shall now consider my position. Resign! I hear many of you Frank supporters say 😉
Rich.
P.S. It’s a glorious cool March day here so I’m off for a walk first.
I wasn’t notified of your critique, Rich.
I wish you had been. I think it is a basic service that the moderators could be encouraged to offer, in circumstances where one blogger is writing about another’s work.
“Frank supporters” supporters of standard uncertainty analysis, rather.
Unfair and unkind, Rich.
Well it’s only because I am jealous – from my POV I’d like to have a good set of supporters. Janice, wow!
Hi, Rich,
Here is some support:
Both you and the readers of WUWT deserved to have you be given timely notice and an opportunity to timely appear (IOW, Due Process). While I agree with Pat’s analysis, that you were not given reasonable notice was inept publishing management, indeed.
If it makes you feel any better (or, at least, less dismayed), Pat, also, was treated this way. When Roy Spencer’s article criticizing Pat’s analysis was published on WUWT, there was no corresponding Reply article featured by Pat within a reasonable time frame (a few hours, that is). Roy’s LOUD blast on the horn aimed at Pat’s work just echoed on and on and on… for days.
So! There you go. 🙂
Best wishes to you in what appears to be a diligent search for truth,
Janice
P.S. Also, I have noticed that you have been over the years, in general, a fine promoter of facts and data about CO2. Way to go! 🙂
Pat, it’s going to be best if I respond to your posting in chunks, to keep the size manageable.
Response 1
At this stage I won’t reply to your 21 summary points, leaving that until after the detail is covered. Here I cover your first 8 pages (roughly, as per copying to MS Word A4), starting with your analysis of my Section B. Your very first equation (1R) is not what I wrote. If you go back to my posting you will find no (-r) coefficient. My remote diagnosis of how this happened is as follows, and you can tell me if I am on the right lines. You copied my equation onto paper, and then whilst studying the connection between R2(t) and R3(t) you wrote in a (-r) before the latter to remind you of the anti-correlation. But later you forgot that this was an annotation, and took it to be a coefficient. This is the generous explanation, so is likely to be true.
Anyway, most of what you write later about (-r) is incorrect, but you can replace it with 1 to good effect.
You note that I never once use my emulator on real temperature data. There are 3 reasons for that: first it would have unnecessarily lengthened the essay, second I don’t have suitable data to hand, and third in Section D I show that my emulator with a > 0 can roughly mimic the expectations from the emulator with a = 0. However, I should be happy to test this assertion if you could provide me with suitable digital data.
I don’t understand your “The relative impact of each Rn on W(t-1) is R₁(t) > R₂(t) ³ |rR₃(t)|”, so please could you explain it?
You write that I did not designate what X(t) actually is. True, I merely wrote “X(t) be the random variable for reality at integer time t”, without specifying the reality. That section started out in generality, but by Equation (1) (WordPress turned it into a bullet point) I should have mentioned I was moving towards specifics of GAST (Global Average Surface Temperature). However, W(t) could either be temperature anomaly in Kelvin or radiative forcing in W/m^2 and then later converted to Kelvin by a sensitivity factor; the rescaling does not materially affect the analysis of propagation of error.
You write about the lack of focus or relevance of my emulator. Until I read your comments on my Section D, I’ll defer substantive comment on this, but merely note that my emulator is just a generalization of yours, so it has potential relevance by building upon yours. The question is whether your Newtonian special case suffices to explain the data well enough, and whether mine can improve upon that.
You analyze my Equation (2R) for correctness. Unfortunately there is a typographical error in it, which may have sent you down a garden path. The upper limit of summation should be t-1, not t. Nick Stokes was smart enough to spot a problem there, and it should be evident enough merely by trying out t=1. I did originally write a precursor of it correctly at Equation (5) of
https://wattsupwiththat.com/2019/10/15/why-roy-spencers-criticism-is-wrong/#comment-2830795 , but somehow that typo crept in. Anyway, you reached your Equation (1), which written in ASCII/TeX is
W(t_t) = (1-a)^tW(0) + sum_{i=1}^t (1-a)^{t-i}[R_1(t_i)+R_2(t_i)-rR_3(t_i)]
You do choose some weird notations sometimes – t_i should be i, an integer time step. Your equation is the same as my (2R), with typo fixed, as follows: change t_i to i, change –r to 1 (as previously mentioned), let j = t-i and get
W(t) = (1-a)^tW(0) + sum_{j=0}^{t-1} (1-a)^j[R_1(t-j)+R_2(t-j)+R_3(t-j)]
which is (2R) except for using a variable called j instead of i. So yes, it is correct.
Next, you ask why ‘a’ should be constant apart from convenience. That is a bit rich considering you have been using the constant a=0 in your emulator. But yes, in mathematical modelling there are trade-offs between convenience, under-fitting, and over-fitting. For God’s sake don’t ask that a = a(t) be a function with 4 tunable parameters or Willis will come over brandishing an elephant’s trunk!
Under Equation (3R) you ask to see the work. I could do that, but it’s pretty tedious, being standard summation of a geometric series including a spare variable x, differentiation wrt x, and setting x to 1. Instead though, I’ll show you that t=2 works, and leave others to try t=3 if they wish. Let’s just take the R1(t-i) piece. Then Equation (2R) with t=2 and proper limits 0 and 1 gives E[R1(2)+(1-a)R1(1)] = (2b+c)+(1-a)(b+c) = (3-a)b + (2-a)c. And the relevant bit of Equation (3R) gives (2a+a-1+1-3a+3a^2-a^3)b/a^2 + (1-1+2a-a^2)c/a = (3-a)b+(2-a)c. Magic isn’t it?
Now to your second diversion into dimensional analysis. You say that R1(t) must be Celsius (I use Kelvin = K as the SI unit) and then in the next sentence that R1(t) is in W/m^2. Bizarre. As per previous discussion, the model could use either K or W/m^2 as units, but in Section D I plump for K.
You ask how I decided to use a straight line to represent the causal influences of the sun and CO2. Well, AFAIK GCMs assume constant sun into the future. For CO2 they assume various RCPs (Representative Concentration Pathways) which tend to be exponential increase of CO2 (1% per year is widely quoted for Transient Climate response studies even though 0.5% is nearer the observed mark recently), but the radiative effect is logarithmic and log(exp(kt)) = kt. OK?
As for the internal dimensions of bt(t+1)/2, it is the only comment of yours so far (I’m sure there are better ones to come) which has piqued my interest (not “peaked” as I recently saw written on WUWT). I can’t give a definitive answer, but would suggest not to worry about dimensionality along the x-axis, with the following analogy. Suppose I give you $5i on each of the days numbered i=1,2,…,t, then at the end you have $5t(t+1)/2. Are you really worried about the apparently day^2 dimension of your wad of cash?
That’s all for now, Rich.
Rich, I just saw your response. You wrote, “Your very first equation (1R) is not what I wrote. If you go back to my posting you will find no (-r) coefficient. … You copied my equation onto paper, and then whilst studying the connection between R2(t) and R3(t) you wrote in a (-r) before the latter to remind you of the anti-correlation. But later you forgot that this was an annotation, and took it to be a coefficient. This is the generous explanation, so is likely to be true.”
Here’s what you wrote about R3(t), Rich: “R3(t) is a putative component which is negatively correlated with R2(t) with coefficient -r, with the potential (dependent on exact parameters) to mitigate the high variance of R2(t).”
That is, you described R3(t) itself as negatively correlated and then you went on to say it has a coefficient -r.
The complete and correct rendering is therefore -rR3(t) by your own description.
My rendering of the R3(t) term accurately followed your description of it.
If you repudiate that meaning now, you repudiate your own original description as having been inaccurate or else admit it to have been so poorly rendered as to be inadvertently misleading.
Should I finish with your, “This is the generous explanation, so is likely to be true.“?
OK, when I wrote “with coefficient -r” I didn’t write “with correlation coefficient -r” because to most intelligent mathematicians that is entirely understood from the “negatively correlated” precursor. And I did write R2(t)+R3(t) with no intervening (-r). If I have two random variables X and Y, negatively correlated with correlation coefficient -r, their sum is written as X+Y, not as X-rY. And then Var[X+Y] = Var[X]+Var[Y]+2Cov[X,Y] = Var[X]+Var[Y]-2r sqrt(Var[X]Var[Y]).
Your ignorance of basic probability theory is becoming quite disturbing. Yet readers seem to trust your outlook on uncertainty theory.
But you didn’t write “with negative correlation coefficient -r,” did you Rich.
You originally wrote, “R3(t) is … negatively correlated with R2(t) with coefficient -r,” which means something different than your revision.
And now, having been careless in expression, you try to shift the blame on to me for taking the meaning from what you actually wrote.
Here’s what’s disturbing, Rich: your repeatedly evident ignorance of basic empirical physical error analysis never gives you pause.
Pat, whatever words I wrote about R_3(t) following Equation (1R) could not change the equation itself, and therefore the words have to be viewed in that light. Mathematicians would see that in “putative component which is negatively correlated with R_2(t) with coefficient -r”, the coefficient means the correlation just mentioned, not a multiplier. But you are a very intelligent physicist, rather than a mathematician, so sorry if this confused you.
Rich, “I don’t understand your “The relative impact of each Rn on W(t-1) is R₁(t) > R₂(t) ³ |rR₃(t)|”, so please could you explain it?”
There were a few mistranslations from Word to HTML when Charles posted the essay. I though we caught them all, with Clive Spencer’s help. You just found another.
In its submitted original, that line read, The relative impact of each Rn on W(t-1) is R₁(t) > R₂(t) (>/=) |rR₃(t)|., where (>/=) is ‘greater than or equal to’ and the vertical strokes around rR3(t) indicate the absolute value.
You wrote, “but merely note that my emulator is just a generalization of yours, ,,. ”
No, it’s not. Your equation includes persistence. Mine doesn’t. Yours includes climate feedback. Mine doesn’t. Yours includes a tuning factor. Mine doesn’t. Yours is whimsical, invented, and oracular. Mine is empirical. They have nothing in common.
You wrote, “The upper limit of summation should be t-1, not t. Nick Stokes was smart enough to spot a problem there, …”
Nick’s supposed correction was that you should have begun the summation at i = 1, not that you should have ended at t-1.
Your summation should have been over i=1->t. There is no zeroth step to a summation.
If a zeroth term exists, it is an initial value to which step results are summed, e.g., W(t) = W(0) + [sum(i=1->t)(W_i)].
Your mistake is not that the sum in 2R is to t rather than t-1. It’s that you’re summing over i=0 ->t-1 rather than over i = 1->t. You made that same mistake in your equation 5 touch-stone.
You wrote, “and [the t-1 fix] should be evident enough merely by trying out t=1”
I did try the t=1 test. Your generalized equation yields, W(t_1) = (1-a)[R1(0)+R2(0)+R3(0)] + (1-a)W(0), which is wrong. Your “(t-i)” is ’t minus i’ not ’t_i’ (t-sub-i), leading to undefined R(0) terms.
My notation is science-standard. You chose ’t’ to be integer time in your definition of X(t), but you also use it as a place-designator, e.g., R(t), in your notation.
You wrote, “change t_i to i, change –r to 1 (as previously mentioned), let j = t-i and get W(t) = (1-a)^tW(0) + sum_{j=0}^{t-1} (1-a)^j[R_1(t-j)+R_2(t-j)+R_3(t-j)].”
If j = t-i, then i = t – j. In that equation, j is not 0 when i = 0. When i = 0, j = t. So, your sum[i = 0 -> (t-1)] becomes sum[t -> (t-1)], which is nonsense.
You wrote, “which is (2R) except for using a variable called j instead of i.”
Except that “j = t-i” means that ’j’ is now the entire interval between ‘i’ and ’t’. Your ‘j’ is not an integer time. Your new equation 2R does not have the same (attempted) physical meaning as your old equation 2R.
And it certainly does not have the same meaning as my eqn. (1).
You’ve merely hidden your mistake in a cryptic formalism; a superficial but false analogy.
So yes, 2R is not correct. And neither is your attempted reformulistic encryption.
You wrote, “That is a bit rich considering you have been using the constant a=0 in your emulator.”
Is that so? What, then, do you make of the values for ‘a’ in Figure 3 Legend and SI Table S4-1 through S4-4?
You wrote, “Now to your second diversion into dimensional analysis. You say that R1(t) must be Celsius (I use Kelvin = K as the SI unit) and then in the next sentence that R1(t) is in W/m^2. Bizarre. As per previous discussion, the model could use either K or W/m^2 as units, but in Section D I plump for K.”
You wrote that, “R1(t) is to be the component which represents changes in major causal influences, such as the sun and carbon dioxide. R2(t) is to be a component which represents a strong contribution with observably high variance, for example the Longwave Cloud Forcing (LCF).”
Cloud forcing is not a temperatures. It is power in W/m^2. A sum of terms means all the rest of the Rn(t) must also be dimensionally identical, i.e., in W/m^2. Causal forces from sun and CO2 are also in W/m^2, not temperatures. But your W(t) are temperatures – Kelvin as you like.
There must be an operator to convert the W/m^2 of each Rn(t) to temperature. You don’t provide it.
Your emulator equation (1R) is a sum of physical terms. The Rn(t) terms to the right of the equal sign must all have the same dimensions — W/m^2.
To sum with the the W(t-1) term and to produce the W(t), they must also have the same dimension as the W(t)’s — your Kelvin. But they do not.
Your use of dimensions is incoherent.
You pay no attention to physical meaning all the while supposing you’re involved in physical analysis.
You wrote, “You ask how I decided to use a straight line to represent the causal influences of the sun and CO2. Well, AFAIK GCMs assume constant sun into the future. For CO2 they assume various RCPs (Representative Concentration Pathways) which tend to be exponential increase of CO2 (1% per year is widely quoted for Transient Climate response studies even though 0.5% is nearer the observed mark recently), but the radiative effect is logarithmic and log(exp(kt)) = kt. OK?”
No.
GCMs model the terrestrial climate. They do not model the sun. They do not model CO2 forcing. The terrestrial climate is dynamically non-linear. There is no reason to assume that the climate responds linearly to a constant or a linear forcing. The chaotic history of the terrestrial climate all the while solar irradiance and CO2 forcing remained relatively unchanged demonstrates that nonlinearity.
Finding that GCMs project a linear temperature response to GHG forcing was surprising to me.
Your emulator supposedly models the GCMs themselves, which in turn model the climate. Your a priori assumption of a linear response is tendentious. You begged the question of climate physics. My emulator treats GCM observable behavior. It has nothing to do with their internal workings (Nick Stokes’ mistake as well) or the climate. The origination of my emulator is completely at variance with the origination of yours. One might say they’re orthogonal.
That distinction is yet another bit of evidence that your emulator has no correspondence to mine.
You wrote, “Are you really worried about the apparently day^2 dimension of your wad of cash?”
Wrong analogy. Dimensional analysis was one of my first lessons in high-school chemistry. If the terms in the equation combine into the wrong dimensions, the equation is wrong. Period.
Not only is your equation wrong, but you treat operators as integers. After your manipulations they end up operating on the wrong terms. The result is physically meaningless.
Your work here shows the same lack of attention to physically important detail as your careless delta-delta T mistake.
That sort of thing shows up repeatedly in your work, with plenty more examples above. It’s really tedious and time-consuming going through and exposing them.
It’s time to figure out that knowing statistics isn’t knowing science, Rich. Another incommensurate you share with climate modelers.
Pat Mar 7 3:06pm
I’ll respond to your substantive comments and prepend them with ‘P:’
P: You wrote, “but merely note that my emulator is just a generalization of yours, ,,. ”
P: No, it’s not. Your equation includes persistence. Mine doesn’t. Yours includes climate feedback. Mine doesn’t. Yours includes a tuning factor. Mine doesn’t. Yours is whimsical, invented, and oracular. Mine is empirical. They have nothing in common.
Please can you explain “persistence”? After you do, I shall probably say “oh, obviously, I shouldn’t have been so thick”, but for the moment I don’t see what you mean.
P: You wrote, “The upper limit of summation should be t-1, not t. Nick Stokes was smart enough to spot a problem there, …”
P: Nick’s supposed correction was that you should have begun the summation at i = 1, not that you should have ended at t-1.
It is astute of you to spot that (I did too), but Nick didn’t get it quite right..
P: Your summation should have been over i=1->t. There is no zeroth step to a summation.
If a zeroth term exists, it is an initial value to which step results are summed, e.g., W(t) = W(0) + [sum(i=1->t)(W_i)].
P: Your mistake is not that the sum in 2R is to t rather than t-1. It’s that you’re summing over i=0 ->t-1 rather than over i = 1->t. You made that same mistake in your equation 5 touch-stone.
P: You wrote, “and [the t-1 fix] should be evident enough merely by trying out t=1”
I did try the t=1 test. Your generalized equation yields, W(t_1) = (1-a)[R1(0)+R2(0)+R3(0)] + (1-a)W(0), which is wrong. Your “(t-i)” is ’t minus i’ not ’t_i’ (t-sub-i), leading to undefined R(0) terms.
Pat, your notation t_k is crazy, it just means k. Let’s ignore R2 and R3 and start with
W(t) = (1-a)W(t-1) + R1(t) for t = 1,2,3,…
We assume W(0) is a known starting point, so
W(1) = (1-a)W(0) + R1(1)
W(2) = (1-a)W(1) + R1(2) = R1(2)+(1-a)R1(1)+(1-a)^2 W(0)
W(3) = (1-a)W(2) + R1(3) = R1(3)+(1-a)R1(2)+(1-a)^3 W(0)
So in those 3 cases, and in all by induction,
W(t) = sum_{k=1}^t (1-a)^{t-k} R1(k) + (1-a)^t W(0)
There, you have your preferred sum from 1 to t. Now let i = t-k, which runs from 0 to t-1:
W(t) = sum_{i=0}^{t-1} (1-a)^i R1(t-i) + (1-a)^t W(0) (2R)
That really is Equation (2R) with the correct upper limit t-1. If you really really want a variable with limits 1 and t then I can offer you j = i+1 = t-k+1, giving:
W(t) = sum_{j=1}^t (1-a)^{j-1} R1(t-j+1) + (1-a)^t W(0)
but in my humble opinion, as copyright holder of the equation, the latter is much less elegant. Your attempts to disprove (2R) fail yet again, unless you can find something wrong in the preceding few lines.
P: You wrote, “That is a bit rich considering you have been using the constant a=0 in your emulator.”
P: Is that so? What, then, do you make of the values for ‘a’ in Figure 3 Legend and SI Table S4-1 through S4-4?
What I make of them, is that your ‘a’ is an additive and my (1-a) is a multiplier, so they are different things. Perhaps I should have chosen a different letter. But I’ll concede that ‘1’, for your case, is a more canonical constant than (1-a) is in my case.
P: You wrote, “Now to your second diversion into dimensional analysis. You say that R1(t) must be Celsius (I use Kelvin = K as the SI unit) and then in the next sentence that R1(t) is in W/m^2. Bizarre. As per previous discussion, the model could use either K or W/m^2 as units, but in Section D I plump for K.”
P: You wrote that, “R1(t) is to be the component which represents changes in major causal influences, such as the sun and carbon dioxide. R2(t) is to be a component which represents a strong contribution with observably high variance, for example the Longwave Cloud Forcing (LCF).”
P: Cloud forcing is not a temperatures. It is power in W/m^2. A sum of terms means all the rest of the Rn(t) must also be dimensionally identical, i.e., in W/m^2. Causal forces from sun and CO2 are also in W/m^2, not temperatures. But your W(t) are temperatures – Kelvin as you like.
P: There must be an operator to convert the W/m^2 of each Rn(t) to temperature. You don’t provide it.
No, I take each Rn(t) to be a temperature converted from the forcing. I don’t give the conversion factor in Section B, but I do in Section D. It is 33*0.42/33.3 = 0.416 Km^2/W, derived from various conversion coefficients in your paper. The value will only be of interest if I ever do any actual data analysis, which you have pointed out I have not done yet.
P: You wrote, “You ask how I decided to use a straight line to represent the causal influences of the sun and CO2. Well, AFAIK GCMs assume constant sun into the future. For CO2 they assume various RCPs (Representative Concentration Pathways) which tend to be exponential increase of CO2 (1% per year is widely quoted for Transient Climate response studies even though 0.5% is nearer the observed mark recently), but the radiative effect is logarithmic and log(exp(kt)) = kt. OK?”
P: No.
P: GCMs model the terrestrial climate. They do not model the sun. They do not model CO2 forcing. The terrestrial climate is dynamically non-linear. There is no reason to assume that the climate responds linearly to a constant or a linear forcing. The chaotic history of the terrestrial climate all the while solar irradiance and CO2 forcing remained relatively unchanged demonstrates that nonlinearity.
I think you’ll find they have to model solar input, since otherwise they cannot model day being warmer than night, which I am sure they do.
P: Finding that GCMs project a linear temperature response to GHG forcing was surprising to me.
It wouldn’t have surprised me, but I wasn’t in your shoes at the time.
P: Your emulator supposedly models the GCMs themselves, which in turn model the climate.
Absolutely correct. And one has to start somewhere. A linear model is the simplest starting point. Plus, of course, it is inspired by your own work which does show a generally linear response. So I’d prefer you to call me a plagiarist than tendentious!
P: Your a priori assumption of a linear response is tendentious. You begged the question of climate physics. My emulator treats GCM observable behavior. It has nothing to do with their internal workings (Nick Stokes’ mistake as well) or the climate. The origination of my emulator is completely at variance with the origination of yours. One might say they’re orthogonal.
P: If they are orthogonal, it is because I started at the same point but went off at a tangent. I’m glad that you treat GCM’s observable behaviour, but you only treat their mean behaviour. Consequently you cannot understand their dispersive behaviour, which leads you, in my opinion, to draw unfounded conclusions on their uncertainty. We are definitely into “what do you mean by mean” here, and I hope to write more on this later in the week.
P: You wrote, “Are you really worried about the apparently day^2 dimension of your wad of cash?”
P: Wrong analogy. Dimensional analysis was one of my first lessons in high-school chemistry. If the terms in the equation combine into the wrong dimensions, the equation is wrong. Period.
Well, if you refuse to engage with my quite legitimate analogy, I’ll have to do it the hard way, which is why this set of comments is later than I hoped. I’ll be interested to see what anyone else thinks, except perhaps only two of us are now reading here, but the conclusion I reach is that the act of discretization of a dimension breaks its dimensionality for practical purposes.. Let’s take my Equation (1R) and simplify with R2(t)=R3(t)=0 and R1(t) = bt+c (i.e. no error). Then
W(t) = (1-a)W(t-1) + bt + c, for t = 1,2,3…
Before analyzing dimensionality of that, let’s look at the continuous time analogue:
dW/dt = -aW +bt + c
Let the two dimensions be K for temperature in Kelvins and y for time in years. I’ll use ~ to denote that a variable has a certain dimension. Then the above is commensurate if
W ~ K, t ~ y, a ~ 1/y, b ~ K/y^2, c ~ K/y.
The solution to that differential equation is
W(t) = k exp(-at) + c/a – b/a^2 + bt/a
where k = W(0)-c/a+b/a^2 ~ K.
So everything works out there.
Now go back to the discrete case, and look at W(2).
W(2) = (1-a)W(1)+2b+c = (1-a)^2 W(0)+(1-a)b+2b+(1-a)c+c
The time is all out of joint. W ~ K is still OK, and the dimensions still work provided that y is replaced by 1 in each of them. But if not, then a ~ 1/y means that in (1-a) we have 1 ~ 1/y (that’s not contradictory), but (1-a)^2 W(0) ~ Ky^2 and that is contradictory. Nevertheless discrete approximations to continuous time series is a time honoured science.
Typo above in “P: If they are orthogonal, …”: that is my current reply, not Pat’s comment.
Rich.
Re: Response 1: Rich, you wrote, “Please can you explain “persistence”?”
Persistence means your equation includes memory. Every W(t) explicitly includes a fraction of the W(t-1). One must know a W(t-1) to calculate a W(t). In contrast, with paper eqn. 1, one can calculate any temperature, T, from forcing alone.
You wrote, “Pat, your notation t_k is crazy, it just means k.”
I was just adapting to the R(t) notation you introduced, Rich.
Continuing: in your “W(t) = (1-a)W(t-1) + R1(t) for t = 1,2,3,…” derivation, your
“W(3) = (1-a)W(2) + R1(3) = R1(3)+(1-a)R1(2)+(1-a)^3 W(0)” is not correct.
Let’s do the substitutions:
W(1) = (1-a)W(0) + R1(1)
W(2) = (1-a)W(1) + R1(2) = (1-a)[(1-a)W(0) + R1(1)] + R(1(2) = (1-a)^2W(0)+(1-a)R1(1) + R1(2)
W(3) = (1-a)W(2)+R1(3) =(1-a)[(1-a)^2W(0)+(1-a)R1(1) + R1(2)] + R1(3) = (1-a)^3 W(0) +(1-a)^2 R1(1)+(1-a)R1(2) +R1(3)
Putting your version above mine will illuminate your error:
“W(3) = R1(3)+(1-a)R1(2)+(1-a)^3 W(0)”
W(3) = R1(3) + (1-a)R1(2) +(1-a)^2 R1(1) +(1-a)^3 W(0)
You are missing the entire ‘(1-a)^2 R1(1)’ term.
Then you wrote, “There, you have your preferred sum from 1 to t. Now let i = t-k, which runs from 0 to t-1:
W(t) = sum_{i=0}^{t-1} (1-a)^i R1(t-i) + (1-a)^t W(0) (2R)”
That equation is wrong as written.
Your integer series is still 1 -> t, but the ’i’ label is stepped back one unit.
In your original k-notation, your equation is now sum[k = (-1 -> t-1)], where -1 labels the initial value.
If new i = t-k, then i = 0 when t = k = 1, your initial value is at t = -1 and old W(0) is now become W(-1) and your original W(1) becomes W(0).
Now we have the series:
t = 0; i = -1, and old W(0) = new W(-1)
t = 1; i = 0; old W(1) = new W(0) = (1-a)W(-1) + R1(0)
t = 2; i = 1; old W(2) = W(1) = (1-a)W(0) + R1(1) = (1-a)[(1-a)W(-1)+R1(0)]+R1(1) = (1-a)^2 W(-1)+(1-a)R1(0) + R1(1)
t = 3; i = 2; Old W(3) = new W(2) = (1-a)^3 W(-1) + (1-a)^2 R1(0) + (1-a)^1 R1(1) + (1-a)^0 R1(2)
then generalizing: W(t-1) = [(1-a)^(i+1)]W(-1) + {sum(i=0->(t-1)}[(1-a)^i R1(i)]
The “(i+1)” exponent over (1-a) should be retained, even though (i+1) = t, because the step-labels traverse 0 -> (t-1), while the number of steps traverse 1 -> t.
In your formalism, Rich, the ’t’ is always unit greater than the ‘i’. This can confuse if one is not careful.
Nevertheless, changing (i+1) to ’t’ for the sake of comparison:
compare, W(t-1) = (1-a)^t W(-1) + {sum(i=0->(t-1)}[(1-a)^i R1(i)
with your, “W(t) = (1-a)^t W(0) + sum_{i=0}^{t-1} (1-a)^i R1(t-i).”
When i = 0, t = 1 your equation becomes W(1) = (1-a)W(0)+ R1(1).
However, from above: when t = 1; i = 0; the correct equation is W(0) = (1-a)W(-1) + R1(0).
Your equation is wrong.
With mine, i = 0, t = 1 yields W(0) = (1-a)W(-1)+R1(0), which is correct.
When you set i = 0 as summation point one, everything changed. Your notation is wrong; your equation is wrong.
Stepping back, when you substituted i = t-k into my equation, it was necessarily trivially correct.
But then you uncritically substituted in the W(0), without realizing it had become old W(1) when the initial value became W(-1).
Moving the initial summation point to zero changed the expression. But you didn’t pay attention to that detail.
You wrote, “Your attempts to disprove (2R) fail yet again…”
I did not attempt to disprove 2R. I tested 2R, found it wrong, and reported that result. Eqn. 2R is still wrong.
You wrote, “What I make of them, is that your ‘a’ is an additive and my (1-a) is a multiplier, …”
You claimed I used a constant a = 0 and when I called you on it, your reply circumlocuted admitting that mistake.
I wrote, There must be an operator to convert the W/m^2 of each Rn(t) to temperature. You don’t provide it.
To which you replied, No, I take each Rn(t) to be a temperature converted from the forcing.”
You represented R1 to be causal influence of the sun or CO2.
You exemplified R2 as long wave cloud forcing. Neither R1 nor R2 be temperatures. In Section C, you say R3(t) cancels error in TOA flux, making it of unit W/m^2.
Your eqn (1) is dimensionally wrong. And now you’re changing meaning in mid-stream as a means of exit.
Then you go on to aver, “I don’t give the conversion factor in Section B, but I do in Section D. It is 33*0.42/33.3 = 0.416 Km^2/W.”
That’s not a conversion factor for any of the Rn(t). It’s the climate sensitivity to greenhouse gas forcing derived from Manabe and Wetherald, (1967).
Incredible. You’re just winging it, aren’t you.
You wrote, regarding my emulator and GCM projections, “I’m glad that you treat GCM’s observable behaviour, but you only treat their mean behaviour.”
You’re wrong, Rich. Your mistake is so obvious, it’s as though you never read my paper. My figures demonstrate 68 successful emulations of individual projection runs.
My emulator reproduces single GCM runs, Not just means. Look at Figure 1 in my paper. Emulator equation 1 will reproduce every single one of those 19 GCM projections merely by varying fCO2.
You continued, “Consequently you cannot understand their dispersive behaviour, which leads you, in my opinion, to draw unfounded conclusions on their uncertainty.”
My emulator can reproduce their dispersive behavior. But much more important than that, the predictive uncertianty — the physical reliability — is not determined by GCM dispersive behavior.
Predictive uncertainty is determined by simulation error in comparison to observables. By calibration, Rich.
I and others have communicated that methodological truth over and over again to you. And you just as often ignore it or dismiss it, or otherwise disregard it. And yet that concept is absolutely central.
GCMs are physical models subject to physical accuracy tests. You’re treating them as statistical models subject to conjectural dispersion. Your entire approach to predictive uncertainty is misguided. It’s worse than careless, Rich.
You wrote, “Well, if you refuse to engage with my quite legitimate analogy,…”
I did engage it. Your analogy was sloppy and wrong. No more need be said than that if dimensions are not identical on both sides of the equal sign, the equation is wrong.
Pat Mar 11 12:45pm
Pat, below I’ll mark your comments with ‘P:’, followed by my responses. I’m only going to bother with the persistence thing and my Equation (2R).
P: Re: Response 1: Rich, you wrote, “Please can you explain “persistence”?”
P: Persistence means your equation includes memory. Every W(t) explicitly includes a fraction of the W(t-1). One must know a W(t-1) to calculate a W(t). In contrast, with paper eqn. 1, one can calculate any temperature, T, from forcing alone.
OK, my Equation (1R) is like a first difference of your Equation (1). If you do that first difference, then on the RHS you get delta F_t/F_0, i.e. (F_t-F_{t-1})/F_0, which now looks like (1R) if my a=0. Conversely, if I sum my (1R) to get (2R), then my W(t) only depends on the forcings plus a term for W(0), which is like your Equation (1) where your a corresponds to an initial condition. So it merely depends on presentation as to whether persistence is apparent. What my equation does have, if my a > 0, is decay.
P: You wrote, “Pat, your notation t_k is crazy, it just means k.”
P: I was just adapting to the R(t) notation you introduced, Rich.
Whatever. We are working with a discrete time series and all arguments to R_i(t), W(t) etc. are integers, and to denote these it is simplest to use single letters.
P: Continuing: in your “W(t) = (1-a)W(t-1) + R1(t) for t = 1,2,3,…” derivation, your
“W(3) = (1-a)W(2) + R1(3) = R1(3)+(1-a)R1(2)+(1-a)^3 W(0)” is not correct.
You are quite right, that was a slip. So I’ll cut your correction of it and continue with:
P: You are missing the entire ‘(1-a)^2 R1(1)’ term.
Yes, I was missing that; I have heard some people call it a “thinko” – a bit like a “typo” but a greater brain failure! So, putting that term back in, we have:
W(3) = (1-a)W(2) + R1(3) = R1(3)+(1-a)R1(2)+(1-a)^2 R1(1) + (1-a)^3 W(0) (#)
P: Then you wrote, “There, you have your preferred sum from 1 to t. Now let i = t-k, which runs from 0 to t-1:
P: W(t) = sum_{i=0}^{t-1} (1-a)^i R1(t-i) + (1-a)^t W(0) (2R)”
P: That equation is wrong as written.
No, it isn’t, and I can’t believe we’re still going over this. Substitute t=3 into it, term by term for the 3 terms i=0,1,2 and you get:
W(3) = (1-a)^0 R1(3) + (1-a)^1 R1(2) + (1-a)^2 R1(1) + (1-a)^3 W(0)
which is the same as (#) above. It is not a hard substitution to do, changing from k to i = t-k, and high school students are expected to be able to do that. k runs forward from 1 to t, and i runs backward from t-1 to 0. Until you accept that, there is no point in considering the rest of your comment, which seems to imagine i running forward in parallel with k, despite the fact that i+k is the constant t. The laws of commutative algebra allow for writing a sum in the reverse order!
Rich, you wrote, “Well, all this makes me feel as I am in some sort of high octane contest, …”
One that you began.
You wrote, “But unlike you, I am content for us to have these points of view without claiming that one is more accurate than the other.”
Your entire approach, til now, has been that the statistics of random numbers account for all physical error. It does not and is badly incomplete.
“in fact I know a good deal of science,…”
Your post and comments have provided no evidence for that. It is not arrogant to notice.
You wrote, in your March 11, 2:58 “No, it isn’t, and I can’t believe we’re still going over this. Substitute t=3 into it, term by term for the 3 terms i=0,1,2 and you get:
W(3) = (1-a)^0 R1(3) + (1-a)^1 R1(2) + (1-a)^2 R1(1) + (1-a)^3 W(0)
which is the same as (#) above. It is not a hard substitution to do, changing from k to i = t-k, and high school students are expected to be able to do that. k runs forward from 1 to t, and i runs backward from t-1 to 0. ”
Your t = 3 does not yield W(3). It yields W(2) because old W(0) is your new W(-1). In your formalism, when t = 3, i = 2.
I showed that with, t = 3; i = 2; Old W(3) = new W(2) = (1-a)^3 W(-1) + (1-a)^2 R1(0) + (1-a)^1 R1(1) + (1-a)^0 R1(2)
Your notation is wrong.
You wrote, “The laws of commutative algebra allow for writing a sum in the reverse order!”
But we’re not working strictly within The laws of commutative algebra [which] allow for writing a sum in the reverse order!
We’re working with your temperature-time series, within the scheme of physics. Your emulator sum requires knowing the prior temperature before the subsequent temperature can be calculated. It’s physical nonsense to run that sum backward. The ‘i’ must necessarily run from 0 -> (t-1), never from (t-1) -> 0.
Pat, I promised yesterday to let you have the last words. I forgot to put in a proviso, which would be “except for mathematical correctness”. So for example, you have made some interesting comments on my 10:9 games, which I would happily comment on, but it is all a matter of interpretation and relevance, and not correctness, so I shall remain quiet on that.
However, the question of W(3) and Equation (2R) is about correctness, so on that I break my embargo and revisit it. I think we had agreed that my Equation (1R) W(t) = (1-a)W(t-1) + R1(t) led to the equation:
W(3) = (1-a)W(2) + R1(3) = R1(3)+(1-a)R1(2)+(1-a)^2 R1(1) + (1-a)^3 W(0) (#)
You then claimed that
W(t) = sum_{i=0}^{t-1} (1-a)^i R1(t-i) + (1-a)^t W(0) (2R)”
is wrong as written. I said no, it isn’t, substitute t=3 into it, term by term for the 3 terms i=0,1,2 and you get:
W(3) = (1-a)^0 R1(3) + (1-a)^1 R1(2) + (1-a)^2 R1(1) + (1-a)^3 W(0) (##)
which is the same as (#) above. On the RHS the first term is i=0, the second is i=1, the third is i=2. Do you dispute the equality of (#) and (##)?
In your Mar 14 11:41am comment you write: “Your t = 3 does not yield W(3). It yields W(2) because old W(0) is your new W(-1). In your formalism, when t = 3, i = 2.”
No, there is no old W and new W, there is just W, the same sequence W(t) of numbers deducible by either (1R) or (2R) from given quantities W(0), a, R1(1), R1(2),… A value of t (say 3) does not imply a value of i (2), because i is a variable running from the limit 0 to the limit t-1. If we don’t reverse the order of summation then we get:
W(t) = sum_{j=1}^t (1-a)^{t-j} R1(j) + (1-a)^t W(0) (2Rj)
(2R) and (2Rj) are both mathematically correct consequences of (1R). You can argue about which one looks nicer, which one makes it easier to understand the physics behind it, or which one is easier to use in further manipulations. But you can’t argue, correctly, about the correctness of either of them.
You’re wrong again, Rich, because you’re not keeping track of your own formalism.
Your ‘i’ starts at -1, and your purported initial value W(0) is properly W(-1).
All the labels thereby need to be modified.
As a way out of that corner you also claimed a sum of physical results can be run backwards — a nonsense exit from your mistaken usage.
No, in my (2R) i starts at 0, not -1. It’s my equation and you can’t just change it to suit your purpose. I don’t think you will find even 1% of mathematicians to agree with your interpretation, any more than they would if you said 2 plus 2 was 6 (you obviously wouldn’t say 5 because we all know even plus even is even :-)).
And I only said that a sum of real numbers can be run backwards, as in 2+3 = 3+2, and I’m not sure what you mean by “physical results” in the context.
You started your sum at 0 rather than 1, Rich. That requires your zeroth state to have label -1. The first term of your sum must be W(0) = (1-a)W(-1)+R1(0).
Your ‘t’ and ‘i’ are become out of synchrony throughout.
We’re working with physical reality here, Rich. What mathematicians think is pretty much irrelevant, unless they choose to operate in a physical context.
I tend to doubt, in any case, that you’d find much agreement among mathematicians for your position, after they note the 0->(t-1) range on your sum.
The zeroth state can only be W(-1). All else follows.
Pat: You wrote, “Well, if you refuse to engage with my quite legitimate analogy,…”
Pat: I did engage it. Your analogy was sloppy and wrong. No more need be said than that if dimensions are not identical on both sides of the equal sign, the equation is wrong.
If you really believe what you say, given that I solved a differential equation with perfect dimensional balance, but on moving to a discrete time approximation had to convert time units to unity, then why bother with a paper saying that the GCMs are awful because of huge uncertainty bounds? Why not just say the GCMs are dead wrong because they start with differential equations, discretize them to run them on computers, and lo and behold they are no longer dimensionally correct! Simples!
Rich.
You should have started using a properly dimensioned equation.
You wrote, “given that I solved a differential equation with perfect dimensional balance,” after giving ‘a’ the units of 1/y for reasons that don’t seem distinguishable from immediate opportunism.
Your starting emulator was W(t) = (1-a)W(-1) + Rn(t).
The ‘1-a’ was a scale factor. The ‘a’ was necessarily dimensionless. Now, suddenly, it has dimension 1/y in order to solve your immediate need to account for the time derivative.
And then, when it later turns out to be inconvenient, you discretize to encrypt its presence and so solve the inconvenience.
Paraphrasing you, why bother with having dimensions at all in physics when you can add them or remove them ad libitum.
Response 2
In this response I deal with the end of my Section B and with Section C.
Pat, you wrote “Further, a>0 causes general decay only by allowing the mistaken derivation that put the (1-a) coefficient into the Rn(t) factors in 2ʀ”. But there was no mistaken derivation, with the typo in 2R corrected (t -> t-1). Your further conclusions on Section B are then unwarranted, though I could comment further if you were to expand on what you mean by “unwarranted persistence”.
In Section C (your Section II) you start with the words “so-called decay parameter a”. You have accused me in the past of being tendentious; you need to be a bit careful to avoid hypocrisy here, as “so-called” is a pretty loaded phrase. ‘a’ is a decay parameter, no two bones about it.
A bit later you say that my parameters ‘c’ and ‘d’ are actually functions. I’m afraid you can’t do that. I may be Tweedledum to you, but it was my essay and my model equations and my coefficients. You may not like them, but you have to treat them on my terms. So I’ll reiterate them, explain their intention, and since you are a dimensioneer, I’ll add dimensions.
M(t) = b + cF(t) + dH(t-1) (6R)
H(t-1) = H(t-2) + e(M(t-1)-M(t-2)) = H(0) + e(M(t-1)-M(0)) (99R)
You didn’t include the second one, so I’ve labelled it (99R). M is temperature of dimension K, F is forcing flux of dimension W/m^2, H is heat content of dimension J (joules). Yes, when I wrote “heat content” I actually did mean that, not “heat flux” as you later describe H(t-1). Therefore the real numbered coefficients b, c, d, e have the following dimensions:
b: K
c: Km^2/W
d: K/J
e: J/K
Therefore de is dimensionless, and a = 1-de is a perfectly sensible dimensionless decay rate.
Those two equations are just a model, an attempt to penetrate what might be going on in either the climate or the GCMs or both. I am trying to understand what happens to errors inside the black box GCMs, by opening them a little into grey boxes if you will. Do the equations make any sense, not in an exact physical way, but as a crude model of physical processes? Possibly they make more sense as anomalies rather than full values. So (6R) says that if the oceans have more heat H(t-1) than “average” then they can, and probably will, give up some of that heat to raise surface temperature. And (99R) says that if temperatures have risen over a period of time then some of that will have forced more energy to be stored in the oceans. I wrote “there may be some quibbles about it, but it shows a proof of concept of heat buffering leading to a decay parameter”. No doubt you will quibble, but I stand by the general concept of a buffer in the GCMs leading to a non-zero decay parameter. I can’t prove this; no modeller has come along and said “yes, Rich, that is what is going on”. Or the reverse; for me for now it is an open question.
Then further down, for my anti-correlation term R3(t), you rightly say that I approve of Roy Spencer’s critique. You then say that we are both wrong because we conflate physical error with predictive uncertainty. That is a big topic. It lies at the heart of our differing approaches to uncertainty analysis, of what we want our emulators to say, or not to say, about GCM performance, and possibly even of what we mean by mean. As such it must await a later response.
Rich, in Response 2 you wrote, “But there was no mistaken derivation, with the typo in 2R corrected (t -> t-1). ” But mistake there is.
And correcting to i->(t-1) doesn’t correct the first mistake and just goes on to make another one. Summations go to the final step of the series, not to the penultimate step.
My criticisms of your Section B remain in place and unscathed.
You wrote, “you start with the words “so-called decay parameter a”. You have accused me in the past of being tendentious; you need to be a bit careful to avoid hypocrisy here, as “so-called” is a pretty loaded phrase. ‘a’ is a decay parameter, no two bones about it.”
Your decay parameter ‘a’ itself is tendentious, Rich. Yet another oracular presumption; more evidence — as if we needed any more — that your emulator is fundamentally unlike mine. Mine has no decay parameter and no persistence. Yours has both.
You wrote< “A bit later you say that my parameters ‘c’ and ‘d’ are actually functions. I’m afraid you can’t do that.”
I’m afraid right back that I can do, as justified by mere inspection. Your ‘c’ converts a forcing into a temperature. It is necessarily a function. Your ‘d’ converts Joules into Kelvins. It is necessarily a function. They are your functions, but you clearly misunderstand them. They are not coefficients.
You wrote, “So I’ll reiterate them, explain their intention, and since you are a dimensioneer, I’ll add dimensions.
b: K
c: Km^2/W
d: K/J
e: J/K”
Such a convenient after-the-fact set.
So, now let’s look at your final emulator equation, M(t) = b + cF(t) + dH(t-1) and put in the units:
Temperature at time ’t’ = M(t) = b(K) + (Km^2/W)F(t)(W/m^2) + K/JH(t-1)(J).
Your (K/J)*H(t-1)(J) provides the Kelvins of the prior step entering into subsequent time-step ’t.’ Your (Km^2/W)*F(t)(W/m^2) adds the Kelvins produced by the new forcing of time step ’t.’
Prior temperature plus new added temperature = total new temperature. There is nothing left for your term ‘b(K)’ to do.
The F(t) and H(t-1) terms account for the entire new temperature at every new step. Necessarily and in every step, b(K) = 0.
Now lets look at ‘d’ and ‘e’: d = K/J, while e = J/K. They are exact reciprocals. de = ed = 1. Therefore, “ de is dimensionless” is true because de = 1, and “a = 1-de is a perfectly sensible” a = 1 – 1 = 0.
Therefore your (1-a)M(t-1) term becomes (1-(1-1))M(t-1) = 1(M(t-1).
Meanwhile your f = b+dH(0)-deM(0) becomes 0 + M(0) – 1M(0) = 0
And finally your entire emulator M(t) = f + cF(t) + (1-a)M(t-1) becomes M(t) = 0 + cF(t) + M(t-1).
It has no decay term.
So new temperature M(t) = prior temperature + new delta temperature. That’s your whole emulator, Rich.
And let’s note it’s not like mine at all. My emulator has no temperature terms to the right of the equal sign.
Your emulator is not a generalization that can be reduced to mine. It is the physically trivial: new temp = old temp plus change in temp.
You wrote, “I am trying to understand what happens to errors inside the black box GCMs…”.
I used observed and published GCM calibration error to propagate uncertainty. Errors made within the GCM are irrelevant. Error that appears in the output is my critical focus.
Your focus is not mine. Why would you think your quest has any relevance to mine?
I have now spent considerable time answering your criticisms. They have invariably been ill-considered. I’m of a mind to not spend any more time on them.
OMG, J/K and K/J are dimensions, not values! d in units of J/K times e in units of K/J equals d*e in units of 1, and de is not in general equal to 1. d and e are real numbers with stated dimensions, a = 1-de is a real dimensionless number.
It is clear from all you have written that you are a fine physicist and chemist, with a probing independent mind, but mathematics is a weak point, yet mathematics is indelibly woven into uncertainty theory so I, for one, cannot trust it in your hands.
I’m going to persevere a little longer with the huge swathe of things you have written against my essay, but I may soon follow you in like mind not to converse further. How can chalk talk to cheese? Mathematics should be a common language, but between us it seems not to be so.
Rich.
After the 2nd paragraph above I put a left-bracket-slash adhominem right-bracket to indicate I realized I was being ad hominem there, or merely truthful depending on POV and sensitivity. That marker got lost.
Rich, you wrote, “OMG, J/K and K/J are dimensions, not values! d in units of J/K times e in units of K/J… ”
You got it backwards, Rich. Your d is K/J and your e is J/K.
Honestly, Rich, sometimes you get in over your head without even knowing you’re near a lake.
Let’s look at the relevant equations:
M(t) = b + cF(t) + dH(t-1)
and
H(t-1) = H(0) + e(M(t-1)-M(0))
In the first, d must convert H(t-1) in Joules into the appropriate magnitude of Kelvins. Doing so requires the proper conversion factor. That factor can only be your ‘d,’ because ‘d’ is all you have provided.
In the second, ‘e’ must convert delta-M in Kelvin to Joules. Doing so requires the proper conversion factor. That factor can only be your ‘e,’ because ‘e’ is all you have provided.
So, ‘d’ converts J to K, while ‘e’ converts K to J. They are necessarily perfect reciprocals; de = ed = 1.
Going further, ’d’ supposedly converts Kelvins to Joules. So, you just blithely give it dimension K/J. But proper conversion requires, e.g., knowing the heat capacity of the substance.
There is no such thing as J/K or K/J outside of a material context.
Your H(t-1) in Joules is presumably the heat content of the atmosphere. So, you’d need the atmospheric heat capacity, which varies with the humidity. The units of heat capacity is (J-K)/gm, symbol Cg. So, your (K/J)*[(J-K)/gm] = (K^2/gm), which is now the dimensions attached to d.
Your equation term becomes d(K^2/gm) *H(t-1)J = JK^2/gm, which is not the Kelvins of M(t).
Likewise, you made ‘e’ to be J/K merely by assignment. Suiting your convenience, but with no physical meaning.
There, you want to convert Kelvins to Joules. To do that, you’d need the reciprocal of the heat capacity, i.e., gm/(J-K).
So now we have (J/K)*gm/(J-K) = gm/K^2. And the term becomes e(gm/K^2)*[M(t-1)-M(0)](K) = gm/K, which is not the Joules of H(t-1).
Life gets tough when one must keep track of the details, doesn’t it.
You wrote, “d and e are real numbers with stated dimensions, a = 1-de is a real dimensionless number.”
Not the way you wrote them, they’re not.
For d & e to be dimensionless, you have to write your equations this way:
M(t) = b + cF(t) + d(Cg)H(t-1)*wt
and
H(t-1) = H(0) + (e/Cg)(1/wt)(M(t-1)-M(0),
where ‘Cg’ is heat capacity and ‘wt’ is the weight of the material (the atmosphere in this case). In that formalism, ‘d’ and ‘e’ are now become dimensionless scale factors.
However your ‘de’ cannot be dimensionless in your revised usage, because in your confusion you’ve opportunistically assigned them conveniently reciprocal dimensional units. That screws up your equations all over again.
All of that just shows you don’t know what you’re doing.
Even taking your original equations as representational with implied conversion factors, your emulator has no correspondence with mine. It casts no illumination on my work. Your entire critical effort has been an irrelevant exercise.
You wrote, “It is clear from all you have written that you are a fine physicist and chemist, with a probing independent mind, but mathematics is a weak point, yet mathematics is indelibly woven into uncertainty theory so I, for one, cannot trust it in your hands.”
I’m a physical methods chemist, not a physicist. Our debate is not mathematics but science. I know some of the one, but you know none of the other. And that’s been the crux of the entire problem.
It is ironic that you say I am weak in mathematics, when it is your efforts that have been rife with careless math mistakes.
You have yet to show any math mistake in my work or in my discussions of the mathematics of uncertainty. Instead, you have invariably tried to turn every notion of systematic error into the statistics of random numbers. This, despite repeated demonstrations that you’re mistaken. The weakness is yours, Rich. You plain will not accommodate the hard-won analytical methods of science.
A more accurate rendering of our difference might be, It is clear from all you have written that you are a fine statistician but physical error analysis is a weak point. Uncertainty theory must accommodate science in the analysis of non-normal error, in the demands of accuracy, and in the expansion of ignorance across sequential error-ridden calculations, all of which you are loathe to do. So I, for one, cannot agree to its stifling at your hands.
You wrote, “the huge swathe of things you have written against my essay”
One huge swathe deserves another, Rich.
You wrote, “How can chalk talk to cheese? Mathematics should be a common language, but between us it seems not to be so..
Our conversation concerns science not mathematics. Our common language should be science. But you insist on stuffing the conversation into your statistical box.
Math serves science around here, Rich. It provides our grammar. It does not provide our content.
You have invariably insisted that math alone — your statistical math — defines the content of science-based physical error analysis. It does not.
You have brought your chalk into the land of cheese. You — you, Rich — have repeatedly insisted that our cheese is your chalk. It’s not.
Science is not the branch of statistics you intend it should be.
Pat Frank, I note your statement, “The units of heat capacity is (J-K)/gm”. Shouldn’t the units be J/(gm-K)?
David, see also my comment currently at the very bottom (5:41am). I hadn’t seen anything like J-K before and was worried that that was a crazy minus sign, but now I see it must be elision (I’d prefer J.K though for that, or even just JK). As you say, it should be J/(g K), or J/(kg K); gm for grammes isn’t in the SI I think. So it appears Pat can suffer the odd “thinko” too!
David and Rich, you’re right. My mistake in transcription.
I’ll follow on appropriately.
Response 3
Pat, instead of continuing linearly, here I skip down to your Appendix on the interesting question of including systematic error in uncertainty variance. You claim I am wrong, by virtue of incompletely quoting my theorem, and yet we appear to be in “violent agreement”, in the sense that we both agree that the mean error b does not/cannot appear in the uncertainty variance. The question of uncontrolled variables is a separate matter.
I originally wrote my theorem to persuade you that b has no place in uncertainty variance, and in your Cases 1 and 2 you agree with that. Here is my theorem, where the bold part is what you displayed, out of context.
Theorem: Assume that prior to a measurement, the error which will occur may be considered to be a random variable with mean b and variance s^2. Let v = s^2 and let g(v,b) be a differentiable function which defines the “uncertainty” of the measurement. Assume that the correct formula for the uncertainty of a sum of n independent measurements with respective means b_i and variances v_i is that given by JCGM 5.1.2 with unit differential: g(v,b) = sqrt(sum_i g(v_i,b_i)^2) where v = sum_i v_i, b = sum_i b_i. Then if g(v,b) is consistent under rescaling, i.e. multiplying all measurements by k then multiplies g by k, g(v,b) is independent of b,
The bold part is not “wrong”, as you claim, but requires g to be a function only of v, not b. And in the proof it is shown that g(v,b) = sqrt(2v k_v), which if k_v is taken to be ½ gives g(v,b) = sqrt(v) = s, the standard deviation of error, which is of course the normal input to an uncertainty calculation. So where you claim that something contradicts my assumptions, you have misread the situation (literally, you have misread the meaning of my theorem).
I looked at your example Cases. I think we can pretty much agree on the first two. Case 3 is the interesting one, where you have x_i = X + r_i + d_i, and d_i is a deterministic non-random systematic error, and you say that the d_i enter into the uncertainty calculation. (You say non-random, but later you write “the deterministic cause of d_i, d_j will very likely make their distribution non-normal”, which confusingly seems to imply a random variable.) Having d_i be unrepeatable is problematic, because the JCGM defines systematic error as that which would remain after an infinite number of measurements. Perhaps you envisage that many measurements could in theory be taken, but in practice are not taken before d_i changes by a significant amount. This is a tricky concept, and I don’t see much mention of it in the JCGM. If the d_i’s are that much variable, isn’t it better to subsume them into the r_i’s? An example might help to tease this out, and in Case 4 you mention global cloud fraction, so perhaps that would be a good arena in which to exemplify more. Such cases are not covered by my theorem, so I am definitely interested to learn more on this.
More support:
Re: I am definitely interested to learn more ((applause))
Good for you. (Not just saying this because I want you to understand Pat, but because I simply admire that you want to learn and SAID SO. Well done.)
(((whispering, now))) Psst. Rich. Just a little friendly advice. A more friendly tone would help those of us reading your reply to “hear” you, e.g., instead of accusatorily saying, “you have misread my…” try, “I may have not written what I intended to there; here it is restated … .” You started off VERY well in your March 6, 2020 0615 comment. Nice collegial tone. And re: that comment, I have to add to my above remarks that all along, as I kept returning to this thread, I kept wondering, “Where is Rich? Why isn’t he replying? …” No doubt others were wondering the same.
I am getting a bit in the way, now, so I must be going.
Oh, one more thing, (in case this might help you to be open to learning from Pat), remember that many of the WUWTers writing in support of Pat have excellent math knowledge and skills.
Here’s to you, Rich! A valiant (if mistaken — on this point — imo) warrior for CO2 truth!
HOPE YOU FEEL SUPPORTED. 🙂
Janice
P.S. Congratulations on your retirement! Enjoy.
More support:
Re: I am definitely interested to learn more ((applause))
Good for you. (Not just saying this because I want you to understand Pat, but because I simply admire that you want to learn and said so. Well done.)
(((whispering, now))) Psst. Rich. Just a little friendly advice. A more friendly tone would help those of us reading your reply to “hear” you, e.g., instead of accusatorily saying, “you have misread my…” try, “I may have not written what I intended to there; here it is restated … .” You started off VERY well in your March 6, 2020 0615 comment. Nice collegial tone. And re: that comment, I have to add to my above remarks that all along, as I kept returning to this thread, I kept wondering, “Where is Rich? Why isn’t he replying? …” No doubt others were wondering the same.
I am getting a bit in the way, now, so I must be going.
Oh, one more thing, (in case this might help you to be open to learning from Pat), remember that many of the WUWTers writing in support of Pat have excellent math knowledge and skills.
Here’s to you, Rich! A valiant (if mistaken — on this point — imo) warrior for CO2 truth!
Hope you feel supported. 🙂
Janice
P.S. Congratulations on your retirement! Enjoy.