Guest Post by Willis Eschenbach
In the comments to my post called “Inside the Acceleration Factory“, we were discussing how good the satellite measurements of sea surface heights might be. A commenter said:
Ionospheric Delay is indeed an issue. For Jason, they estimate it using a dual frequency technique. As with most everything in the world of satellite Sea Level Rise, there is probably some error in their estimate of delay, but its hard to see why any errors don’t ether cancel or resolve over a very large number of measurements to a constant bias in their estimate of sea level — which shouldn’t affect the estimate of Sea Level Rise.
Keep in mind that the satellites are making more than 1000 measurements every second and are moving their “target point” about 8km (I think) laterally every second. A lot of stuff really will average out over time.
I thought I should write about this common misunderstanding.
The underlying math is simple. The uncertainty of the average (also called the “mean”) of a group of numbers is equal to the standard deviation of the numbers (a measure of how spread out the numbers are), divided by the square root of how many numbers there are. In Mathspeak, this is
where sigma (σ) is the standard deviation and N is how many numbers we’re analyzing.
Clearly, as the number of measurements increases, the uncertainty about the average decreases. This is all math that has been well-understood for hundreds of years. And it is on this basis that the commenter is claiming that by repeated measurements we can get very, very good results from the satellites.
With that prologue, let me show the limits of that rock-solid mathematical principle in the real world.
Suppose that I want to measure the length of a credit card.

So I get ten thousand people to use the ruler in the drawing to measure the length of the credit card in millimeters. Almost all of them give a length measurement somewhere between 85 mm and 86 mm.
That would give us a standard deviation of their answers on the order of 0.3 mm. And using the formula above for the uncertainty of the average gives us:
Now … raise your hand if you think that we’ve just accurately measured the length of the credit card to the nearest three thousandths of one millimeter.
Of course not. And the answer would not be improved if we had a million measurements.
Contemplating all of that has given rise to another of my many rules of thumb, which is:
Regardless of the number of measurements, you can’t squeeze more than one additional decimal out of an average of real-world observations.
Following that rule of thumb, if you are measuring say temperatures to the nearest degree, no matter how many measurements you have, your average will be valid to the nearest tenth of a degree … but not to the nearest hundredth of a degree.
As with any rule of thumb, there may be exceptions … but in general, I think that it is true. For example, following my rule of thumb I would say that we could use repeated measurements to get an estimate of the length of the credit card to the nearest tenth of a millimeter … but I don’t think we can measure it to the nearest hundredth of a millimeter no matter how many times we wield the ruler.
Best wishes on a night of scattered showers,
w.
My general request: when you comment please quote the exact words you are referring to, so we can avoid misunderstandings.
“A lot of stuff really will average out over time.”
A septic tank does that. Averages stuff, over time. But it’s still crap.
But it’s still crap.
Ha ha ha ha ha ha ha ha ha ha ha!
Can I have the Australian and Canadian rights to your quip Bruce?
“But it’s still crap”
Over time, that crap turns to natural fertilizer. All craps do that given enough time.
Well then, let’s use the model mean for fertilizer.
willis
What happens if the thing you are measuring is not a single thing (like the length of a credit card) but is itself the average of (say) 1000 things? In that case, would taking 1 million, or 100 million measurements make the average more accurate?
My understanding, and I could be wrong, is that such techniques are valid for multiple measurements of the same thing.
But not for multiple measurements of something which is in the process of varying over time, or which is different depending on which part of the thing you are measuring.
I know that a number of commenters in previous threads had this same argument regarding such things as the global temp, and averaging readings from different places or the same place on different days.
My understanding, which I freely admit may be wrong, is that you have to be measuring the same thing multiple times. If you have one ore sample, it is one thing, and results can be averaged.
You cannot average measurements taken one for each of one hundred samples. AFAIK.
Besides for that, in every single class I ever took, you cannot report any result to more significant figures that the least number of sig figs in the measurements, whenever two or more measurements are factors in the final result. Because then you are multiplying errors and claiming precision you did not measure.
I have not read any of the comments yet, but I know this is always a huge debate whenever this comes up. This is my out of the gate thoughts only…and going by memories from a long time ago.
I do not think I have done this sort of work since I got an A in analytical chemistry in college, which was a long time ago.
I am looking forward to this comment thread, because I do not know nearly as much about this as Willis or probably numerous others here.
You’re right, Menicholas, and so is Willis.
The lower limit of accuracy is given by the resolution of the instrument. In the case of Willis’ ruler, the smallest division is 1 mm. The most careful measurement, using one’s most analytical eye-ball, is to 0.25 mm.
It won’t matter how many gazillion measurements are taken, the length of the credit card will never be better than, e.g., 8.6±0.25 cm. And that’s assuming all the score-marks on the ruler are regularly spaced to 10-times that accuracy.
There’s another little interesting fillip about satellite sea level. Assuming all the error is random, it will average out as per Willis’ example: 1/sqrtN. But if the true height of the sea level varies from place to place, then the measured heights will have different physically true values.
That means sea level will have a state uncertainty because its height — its physical state itself — varies. So, the reported sea level should be reported as mean±error, and mean±(physical variation). Unless all the physical variations can be shown to be normally distributed about the mean.
Providing The measurement instrument is has the accuracy levels claimed of it and you deal with operatory error.
So in this case, to what level of precision is the ruler marked, is 1 mm really 1 mm through the whole range being used , and what measure where taken to deal with factors such has eye-sight ?
These really matter when you start to make great claim about either the data or the accuracy of the information.
You can try the average route but you still average the numbers without knowing any errors and hopping, not knowing, if the process covers the errors.
Remember Wittgenstein’s Ruler: Unless you have confidence in the ruler’s reliability, if you use a ruler to measure a table, you may also be using the table to measure the ruler.
The ENSO state uncertainty comes to mind.
It can be seen that the range of the multivariate index or sea level show a range of some 5 “normalized values”. It would be interesting to know how these values translate into meters, but this is no random variation.
Theoretically this variation is achieved by ENSO causing more rain to fall over the ocean and less over land, causing particularly South American tropical land droughts.
The Rossby and Kelvin alternating reaction waves bouncing back and forth across the basins, and the trade wind stacking are other state uncertainties at annual scale.
The rule works if errors are perfectly random and so with millions of measurements, nearly every measurement of +0.251 is cancelled by a -0.251 measurement. You can’t round measurements to the nearest whole and expect the rule to still be true. Even if you did each measurement to many more significant figures, you use the rule assuming no systematic error eg temperature change during measurements have contracted ruler and card differently. A bit silly to just assume it when applying the rule to a million measurements even if they don’t vary spatially or with time.
Thank you Pat and Robert for the replies.
Very good point about systematic errors…I was thinking the same thing…the ruler could be defective, or it could be at a far different temperature than when it was manufactured.
The picture could have been taken from an angle, instead of square on, etc…
Another thing I wanted to mention was being careful to use consistent terminology, and to use the words that mean what one if attempting to communicate, in particular the distinction between accuracy and precision.
About the sea level measurements…we know for sure that the level of the sea is not anything like a symmetrical oblate spheroid, and for several reasons this is true. The actual shape of the Earth is called the geoid. I will post a link to an image of this shape. It is a bumpy lumpy and irregular shape. And sea level is even more so, given that sea level is defined in terms of the gravimetric field of the Earth in each given spot of the surface, and this varies tremendously, as mentioned and depicted briefly in the minute physics video on sea level.
I am not even gonna pretend I have any special insight based on knowledge of the method used by the people interpreting the satellite data…but given all of the comments over the past few days from people who evidently do have such knowledge of at least some aspects of the method and possible confounders, and when I have gleaned about what “sea level” even means…I think I would doubt the results of satellites over tide gages and old photos of known landmarks, even if I thought the people in charge of the entire process were as unbiased as a person could possibly be…which I do not.
In common parlance, these terms are interchangeable, but we know that is not the case when discussing measurements, and errors, and statistics, and such. Mentioning this for any readers who may not be familiar with this distinction, which is anything but a trivial one. But also to remind myself, because even though I know this, I still find myself using the wrong words sometimes in my haste to get my thoughts typed out.
I do not know if there are different version of this…but most of the images of the geoid look more or less like this…using this one because it is a rotating gif:
http://2.bp.blogspot.com/-2o3hXVPI2EM/VnTlaYLmeVI/AAAAAAAAuEA/fDXIf3NA8yE/s1600/geoid.gif
Oops, my second comment got the last part, beginning “In common parlance…”,
out of order.
Those last sentences should be further up…here is how it was meant to read:
Thank you Pat and Robert for the replies.
Very good point about systematic errors…I was thinking the same thing…the ruler could be defective, or it could be at a far different temperature than when it was manufactured.
The picture could have been taken from an angle, instead of square on, etc…
Another thing I wanted to mention was being careful to use consistent terminology, and to use the words that mean what one if attempting to communicate, in particular the distinction between accuracy and precision.
In common parlance, these terms are interchangeable, but we know that is not the case when discussing measurements, and errors, and statistics, and such. Mentioning this for any readers who may not be familiar with this distinction, which is anything but a trivial one. But also to remind myself, because even though I know this, I still find myself using the wrong words sometimes in my haste to get my thoughts typed out.
About the sea level measurements…we know for sure that the level of the sea is not anything like a symmetrical oblate spheroid, and for several reasons this is true. The actual shape of the Earth is called the geoid. I will post a link to an image of this shape. It is a bumpy lumpy and irregular shape. And sea level is even more so, given that sea level is defined in terms of the gravimetric field of the Earth in each given spot of the surface, and this varies tremendously, as mentioned and depicted briefly in the minute physics video on sea level.
I am not even gonna pretend I have any special insight based on knowledge of the method used by the people interpreting the satellite data…but given all of the comments over the past few days from people who evidently do have such knowledge of at least some aspects of the method and possible confounders, and when I have gleaned about what “sea level” even means…I think I would doubt the results of satellites over tide gages and old photos of known landmarks, even if I thought the people in charge of the entire process were as unbiased as a person could possibly be…which I do not.
I was working on a complex analogy, and realised that it was summed up by Robert’s point that each error +0.251 is balanced by a -0.251 if the errors each side of the measurement are balanced.
However, is this discussion really relevant to satellite measurements? If we are discussing tide guages then yes, measurements are taken to the nearest millimeter (or 10th or whatever they use). Satellites will measure to much greater precision than that. I just hope that the average was taken before the measurements were rounded.
Its been a while since I read up on it. I remember that average distance to a swathe of surface is measured with a precision of ±2 cm with a method of judging what the conditions of the area is and choosing the right model for the wave conditions. You really can’t expect the law of large numbers to fix up any systematic errors.
Haven’t read the other responses yet, but my first reaction is, you are correct. My background is geology (so last century), but we had to take chemistry and then had geochemistry classes. Don’t get me started on the geophysics professor who had no concept of sig figs.
I took a bunch of geology classes, starting with physical, and then took one called Geology of the National Parks, and also took Earth history. At that time I was thinking I would pursue a degree program called Interdisciplinary Natural Science, but mostly I was just taking classes in subjects I wanted to know more about and was not really thinking about a degree.
So I also took physics, history of science classes, zoology and other biology classes, and some other classes that were heavy on earth history and the history of science.
Then I started to take more classes in physical geography, since I wanted to study the weather and those were prerequisite to meteorology and climatology and hydrology.
But then I found out that physical geography, meteorology and all of those classes were not considered natural science classes, hey were in the humanities dept, and I had so many science credits it was impossible to get a degree in any of those subjects without taking over a year of extra classes. At that point I had a lot of chemistry classes, and it seemed very easy to me and hard to other people, and also it seemed like pretty much everything is based on chemistry and physical chemistry at some level, so chemistry it was.
If I had been getting good advice or I had in mind getting a degree from the get go that was going to result in a wide choice of readily available jobs, I would have been in the engineering dept, or premed/medical school.
I was just about to decide on what graduate degree to pursue when I found myself under intense pressure to help out with the family business, the plant nursery, when my dad got a brain tumor and was unable to do anything and was facing a tax nightmare. Previously, I was just building the place for them on my weekends and Holidays and a couple of summers…no plans to have anything to do with the biz, but I was the only one in the whole clan who knew anything about construction, so I just built and built until I had put up 80,000 square feet under shade and glass.
I was getting letters of employment offers from everyone from the Navy (civilian nuclear tech on a sub) to the EPA, and the nursery ting was supposed to be just season, then a year.
By the time I realized it, I had been out of school so long, could not decide on what to pursue or where to do it (I had been accepted to Penn twice…family legacy) and then wound up renovating historical buildings and shooting pool and womanizing.
Life is never what you think it will be, at least it was not for me.
There is no way to answer your question AndyL, this comes back to the issue of the salary for the Greenpeace employees. Once you start abstracting thing you need to know the distribution of the thing we are talking about.
In the Visa card situation you will get most around the same sort of number and a few who make complete mistakes etc so you can apply averaging knowing that you will get closer to the answer.
Wages for example can have a heavily cauchy distribution with management with one grouping of salaries and pleb workers with a different grouping of salaries. An average in such a situation is meaningless in most analysis because it represents a salary no-one actually gets. No matter how many samples you take it doesn’t improve things because you need to first understand the distribution.
What Willis did not make especially clear in this post is that he is describing the precision of the standard deviation, not the precision of the measurement of card length.
So yes, collecting a million measurements and having them all fit within the described range of 1 mm, then the precision of the standard deviation becomes very tight indeed. But the standard deviation itself does not grow smaller … it remains 0.3 mm.
Same with any other statistical description based upon real world measurements, or conditions.
And amazingly, the SD from 36 measurements (samples) is almost always the same as the SD when taking a million measurements.
if you are measuring say temperatures to the nearest degree, no matter how many measurements you have, your average will be valid to the nearest tenth of a degree … but not to the nearest hundredth of a degree.
This assertion is probably wrong.
Because the statement clearly assumes that there are 10 distinct markings in between a degree (c or f or whatever) scale. However one can always subdivide the scale to a hundred, thousand or even a million markings (increasing precision). Once thousands of observers make millions of recordings of a set temperature (lets say the freezing water at sea level) the certainty will be more than a hundredth of a degree using the above formula of Standard Error of Estimate (SEE) However it does not mean the measured value will be exactly 0 degrees C (accurate), MEAN.
In other words, there seems to be confusion about accuracy and precision in the above statement.
Chris, my point is that beyond a certain limit, increasing the number of measurements doesn’t increase either the real-world accuracy or the real-world precision. All it changes is the statistical uncertainty, not the real-world uncertainty.
Best regards,
w.
“doesn’t increase either the real-world accuracy or the real-world precision”
I dont really understand what these two terms mean. Never heard “real-world” metrics before.
However, I agree with you that there is a limit to what can be measured. It is called the Planck’s length,1.6 x 10-35 m. Perhaps you meant this as your “real-world precision”
You are both correct, in a way, but also incorrect. I’ve had a bit of training in metrology. You both seem to be confusing accuracy and precision. The limit on the precision of any measurement [or aggregation of measurements] is the precision available to you provided by the measuring tool. The limit on the precision of the ruler in the image is 1mm. You cannot measure tenths of a millimeter, or any fraction of a millimeter, with a tool that is not marked in fractions of a millimeter. If you take a thousand measurements with that ruler, or one, you will never measure more precisely than ± 1mm. Similarly, if your radar altimeter, on one measurement, reads to the nearest tenth of a millimeter, your aggregation of fifty million measurements taken with that altimeter will never be more precise than ±.1mm. Whether it is accurate or not is another question, requiring comparison to a measurement taken of a known standard.
Brings to mind the same issue with the “warmest the Earth has been in history by .04C garbage we heard about a couple of years ago (or was it last spring). Up until the mid 20th Century, meteorological temperatures were measured to the nearest degree (F in the case of Central England which has the longest record) because greater precision didn’t matter. And until the late 20th century the tenth of a degree was eyeballed (estimated) I know, I was an observer. There were no computers to number crunch and make predictions and + or- one degree didn’t matter. Claims of warmer or colder with greater precision than one degree F (1 degF is more precise than 1 degC) are pure garbage!!!!!
“If you take a thousand measurements with that ruler, or one, you will never measure more precisely than ± 1mm.”
This is not actually correct. Andrew Preece’s comment (way, way below) gives a example from the world of electronics of how it is done.
When Kip posted a column about exactly the same thing I created a math experiment in Excel that you can repeat for yourself. You can create 1,000 instances of measurements, with randomly distributed errors around a specific known value. Round each individual measurement to the nearest round number. You can still estimate the mean with surprising accuracy. More measurements produce better estimates. With 1,000 measurements, you’ll almost always be within a two decimal places.
This is the fundamental assumption where theory departs from reality. The reality is that measurements are rarely evenly distributed around the value you want to measure.
If a measuring device is capable of measuring with say 3 values, 1, 2 and 3 and you’re wanting to measure 2.75 then you’ll probably actually measure 3. If on the other hand you’re trying to measure 2.5 then you might do it if there are evenly distributed values of 2 and 3 coming from the device.
” Never heard “real-world” metrics before.”
Measure the thickness of printer paper, I’ll provide the yardstick.
Measure the thickness of 500 sheets of printer paper 400 times.
That’s more relevant.
Then try to figure out “why” the differences are present:
Too much pressure on the caliper?
Too little pressure sometimes?
Different air masses between the 500 sheets of paper in each ream?
Different nbr of sheets in different reams?
Different sheet thicknesses between reams and between paper mills?
Different calipers?
Bad or changing caliper measurements?
Caliper differences in successive measurements?
Different users for the calipers?
Different humidity when measured?
but remember! The alarmists would have you believe none of these occur in the sea level measurements!
Instead of using credit cards or sheets of paper as an analogy (even though you have shown how many things can vary with even something as straightforward seeming as measuring something which seems fixed and definite), maybe a better analogy would be trying to find out how much cats weigh, and if they are gaining weight over time, and if so, is the weight gain accelerating?
Suppose one had set up scales all over the cat world and the weight of cats was measured as they ran or walked across it.
And someone else was driving around with a very accurate laser scanner which had been cleverly designed to measure the shape and size of cats very precisely, and had also devised and algorithm to translate these readings into a weight.
Or even something simpler…how long are cats from nose to tip of tail, and like the sea and the tides, the darn things are never holding still and can change the shape and length of the tail and do, continuously, do so.
And then I graphed all the results…and on the page with the graphs I said all of the values were averages which had removed seasonal and daily variations to correct for meal times and cats that got more to eat during certain times of year.
I like comparing trying to measure cats with trying to determine the level of the never-still sea surface! Very apt analogy
SR
“…the darn things [cats] are never holding still…” Oh then you’ve never met my slacker cat. She will sit in your lap for hours on end.
Andrew Stanbarger December 20, 2018 at 4:02 pm
You have also forgotten “repeatability” of the instrument.
Ever done Gauge Capability Studies?
” beyond a certain limit, increasing the number of measurements doesn’t increase either the real-world accuracy or the real-world precision.”
Number of measurements increases precision but not always accuracy. You can be precisely inaccurate.
You should really should take a stat class sometime.
In general, the S.E. (the S.D. divided by (approximately) the sqrt of the N) dictates the significant figures of the measurement. You can easily add one or two decimals with 30 measurements. And with 1000 measurements, go crazy.
That said, and you do haf to use a little judgement (no math formulas involved), but we have extended really crappy measurements to nice statistical differences all the time. It’s not rocket science. Just stats 101.
If you measure the height of the two sides of your favorite table enuf times, you will always find them different. (unless you use a ruler calibrated in cm only with no tenths of a mm.)
trafamadore December 20, 2018 at 3:41 pm
Accuracy is how close successive measurements are to the true value.
Precision is how close successive measurements are to each other.
Please explain, using my credit card example above, how taking more measurements of the credit card will make the measurements either more accurate or more precise.
w.
The more measurements the more confident you can be that the actual value is within a certain range.
However, the size of that range is determined by the physical nature of the measuring process and equipment.
They are two different things. But they are both called uncertainty because they are both related to the chance that a measurement isn’t reflective of the actual value.
Hope that helps resolve the confusion.
Actually, with your credit card example, if the true value was 85.8 mm, more people will get 86 than 85 and even fewer 87, and even fewer 84 and 88. So even though the ruler was calibrated in mm, you can easily increase precision of the measurement.
But I think you just being purposely obtuse on this, you could easily look that up in a stat book.
Willis, as long as the errors are randomly distributed, it is possible to determine the true value with surprising accuracy.
In a cell in Excel, create a “true value” between 85 and 86mm. Copy that number down into a column 1,000 rows long, and in the next column use the random function to create measurement errors and then add the errors to the true value. (Create random numbers between 1 and minus 1. Or heck, make the errors between 2 and minus 2.) In the next column, round all the measurements to the nearest whole number.
Now you have 1,000 measurements, all of which are wrong. How close do you expect the average of the 1,000 incorrect measurements to be to your true value? If you take 5 minutes to do this, I guarantee you are in for a surprise.
Steve O, isn’t it possible you learned something surprising about Excel’s “Random()” function?
trafamadore December 21, 2018 at 6:33 am
I agree … but only so far. First, despite your snide comment to me about taking a statistics class, you cannot increase precision by repeated measurements. Precision is the standard deviation of the measurements, which generally doesn’t change much with repeated measurements.
And while you can increase the accuracy by the means you describe, you cannot do so indefinitely. You’ll get another decimal of accuracy out of your procedure, but not another three decimals of accuracy. WHICH IS WHAT I SAID!
“Purposely obtuse”? Perhaps you and your friends play underhanded tricks like that. I do not, nor do I appreciate being accused of that kind of behavior. I tell the truth as best I know it.
w.
Steve O December 21, 2018 at 7:51 am
Steve, I’ve done that a number of times, starting back when I first got hold of a computer.. So despite your “guarantee”, that is absolutely no surprise to me at all.
Stop assuming I don’t know what I’m talking about, and start thinking about the example I gave.
w.
Willis, you posted :
“Accuracy is how close successive measurements are to the true value.
Precision is how close successive measurements are to each other.”
Well, I was taught that “accuracy” is how closely the reported measurement represents the true value of the parameter being measured. One the other hand, “precision” just represents the number of digits used in the numerical value being reported.
Thus, for example, a mechanical caliper may report a length measurement of 6.532 inches and yet be much more accurate that a scanning electron microscope that reports the same object feature as being 6.54798018* inches long if the SEM has been incorrectly calibrated or has a undetected electronic failure. (*Note: modern SEMs can indeed achieve resolutions of one nanometer, or 4e-8 inches.)
So, precision actually has nothing to do with accuracy and the number of times a given parameter is measured. And accuracy is not necessarily related to precision of the measurement, but it can be improved by statistical analysis of repeated measurements at any given level of precision.
It is the combination of using highest precision within the known/calibrated range of accuracy of the measuring device that is of utmost importance to “truthful” value reporting, whether it be for a single measurement or multiple measurements of the same parameter.
Trafamadore,
if the true value was 85.8 mm, more people will get 86 than 85
That’s totally true. You can get SOME increased accuracy, but not ANY increase just by increasing the number of measurements.
In your example, you can be sure that your average will be closer to 86 than to 85. But it is NOT true that, by increasing the number of measurements, you will get exactly 80% more 86s than 85s. It may be 80% or 70% or 90%. You won’t change that by taking more and more measurements. You could increase precision to about 0,1mm as Willis said, but not more than that, being realistic.
This doesn’t mean that the maths are wrong. What it means is that the condition for the maths to work (perfectly evenly distributed errors) never happens in real world.
IF SATELLITES are capable of measuring
sea level to the nearest millimeter,
or even to the nearest centimeter,
I’ll eat my hat (a scientific term).
And I would say the same thing
if the oceans had no tides,
and no waves.
In my opinion, there is no accuracy,
and 100% uncertainty. with the
satellite methodology !
Of course after the “proper adjustments,”
the conclusion will be the usual:
“It’s even worse than we thought.”
The real question is whether sea level rise
is causing property damage on the shorelines,
such as on the Maldives Islands … where
investors are building new resorts
like money was growing on trees there.
Who is getting hurt by sea level rise now?
Who is getting hurt by global warming now?
The correct answer is “the taxpayers” —
getting fleeced by leftist con men,
scaremongering about
(harmless) global warming
and sea level rise.
Satellite sea level data
does not pass the smell test”
(another scientific term)
My climate science blog:
http://www.elOnionBloggle.Blogspot.com
In climatespeak, “I’ll eat my hat translates to” “Is not a robust methodology”.
It is incredible how the smell test, the nose, makes scientists look blind.
Smell is the most ancient sense. The smells of peppermint and caroway, quite distinct, are yet of two molecular enantiomers.
Even we, nowhere near a cat, can actually smell chirality (rotational polarity).
There is more to measurement than meets the eye!
The satellites are sensing through a plasma, atmosphere, of highly variable electromagnetic polarizablilty. Ironically they are telling us more about the atmosphere or about the ocean surface physics. It would be an expensive scandal if that data is thrown away in a mad pursuit of height accuracy.
“Jason-2 flies in a low-Earth orbit at an altitude of 1336 km. With global coverage between 66°N and 66°S latitude and a 10-day repeat of the ground track, Jason maps 95% of the world’s ice-free oceans every ten days. Sea surface height accuracy is currently 3.4 centimetres, with 2.5 expected in the future.” — source: https://www.eumetsat.int/jason/print.htm#page_1.3.0
The “Inside the Acceleration Factory“article linked in the above article’s first sentence states that C&W data analysis gives a SLR slope of 2.1 +/- 0.5 mm per year for the last 20 years using a large amount of data from one or more spacecraft instruments (presumably the Poseidon-3 dual frequency altimeter that is on Jason-2 or something with similar accuracy), having at best a 25 mm accuracy.
As to how anyone can assert that satellite radio altimetry (independent of GPS use) above oceans is accurate to +/- 1 mm or better . . . go figure.
Richard Greene:
If satellites weren’t capable of measuring
sea level to the nearest millimeter,
or even to the nearest centimeter,
Why did they build and deploy them.
Even though the oceans have tides.
And waves.
The area struck by the radar beam
is many waves wide.
It can’t see them.
Averaging enough points
can handle random noise.
It is systematic error
we must fear.
The GPS in your cell phone
will soon be accurate to 1 cm.
The first Grace satellites
measured their separation to 1 um
over 220 kilometers with microwaves
The second generations with visible lasers
should be much more accurate.
LIGO detected gravity waves
by detecting motions of
1/10,000 the width of a nucleus.
1/1,000 wasn’t good enough.
You are talking out of your hat.
Now eat it. (It doesn’t pass the smell test.)
For those who don’t know,
the satellites are being calibrated
by measuring the distance down
to sites with see level known by GPS.
Using one set of sites for calibration
and a second set for validation
might work.
The potential for systematic error
is high.
I take real-world measurements at my holiday cottage, the address of which is:
1 Derwater Street
Tipping Point
Maldive Islands
Merry Christmas to all! Glub glub glu…..
“beyond a certain limit” – What does define that “certain limit”? One can imagine many experiments where you can get very high precision with very crude tools. For instance think of Buffon’s needle to determine many digits of PI. It is basically a binary precision that leads to multi digit precision.
Of course there were none of the subdivisions of a degree in 1850 when humans were visually eye-balling liquid mercury thermometers in a few sparse locations.
So to pretend that we know the “global temperature” was in 1850 to a tenth, or a hundredth of degree, is completely absurd, dishonest and completely unscientific.
Not to mention the number of sensors was 2 to 3 orders of magnitude too few to even begin contemplating such a thing, even if they were accurate to 0.001C.
ChrisB,
Not sure, even with my glasses, if I could differentiate the markings of the scale were “subdivide the scale to a hundred, thousand or even a million markings.”
“In theory there is no difference between theory and practice; in practice there is.”
No one could read a scale with the divisions you suggest. And you mention a confusion about accuracy and precision. your divisions could be precise, but not accurate.
PhilR: “No one could read a scale with the divisions you suggest. ”
There is a tool called electron microscope. With it one can measure a distance of 43 picometers. That is almost a million million times smaller than a meter. https://www.youtube.com/watch?v=eSKTFXv5rdI.
As for the discussion about precision and accuracy please google it. You’ll be surprised what is out there.
So what, you do not use an electron microscope to measure Sea Level or temperatures, in fact for anything “large” either.
So how accurate or precise is an electron microscope at measuring something 1 metre across?
Willis article refers to the practical limits of measurement accuracy, plank distance is a theoretical measurement accuracy of distance which you will never achieve.
In the Willis article the ruler itself has practical limits of measurement accuracy, it for example expands and contracts with temperature as does the Visa card. So no what number you finally agree on it is only representative of a certain temperature. We have all made an assumption the ruler which was probably printed in China is actually the right scale which is why most countries have standards bodies to oversee devices that measure things.
So there are practical limits measurement accuracy of any measurement equipment and no averaging does not improve that limit because the errors lie outside the measurement distribution.
The point about a ruler and Eye Sight is that you are measuring to 1mm and guessing or estimating anything less.
It is you who is mixing up terminology Chris. Subdividing the scale will increase resolution, not precision. Precision relates to repeated measurements. A meter could have excellent resolution (resolution is the smallest signal change that it can detect) and poor precision, although in my experience meters with excellent resolution tend to have excellent precision. The word precision is not recommended for use anymore because so many people mix up what it means. Instead the term Repeatability is recommended.
I recommend the ISO Guide to the Uncertainty of Measurement to everyone interested in this topic. An excellent guide to it with loads of examples is the SAC Technical Reference 1.
“Precision relates to repeated measurements.”
I disagree. Let’s say I measure a coin with digital calipers and record its value as 20.2 mm, which may be all I need to assist in verifying its authenticity. But in reality, I could record the full readout of 20.1825 mm displayed on the caliper’s digital scale. The first numerical value is less precise than the second numerical value but both represent the exact same measurement.
I don’t need any repeat measurement to establish a given level of precision.
And the above measurement scenario tells you absolutely nothing about how accurate those numerical values are unless you know the digital caliper has been recently calibrated (or you yourself just did so) against a known standard at some time before or after that measurement.
“When values obtained by repeat measurements of a particular quantity exhibit
little variability, we say that those values are precise” Les Kirkup & Bob Frenkel, ‘An Introduction to Uncertainty in Measurement’, 2006 page 33 section 3.1.9.
The old ‘the errors cancel out’ anti-argument. I bet those that claim that would believe the same even if guessing the future from goat entrails. You just need many goats for accurately guessing the future 🙂
The same line of thinking goes for averaging wrong results from computer models. Why don’t they just pick results at random then average them, if they think the errors will cancel out no matter what?
On a less funny note, the central limit theorem has its limits.
Even then, as Kip Hansen pointed out here, MOST of the time, you aren’t going to get ANY reduction in uncertainty:
https://wattsupwiththat.com/2017/10/14/durable-original-measurement-uncertainty/
It only occurs under certain strict conditions, which in real-world climate data are very seldom present.
Thanks, Lonny, an interesting post.
w.
Excellent explanation.
Those that have used Slide Rules easily understand this. A simple way of realizing / demonstrating this is to perform a moderately complex calculation on a slide rule (one with more than three steps) as you normally would. Write down the answer. Next, perform the first step of the calculation, write down the result of that step. Slide the slider back and forth, then set the slide on the result. Do the same for the rest of the steps. Each time reading, writing down the result of the intermediate step, sliding the slide back and forth then resetting to thath intermediate result to obtain the next result. Even with problems that only have two or three intermediate steps, the final result is drastically different. An clear example of why this happens is when you divide the cercomperference by the radius, the index is EXACTLY on PI. If your slide rule does not have a clear mark for PI you will never put the slide in the correct spot.
Also have problems over the fact that the satellite is moving, thus measuring a different spot on the ocean in a different portion of the Swell, a different wave height, and even a different ocean level, through a different wavy atmosphere.
Only my dad could use a slide rule. Messed with them a bunch when I was a kid, but could never figure how it worked.
I still have one NEW, in its box! My son, who is in the Oregon National Guard discovered them throwing one out and brought it home to me. I still remember how to multiply, divide and do logs but not much more. I will keep it, one of these years it will probably be worth a bunch of money.
There is a web site for that.
Look it up.
I have a yellow metal Pickett Dual Base ~1961
Yours should have the name of the maker, copyright year, & model #.
Slide rules as a topic cycle through about every 2 years.
Last time, I spent an hour looking at sites.
Give it a try.
I still have the slide rules I used in the first two years of mechanical engineering, nuclear physics, statics, dynamics, and reactor design. I competed in slide rule and mathematics in high school though, so they are well-used. defintely “not in the box” shape!
And I have a Carmody Navigational Slide Rule. Never actually got past multiplication and division though, just kept using Norries Tables.
But at Nautical College before my Mate’s exam we had a lecturer who used one all the time. We set him up. During an exercise someone looked up from his paper and asked as if he had forgotten, “What is the square root of four?” Out came the slide rule and the answer “two nearly” had the class in stitches. The rule didn’t get used nearly as much after that.
Usurbrain
My slide rule had PI clearly marked but it only showed up when the circumference was divided by the diameter or twice the radius! : ]
Correct! My Bad.
To make this more like climate science. Let’s make that measuring 1000 different credit cards using 1000 different rulers by 1000 different people.
Would these measurements increase the over all accuracy at all?
Unlike thermometers, the readings from rulers won’t drift over time if they aren’t re-calibrated regularly.
Depends. What’s the ambient temperature and the coefficient of expansion of the ruler?
That all depends on the ruler material’s sensitivity to environmental conditions.
Wood: heat and humidity
Plastic: heat and humidity
metal: heat
laser: everything
They will drift because they expand and contract with temperature and many will be printed in China, with a that is close enough attitude 🙂
“To make this more like climate science, let’s make that measuring 1000 different credit cards using 1000 different rulers by 1000 different people. Would these measurements increase the overall accuracy at all?”
Only after the data was properly adjusted of course…
Mark W,
Exactly!
MarkW
It is worse than you suggest. The credit card doesn’t change length over time, whereas the temperature is never the same even in the same place. Thus, the randomness of the measuring error doesn’t come into play.
I think weighing all the cats in the world to determine if they are fattening up at an accelerating rate, may be a robust analogy.
Sooo, with respect to satellite sea level measurements, what uncertainty can be expected?
Also, I am curious why the rate of sea level change as measured by satellite seems always to be a factor of 2 higher than that measured by tide gauges. Can it be a reflection effect, like an echo, where the change in distance that the satellite measures is twice the change in sea level?
There was a thread a while back showing that satellites are measuring different rates in different areas, I dimly recall noting that they in general showed higher levels of rise in the middle of the oceans than they did close to shore. I can think of a few ways that this might actually be physically possible given all the cycles at play and the short observation time we have. But let’s put that aside for the moment and assume it is true.
If the processes at play express themselves first (or more at this point in time) in the deep ocean than they do close to shore, then tide gauges, which by definition are located at the shore, would show less rise than would the satellites which are looking at a lot more area.
…Indonesia has had a positive anomaly for over 30 years
It now looks like this…
http://www.yohyoh.com/img_upload/server/php//files/4a6bf8fdb0df8a17f7f3dc1fa88cba99.jpeg
Yay!
Waves are big in the open ocean, and are very chaotic.
Tide gages, I do think, are in protected harbors and such, and in laces where the sea is not as rough.
Measuring the height of a kid that is holding at least somewhat still is easier than one who is running around a schoolyard while you are measuring her.
Menicholas,
Not only are tide gauges often in harbors, but they are intentionally designed to dampen the effects of waves, and primarily be sensitive to the low-frequency tides. There is nothing in the open ocean to dampen the waves of any frequency.
Originally the satellite showed a decline in sea levels. And, as they did with the ARGO data, they adjusted it to fit their preconceived ideas — the adjustment is called GIA and based on the belief that tectonic activity was hiding the “real” sea level rise.
This of course begs the question that if the sea isn’t rising where we live and where the tide gauges, why should we care.
Well, yes, the GIA adjustment is clearly bogus and has to be removed to get Eustatic (apparent) sea level. But it’s only about 12% (roughly 0.35 mm/yr). The rest of the discrepancy between tidal gauges and satellites is a mystery.
When the original readings did not agree with what they were expecting, they went in and looked at everything very carefully and identified some things that were out of whack or improperly calibrated, or maybe they rewrote the algorithm, or all of the above.
Then they got a result which not only showed what they wanted to find, but found that it was worse than we thought!
Jackpot!
They then stopped looking for things that might be out of whack or miscalibrated and stopped looking for better algorithms.
Just a hypothetical, but this is how confirmation bias, and climate science, works.
From a satellite sea level measurement perspective the biggest problem is waves. You are trying to measure the surface of something has these lumpy things on. Those waves by definition distort the very surface you are trying to measure locally, the more of them the more they distort.
Many years ago, on my first day in CHEM 101 class, the instructor told us we would have points deducted for any lab or test result where we express an answer to a calculation with a higher degree of accuracy than the least accurate measurement that was used in the calculation.
This story makes me think of the announcements saying the current year is the hottest ever because the average temperature is 0.00023 degrees higher. If the weather stations providing these measurements have an accuracy of only 0.05 degrees, someone is full of bull waste products.
Do we need to send these people back to undergrad classes to remind them of the basic rules of math and science?
“Do we need to send these people back to undergrad classes to remind them of the basic rules of math and science?”
That would be a waste of time and money, as witch doctors don’t follow the basic rules of math and science. But we do need to keep reminding readers that the reason ‘climate scientists’ don’t follow basic freshmen chemistry science rules is because they’re witch doctors, and NOT scientists!
As applied to climate studies, a better analogy would be a large group of riflemen, shooting at different targets. All the riflemen have varying skills, and all the rifles/ammunition have differing inherent accuracies.
So would combining the results give any better understanding of the odds of any one target being hit?
Do those satellites used for sea level measurements ever measure something that is at a known level, just for a reality check?
Every time a satellite passes over a harbor, why not take a reading, at the same degree of accuracy, at that moment?
SR
My understanding is that they are checked by measuring the level over lakes, where waves and tides are not much of a problem. However, the same issues persist.
w.
Wouldn’t that mean the lake level, used for calibration, is dependent on the tide gauge base data for the lake?
I wonder if the satellites say those lake levels are accelerating?
SR
Wouldn’t that also mean the satellite reading would necessarily be constrained to the same degree of accuracy as can be had from the lake’s gauge?
SR
If it is then the uncertainty of the tide gauge must be included as part of the uncertainty calculation for the satellite measurement. Thus the satellite measurement uncertainty figure must be higher than lake tide gauge uncertainty figure.
If they are using lake surfaces for reference, how do they know what the level of the lake is? Prior to GPS this was all done with eyeballs and transits. And even GPS has a prolate elipsoidal error envelope. It has about 6 meter radial horizontal accuracy and 10-20 meters vertical accuracy.
That’s one reason inexpensive drones won’t be making deliveries any time soon.
https://www.gps.gov/systems/gps/performance/accuracy/
Large lakes often have their own “tide gauges”. Here’s a link to a web site for the Lake Champlain site at Rouses Point New York. https://water.weather.gov/ahps2/hydrograph.php?wfo=btv&gage=roun6
And another for a site at Burlington, VT https://nh.water.usgs.gov/echo_gage/measurements.htm
There would be all sorts of problems with trying to use these as calibration points. All the problems of tide gauges less tides themselves but plus changes in lake level due to precipitation, evaporation, river inflow and outflow. Plus which, the level of any point on the lake varies with which way the wind is/has been blowing.
I honestly don’t know how/if satellite RA’s are calibrated against surface targets. There are many papers on the subject and they concur that it’s a tough problem at the accuracy levels required. My vague impression is that they are checked periodically to make sure they are drifting off into fantasy land, but that there’s no overt correction for RA calibration error.
I’ll add it to my lengthy list of things to look into.
Had some surveying done for property values this Fall. He just used a GPS pole for legal surveys. Next year use your smartphone!
“Smartphones’ GPS systems are going to start getting a lot more accurate. According to IEEE Spectrum, Broadcom is starting to make a mass-market GPS chip that can pinpoint a device’s accuracy to within 30 centimeters, or just under one foot. That’s compared to today’s GPS solutions, which typically have a range of three to five meters, or up to 16 feet away.”
https://www.theverge.com/circuitbreaker/2017/9/25/16362296/gps-accuracy-improving-one-foot-broadcom
They are specific sites around the globe with there own floating laser reflective target which has GPS surface fixes.
I know Lake Issykkul (Kyrgyzstan) has one, Jason 1 and 2 had a floating target in Bass Straight off Tasmania in Australia which I assume Jason 3 also uses.
Jason 3 is now flying the same orbital path Jason 2 used to so I imagine it uses the same calibration sites.
So the algea and seaweed and dirt growing underneath and top of the calibration site bouys didn’t change the “elevation” of the reflectors over those many years in the middle of Russia?
The buoy is GPS positioned, what you are saying makes no sense????
All that matters is the buoy is floating on the water surface are you saying it isn’t?
The buoy is floating on a water surface. The reflectors on the top of the buoy are held above the water level, obviously tilting as the bouy jerks and moves on its anchor chain above the bottom as random waves go by, as wind tilts the buoy slightly.
As the buoy gets physically heavier (as with every marine object with a waterline and anchor chain supporting growing biologics and dirt and scum), the buoy sinks down. This requires daily cleaning all over the surface, if you’re going to claim a “sub-millimeter” calculation accuracy in the RESULT of the “measurements” made from your instrument that is calibrated from a moving irregular source. A laser reflector on the moon, on the other hand, IS capable of giving higher accuracy because the only thing moving it are “moonquakes” and aliens.
I could add imagine the lake got lifted or dropped 100m in an earthquake. The GPS reading on the floating buoy measures the new position. Jason 3 sees the new height of the lake and it should match the buoy position from it’s own GPS as up/down 100m and so we know the satellite is in calibration. The only way it can fail is if the buoy isn’t floating.
As for accuracy Jason 3 clearly states it’s accuracy as 2.5 cm RMS so I don’t know where you get sub millimeter accuracy. Climate Science models things from the data but that ahs nothing to do directly with the Jason 3.
LdB
No, earthquakes move the surface anywhere from 2 cm (mag 4, 5 or 6) to 2-3 meters sideways (mag 7.5+) and “maybe” 1/2 cm to 1-2 meters vertically in limited areas around the quake. 100 meters lake surface movement? No.
Look at the fence broken in the 1903 San Francisco quake, the road busted a few days ago in Alaska. Underwater, even Japan 8+ quake disturbed the sea floor “only” a few meters over a long line.
Well, you see, regardless of what the satellite accuracy is, the result of the satellite readings is a sea level change “accelerating” from 2.1 mm/year to 2.5 mm/year, with a ‘trend line” being analyzed to 3 and 4 decimal places to find anything but a linear (or steady) trend. But, you see, an “acceleration” had to be found!
The sea level measurement and acceleration you are talking about is a different thing it has nothing to do with the instrument doing the measurement., that belongs to climate scientists.
At the moment your criticisms are all over the place like a rabid dog but you clearly don’t know how it all works. I suggest you do some reading so you can voice whatever complaint you want to make.
One of my favorites the first sentence in Chapter five of the IPCC’s 4th assessment report:
Climate Change 2007: Working Group I: The Physical Science Basis *
The oceans are warming. Over the period 1961 to 2003, global ocean temperature has risen by 0.10°C from the surface to a depth of 700 m
Really they can measure the global ocean to with 0.10°C Is that 2nd zero after the decimal point really significant? Let’s see, 0.10 / (2003-1961) = 0.002 – They must have some very accurate data.
* That’s a WayBack Machine link. It seems that the people at the IPCC have recently decided to make it difficult to navigate their website. Could that possibly be by design?
“Regardless of the number of measurements, you can’t squeeze more than one additional decimal out of an average of real-world observations.” I was taught that you can’t get more accurate than the least accurate number used when averaging but I can see why you might push it out one decimal place. My caveat to that is while I’ve always enjoyed math I stayed away from statistics classes.
True, Darrin, the concept of “significant figures” seems to have been forgotten in climate science.
w.
Significant figures in climate science.
(To quote someone else about something else)
“They don’t know what it is, they just know what it’s called.”
“Significant figures in climate science.”
That’s the ilk of Mann et al isn’t it? Nothing to do with real numbers.
And to top in all off, the “sea level” that the satellites measure is not the “sea level” that human civilization needs to care about. I hate to claim it’s not possible, but I doubt even the most sophisticated deep ocean creature will notice if the mid-Pacific ocean depth changes by a meter or two. But in any case, we don’t. We care about sea level relative to shorelines where there is significant population and fixed infrastructure.
And in those cases, if sea level rise is a problem, it’s most likely due more to land subsidence than actual change in the ocean level.
The satellite data is interesting and no doubt useful for some things, but it isn’t what we should pay attention to.
There is one case where open ocean measurement is critical – tsunami warning. They start as harmless swell, until they arrive at come coastline. A lot of talk about buoy sensors went nowhere afaik after the big one in Indonesia. And there were strong rumors a deep sea ocean creature, a US Ohio Class boat was severely damaged, rolled like a toy, and had to limp home.
Some of these are caused by major earthquakes, which it so happens also have warning signatures with ionospheric effects which ironically could in theory be monitored with the same satellite radar system (if subtle effects are not averaged out). The “fukoshima” earthquake precursors were in fact measured – resistance to this earthquake precursor “smell-test” blindsides otherwise capable scientists, and gets a lot of people killed.
An excellent common sense rule, Willis, thank you. Regarding sea level measurements back when estimates for 2100 were a rise of twelve feet plus (Hansen?) I argued that if that is the worry, we needn’t run down to the sea with a micrometer, a yardstick will do and if worse, then axe handles are sufficient for the job.
Also, for global av. temperature rise of 6C which seemed to be the worry, a dozen thermometers, scattered across the Arctic, where with enhancement we would have 2 or 3 times this temperature, would be a perfectly adequate warning system. Had they done this in 1988, we would know before now that another degree at most is really all we are in for.
It would be instructive to know what all the T adjustments have done to uncertainty of measurement. Indeed the algorithms are changing previous temperatures as we speak. As Mark Steyn noted at a senate hearing on data quality, ‘we don’t know what the temperature in1950 will be by 2100 and yet certainty is high about 2100’s temperature’.
Even today 1950 is a lot colder than it was in 1950. Proof positive of global warming. The further we go into the future the colder the past becomes. Entropy in action. Eventually 1950 will be the date of the big bang.
“Even today 1950 is a lot colder than it was in 1950.”….LOL and true!
From my file of tag lines and smart remarks:
Climatologists don’t know what the temperature is today within ±0.5°C, and don’t know what it was 100 years ago, but they know it’s 0.8°C ±0.1°C hotter now than it was then.
CO2, is a temporal gas, it steals heat from the past and moves it to the future.
That deserves a house point, or two. Brilliant.
I am off to calibrate my hockey stick, I use it to measure climate change, I sometimes need to re-calibrate it, especially when historic matches have to be replayed.
CO2, is a temporal gas, it steals heat from the past and moves it to the future.
I’ve added that one to my smart remarks and tag line file.
Even today 1950 is a lot colder than it was in 1950. Proof positive of global warming. The further we go into the future the colder the past becomes. Entropy in action. Eventually 1950 will be the date of the big bang.
Yogi Berra would have been a climate scientist if he had been born 100 year later than he was.
“It gets late early out here”, “But the towels at the hotels are the best…I could barely close my suitcase”.
Accuracy depends on resolution. Uncertainty depends on distribution relative to the true statistic (so-called “normal”). The average temperature is highly time and space dependent, with a potentially large spread in even a narrow frame of reference.
And your point is… ?
Mosher, is that you?
Averaging measurements WILL increase resolution for a discrete sampled signal as long there exist an appropriate random dither noise.
See the excellent paper by Walt Kester: ADC Input Noise: The Good, The Bad, and The Ugly. Is No Noise Good Noise? Published in Analog Dialog 40-02, Feb 2006
Willis wrote: “Regardless of the number of measurements, you can’t squeeze more than one additional decimal out of an average of real-world observations.”
That depends on the what type of error you are dealing with. Using EXCEL or some other software, generate 10,000 numbers with a mean of 15 (deg C) and a standard deviation of 1 (degC). You will have more than 10 digits to the right of the datapoint for each value using an add-on for EXCEL. Calculate the mean and standard deviation of the 10,000 number you actually received with all of the decimal points. Make another column of data and round your values to the nearest 0.1. Take the mean and standard deviation of the the first 100 values of rounded data and separately for 10000 values of rounded data. Do you get closer to the “right” answer using 100 rounded values or all 10,000?
If you have random noise in your data, averaging a lot of values can give you a more accurate mean. If you have systematic errors in your data, your mean won’t get closer to the right answer. For example, it you fail to look straight down and always look at a slight angle, parallax could make all of you measurements systematically bigger or smaller than the should be. If you sometimes look slightly from the left and equally likely slightly from the right, your measurements will have random noise in them, averaging can help.
IMO, because there are so many such large adjustments in converting time for the radar signal to return into a distance, the possibility of systematic error is the biggest problem. One large adjustment is from humidity and it involves meters IIRC. Humidity data is determined from re-analysis. Re-analysis is based on a set of inputs that changes over time. A gradually-changing small bias in humidity of 0.1%/yr appears capable of biasing the trend by 1 mm/yr. IIRC, there have been three major corrections of systematic errors in calculating SLR from satellite altimetry data.
Exactly. What Willis is saying is basically that in real world you can never count on having completely eliminated any possible systematic errors in measurements. That only happens in the world of mathematics. Considering that less than 10% of the error of any measurement is systematic is a wrong assumption.
I’m confused. The commenter said,”…resolve over a very large number of measurements to a constant bias…”, which strikes me as the gist of the comment. I don’t think he was commenting on the accuracy of the measurement but on the precision of the measurement. If the measurements are as precise as he seems to imply then the inaccuracy should be consistent. You can usually correct for inaccuracy if your measuring device is precise.
This seems to be like what I call The False Precision falicy. A true measure of accuracy (not precicision) would be sigma/mean but of course we cannot measure the mean accurately, which is why we are resorting to statistics in the first place.
Let’s generate a random number between 1 and 100, a million times. The mean would be 50, every time, whether we measured it to 1 significant digit or 100.
Sorry, Robert, not true. First, I think you meant a random number between zero and a hundred. Here’s the experiment:
Note that this is in line with the theoretical uncertainty of the mean as calculated above, viz:
w.
Even the best random number generators are still just pseudo random number generators.
Sorry, Mark, but that is NOT the problem with Robert’s claim. The random number generator is giving numbers close enough to random.
The problem is that his claim goes against centuries of mathematical knowledge about the “standard error of the mean”. Even with N = one million, you don’t get exactly fifty as Robert claims.
w.
The length of the credit card is at least assumed to remain effectively constant. That is not the case for either temperature measurements or those of sea-surface height. They are both dynamic.
Bingo!
Willis,
I think you are confusing resolution (1mm divisions on the ruler) with precision (how repeatable are the measurements). Also you are using an example whose precision/repeatability is smaller than it’s resolution (everyone measures between 85 and 86)
I think if you redo your analysis with something that changes more than the resolution of your measurement you will come up with different conclusion.
For example, what if I wanted to estimate the average length of pencil being used by primary grade students in a district with 100,000 students by asking them to measure their pencils and reporting to their teacher to the nearest mm. And lets say students will have up to a quarter mm error (they might round x.25 mm up or x.75 mm down but not worse).
In this example sampling more students will still improve accuracy (assuming zero bias) because the variation in individual pencils will be much greater than measurement precision even if the average over the district changes very little. With enough students I can still get better than 0.1 mm accuracy (relative to the actual mean) even with measurements rounded to the nearest mm.
David
Thanks, David. I’ve chosen that example because it is related to our measurements using thermometers. If the actual temperature is 85.37°F and we ask 10,000 people what the temperature is, that is very much like my example, and unlike your example.
Finally, your example only works if the errors are symmetrically distributed … which in the real world is generally not true. For example, when measuring extreme cold people will tend to round down, because it’s more impressive …
My best to you,
w.
I was not in the original discussion but measuring ocean conditions such as wave heights can be misleading. I was on the deck of very large derrick barge one time off the shore of the North Island in New Zealand trying to get a fixed production platform piled to the ocean floor. The waves did not seem too high to me compared to other locations around the world. As I was pondering what to do, I noticed the wave pole on the fixed platform. The swell as 19 feet. I then timed the swell period and it was 20 seconds. I then thought about the 4000 mile fetch of open water and understood why one could not see the wave/swell height changing very much.
With the satellite moving so fast I wonder how the surface measurement is corrected for the wave heights, periods and fetches? There would be a difference between the middle of an ocean and the near shore area. As an old sea salt, Willis probably understand this aspect well.
This is why we have tolerances on drawings/tooling. To elevate guess work and scrap.
In my position as a mechanical engineer for over 50 years, I had the distention of being the evil one for rejecting mechanical parts to unusually tight toleranced parts per the drawing. Many times the problem was elevated by loosening the tolerance where allowed with an Engineering Change order agreed by all involved. No need to use a micrometer, let’s say, on a credit card. Or a yard stick on a precision part.
Regards, retired mechanical engineer
The issue is with the absolute uncertainty of each measurement. If this error is consistently biased in one direction and/or not random, then the error will not cancel and the uncertainty of the average converges to the bias. However; if the error is normally distributed around actual values, then the precision of the average will continue to increase as the number of samples increases.
Thanks, CO2, but that’s not the issue I’m highlighting. In my example, the error will be normally distributed, but we still cannot use a ruler as a micrometer no matter how many readings we take.
w.
Willis,
Yes, a ruler can’t be used as a micrometer by measuring one distance over and over and taking an average, but that’s not the case with satellite based sea level measurements.
In the first case, the same instrument will always result in the same measurement of the same thing, so there’s no variability to average out. However; if you add stochastic noise to the measurement whose variability is more than the precision of the measuring device and centered around the actual distance, then there is variability to average out and the precision will continue to get better as more measurements are made.
In the second case, the same instrument makes many measurements of different distances at different places at different times and never measures the same thing twice. To the extent that the data is normally distributed around the steps of the measurements, the precision of the average will continue to get better as more measurements are made.
No one seems to have used the term “measurement resolution” in this discussion, but it is distinct from both accuracy and precision:
“In addition to accuracy and precision, measurements may also have a measurement resolution, which is the smallest change in the underlying physical quantity that produces a response in the measurement.”
https://en.wikipedia.org/wiki/Accuracy_and_precision
https://en.wikipedia.org/wiki/Accuracy_and_precision