From Lockdown Sceptics
Stay sane. Protect the economy. Save livelihoods.
An experienced senior software engineer, Sue Denim, has written a devastating review of Dr. Neil Ferguson’s Imperial college epidemiological model that set the world on a our current lock down course of action.
She appears quite qualified.
My background. I wrote software for 30 years. I worked at Google between 2006 and 2014, where I was a senior software engineer working on Maps, Gmail and account security. I spent the last five years at a US/UK firm where I designed the company’s database product, amongst other jobs and projects.
She explains how the code she reviewed isn’t actually Ferguson’s but instead a modified version from a team trying to clean it up in a face saving measure.
The code. It isn’t the code Ferguson ran to produce his famous Report 9. What’s been released on GitHub is a heavily modified derivative of it, after having been upgraded for over a month by a team from Microsoft and others. This revised codebase is split into multiple files for legibility and written in C++, whereas the original program was “a single 15,000 line file that had been worked on for a decade” (this is considered extremely poor practice).
She then discusses a fascinating aspect of this model. You never know what you’ll get!
Non-deterministic outputs. Due to bugs, the code can produce very different results given identical inputs. They routinely act as if this is unimportant.
This problem makes the code unusable for scientific purposes, given that a key part of the scientific method is the ability to replicate results. Without replication, the findings might not be real at all – as the field of psychology has been finding out to its cost. Even if their original code was released, it’s apparent that the same numbers as in Report 9 might not come out of it.
Ms. Denim elaborates on this “feature” quite a bit. It’s quite hilarious when you read the complete article.
Imperial are trying to have their cake and eat it. Reports of random results are dismissed with responses like “that’s not a problem, just run it a lot of times and take the average”, but at the same time, they’re fixing such bugs when they find them. They know their code can’t withstand scrutiny, so they hid it until professionals had a chance to fix it, but the damage from over a decade of amateur hobby programming is so extensive that even Microsoft were unable to make it run right.
Readers may be familiar with the averaging of outputs of climate model outputs in Climate Science, where it’s known as the ensemble mean. Or those cases where it’s assumed that errors all average out, as in certain temperature records.
Denim goes on to describe a lack of regression testing, or any testing, undocumented equations, and the ongoing addition of new features in bug infested code.
Denim’s final conclusions are devastating.
Conclusions. All papers based on this code should be retracted immediately. Imperial’s modelling efforts should be reset with a new team that isn’t under Professor Ferguson, and which has a commitment to replicable results with published code from day one.
On a personal level, I’d go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people, and the results speak for themselves.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Would the same (or very similar) comments apply to Dr Mann’s work product – if anyone can find it?
M&M uncovered Mann’s amateurish attempts at PCA and statistics, to be charitable. No doubt Mann’s coding skills 20 years ago were on the same level… amateurish if even existant. Nowadays, he can hire a knowledgeable coder to create his spaghetti code. But what any major software project beyond a few 10,000 lines of code needs is a qualified software architect at the beginning to stop the spaghetti factory from getting started, and inject process, traceability, and regression testing on bug fixes/changes.
If you are going to test it you need a verifiable historical record of the last pandemic. That must include all factors. After the design phase in the specification which should include such things as known person-to-person transfer and so on. For instance you cannot know if person-to-person transmission is decreased by social distancing unless you have confirmed data in the past. I doubt we have any of that and since all models must be tested before you can rely on them this model becomes a guess.
Excerpted from article:
“WOW”, ….. now that revelation really surprises me.
And here I was sure that those “coronavirus contagion” models …… were being produced/generated by those “climate modeling” computer programs.
When the “upgraded” and “improved” code is still a piece of crap, you know the original was really bad. Frankly I’ve dealt with such code a few times in my career. Ultimately I ended up determining what the data and processing requirements were and rewriting from scratch. Some things just can’t be fixed.
Even worse than a GIGO analysis using bad data would be a model that does “Good data In, Garbage Out”….a GIGO that seems to apply to this model
Bingo!
Accurate and well stated, Paul.
I have had several similar experiences.
No amount of searching and correcting ever finds all of the bugs/errors/’incorrect mathematical formulas’. Especially as authors of spaghetti code never bother to set up proper test data to ensure the program runs correctly.
Ergo, “just run it a lot of times and take the average”; the epitome of GIGO.
The problem is compounded thousandfold when amateurs include shortcuts where program control is diverted based on some local program condition.
This is commonly called spaghetti code because there is no organization or rationale to program logic, process and order.
Spaghetti code programs get worse over time as said amateurs patch bugs/fudge errors and hardcode datums into the software as fixes. Never throwing away the bad code in favor of properly designed software/databases/’test data’.
Yes.
“Academic efforts don’t have these people, and the results speak for themselves.”
Ouch! That’ll leave a mark.
That’s why the term ‘but that’s academic’ is used to point out when something isn’t real.
Is there a “Harry – Read Me” file ?
No but there’s an “Antonia – F*** Me” file.
This comment definitely apply to ModleE …
“that’s not a problem, just run it a lot of times and take the average”
“that’s not a problem, just run it a lot of times and take the MAX”
There, fixed it…
In an article at National Review, John Fund exposes a litany of failed alarmist predictions by Neil Ferguson. (https://www.nationalreview.com/corner/professor-lockdown-modeler-resigns-in-disgrace/).
What particularly caught my attention was… “Ferguson declined to release his original code so other scientists could check his results. He only released a heavily revised set of code last week, after a six-week delay.” A very Mannly approach to rigorous science.
As an expert in vascular biology said of Ferguson, “I’m normally reluctant to say this about a scientist, but he dances on the edge of being a publicity-seeking charlatan.” Again, very Mannly.
The article concludes, “So the real scandal is: Why did anyone ever listen to this guy?”
What is it about the media that leads them to promote alarmist predictions made by “scientists” shown by experience to be serial failures? It can only be that “journalists” of the modern, post-truth era care nothing about being recognized for their insightful analysis, but are focused merely on grabbing the attention of the public, which is where the money is: click-bait and eyeballs. Unfortunately 99% of journalists give the remainder a bad name.
“Unfortunately 99% of journalists give the remainder a bad name.” Good one. –AGF
The problem? Every single journalist (and climate scientist) believes himself to be in the 1%, not the 99%. That is the central problem.
“I’m normally reluctant to say this about a scientist, but he dances on the edge of being a publicity-seeking charlatan.”
A person who does this sort of slap dash thing really doesn’t do a good job of his core ‘expertise” either. I hope there aren’t innocent souls out there who have to rely on his vascular biology skills.
Steve McIntyre, a very gentlemanly and modest fellow, having been complimented on his takedown of the hockey stick said of the consensus climate “Team” after dealing with them and their work methods:
“It seems to me that most famous “amateurs” from the past were highly professional in their field. Nor do I find invocation of their stories very relevant since the sociology of the science enterprise has changed so much.
In my opinion, most climate scientists on the Team would have been high school teachers in an earlier generation – if they were lucky. Many/most of them have degrees from minor universities. It’s much easier to picture people like Briffa or Jones as high school teachers than as Oxford dons of a generation ago. Or as minor officials in a municipal government.
Allusions to famous past amateurs over-inflates the rather small accomplishments of present critics, including myself. A better perspective is the complete mediocrity of the Team makes their work vulnerable to examination by the merely competent.
– Steve McIntyre, Climate Audit Aug 1, 2013
Similar to academics and climate scientists applying thermodynamics and heat transfer they “studied” on Wiki or a 45 minute YouTube.
https://en.wikipedia.org/wiki/Earth%27s_energy_budget
The referenced NASA energy budget is linked above.
According to the diagram 163.3 W/m^2 made it to the surface.
18.4 W/m^2 leave the surface through non-radiative processes, i.e. conduction and convection.
86.4 W/m^2 leave the surface through latent processes, i.e. evaporation and condensation.
That leaves 163.3-18.4-86.4-0.6 = 57.9 W/m^2 leaving as LWIR.
That’s it!
The energy balance is closed!
Fini!!
But what about!?
LWIR: 398.2 total upwelling – 57.9 from balance – 0.6 absorbed = 340.3??
An “extra” 340.3 W/m^2 have just appeared out of thin air!!!???
So where does this 398.2 W/m^2 upwelling “extra” energy come from?
Well, actually the 398.2 W/m^2 is a theoretical “what if” S-B heat radiation calculation for an ideal, 1.0 emissivity, Black Body with a surface temperature of 289 K or 16 C.
The SINGLE amount of LWIR energy leaving the surface has just been calculated by TWO different methods!! and then combined to effectively double the amount!!!!
398.2 is THEORETICAL!!!!!
340.3 is NOT REAL!!!
340.3 VIOLATES conservation of energy!!!!!
And, no, it is NOT measured except by the amateurs who don’t understand how IR instruments work or emissivity and assume 1.0 when it is in fact 57.9/398.2=.145.
There is no 398.2 upwelling “extra” energy, there is no 340.3 “trapping” and “back” radiating “extra” energy, no RGHE, no GHG warming and no CAGW.
BSME CU ’78 where thermo and heat transfer are formally taught, studied and applied.
And 35 years of power generation where thermo and heat transfer turn fossil fuel into electricity.
And where not understanding thermo and heat transfer equals loud unpleasant, expensive noises and unemployment.
The whole basis for global warming is where that extra energy comes from. No alarmist has ever shown where it comes from. As you say, it does not exist in reality.
The ‘extra’ energy comes from the feedback model which assumes an infinite, implicit source of Joules powering the gain. This, along with strict linearity, are the only two preconditions for using linear feedback amplifier analysis. both of which were ignored by Hansen and Schlesinger when they misapplied Bode’s analysis to the climate.
They then try to claim that the average not accounted for by the incremental analysis is the power supply. That the effect of the average is different from the incremental effect necessitating this bastardization of the theory presumes non linearity which precludes using feedback analysis in the first place! Two wrongs don’t make a right.
Not to mention the idea that you can somehow convert energy into temperature without invoking the gas law. CAGW is all a house of cards.
I want more information on Sue Denim. She gives her bio but a simple search is only getting hits on this particular article.
I am definitely curious about the history and identity of the author “Sue Denim” as well.
Say Sue Denim quickly.
Pseudonym. 🙂
Doh!
😄 Hi, cBob. Thanks for letting me know that you, too, didn’t figure that out. That was gracious of you. Hope all is well with you Mr. NOT a commie (hm…. that could be….. Mr. Natakami (hope that doesn’t mean something horrid in Japanese, lol)).🌷
I didn’t figure it out, though, clever Charles. I was very seriously searching in multiple ways for that scholar “Sue Denim.” “Susan Denim” didn’t work at all for some reason, heh.
Thanks Charles – I was a bit slow.
Crossed my mind as well….Pseudonymically speaking.
“Sue Denim” appears to be qualified to critique Ferguson’s code (i.e., her/his remarks, unless blatant lies, appear to be legit — at least, they are technically saavy enough to be worth considering), however, I don’t think that is her or his real name… .
The reason I think this is that I just found this….🤨
Sue DeNim is a pen name for a government employee who aspires to writing fiction.
July 16, 1979
https://books.google.com/books?id=oX2Qc-GVSAYC&pg=PA42&lpg=PA42&dq=Sue+DeNim,+software&source=bl&ots=laQ6AW45pB&sig=ACfU3U0Dv0NA5I9LGA–ZUKS2fu6wOfvdg&hl=en&sa=X&ved=2ahUKEwivrKK52aDpAhVLvZ4KHeUXBmQQ6AEwBXoECAkQAQ#v=onepage&q=Sue%20DeNim&f=false
Obviously a play on pseudonym. The hostage crisis has existed for years among people that want to keep their jobs, even POTUSes themselves.
Even Pierre Delecto…
Perhaps Sue Denim will identify herself or himself through these columns.
SF writer Harry Turtledove used to identify himself in his early fiction as “Tak Hallus” – the Arabic word for something like ‘pseudonym’!
Sue Denim – pseudonym?
You’ll have to be quicker yet to outpace Janice Moore, I’m always fashionably late to the party so I can see what other folks think. 😎
Pop! Am I glad to see you! :). You mentioned not feeling well awhile back and ever since, I have prayed for you. Glad to see you are, at least, well enough to comment. Better late than never! 🙂
And, thanks for the encouragement.
Take care and enjoy all those pretty Midwest wildflowers.
Cool shades, btw
In the last paragraph https://lockdownsceptics.org/code-review-of-fergusons-model/
where this crique came from,
the person did actually say
“My identity. Sue Denim isn’t a real person (read it out).”
So it required one to read the article, to grasp this fact.
That’s not his/her real name.
What is truly amazing it has taken all this time to actually have someone review the code that has cost the world trillions of dollars and it turns out it is garbage. I have not reviewed the code nor will I unless paid to do so. Yes I have over 50 years of writing code and I still get paid to write code. I still believe the focus should be on the treatment and analyzing the best possible plan for each patient. But it seems that the only thing that counts is the optics of politics. Let the bad outcomes fall where they may.
Terry Bixler
May 6, 2020 at 6:22 pm
The code itself is innocent, the guy behind no so much.
And there is more than one “guy” behind and upfront of that code.
many many many…. (It seems his girlie too)
cheers
“This revised codebase is split into multiple files for legibility and written in C++, whereas the original program was “a single 15,000 line file that had been worked on for a decade” (this is considered extremely poor practice).”
I think the common term is “spaghetti code”? Like unravelling a bald eagle’s nest I imagine, a shitty and tedious job that no one wants to do. I bet she got paid to review it lol
It was a certainty.
Great post. Thanks. Maybe we can turn her loose on climate science and its dark statistics mysteries.
https://tambonthongchai.com/2018/12/03/tcruparody/
British MPs are taking notice.
https://platform.twitter.com/widgets.js
Sue Denim – I wish she would review the climate models, i bet she would find the same or worse!
You need access to the climate models raw code – just as Mark Steyn how little forthcoming Mr Mann has been.
Harry already has – http://di2.nu/foia/HARRY_READ_ME-20.html
Surprising that this did not kill the veracity of climate models nor the many critiques of how the models break most of the rules modellers are supposed to live by.
While politicians follow the advice of the likes of Ferguson there seems little hope of sanity prevailing
“[The model] attempts to simulate households, schools, offices, people and their movements, etc.”
Taking into account how little we know about the epidemiology of the virus itself, the “model” introduces a whole additional set of arbitrary parameters. Not a way to achieve usable results. A post-normal “science” in all its ugliness.
Article Summary:
What the actual F#@ur momisugly% have we been doing, using this model?
500,000 deaths in the Imperial world but 30,000 or so in the real world.
Ferguson has £ Billions to a answer for along with the significant and growing number of deaths from people untreated or too frightened to seek treatment for life-threatening, non-Covid diseases.
far as i can ascertain this model was used and supported by the prof king or whatever he was that died last week to make the decision to murder healthy rare breed and peoples pets and livestock in the foot n mouth outbreak as well
animals recover and get well although it is ugly and painful while the scabs are there
what the nation lost was damned near unrecoverable bloodlines and lifetimes of breeding
his deathtolls then were orders of magnitude wrong just like now
sorry that ws lord may and ex aussie we were well rid of!
“that’s not a problem, just run it a lot of times and take the average”
Where have we seen that before? Let me think………
Head in oven + feet in liquid helium=comfy person!
One article I’m not going to drill down into. Still have PTSD from my years in the trenches…
I will just reiterate my previous response here.
The excuse for the model’s non-deterministic behavior is that the model is stochastic. OK then.
If the model is truly stochastic, it should be backed up with reams and reams of data. You can tell if a program is properly stochastic. I have to assume that Sue Denim knows the difference.
Describing a buggy apparently useless model as stochastic is simply brilliant. Don’t they say that a well developed sense of humor is the sign of a great intellect? My hat is off to whoever thought to call that model stochastic.
They use the word “stochastic” when they should use the word “non-deterministic”.
The former will give values conforming to a statistical distribution when input parameters are varied slightly. That’s ok.
The latter will give different results even for the same inputs. That is not ok. There can be non-determinism due to multi-threading but in this case it is due to bugs. That is, the code is WRONG.
I’ve written code for 30 years. I still get paid (handsomely) to code. I wrote scientific code for a PhD thesis in Astrophysics. I reviewed and modified lots of other institutions code as well. Scientific code is garbage compared to what the Free Market/industry produces using modern software engineering techniques.
I looked at the Github repository for the code even after Microsoft had come through to clean it up. Absolutely terrible and I’d fire people who produced that.
Worst of all, not only zero unit test but the author seems not to know that that has been a minimum standard for 15 years.
The politicians and bureaucracies were too incompetent to know how bad the software is – especially given the stakes involved for Billions of people and Trillions of dollars.
Anyone who did understand software engineering who promoted anything based on this garbage should be prosecuted for criminal conspiracy to defraud.
Yesterday I downloaded the source code from GitHub and fed it into Microsoft Visual Studio Community 2019. To my surprise, it compiles with only a few warnings. A quick inspection of the code reveals a number of issues, probably noted by the “Sue Denim” group as well. A couple of note:
1) There are a number of compiler / IDE warnings about type conversions (double to float, 4 byte to 8 byte) that could lead to overflow (arithmetic errors).
2) Tons of hard-coded char array declarations, i.e. “char buf[65536]” that can lead to memory management issues, and is a rather lazy coding practice. (VS2019 presents warnings to that effect, no need for the programmer to search for the problem.)
I didn’t do a thorough code inspection, as it can be difficult to evaluate code written for an unfamiliar project at the best of times, and the cluttered code makes my eyes glaze over. But it seems that a first step should be deal with the warnings from the IDE and the compiler.
This simulation could probably make for a nice science project, but it’s in no condition to be used as a basis for government policy.
Perhaps not so surprising if the following is true:
“What’s been released on GitHub is a heavily modified derivative of it, after having been upgraded for over a month by a team from Microsoft and others.”
It would be interesting to run it through an AQSC analysis. At my company we are required to do this every time we modify code before checking it back in. A lot of the time it’s just nit-picky, but from time to time it finds a real problem.
When I get some time, I’ll examine the code and conclusions of another group called “The London School of Hygiene & Tropical Medicine”. The city of Ottawa is using their models and on April 8 stated “the number of cases in Ottawa is likely between 11,000 and 34,000.”
https://ottawa.ctvnews.ca/ottawa-s-top-doctor-can-t-say-when-covid-19-will-peak-as-models-suggest-up-to-34-000-cases-in-ottawa-1.4888091
Today, May 7, the reported total in Ottawa is 1,579 laboratory-confirmed cases, and the medical officer is claiming we’re “in the post-peak period.”
It looks like the LSHTM model’s predictions are much too high:
https://www.lshtm.ac.uk/newsevents/news/2020/estimates-relative-impact-different-covid-19-control-measures-uk
I certainly would not hold up Microsoft as an example of a company which can fix buggy code.
I don’t have time now (and probably won’t have) to do a proper read of this. What I will say is that stochastic models are a Good Thing for certain problems. They are used in certain disciplines (the wikipedia article you reference is about insurance, but that’s only one of the disciplines).
You run a stochastic program multiple times, and it randomly chooses data points each time (or to be precise, pseudo-randomly, since computers can’t do real random choices). Because you choose different sets of data points on each run, you *expect* the results to differ.
Why would you create such a program? One possibility is that there is too much data to input all the data points and compute something; the program would go on forever. Another possibility is that the complete set of data points itself represents a sampling of the real world, and using the whole set would give you results that might depend too closely on what that original sample was.
Another way to think of stochastic modeling is that you have some idea what the Brazilian butterfly did, but you don’t know *exactly* what he did. So you try a bunch of different flutterings of that butterfly’s wings, so as to get an idea of the range of effects you might see, how closely or how widely they cluster around a single outcome, and just how unlikely certain events are.
Let me amplify that last point. It may be that the butterfly in Brazil doesn’t have much effect on storms in Montana. If that’s the case, then a good stochastic model should give nearly (but not exactly) the same result every time. But if that butterfly really does have a substantial effect, then the model ought to give substantially different results each time. If that happens, it’s not an indication that the model is bad, it means the initial conditions strongly affect the outcome–and you really do want to know that.
So as I said, stochastic models are a good way to understand certain kinds of situations. Whether they were in this case, I have no idea. But simply saying that the program is wrong because it gives variable results is a Bad Misunderstanding of what stochastic models are. If they give the same result all the time, then something is wrong with the program.
But — following the article, the complaint was NOT about stochastic calculations. Valid stochastic models give the same results each time if the inputs are exactly the same, no? The discussed model is reported as giving different results over multiple runs, each using the exact same input values.
I first became aware of stochastic models in the 1970s and then managed to completely ignore them so I have no first hand experience. That said:
So, a stochastic model doesn’t appear to be the same as just using statistics. If I use statistics to calculate how many phone lines are needed between two cities, I’ll get the same results every time. link
The point is that these “stochastic” models are not random. They are pseudo-random. The inputs are known and controllable – only the outputs are created by the model.
They run the model over and over with slightly different starting parameters and the output provides an envelope of scenarios.
If the envelope is narrow then the model is predicting low sensitivity to starting parameters.
If the envelope is wide then the model is predicting high sensitivity to starting parameters.
So far so good. The models are investigating our understanding of the real world and so can be compared to the real world.
BUT… this is different. The variation of up to 80,000 deaths is not due to the starting conditions. It’s due to which computer you happen to run the model on.
Now that’s the problem. Because the real world is not run on a computer (or at least, no-one is swapping the computer over day to day).
The model output is not reflecting our understanding of the real world. It is confounded by gibberish.
A related aside, is anyone using TRNGs or QRNGs to drive the random functions in statistical modelling?
Had you done a proper read, you might have saved yourself the embarrassment of writing your comment. The author clearly understands the difference between a stochastic model and a buggy mess. In a proper stochastic model, if you feed it the same random number seeds each time it will give the same result. For the code under question, if you feed it the same data each time, it gives different results.
Considering that everyone is doing their best, there is always the maxim:
“No operation extends with any certainty beyond the first encounter with the main body of the enemy.”
=======
“There is no instance of a nation benefitting from prolonged warfare.”
― Sun Tzu, The Art of War
Ironic that the crowd who scolds the rest of us about denying science seem to be the primary advocates and supporters of a methodology that fails basics standards of reproducibility.
From what I understand, the good doctor has generated hugely erroneous studies before. Someone needs to file a class action lawsuit to put him out of business.
Yes. Nobody cared to vet the doctor, the model or the results before publishing and sensationalizing the “prediction.”
That our media as a whole are incurious is not questionable. However, it appears this study influenced the lock downs in the US and UK. If this proves to be true, the government “experts” have a lot of explaining to do.
“vet the doctor” ….. very droll Jeffery; and very apposite. when you say doctor it is just PhD ..
…… Ferguson first came to notoriety in the UK foot and mouth problem. His coding caused the death of perhaps 10 million healthy animals: he was running around urging “kill the cows, kill the cows ..”
The veterinary authorities were appalled at the way Ferguson; and his then boss, RM Anderson, elbowed their way into an animal health issue; with coding; they knew nothing of veterinary issues.
The lack of subject knowledge of course has now inhibited him in any way in this current business.
Gets very funny with the latest post at “Lockdown Sceptics”
” Ferguson has been carrying on an affair with a married mother-of-two during the lockdown. Talk about breaking the social distancing rules! And the icing on the cake is that the name of his 38 year-old mistress is Antonia Staats! ”
https://lockdownsceptics.org/
“Antonia Staats, senior climate activist of avaaz.org”
https://avaaz.org/page/en/highlights/
revealed by a comment from Petit_Barde here
https://realclimatescience.com/2020/05/shutting-down-incompetent-academic-modelers/
”
Antonia Staats, senior climate activist of avaaz.org, wants to end imaginary fossil fuel subsidies. By a “staggering coincidence” she met Ferguson, the gravedigger of the world economy.
The climate clown show is a very small world.
Sue Denim is quoted: “Due to bugs, the code can produce very different results given identical inputs. They routinely act as if this is unimportant.”
I am dubious (and I’ve been programming computers for 55+ years). Programs are deterministic — if the machine they run on does not fail, the answers do not differ from run to run on “identical inputs.” (I say this in full knowledge that the origins of Chaos Theory stem from the discovery that “same inputs” may result in different results if the same inputs are “not the same” due to rounding errors — see Edward Lorenz and his study of weather in early 1960s.)
If Ferguson’s model suffers in the same way as did Lorenz’s, then this is old news. The same problem plagues also ALL weather models. If this is the main gripe against Ferguson, then I yawn and move one.
“If Ferguson’s model suffers in the same way as did Lorenz’s”
Unlikely. It may incorporate a non-linear differential equation, but is unlikely to include the kind of attractor that Lorenz showed with multiple interacting variables. I too am dubious about the “same inputs” story. And even then, the question would be “how much different?”. There is a lot missing from this account.
Agreed, there is a lot missing. Someone should get hold of the original code, and scrub it.
I, too, am skeptical about the idea of identical inputs producing dramatically different results. It is certainly possible if a random seed generator is buried somewhere in the code – and the generator is a hardware derived seed (like a disk drive randomness source). But this review doesn’t allow one to judge whether that is the case.
The differences resulting from running on single core or multi-core processors may make some sense, but the rest of the core discussion does not make sense to me.
Uninitialised stack variable used before set.
You beat me by one minute, Harry!
+1 Harry!
I can think of a few software errors I’ve made over the years that did it. Mostly not cleaning up after myself.
“…I too am dubious about the ‘same inputs’ story…”
Probably invokes a random number generator at some point, e.g., when utilizing a genetic algorithm. Code is fubar so while the results should always match with the same inputs, a coding error ruins it. Or that they thought they would be smart to include randomness rather than have a hard-wired model which was overly constrained.
“…And even then, the question would be ‘how much different?”…”
Well the article says “very.” But honestly, ANY difference is a sign of failure unless the randomness were intentional.
To quote the linked article:
Delta 80,000 deaths in 80 days is suspiciously round.
But if true then I would agree it was “very” different.
Yes but there is a possibility the code was giving cool results due to the creative use of some random number function. Makes it seem more real. I remember some things sort of like that on a CDC 6600.
No, historically even single-threaded programs have not always been deterministic, if they are badly written. At least, if you were writing in a language in which it’s possible to fail to initialize variables, and in an environment where the contents of a block of memory you are handed by the operating system (e.g. stack space or dynamically allocated memory) is not guaranteed. Modern programming languages handle these issues better than older ones did; but Ferguson was writing in C, which has been around for a long time.
Apparently significant portions were the result of running FtoC on ancient card deck-style FORTRAN.
You are right, for one-threaded programs.
For multi-threaded routines, you can easily have different results in two cases :
1. if the result integrator routine is bugged. If the order in which partial results arrive for integration is important, and you can’t guarantee a particular order, then you get different results.
2. If shared variables (global or not) between processes are used and their state is important. You can get different results dempending from which point of time the various processes modify and read them.
In both cases, it’s poor programming practice.
It appears there are cases hidden in the code where the seeds for random number generation may be set by e.g. the machine clock, rather than as definitive inputs. Also, multi threaded behaviour is clearly different. That suggests that there is insufficient care in setting up the PDE integration regimes.
The article offers examples of how deterministic code can produce non-deterministic results.
If multiple unmanaged threads are modifying shared memory, each run will reflect the time-slice that the OS provides for each thread.
There was also the mention of using a database. Databases are required to persist and return data based on queries, unless there is an explicit requirement for grouping/sorting, the order in which database rows are returned are at the discretion of the database software provider. That is, I can persist ten thousand rows in a particular order. If I select those same rows without specifying an order, the rows returned may or may not reflect the order those rows were originally stored – now add the fun of multiple threads contesting to write to the same database (yet another form of shared memory).
Looking through some of the code, there were some choice made to run certain algorithms based on available memory. C/C++ code is notorious for allowing neophyte programmers to create memory holes. If a 100 units of memory are available for allocation and a particular process requires 10 units of memory but there is no block of contiguous memory of 10 units or larger available, then that particular process would fail to execute due to lack of memory.
The amount of memory available may have differed due to competing processes or even the OS having no memory to give – part of the reason why running on different computers yields different results.
NeedleFactory posted: “Programs are deterministic — if the machine they run on does not fail, the answers do not differ from run to run on “identical inputs.”
Uhhh . . . would that statement apply to code (a “program”) that has buried within one or more calls to a random number generator function?
Well, no, but it’s a rather elementary blunder to use a PRNG without first seeding it, except in those cases when you truly want different numbers from run to run. If you DO want different results, it’s still usually best to seed the RNG with different seeds from run to run, giving you reproducibility so you can debug or understand unexpected or weird results.
I am reminded by others here that using a value before setting it can be an obscure source of non-determinism. Modern systems make this more rare; and therefore even harder to locate when it happens. One could argue that if it’s possible for a program to use a value without initializing it, either (a) the computer or (b) the program language or (c) the program-development-system on which it runs is at fault.
The issue of different results with the same inputs is likely unintentional and is a bug.
Many intelligent people who are proficient in their fields think they can write useful code without knowing more than the basics. Experience tells us amatures do not do well with in their attempts to code for complex problems.
GIGO rules!
I wish that I’d used the term ‘stochastic’ when writing a simulator to train Air Traffic Controllers! I just went a long way to set up GOOD pseudo-random number generators and a ‘normal’ distribution model for radar detection. The boss didn’t approve of me ‘messing with maths’ while I was testing the generators!
Would someone please help me understand how a software model can output different results from the same inputs? Unless there’s a random number generator in there, it should give exactly the same result for a given input — every time.
Sure. See my comment just prior to yours, then google for Lorenz, weather, chaos and butterfly. You’re in for an intellectual treat!
Yes but you qualified that the inputs were not the same in the case of Lorenz.
Exactly so.
Thus, either Ms. Demim is mistaken about so-called “same” inputs, or she is not. Either way, her allegation implies nothing noteworthy.
Thanks for your reply.
What if the model is designed as a deterministic model?
E.g., Moa above writes:
“They use the word ‘stochastic’ when they should use the word ‘non-deterministic’.
The former will give values conforming to a statistical distribution when input parameters are varied slightly. That’s ok.
The latter will give different results even for the same inputs. That is not ok. There can be non-determinism due to multi-threading but in this case it is due to bugs. That is, the code is WRONG.”
https://wattsupwiththat.com/2020/05/06/brutal-takedown-of-fergusons-model/#comment-2988042
If “Ms. Denim” is not mistaken about the inputs being the same, but rather “she” is correct that buggy code is producing non-deterministic results from the same inputs, shouldn’t that considered noteworthy on a deterministic model?
According to my readings of history, Lorenz’s simply realized (or possibly was the first to discover?) that very small input value variations, i.e. beyond the 12th decimal point, will lead to major differences in results in a chain of many calculations, each based on the results of the earlier calculations.
“Discovered” seems unlikely considering how well this is understood today for so many kinds of useful processing. Surely it was understood, as least in theory, before Lorenz was running his first simple weather model. Perhaps, however, as computers were fairly new, and with rather limited memory, he was the first to become aware of it with a practical application.
Yet another lifelong software industry professional here.
While I think the poster is spot on that it is junk code given all the flaws cited, there are legitimate cases where different results from identical inputs can be correct. Say that at one point in the model the code needs to simulate what number of people in a given locale decided to go a store on a given day. The model can’t pick an exact number, so it may, indeed, use a pseudo-random number. But a correct model cannot pick just ANY pseudo-random number. There needs to be a valid statistical basis for the range and distribution within which the number falls, and so it is not so much pseudo-random as probabilistic. Each run may then get a different results from the original identical inputs, but the reliability will tie back to how valid the math is behind each such choice. I suspect if this is what they were trying to do, there probably was just wild guessing at it at best, resulting in it being completely unfit for use.
RANDSEED (X) where X is defined, and not e.g. taken from milliseconds since boot time.
Then make sure your calls to RAND follow consistent threads of random numbers.
You can easily, if you’re in a multi-threaded environment, where the order of partial results is important.
Quite. Part of the fun with the Limits to Growth models was the order in which the 4th order Runge Kutta iterations were applied to the simultaneous higher order PDEs implicit in the model.
Uninitialized local variables will be whatever the last process left in that memory location. Depending on your program and operating system, the “last process” might not even be a part your program.
MarkW,
Quite right. To my mind, this is the most likely reason a single-threaded process will exhibit this type of behavior. I saw this exact thing recently where I work. The tester couldn’t understand why the results varied even though they ran the test the “exact” same way each time. From their description I was able to find the bug in just a few minutes because I was sure the problem was an uninitialized buffer. I was right.
And if the code was decent C++, for example, the buffer creation would have a constructor that automatically initializes the buffer to default values, eliminating the problem and the need to track down the culprit.
Emphasis on “decent”. I’ve seen really horrendous C++ and Java code too. I’d rather work on well written C code than poorly designed C++ or Java code any day.
The average of bad data and bad models is still bad
“Take Down and Pin – Susan Denim!”
I love the way ‘a team from microsoft’ immediately jumps in to try to ‘update’ this steaming heap of code….
Well, why shouldn’t they? They have the best experience.
They’ve been doing exactly that since they pinched original code from Digital Research (more than once).
M$ can’t even fix their own code. Lets not talk about Hyper-V. The company I work for want out of Hyper-V and want in with VMWare. I support that move!
This problem makes the code unusable for scientific purposes,
≠=======?==.
This exact same problem exists in climate models. Indeed they typically employ a random seed to ensure each run is different. This of course makes it impossible to test if the model is programmed correctly.
Which is in fact quite a useful feature if you are writing/hiding buggy code.
“. . . makes it impossible to test if the model is programmed correctly.”
Maybe, or maybe not:
Substitute a single defined numeric value for the output number coming from the random number generator. Run the program with this modification many times . . . see if the output value(s) is (are) the same for all runs: YES=good, NO=trash the code.
” Indeed they typically employ a random seed to ensure each run is different.”
Not in any of the codes I have examined. Climate models are usually fully deterministic. Typically, each model lab runs a pre-industrial control run (“PIControl”) for several hundred years. (This is prescribed by the CMIP.) When a prediction run is required, the initial conditions are chosen to be slightly different in each run by the simple expedient of sampling different kick-off times from the stored PIControl state data, say at intervals of every 5 years. The end-state minus the initial-state is then the reported result for each run. The high sensitivity of the models to initial conditions (aka chaotic mathematics) on its own is sufficient to ensure that that the small changes in initial conditions from the sampling of kick-off times translates into a spread of outcomes for the same imposed forcing drivers and parameters.
And now for the brutal takedown of Dr Fauci & friends.
Dr Judy Mikovits PHD, lays bare the dirty side of Dr Fauci. Circulate this widely.
https://banned.video/watch?id=5eb3062575314400169f3e6c
The original video posted to YouTube had over 1 million views before it was deleted for contradicting the World Health Organization and the NIH.
TRM May 6, 2020 at 7:38 pm
——————-
oh dear INFO WARS
where you can buy (this link contains adverts)
Leading the charge in super high quality nascent iodine, Infowars Life Survival Shield X-2 will help boost your health beyond expectations. *
A Stronger Formula From 7,000 Feet Below The Surface
Is not iodine, iodine or are we getting into the memory of water!!!!!!!!!!!
Do you have any intelligent critique of Dr Judy Mikovits’ statements?
Do you have any intelligent critique of the points raised in the documentary?
If you don’t like getting it from the “banned.video” site you can get it from “plandemicmovie.com”.
Many here wishing someone would review the GW models, however….the modelers will not allow a proper investigation and hide their inputs claiming copy write etc. I remember a loooong time ago David Evans wanted to run a few warmists highly confident assumptions, inputs and outcomes through his 1st class models and….they refused. I guess they were not so much confident as fraudulent.
I haven’t heard from Evans for a while so had a quick look…and here is an article that contains criticism of modeling amongst many other criticisms as well. An interesting article would be raw data next to homogenized “data” especially the deleted stations should be examined. I think we would all find that very interesting.
https://www.skepticalscience.com/David_Evans_arg.htm
As the tabloids noted, he was forced to resign due to unlawful entries into his model 😉
6 May: Breitbart: Who Would Buy a Used Computer Model from ‘Bonking Boffin’ Neil Ferguson?
by James Delingpole
Professor Neil Ferguson is currently the most reviled, scorned and mocked man in Britain.
Such a pity, though, that it’s for all the wrong reasons…
Being an arrogant, sexually incontinent, hypocrite is a fairly venial slip, after all, when compared to destroying the world’s fifth-largest economy – and with it the livelihoods, job prospects and prosperity of a nation of 65 million people…
Few men in recent British history, I would argue, have done more long-term damage to their country. It’s about time this was recognised by two important groups: by the politicians who, up till now, have been taking him seriously as their guru; and by the general populace, a significant portion of which has been terrified by his doom-laden prognostications into a hysteria out of all proportion to the nature of the actual threat posed by coronavirus.
Ferguson, let us not forget, has a long track record of failure…
https://www.breitbart.com/europe/2020/05/06/would-buy-a-used-computer-model-from-bonking-boffin-neil-ferguson/