Brutal Takedown of Neil Ferguson’s Model

From Lockdown Sceptics
Stay sane. Protect the economy. Save livelihoods.

An experienced senior software engineer, Sue Denim, has written a devastating review of Dr. Neil Ferguson’s Imperial college epidemiological model that set the world on a our current lock down course of action.

She appears quite qualified.

My background. I wrote software for 30 years. I worked at Google between 2006 and 2014, where I was a senior software engineer working on Maps, Gmail and account security. I spent the last five years at a US/UK firm where I designed the company’s database product, amongst other jobs and projects.

She explains how the code she reviewed isn’t actually Ferguson’s but instead a modified version from a team trying to clean it up in a face saving measure.

The code. It isn’t the code Ferguson ran to produce his famous Report 9. What’s been released on GitHub is a heavily modified derivative of it, after having been upgraded for over a month by a team from Microsoft and others. This revised codebase is split into multiple files for legibility and written in C++, whereas the original program was “a single 15,000 line file that had been worked on for a decade” (this is considered extremely poor practice).

She then discusses a fascinating aspect of this model. You never know what you’ll get!

Non-deterministic outputs. Due to bugs, the code can produce very different results given identical inputs. They routinely act as if this is unimportant.

This problem makes the code unusable for scientific purposes, given that a key part of the scientific method is the ability to replicate results. Without replication, the findings might not be real at all – as the field of psychology has been finding out to its cost. Even if their original code was released, it’s apparent that the same numbers as in Report 9 might not come out of it.

Ms. Denim elaborates on this “feature” quite a bit. It’s quite hilarious when you read the complete article.

Imperial are trying to have their cake and eat it.  Reports of random results are dismissed with responses like “that’s not a problem, just run it a lot of times and take the average”, but at the same time, they’re fixing such bugs when they find them. They know their code can’t withstand scrutiny, so they hid it until professionals had a chance to fix it, but the damage from over a decade of amateur hobby programming is so extensive that even Microsoft were unable to make it run right.

Readers may be familiar with the averaging of outputs of climate model outputs in Climate Science, where it’s known as the ensemble mean. Or those cases where it’s assumed that errors all average out, as in certain temperature records.

Denim goes on to describe a lack of regression testing, or any testing, undocumented equations, and the ongoing addition of new features in bug infested code.

Denim’s final conclusions are devastating.

Conclusions. All papers based on this code should be retracted immediately. Imperial’s modelling efforts should be reset with a new team that isn’t under Professor Ferguson, and which has a commitment to replicable results with published code from day one. 

On a personal level, I’d go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people, and the results speak for themselves.

Full article here.

5 1 vote
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

191 Comments
Inline Feedbacks
View all comments
Ed Zuiderwijk
May 7, 2020 4:28 am

Old code of more than 15000 lines. Sounds like a good old Fortran job to me with legacy stuff full of GOTO statements and labels. Transcribing such a muddle to an object language like Java or C++ is an absolute nightmare (I speak from experience). Better to redesign the whole thing from scratch. Takes less effort and is much more transparent (also here speaking from experience). Should have been done 30 years ago. You need well-documented algorithms though and that may have been the problem.

Newminster
May 7, 2020 4:49 am

Just maybe, this time, perhaps, politicians and civil servants will start to understand that the science “community” is no more or less venal or corrupt than any other and take the “facts” that “the science” produces with the same caution and initial scepticism they would naturally apply (one would hope!) to estate agents and second-hand car dealers.

I live in hope but I’m not holding my breath. (I might add that if dealing with King or Deben the caution should be the same as they would apply to snake oil salesmen!)

Steven Mosher
May 7, 2020 5:35 am

Now I know who to blame for google maps and mail.

Kevin kilty
Reply to  Steven Mosher
May 7, 2020 7:18 am

Your sense of humor is obscure at times. I didn’t get this until I reread the article.

Curious George
Reply to  Steven Mosher
May 7, 2020 7:20 am

Blame Google.

Kevin kilty
May 7, 2020 7:09 am

The original code in “C”? I would have thought in Fortran. Lots of legacy codes are patched up with pieces of newer languages, but may have flaws still residing from decades back. I got some old codes from “COSMIC” at the University of Georgia in the early 80’s and found flaws in the original Fortran dating from more than a decade before. Remember the Assigned GoTo? Or block common? Errors on different machines with different word lengths. Or something uninitialized, but no one ever notices?

leowaj
May 7, 2020 7:42 am

I write code professionally using a range of languages. One of the worst things to hear, especially from academia, is: “the original program was ‘a single 15,000 line file.'”

In the 21st century, programmers are still writing all their code in a single file. What a crying shame.

Any time I have a chance to talk to academics in computer science, mathematics, statistics, and engineering, I try to interject the recommendation that academia learn from the business world good software development practices. Things like, test-driven development, separation of concerns, refactoring, and so forth, all of which are battle-tested practices that produce superior code and code that developers can demonstrate their confidence in. Academia should not be stuck in ~1980 software development practices.

Paul Penrose
Reply to  leowaj
May 7, 2020 10:45 am

More like pre-1980, maybe even pre-1970.

Michael Jankowski
May 7, 2020 10:25 am

Who would have thought that modeling in epidemiology was so stunted?

Where are all of the other models and their predictions/results?

Thus far the best model I’ve seen is from a civil engineering professor with a water resources background applying a pond storage model.

JCM
May 7, 2020 10:40 am

Respectfully I doubt a programmer is adept epidemiological modelling. I come across IT techs and coders every day who seem to suggest they know subjects of engineering, modelling, and many other areas better than professionals in those fields. I often feel embarrassment for them and it appears to be an issue of ego. The code structure is rather irrelevant and perhaps speaks to a broader issue of under investment into this area of epidemiology. I doubt the authors of the model ever insisted draconian measure were to be taken based the outputs. Negative consequences and misrepresentation of this information is the mistake of the political class and media.

Janice Moore
Reply to  JCM
May 7, 2020 2:03 pm

The code structure is rather irrelevant … .

Leaving aside the question of whether a given coder is skilled at epidemiological modeling,

I would suggest that you clarify the quoted remark. Until you do, you will appear to be ignorant of what matters in computer simulations called “models.” The coding (along with the underlying equations being coded) is more than relevant, it is virtually EVERYTHING.

lb
May 7, 2020 12:44 pm

I too develop software for money. In my experience, there sure are situations when a program with the same input doesn’t always produce the same output. Without random number generators or multithreading.

Remember the pentium bug? Or working with float data type? Sometimes 1.0 actually is 0.99999 or 1.000001. That’s no problem for the software you develop for the local birdwatcher club, but not good enough for banks, insurance companies or science.

Personally I don’t write modelling software, but I think they often use the output from say simulated year one as input for simulated year two and so on. These errors are going to add and multiply like crazy, and not always in the same way. Fun stuff.

Russ Wood
Reply to  lb
May 9, 2020 9:38 am

My wife, programming back in the 1980’s, had a really ‘orrible bug in a FORTRAN program she was working on. After days of debugging (this was in line editor /compile / test run /read the printout days), she found that a provided subroutine was CHANGING THE VALUE OF ONE! The subroutine writer was unaware how FORTRAN passed its call data, and was using an input parameter as ‘scratch’, thereby changing the value of ‘1’ in the calling routine! And I myself found that a brand-new ICL FORTRAN 77 compiler created incorrect code when you turned the debugging off! (That required assembly-level debugging!)
Ah – the fun we had!

Richard Saumarez
May 7, 2020 4:01 pm

I don’t think that we can pass judgment until the code is released and it is analysed forensically (If it is ever released).

It is likely that there are errors in the code. Speaking as an academic I know that academics write terrible code and that there are errors in the code.

I think that the errors will stem from assumptions about the processes in the model and their parameterisation.

michel
May 8, 2020 12:44 am

This is what happens, and this is the parallel to Climate.

You start out with a clear and simple idea of how its going to work. CO2 levels, for instance, are going to drive warming. Your intuition is expressible in a simple formula.

But obviously this is not good enough to be convincing. So you now model in exhaustive detail. You are now splitting the world into thousand, hundreds of thousands, million, billions of cells, and doing each one separately. You now treat CO2 as an output of modelling every country and every fuel. Same with takeup.

The next stage is invisible to everyone including the developer: your model in its early versions now does not give output which matches your simple formula. So, with or without realizing what you are doing, you tweak it till it does.

You’d have been better off staying with the simple formula, and testing it and refining it. If the energy had gone into validation and examination of the parameters, the result would at least have been something people could profitably argue about.

This is probably what has happened to Ferguson, several times in succession.

Mardler
May 8, 2020 1:19 am

Link to the original article now broken.

I hope “Miss” Pseudonym hasn’t been found out and sacked from his/her present position.

kramer
May 8, 2020 3:20 am

Interesting. The software was released on GitHub and Microsoft upgraded it. I’d like to know how this unfolded.
Microsoft was once run by Bill Gates, a billionaire who is concerned with over population and climate change. I wonder if he pulled any strings at Microsoft to get them to ‘upgrade’ the code?

Meanwhile, Gates is getting involved on finding a vaccine for the virus while his Windows OS, which has been upgraded eleventy-zillion times with virus patches over a span of several decades, still gets viruses.

I can see the future, we will be getting notified every couple of months to install (inject) new virus vaccinations or update patches.

What, me worry? Ha!

lb
Reply to  kramer
May 8, 2020 10:12 am

kramer
“I can see the future, we will be getting notified every couple of months to install (inject) new virus vaccinations or update patches.”

Ha. Yes. And we will pay through the nose for the license and the upgrade service and…

May 8, 2020 7:31 am

Unbelievable. Science literally depends on being able to replicate and analyze results, but these guys have an amazing “Just do it, it will work out” perspective that trascends anything and everything. So… all of those papers contributing to the alarm regarding climate change were based on a faulty model? Sincerely it’s hard to believe most if any thing nowadays man.

Brett Keane
May 8, 2020 1:04 pm

Sue Denim – Pseu Donym: Nice One There! Brett Keane (the real one haha).

Mardler
Reply to  Brett Keane
May 8, 2020 4:11 pm

As I said above.

SocietalNorm
May 8, 2020 4:11 pm

Spaghetti code — aauugg!
I had to modify code in the Space Shuttle simulation programs I was using that had been written originally back in the 60’s for Apollo and had been added to over and over again through the years.
Of course, we had massive testing requirements done by multiple independent companies and, of course, many users were still finding bugs in them constantly.
These guys have none of that rigor at all.
The consequences of using the modeling for the corona virus from Wuhan, as well as the modeling of climate forecasting are much more devastating to humanity than possibly losing a Space Shuttle and the astronauts.

These guys are playing around at simulation and modeling and playing around with billions of people’s lives.

davidgmillsatty
May 9, 2020 12:11 pm

I am not sure that this is the takedown that some think. Commenters dr_t and earthflattener had pretty good takedowns of Sue Denim aside from Sue Denim’s anonymity.

Rune
May 9, 2020 2:28 pm

“a single 15,000 line file that had been worked on for a decade” (this is considered extremely poor practice).”

And she then links to John Carmack’s comments on twitter (https://twitter.com/ID_AA_Carmack/status/1254872369556074496). Yes, he says it is 15000 lines of code, but: “It turned out that it fared a lot better going through the gauntlet of code analysis tools I hit it with than a lot of more modern code. There is something to be said for straightforward C code. Bugs were found and fixed, but generally in paths that weren’t enabled or hit. ”

I’m sure I would’ve grumbled if someone handed me a single file containing 15k lines of code, but it would be downright silly for anyone to pick an argument with Carmack on this topic. I’ve never heard of Sue before. John Carmack OTOH is a living legend (for 30-odd years).

Harry Passfield
May 10, 2020 5:10 am

In refusing to produce his code is Ferguson the UK’s very own Mann?

Peter D. Tillman
May 11, 2020 2:00 am

“Sue Denim” is a pseu-donym of author Dav Pilkey: https://en.wikipedia.org/wiki/Dav_Pilkey
I think he was inspired by one of Bruce Sterling’s running-mates at the original Cheap Truth zine, https://en.wikipedia.org/wiki/Cheap_Truth
— where “Sue Denim” as in pseu-donym, in this case for Lewis Shiner. Oh, those SF/F guys!