Another data science professional destroys Ferguson’s coronavirus model

From @thatkatyagirl:

I’ve spent the last few days picking through Neil Ferguson’s Covid model.

Thread 👇

It contains 450 different parameters, each of which is either a single float or a float matrix. The majority of these are not based on any ground truth data that I can see and are just… … “magic numbers” presented in the model. Any one of them can change the outputs of the model in unpredictable ways.

The model is essentially a giant, complex state machine. Its stochastic, which basically just means its contains random elements – so the no two runs… … produce the same results.

The code itself contains many special rules held in imperative C++. There is no explanation as to why these are present. For example, why are hotel’s excluded in the “place sweep”? Who knows, but they are.

The complexity is hence… …Cartesian product of the input parameters and the embedded code rules.

All this wouldn’t matter if it delivered reasonable results. However, no amount of fiddling with parameters deliver Swedish deaths of less than 90k. The model just consistently over counts infected and… … hence dead.

I am not even going to go into the bugs that are listed in the issues list. Some of which results in +/- 80k deaths.

Here are just a few of the free parameters you can fiddle with in order to “make up” a result which suits your narrative. Remember there are……450 such cogs and switches to fiddle with.

All in all, this is not a clear, transparent model, based on firm ground truth data. Its a fairly arbitrary Heath Robinson machine, which over counts infections on a consistent basis. Thanks all for your responses, questions and challenges. Too many to respond to them all I’m afraid. But all will be read. Strange that many academics have jumped in with “get off my land” style comments. This is such a thread. Worth a read.

Sample of parameters for which arbitrary inputs are included:

This is the model that broke the global economy….

2 thoughts on “Another data science professional destroys Ferguson’s coronavirus model

  1. Stumbled onto her on Twitter, too. Just checked back and she just pointed out care (nursing) homes aren’t even in the model … and she’s getting lots of mansplaining about code. She’s killing it.

    Liked by 1 person

    1. I don’t know who she is, but I know she knows her stuff for sure.

      It’s unbelievable, isn’t it, how awful this stuff is? I had the sense that his model was garbage from the way he described it early on, as someone who works in fintech and codes for a living. He clearly had no control over his product and no serious quantitative background.

      This is a general problem in academia though. You have a bunch of people who self-credentialize as “experts” running around saying extreme stuff, and you know they are silly as the misappropriate technical vocabulary, etc. But they try to bully and shame anyone who points that out. We have hundreds of these sort of people running around across disciplines. Tom Nichols is this personality par excellence. Taleb is getting that way too, and I used to really respect him. I almost wrote a piece fisking his fight with Calpers over the cost of end-of-the-world hedges and long-term investors.


