Posted on 05/20/2020 11:08:22 AM PDT by ImJustAnotherOkie
I have been asked by a source in Britain to review the Ferguson model code for my opinion. Just so everyone has some idea, the original program used by Ferguson was a single 15,000 line file that had been worked on for a decade and by no means is this remotely sophisticated. I seriously doubt that Imperial College will want to go public with the code because it is that bad. To put this in some perspective, just the core to conduct basic analysis in Socrates is about 150,000 lines of code. It is so complicated, it takes a tremendous amount of concentration to try to see the paths it has available to it for basic analysis.
To try to keep this in traders terms, reviewing the code reveals this is just a stochastic which is INCAPABLE of forecasting high, low, or projected price target expected to be achieved. Any trader knows that a stochastic is a trend following measure not a forecaster of the trend nor a projection tool to say when a market is overbought or oversold. This clearly shows the vast chasm between trading models and academic models where the money is never on the line. The documentation even states:
The model is stochastic. Multiple runs with different seeds should be undertaken to see average behaviour.
Stochastic is simply defined as randomly determined; having a random probability distribution or pattern that may be analyzed statistically but may not be predicted precisely. In other words, they begin with a presumption, and therein lies the FIRST error. Fergusons assumption was wrong, to begin with. Then this mode is so old, they recommend that it be run only on a single CORE processor as if we were dealing with an old IBM XT.
Effectively, you start the program with what is called a seed number which is then used to produce a random number. Most childrens games begin this way. In fact, this is a version of what you would be similar to the game SimCity where you create a city starting from scratch and it simulates what might happen based upon the beginning presumption. There are numerous bugs in the code and the documentation suggests to run it several times and take the average. This is just unthinkable! A program should produce the same result with the same data from which it begins. Therefore, there is no possible way this model would ever produce the same results. In reality, this model produces completely different results even when beginning with the very same starting seeds and parameters because of the attempt to also make the seed random. This is not even as sophisticated as SimCity, which is really questionable. This is where the Imperial College claims that the errors will vanish if you run it on an old system in the single-threaded mode as if you were using a 1980s XT.
In programming, you run what is known as a regression-test, which is re-running a functional and non-functional test to ensure that previously developed and tested software still performs after a change. In market terminology, its called back-testing. In the most unprofessional manner imaginable, the Imperial College code does not even have a regression-test structure. They apparently attempted to but the extent of the random behavior caused by bugs in the code to prevent that check? On April 4th, 2020, Imperial College noted:
However, we havent had the time to work out a scalable and maintainable way of running the regression test in a way that allows a small amount of variation, but doesnt let the figures drift over time.
This Ferguson Model is such a joke it is either an outright fraud, or it is the most inept piece of programming I may have ever seen in my life. There is no valid test to warrant any funding of Imperial College for providing ANY forecast based upon this model. This is the most UNPROFESSIONAL operation perhaps in computer science. The entire team should be disbanded and an independent team put in place to review the world of Neil Ferguson and he should NOT be allowed to oversee any review of this model.
The only REASONABLE conclusion I can reach is that this has been deliberately used to justify bogus forecasts intent for political activism, or I must accept that these academics are totally incapable of even creating a theoretical model no less coding it as a programmer. There seems to have been no independent review of Fergusons work which is unimaginable!
A 15,000 line program is nothing. I will be glad to write a model like this in two weeks and will only charge $1 million instead of $79 million. If you really want one to work globally, no problem. It will take a bit more time and the price will be at a discount only $50 million on sale refunds not accepted as is the deal with Imperial College.
The Bill Gates connection is the laugh line here although not mentioned.
I did my own projections on the back of an envelope.
It ain’t that hard.
Well, who would have thought an ‘epidemiologist’ would have a Crackerjack mathematical biology diploma...../s
Follow the Science, they said
Trump should let the science experts manage the national response, they said
Fauci and Birx are the smartest scientists in the US, they said
Funny, they are still saying this
The author is bloviating. He may be right on a few details, but he misses the point all together.
It doesnt matter how many lines of code there are. Its the basic algorithm that drives the “answer”.
This part he gets right. The problem, as I have stated many times now, is that trending, fitting historical data, whatever you want to call it, does not lead to good prediction. The only way to do that is by having a model of the complete process that has been proven out with real data. There are no good models that have all the details of both viral growth/decay in the real world (vice a lab) AND the infinite impact of human behaviors and interactions. All you can do is postulate.
Reminds me of the Obamacare website which cost something like a billion dollars and basically could not function as a website.
Fraud? or Incompetence? Sometimes the lines seems very thin.
I appreciate that the author does not get caught up in which language the program was written in. He does get worked up about the program wanting one core. Adding cores allows more speed and complexity, but does not increase accuracy (except that you have more time to run more simulations and scenarios).
Well, the number of lines do matter. The original was done in Fortran. The Fortran program was the original model. It’s since been refactored into R, Python, and C++ from what I understand.
After all this refactoring, all the estimates have been changed.
A program written in this manner is almost impossible to change. Using Copy/Paste instead of loops is one issue.
Instead of subroutines just copy paste code.
Programs of this size have almost no structure either. I know for a fact because I deal with them and have been a developer for almost 40 years.
Make a small change and everything goes out the window.
Yes, size does matter.
Hope he never looks at the models for climate change. They might make this one look accurate.
Fortran of all things.
In the most unprofessional manner imaginable, the Imperial College code does not even have a regression-test structure. They apparently attempted to but the extent of the random behavior caused by bugs in the code to prevent that check? On April 4th, 2020, Imperial College noted:
However, we havent had the time to work out a scalable and maintainable way of running the regression test in a way that allows a small amount of variation, but doesnt let the figures drift over time.
This Ferguson Model is such a joke it is either an outright fraud, or it is the most inept piece of programming I may have ever seen in my life
My point is not about mistakes from not re validating migrated code.
My point is code is an implementation of a thought process, an algorithm. If the algorithm is faulty, doesnt matter how many lines or what language its in.
The author IS bloviating.
As a former programmer who took a lot of pride in good design, I would never talk this way. It is good enough to say, in one sentence, that the code is written poorly and not maintainable. Boom, done.
What matters is how it works, and that is touched on just momentarily. The fact that it cannot produce repeatable results for a given set of inputs is the showstopper. Whatever it’s doing, it cannot be independently validated.
When we look back on this crisis, we’ll learn how certain people found themselves through pure luck to be in the right place at the right moment to drive policy regarding virus mitigation.
This is supposed to be science, with one person checking that another’s theories are sound. Funny how godly voices from a mountaintop can be perceived as science from those that know better.
True, but in the early stages of a novel infection like this it's all you have.
The glaring omission in all of the Imperial criticism I've seen is a reference to a model that was better at the time.
It's mostly been weak post hoc whining.
Just remember folks: A real FORTRAN programmer can write FORTRAN code in ANY language! /(jk - for the humor impaired)
However...For example one model tried to predict number of deaths yet the 95% confidence range was across 3 orders of decimal magnitude. Since the actual number of deaths was in this range the authors declared success. This model was cited by a number of others.
Another model tried to deduce infectiousness of asymptomatic carriers using phone tracing data from Tencent. Once again there was a glowing conclusion in the abstract and plenty of cites, but the supplement revealed number of deaths 95% confidence range of 10x from high to low. Plus the authors admitted a wide range of values for number of asymptomatic carriers and infectiousness were actually possible, they settled on the values published after repeated iteration - specifically choosing values that kept the model numerically stable in a MatLab solver library.
I cringed and stopped reviewing models.
And THAT is the most egregious part of this. Anyone who understands math and modeling should know that you really have NOTHING to support prediction. Its the Hockey Stick global warming model all over again. The best you can do is be like the weatherman..."there is a good chance that tomorrow will be like today." Now thats a great job to have.
An algorithm is not a monolithic thing. There are probably thousands of algorithms here. I would wager a guess that there are hundreds of duplicated algorithms in this, and also wager that they are not exactly the same.
This is software that from what I see was never really tested. Just a “that sounds about right”. Last minute changes to correct a small bug without good testing is just the ticket to drain the World Economy of Trillions of dollars. There was no vetting, it was impossible.
Software is all about details, details. One small detail can literally wreck a whole application.
Evidently you never get the same result twice with this thing which should tell you something.
Complicated code is bad code. Period. If you need someone to explain what they’re doing with the code, they are not good coders.
As St. Exupery famously said, “Perfection is attained, not when there is nothing left to add, but when there is nothing left to take away.”
Lets just say Ferguson’s mind was on other things — like banging some man’s wife in the midst of a pandemic.
This code and a code of conduct are thrown by the wayside. Both are in ruins and so is his reputation.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.