Two things stand out in the article.
> the problem goes away if you run the model in single-threaded mode
In other words, multiple threads are running and each is accessing the same memory without synchronization, corrupting the numbers. When run as designed, NOTHING valid can come from this program.
> Reports of random results are dismissed with responses like thats not a problem, just run it a lot of times and take the average,
Averaging results is NOT for dealing with bugs. It makes sense to use random sampling and average the results rather than run through every possibility - such as in gambling for example. They are corrupting the data in unpredictable ways and expecting it to somehow average out.
This is a multi-trillion dollar software problem.
Also... the problems exposed by the article are about the implementation. The “expert” assumptions and formulas that went into the model could be a whole other layer of problems.