Posted on 11/28/2017 1:27:41 PM PST by nickcarraway
In Signal and Noise, Nate Silver admits the truth. BIAS.
Statistics are based on a population, a sample, on collected data. There is bias in which data to collect and which data to ignore. There is bias in the weight given to each piece of data collected. There is bias in refusing to admit/recognize the bias. There is bias in refusing to admit what you do not know. There is bias in refusing to admit that you don’t know what you don’t know.
Then there is bias in believing the data. A famous Artificial Intelligence company did a study of immunizations. It believed in advance that immunizations were useful. When accurate math did not prove the pre-conceived bias, they adjusted the denominator to make it fit their bias. They did not do this to intentionally lie. They did it because they knew that the correct answer could not possibly be correct because everybody knew immunizations were good.
They then recommended more immunizations based on their failure of 5th grade math.
Their original math was correct but their understanding of the raw data was seriously flawed. Sick people go to the doctor more often than healthy people. When people go to the doctor, the doctor always pushes a flu shot or whatever immunization is available. So invariably sick people get more shots than healthy people. Naturally, From the thing they got a shot for, sick people get sick more often than healthy people, despite the shot.
But high paid AI gurus with PHDs don’t know what your uncle knows.
They drink objectivity from a chalice sent by Congress.
Beat me to it. That’s my favorite stats book of all time. Unfortunately academia and the media treat it as a guidebook rather than a warning.
I’m guessing that the misinformation created by either deliberate or ignorant misuse of statistics is a greater volume than the accurate information produced.
Academia has had decades and decades of experience manipulating data in order to get government grants.
Their expertise in this field is nonpareil.....................
I think the most common thing I am seeing is applying statistical analysis to a dataset then applying statistical tests to the processed data rather than the original dataset. Of course it is going have a positive result to what ever you are trying to prove or disprove.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.