Posted on 03/20/2015 11:36:31 AM PDT by E. Pluribus Unum
Michael Mann, you are being paged......
This applies to many things...like it’s OK to lie to the public if you’re a politician as long as it’s “unknowingly” done...
Better page Marie Harf too.
Unknowingly?
I doubt it.......................
Why is P-hacking the fault of scientists?
Seems more likely something that would be directly caused as a result of Anthropomorphic Glowbull Warming.
How do we know this study wasn't tweaked to increase its chances of getting results that are easily published?
These are NOT scientists. They are just the more educated members of the “Gimme Dat” crowd pushing in for their piece of the federal pie.
Unknowingly? Oh, puleez. My Aunt Fanny.
That is not unknowingly.
That is outright fraud. That is why there are lies, damn lies and statistics.
You run the numbers several different ways and then pick the one that is most common, the worst case scenario and the best cast scenario.
You present all three.
I’m not so certain about this.
In my field, results are usually validated with either a Student’s t-test or an Anova test.
You can set up an experiment with all of the proper controls, and when you graph the results at the end, they look wonderful. The graph bars are different heights, the standard deviations are fairly small. But then, you do the t-test and get a p-value of 0.051... which just barely makes it across the threshold. A repeat of the experiment gives a p-value of 0.048. Another repeat gives a p-value of 0.050.
An honest scientist would report that as seeing a difference that was not statistically significant. In other words, more study needs to be done to answer the question.
When you have the option of designing a study so that the only variable in the study is the one you are trying to manipulate in order to test the hypothesis, you can set the validation standards quite high.
In some studies, such as drug studies in large populations, it becomes quite difficult to discern the effects of the drug versus other variables that cannot be controlled. Human beings are notoriously difficult to standardize, and we can’t establish a population of genetically identical humans for research the way we can with mice or rabbits. So, then, it takes some really heavy-duty statistics to make sense of the data, and different statistical tests can give different statistical significance.
I think the issue is not so much bias, but the difficulty of interpreting study results in a highly variable background.
My statistics and design prof referred to it as data massaging.
Statistics is one of the easiest branch of mathematics. However, even if you do all the calculations correctly your answer could be garbage. Statistics is the most misapplied branch of mathematics. Most scientists do not have a clue about properly designing a statical experiment. It requires a significant amount of thought in selecting an appropriate P-value (nerds will get this pun). Often, the value is arbitrarily picked as p=0.05 (the most common p-value used). Rarely do the scientist consider the ramifications of a type I or type II statistical error in their research. Many scientists do no clearly understand statistics, they tend to mimic the statistics that they have seen in the past. I have had six different courses just in statistical design of experiments(5 A’s and 1 B) and I get it wrong sometimes, usually because I lack a proper understanding of the application. I have seen good scientists unknowingly get it wrong.
“You can set up an experiment with all of the proper controls, and when you graph the results at the end, they look wonderful. The graph bars are different heights, the standard deviations are fairly small. But then, you do the t-test and get a p-value of 0.051... which just barely makes it across the threshold. A repeat of the experiment gives a p-value of 0.048. Another repeat gives a p-value of 0.050.”
My attitude has always been if you need to use statistics to prove the results are significant, they probably aren’t. As we know the p=0.05 standard is an arbitrary standard indicating there is ~ 1/20 chance the results are not statistically significant. That’s hardly definitive.
there's an interesting history behind the choice of .05 as the threshold of statistical significance. Before the advent of electronic computers, tables of statistics were calculated, literally, by hand, possibly using a mechanical calculator for the actual arithmetic. Most published tables, for no reason better than tradition, included a column of the 5% value of the distribution. If a researcher didn't want to compute a new set of tables for himself, it was convenient to use the published tables, and choose 5% as the cutoff because that value was in the table. Now it's possible to compute the actual probability of the result you obtained, rather than just observing whether it's over or under the 5% limit. However, few researchers bother to do that.
Model selection, or even the choice of nonparametric modeling, is also not cut and dried. People can have honest differences of opinions on these, although I take the article to say that there is a bias in favor of justifying models that get you published (e.g., Hey, look, p=.048 under this model!!).
First thing my stats prof in undergrad said was “Figures lie and liars figure.” The second thing he said is that “if you have a pre-conceived notion, you can prove anything you want with statistics.”
In grad school, my stats class proved to me that most people who took undergrad stats really don’t know how to figure out how to properly build statistics. This is now being exacerbated by “Visualization Technology” where raw data goes in, and instead of creating data outputs, everything is visualized into some creative graph that lets people who don’t have a clue of what they are looking at go “ooh” and “ahh” and “oh, its obvious that (fill in the blank) is happening. Visualizations are stats for those who don’t understand stats.
I have a copy of this book from the early 50's. It's still in print!...............
One of the other things that people need to be aware of is that the conclusions can be the direct opposite of the data.
I wish I could find it, but there was a medical study published that had some inflammatory conclusion a few years back. I was curious enough to actually read the damn thing.
To my horror, I found that the data said the exact opposite of the conclusion. There was a paragraph within the study that dismissed the researchers’ own data. They carried on with why the data was wrong and went to the conclusion.
It kills me that I can’t find it.
The peer review process has totally broken down. There’s too much trust, not enough confirmation, and not enough rigorous examination. This is how we end up with decades of ‘don’t eat cholesterol’ being foisted onto people by their doctors and public policy.
In the field of biochemistry, we cannot get published without showing significance through the P value. Typically, we repeat identical experiments three times, with the P < 0.05 each time, before we even accept our result.
I know that the P value is somewhat arbitrary, but it is a good tool for discarding results that are absolute junk. The cut-off has to be somewhere.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.