Scientists unknowingly tweak experiments (selecting optimal statistical method)

Scientists unknowingly tweak experiments (selecting optimal statistical method)
Phys.org ^ | Mar 18, 2015

Posted on 03/20/2015 11:36:31 AM PDT by E. Pluribus Unum

Dr Megan Head in her evolutionary biology lab at the Research School of Biology.

A new study has found some scientists are unknowingly tweaking experiments and analysis methods to increase their chances of getting results that are easily published.

The study conducted by ANU scientists is the most comprehensive investigation into a type of publication bias called p-hacking.

P-hacking happens when researchers either consciously or unconsciously analyse their data multiple times or in multiple ways until they get a desired result. If p-hacking is common, the exaggerated results could lead to misleading conclusions, even when evidence comes from multiple studies.

"We found evidence that p-hacking is happening throughout the life sciences," said lead author Dr Megan Head from the ANU Research School of Biology.

The study used text mining to extract p-values - a number that indicates how likely it is that a result occurs by chance - from more than 100,000 research papers published around the world, spanning many scientific disciplines, including medicine, biology and psychology.

"Many researchers are not aware that certain methods could make some results seem more important than they are. They are just genuinely excited about finding something new and interesting," Dr Head said.

"I think that pressure to publish is one factor driving this bias. As scientists we are judged by how many publications we have and the quality of the scientific journals they go in.

"Journals, especially the top journals, are more likely to publish experiments with new, interesting results, creating incentive to produce results on demand."

Dr Head said the study found a high number of p-values that were only just over the traditional threshold that most scientists call statistically significant.

"This suggests that some scientists adjust their experimental design, datasets or statistical methods until they get a result that crosses the significance threshold," she said.

"They might look at their results before an experiment is finished, or explore their data with lots of different statistical methods, without realising that this can lead to bias."

The concern with p-hacking is that it could get in the way of forming accurate scientific conclusions, even when scientists review the evidence by combining results from multiple studies.

For example, if some studies show a particular drug is effective in treating hypertension, but other studies find it is not effective, scientists would analyse all the data to reach an overall conclusion. But if enough results have been p-hacked, the drug would look more effective than it is.

"We looked at the likelihood of this bias occurring in our own specialty, evolutionary biology, and although p-hacking was happening it wasn't common enough to drastically alter general conclusions that could be made from the research," she said.

"But greater awareness of p-hacking and its dangers is important because the implications of p-hacking may be different depending on the question you are asking."

The research is published in PLOS Biology.

TOPICS: News/Current Events
KEYWORDS: sciencetrust

Navigation: use the links below to view more comments.
first 1-20, 21-22 next last

1 posted on 03/20/2015 11:36:31 AM PDT by E. Pluribus Unum

[ Post Reply | Private Reply | View Replies]

To: E. Pluribus Unum

Michael Mann, you are being paged......

2 posted on 03/20/2015 11:38:12 AM PDT by expat2

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

This applies to many things...like it’s OK to lie to the public if you’re a politician as long as it’s “unknowingly” done...

Better page Marie Harf too.

3 posted on 03/20/2015 11:39:46 AM PDT by bigbob (The best way to get a bad law repealed is to enforce it strictly. Abraham Lincoln)

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

Unknowingly?

I doubt it.......................

4 posted on 03/20/2015 11:40:03 AM PDT by Red Badger (Man builds a ship in a bottle. God builds a universe in the palm of His hand.............)

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

Like this is new?..in testing you always account for the possibility of your own bias

5 posted on 03/20/2015 11:41:49 AM PDT by tophat9000 (An Eye for an Eye, a Word for a Word...nothing more)

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

Why is P-hacking the fault of scientists?

Seems more likely something that would be directly caused as a result of Anthropomorphic Glowbull Warming.

6 posted on 03/20/2015 11:46:50 AM PDT by C210N (When people fear government there is tyranny; when government fears people there is liberty)

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

A new study has found some scientists are unknowingly tweaking experiments and analysis methods to increase their chances of getting results that are easily published.

How do we know this study wasn't tweaked to increase its chances of getting results that are easily published?

7 posted on 03/20/2015 11:47:33 AM PDT by kevkrom (I'm not an unreasonable man... well, actually, I am. But hear me out anyway.)

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

These are NOT scientists. They are just the more educated members of the “Gimme Dat” crowd pushing in for their piece of the federal pie.

8 posted on 03/20/2015 11:47:55 AM PDT by Don Corleone ("Oil the gun..eat the cannoli. Take it to the Mattress.")

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

Unknowingly? Oh, puleez. My Aunt Fanny.

9 posted on 03/20/2015 11:50:41 AM PDT by bgill (CDC site, "we still do not know exactly how people are infected with Ebola")

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

P-hacking happens when researchers either consciously or unconsciously analyse their data multiple times or in multiple ways until they get a desired result.

That is not unknowingly.

That is outright fraud. That is why there are lies, damn lies and statistics.

You run the numbers several different ways and then pick the one that is most common, the worst case scenario and the best cast scenario.

You present all three.

10 posted on 03/20/2015 11:51:37 AM PDT by Harmless Teddy Bear (Proud Infidel, Gun Nut, Religious Fanatic and Freedom Fiend)

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

I’m not so certain about this.

In my field, results are usually validated with either a Student’s t-test or an Anova test.

You can set up an experiment with all of the proper controls, and when you graph the results at the end, they look wonderful. The graph bars are different heights, the standard deviations are fairly small. But then, you do the t-test and get a p-value of 0.051... which just barely makes it across the threshold. A repeat of the experiment gives a p-value of 0.048. Another repeat gives a p-value of 0.050.

An honest scientist would report that as seeing a difference that was not statistically significant. In other words, more study needs to be done to answer the question.

When you have the option of designing a study so that the only variable in the study is the one you are trying to manipulate in order to test the hypothesis, you can set the validation standards quite high.

In some studies, such as drug studies in large populations, it becomes quite difficult to discern the effects of the drug versus other variables that cannot be controlled. Human beings are notoriously difficult to standardize, and we can’t establish a population of genetically identical humans for research the way we can with mice or rabbits. So, then, it takes some really heavy-duty statistics to make sense of the data, and different statistical tests can give different statistical significance.

I think the issue is not so much bias, but the difficulty of interpreting study results in a highly variable background.

11 posted on 03/20/2015 12:02:28 PM PDT by exDemMom (Current visual of the hole the US continues to dig itself into: http://www.usdebtclock.org/)

[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

This sort of thing was going on back in the 70's when I was in grad school.

My statistics and design prof referred to it as data massaging.

12 posted on 03/20/2015 12:31:10 PM PDT by Arm_Bears (Rope. Tree. Politician. Some assembly required.)

[ Post Reply | Private Reply | To 1 | View Replies]

To: Red Badger

Statistics is one of the easiest branch of mathematics. However, even if you do all the calculations correctly your answer could be garbage. Statistics is the most misapplied branch of mathematics. Most scientists do not have a clue about properly designing a statical experiment. It requires a significant amount of thought in selecting an appropriate P-value (nerds will get this pun). Often, the value is arbitrarily picked as p=0.05 (the most common p-value used). Rarely do the scientist consider the ramifications of a type I or type II statistical error in their research. Many scientists do no clearly understand statistics, they tend to mimic the statistics that they have seen in the past. I have had six different courses just in statistical design of experiments(5 A’s and 1 B) and I get it wrong sometimes, usually because I lack a proper understanding of the application. I have seen good scientists unknowingly get it wrong.

13 posted on 03/20/2015 12:53:28 PM PDT by Do the math (Doug)

[ Post Reply | Private Reply | To 4 | View Replies]

To: exDemMom

“You can set up an experiment with all of the proper controls, and when you graph the results at the end, they look wonderful. The graph bars are different heights, the standard deviations are fairly small. But then, you do the t-test and get a p-value of 0.051... which just barely makes it across the threshold. A repeat of the experiment gives a p-value of 0.048. Another repeat gives a p-value of 0.050.”

My attitude has always been if you need to use statistics to prove the results are significant, they probably aren’t. As we know the p=0.05 standard is an arbitrary standard indicating there is ~ 1/20 chance the results are not statistically significant. That’s hardly definitive.

14 posted on 03/20/2015 1:05:14 PM PDT by Brooklyn Attitude (Things are only going to get worse.)

[ Post Reply | Private Reply | To 11 | View Replies]

To: Do the math

Often, the value is arbitrarily picked as p=0.05 (the most common p-value used).

there's an interesting history behind the choice of .05 as the threshold of statistical significance. Before the advent of electronic computers, tables of statistics were calculated, literally, by hand, possibly using a mechanical calculator for the actual arithmetic. Most published tables, for no reason better than tradition, included a column of the 5% value of the distribution. If a researcher didn't want to compute a new set of tables for himself, it was convenient to use the published tables, and choose 5% as the cutoff because that value was in the table. Now it's possible to compute the actual probability of the result you obtained, rather than just observing whether it's over or under the 5% limit. However, few researchers bother to do that.

15 posted on 03/20/2015 1:06:02 PM PDT by JoeFromSidney (Book RESISTANCE TO TYRANNY, available from Amazon.)

[ Post Reply | Private Reply | To 13 | View Replies]

To: exDemMom

Model selection, or even the choice of nonparametric modeling, is also not cut and dried. People can have honest differences of opinions on these, although I take the article to say that there is a bias in favor of justifying models that get you published (e.g., Hey, look, p=.048 under this model!!).

16 posted on 03/20/2015 1:07:37 PM PDT by FateAmenableToChange

[ Post Reply | Private Reply | To 11 | View Replies]

To: Do the math

First thing my stats prof in undergrad said was “Figures lie and liars figure.” The second thing he said is that “if you have a pre-conceived notion, you can prove anything you want with statistics.”

In grad school, my stats class proved to me that most people who took undergrad stats really don’t know how to figure out how to properly build statistics. This is now being exacerbated by “Visualization Technology” where raw data goes in, and instead of creating data outputs, everything is visualized into some creative graph that lets people who don’t have a clue of what they are looking at go “ooh” and “ahh” and “oh, its obvious that (fill in the blank) is happening. Visualizations are stats for those who don’t understand stats.

17 posted on 03/20/2015 1:07:38 PM PDT by RainMan (Liberals are first and foremost, jealous little losers who resent anyone who has anything they dont)

[ Post Reply | Private Reply | To 13 | View Replies]

To: Do the math

I have a copy of this book from the early 50's. It's still in print!...............

18 posted on 03/20/2015 1:12:02 PM PDT by Red Badger (Man builds a ship in a bottle. God builds a universe in the palm of His hand.............)

[ Post Reply | Private Reply | To 13 | View Replies]

To: E. Pluribus Unum

One of the other things that people need to be aware of is that the conclusions can be the direct opposite of the data.

I wish I could find it, but there was a medical study published that had some inflammatory conclusion a few years back. I was curious enough to actually read the damn thing.

To my horror, I found that the data said the exact opposite of the conclusion. There was a paragraph within the study that dismissed the researchers’ own data. They carried on with why the data was wrong and went to the conclusion.

It kills me that I can’t find it.

The peer review process has totally broken down. There’s too much trust, not enough confirmation, and not enough rigorous examination. This is how we end up with decades of ‘don’t eat cholesterol’ being foisted onto people by their doctors and public policy.

19 posted on 03/20/2015 1:30:01 PM PDT by Marie

[ Post Reply | Private Reply | To 1 | View Replies]

To: Brooklyn Attitude

In the field of biochemistry, we cannot get published without showing significance through the P value. Typically, we repeat identical experiments three times, with the P < 0.05 each time, before we even accept our result.

I know that the P value is somewhat arbitrary, but it is a good tool for discarding results that are absolute junk. The cut-off has to be somewhere.

20 posted on 03/20/2015 1:38:20 PM PDT by exDemMom (Current visual of the hole the US continues to dig itself into: http://www.usdebtclock.org/)

[ Post Reply | Private Reply | To 14 | View Replies]

Navigation: use the links below to view more comments.
first 1-20, 21-22 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search

News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794