Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

How to Lie with P-values
Data Science Central ^ | 6-11-19 | Vincent Granville

Posted on 02/07/2020 3:15:54 PM PST by spintreebob

P-values are used in statistics and scientific publications, much less so in machine learning applications where re-sampling techniques are favored and easy to implement today thanks to modern computing power. In some sense, p-values are a relic from old times, when computing power was limited and mathematical / theoretical formulas were favored and easier to deal with than lengthy computations.

Recently, p-values have been criticized and even banned by some journals, because they are used by researchers, who cherry-pick observations and repeat experiments until they obtain a p-value worth publishing to obtain grant money, get tenure, or for political reasons. Even the American Statistical Association wrote a long article about why to avoid p-values, and what you should do instead: see here. For data scientists, obvious alternatives include re-sampling techniques: see here and here. One advantage is that they are model-independent, data-driven, and easy to understand.

Here we explain how the manipulation and treachery works, using a simple simulated data set consisting of purely random, non-correlated observations. Using p-values, you can tell anything you want about the data, even the fact that the features are highly correlated, when they are not. The data set consists of 16 variables and 30 observations, generated using the RAND function in Excel. You can download the spreadsheet here.

(Excerpt) Read more at datasciencecentral.com ...


TOPICS: Business/Economy; Crime/Corruption; Culture/Society; Philosophy
KEYWORDS: cause; correlation; lie; statistics
Navigation: use the links below to view more comments.
first previous 1-2021-34 last
To: Repeal The 17th

“Will you marry me?”

Sorry, someone else beat you to it! Been married 11 years now!


21 posted on 02/07/2020 4:34:21 PM PST by MeganC (There is nothing feminine about feminism.)
[ Post Reply | Private Reply | To 19 | View Replies]

To: cpdiii
There is nothing wrong with P Values. If you cherry pick the data all the results are crap. It is no longer valid data.

You don't need to cherry pick; 100 researchers do the same experiment (does drug A cause weight loss). 98 of them find no effect, which is not a publication worthy result, so they don't get published, but the two who do find an effect (say p 0.03, 0.04) publish their results and now we have a problem.

22 posted on 02/07/2020 4:36:53 PM PST by LambSlave
[ Post Reply | Private Reply | To 9 | View Replies]

To: MeganC

Yeah,
I’ve been married for 44 years, myself.
Probably would not have worked out anyway...


23 posted on 02/07/2020 4:40:32 PM PST by Repeal The 17th (Get out of the matrix and get a real life)
[ Post Reply | Private Reply | To 21 | View Replies]

To: cpdiii
In some sense, p-values are a relic from old times, when computing power was limited and mathematical / theoretical formulas were favored and easier to deal with than lengthy computations.

And THAT, is where the writer surrenders his position.

As stated, correctly, there is nothing wrong with p-values...just like there is nothing wrong with guns. It is when p-values are put in the hands of unscrupulous people with the intent to do harm, that bad things can happen. Furthermore, models that have other problems like multicollinearity, can lead to inefficient parameter estimates and problematic p-values.

However, this is all Stat 101 stuff. The REAL intent behind this article, is to subtly paint statistics in a bad light and puff up the data science/machine learning professionals. Which is sad, since Dr. Granville has good credentials.


24 posted on 02/07/2020 4:46:42 PM PST by DoodleBob (Gravity's waiting period is about 9.8 m/s^2)
[ Post Reply | Private Reply | To 9 | View Replies]

To: Repeal The 17th

OMG you fool! It’s a trap!


25 posted on 02/07/2020 5:31:44 PM PST by Secret Agent Man (Gone Galt; Not Averse to Going Bronson.)
[ Post Reply | Private Reply | To 19 | View Replies]

To: Secret Agent Man; MeganC

Well, she turned me down.
...and my wife would have killed me anyway!
...it was all in fun.


26 posted on 02/07/2020 5:35:01 PM PST by Repeal The 17th (Get out of the matrix and get a real life)
[ Post Reply | Private Reply | To 25 | View Replies]

To: spintreebob

As others have said, there’s absolutely nothing wrong with p values. They are a tool for reporting the significance of an experimental result, but like most tools, they can be misused. This is especially true if researchers don’t really understand what they are and how to use them as seems to be the case in many areas of research.

For the unfamiliar, suppose you are conducting an experiment such as a drug trial. You would split your subjects into two groups, a control group that does not receive the drug and a treatment group that does. You then compare the two groups on some measure to see if that measure is different between the groups. All that sounds simple enough, but the problem is that even if you’d compare two groups receiving no treatment, they’d never give EXACTLY the same measurements. There would always be some differences. The big question then is how different do the results have to be before we can really claim that the drug did something.

That’s where p values come in. A p value is a measure of probability (hence the “p”) that tells you how likely it is that sampling two groups of people at random and performing the measurement on those random groups would give a difference at least as large as the difference that you observed between your control and treatment groups. If that probability is relatively large, then you probably haven’t observed a real difference. You would be likely to observe a similar difference between ANY two groups of subjects, even absent any treatment. If that probability is small then MAYBE you found something.

The value generally accepted in most scientific work is p=0.05. That is results are not considered significant unless there’s less than a 5% chance that they could have occurred randomly. That should be considered to be a guideline, but too often it seems to be a set in stone benchmark instead. There are many instances where p<=0.05 is completely inappropriate. As a simple example, suppose instead of one drug, you are testing 20 drugs as potential treatments for some condition. If you blindly use p<=0.05 as your cutoff, you would very likely be publishing a positive result when all you really found was a random difference. That should be obvious: if you roll a 20-sided die 20 times you wouldn’t be all that shocked if you rolled a 1 on one of your rolls. You wouldn’t conclude your die was somehow loaded based on that observation.

Similarly you should NOT automatically just publish a positive result in such a situation. A much lower p value should be used in such a case. Particle physicists, for example do just this when they search for new particles. They recognize that they are looking for a particle at a wide range of energies, essentially conducting a large number of experiments simultaneously. Therefore they require a p value on the order of 10^-9 in order to announce a result.


27 posted on 02/07/2020 6:13:26 PM PST by stremba
[ Post Reply | Private Reply | To 1 | View Replies]

To: LambSlave

That is correct...also..they will test drug A at a dose which demonstrates a good p value...but since the drug is tested on basically young healthier people the doses approved are sometimes too much for smaller...sicker...older people. Which is why some older patients may get beneficial effects from smaller than recommended doses or may have more adverse effects using the recommended doses. Gotta be careful when dosing geriatric patients.


28 posted on 02/07/2020 6:19:19 PM PST by Getready (Wisdom is more valuable than gold and diamonds, and harder to find.)
[ Post Reply | Private Reply | To 22 | View Replies]

To: stremba

Thank you for your post.


29 posted on 02/07/2020 6:22:10 PM PST by Getready (Wisdom is more valuable than gold and diamonds, and harder to find.)
[ Post Reply | Private Reply | To 27 | View Replies]

To: stremba

“Our Constitution was made only for a moral and religious People. It is wholly inadequate to the government of any other.”
- John Adams.

Any serious scientist/statistician would say the same about the scientific method. [See Global Warming, Gender Studies, and similar far-left interpretations of science for details.]


30 posted on 02/07/2020 6:29:19 PM PST by Pollster1 ("Governments derive their just powers from the consent of the governed")
[ Post Reply | Private Reply | To 27 | View Replies]

To: Repeal The 17th

:)


31 posted on 02/07/2020 7:01:11 PM PST by Secret Agent Man (Gone Galt; Not Averse to Going Bronson.)
[ Post Reply | Private Reply | To 26 | View Replies]

To: spintreebob

P values are only an easy way to communicate the relative significance of a test. They are a tool and any tool can be misused.

The problem is not P values it is liars and cheats.

Of course, even is the harder science disciplines there is a strange lack of statistical knowledge. Some area of psychology and sociology are actively opposed to the use of statistics at all. You are jussupposed to believe their conclusions because they say so.


32 posted on 02/07/2020 7:56:21 PM PST by wjr123
[ Post Reply | Private Reply | To 1 | View Replies]

To: bigbob

Nope. Disraeli.


33 posted on 02/07/2020 7:58:37 PM PST by wjr123
[ Post Reply | Private Reply | To 15 | View Replies]

To: Getready

Speaking of smaller does for older people...

An old man walks into a pharmacy and hands the pharmacist a script.

The pharmacists says “I can fill this right now, Mr. Smith, you can wait right there.” as he points to a chair.

The old man (OM) says, “Hang on, Sonny, can I ask you a question?”

Pharm: “Sure Mr. Smith, what is it?”

OM: “Can you cut the pills in half?”

Pham: “Yes, I can do that.”

OM: “How about a quarter, sonny, can you cut them into quarters?”

Pham: “Well, Mr. Smith, this prescription is for Viagra. I’m afraid a quarter of a Viagra tablet will not give you a sufficient erection for sex, Sir.”

OM: “Sex? Erection, schmerection! I just want to stop peeing on my slippers!”


34 posted on 02/07/2020 8:11:08 PM PST by Alas Babylon! (The prisons do not fill themselves. Get moving, Barr!)
[ Post Reply | Private Reply | To 28 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-2021-34 last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson