Posted on 02/11/2016 1:49:12 PM PST by Riflema
Countless News Organizations refer breathlessly to the Real Clear Politics "RCP Poll Average" as the go-to source for the big picture view of the candidates' standings. But did anyone take a look at that source and figure out how they come up with their numbers? With some trepidation, in fear of what I would find, I did.
Here's today's RCP Average and the individual polls it summarizes:
Poll |
Date |
Sample |
Trump |
Cruz |
Rubio |
Carson |
Bush |
Kasich |
Christie |
Fiorina |
Spread |
RCP Average |
1/22 - 2/4 |
-- |
29.5 |
21 |
17.8 |
7.8 |
4.3 |
4 |
2.5 |
2.5 |
Trump +8.5 |
QuinnipiacQuinnipiac |
2/2 - 2/4 |
507 RV |
31 |
22 |
19 |
6 |
3 |
3 |
3 |
2 |
Trump +9 |
Rasmussen ReportsRasmussen |
2/3 - 2/4 |
725 LV |
31 |
20 |
21 |
5 |
4 |
6 |
3 |
3 |
Trump +10 |
PPP (D)PPP (D) |
2/2 - 2/3 |
531 LV |
25 |
21 |
21 |
11 |
5 |
5 |
3 |
3 |
Trump +4 |
IBD/TIPPIBD/TIPP |
1/22 - 1/27 |
395 RV |
31 |
21 |
10 |
9 |
5 |
2 |
1 |
2 |
Trump +10 |
Now take a look at any candidate's column and do their math, RCP style:
e.g., Trump: 31%, 31%, 25%, 31%, add 'em up and divide by 4 to get the mean, bingo!, 29.5%!!
I'll simplify the problem with this:- suppose one of those polls was a sample of 2000 voters and one was 100 voters and they came in as follows:
Poll |
Date |
Sample |
Trump |
RCP Average |
1/22 - 2/4 |
-- |
22.5 |
Poll 1Quinnipiac |
2/2 - 2/4 |
2000 RV |
30 |
Poll 2Rasmussen |
2/3 - 2/4 |
100 LV |
15 |
But are you ready to buy that? What Real Clear Politics is actually demonstrating here is innumeracy.
If 30% of 2000 voters and 15% of 100 voters favor Trump, the correct average is:
(30% x 2000) + (15% x 100) = 600 + 15 = 615 voters out of 2100 = 29.3%
It's called a weighted average, RCP. The poll of 2000 voters has way more weight than the one of 100 voters. Get it?
I guess since the majority of consumers of this junk are journalists we should not be surprised that it is swallowed and regurgitated so readily. Over all my years of reading their junk, I long ago ceased to be amazed at their fundamental inability to master any of the hard sciences, you know, the ones that involve math. Sheesh.
As John Huang used to say, that's just my 2c. (and excuse the crude HTML!)
I stopped putting any stock in RCP after their prolonged insistence that Romney was going to win, right up until the bitter end.
My dismissal of them is purely anecdotal, but there’s been like four or five races over time that I cared about where they were telling me one (comfortable, ultimately untrue or skewed) thing, and other sources and reality let me down by being correct.
As much as I hate their editorial content, I go with FiveThirtyEight.com for poll data. Their track record is pretty good.
As you can see above, their mighty average can be easily swayed by any polling firm surveying their favorite 200 RINOs and adding that to the mix....
The problem is that you don’t understand polling. The LV/RV number represents the sample size, which matters only in calculating the margin of error. (LV means they polled likely voters, RV means that they polled registered voters). A higher sample size should mean a smaller margin of error, not more actual voters. Conversely, a smaller sample size is going to mean a larger margin of error.
Most polls are going to give you a margin of error between 3 and 5 percent.
There is room for disagreement as to whether RV or LV will give you the most accurate results. Also look at the methodology as to whether they used only landlines or cell and landlines (more accurate as to younger voters).
Each poll has a separate error rate so using a weighted average doesn’t provide much more accuracy than not using one.
Not really my point I’m afraid. Their math is wrong anyway you cut it. Read my example above.
Er ok. Not sure you got the point ;-)
What he said. Thanks Ray.
Same again. If a poll of 1000 has a MOE of 1% and one of 100 has a MOE of 10%, I am most definitely going to weight the large poll more heavily, no?
Yeah, what PAR said too. Thanks Par
Then start a website that weights the polls. RCP doesn’t claim to do so. It provides the poll numbers, the links to the polling companies, and a simple average.
My larger point seems to have got lost. The top of the hour news regularly quotes this stuff, obviously without a second thought as to what it means. It’s like a charming young Boston Globe reporter writing 15 years ago about the scary new mega container ships that carry 80,000 containers.
No, it isn’t lost. It just doesn’t matter or make any sense. Who gives a flying fig?
Most people, including news reporters, couldn’t tell you what an MOE means, what sample size is suspect, how polling companies weight results internally, or myriad other aspects of polling.
To focus on minutiae like whether RCP has weighted various polls that have already been internally weighted and arrived at through various methodologies is the very definition of a “waste of time.”
RCP is wise not to spend hours and hours of work to satisfy less than 1% of their web visitors.
The data is, just like all polling data, a snapshot of results that may or may not be accurate.
That’s why whenever a new poll comes out, you will see Freepers who know about polling say things like “I’d like to see more polls that show this trend” or “I think this might be an outlier” or “MOE of 7%? Well, that’s worthless” or “Trends are the key, not an individual poll.”
Mot necessarily. It depends on the quality of the sample selected. Did they get the 1000 from a diverse sample (likely voter, on the street vs phone, non-likely, registered, geography, etc) and the 100 from a controlled sample (registered and likely, diverse areas, etc)? If that is the case, I would weight the 100 sample more accurately.
>> Their math is wrong anyway you cut it <<
No, as other posters have observed, you apparently don’t understand Lesson Number One about sampling, which is that sample size affects the standard deviation (margin of error), but not the estimated mean. Therefore, I regret to say that it’s YOUR math that’s wrong.
The main point to remember is that an unweighted average of polls is basically OK, as long as the sampling that underlies each subject poll has been performed competently.
On the other hand, a simple average of the standard deviations (margins of error) is definitely not appropriate. Instead, something like a “Pythagoreon average” — weighted by sample size — could be used. But for whatever reasons, the people who compute the Real Clear Politics averages have decided not to publish this kind of number. I guess maybe they think such info simply would be too complicated to explain to the typical reader.
(Still, I wouldn’t be surprised to learn that somewhere in the sub-basement of the RCP building, there’s a statistics nerd who daily grinds out Pythagoreon weighted averages of the standard deviations, either for his own amusement or for use in the RCP’s internal deliberations.)
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.