Political Polls and Small Sample Sizes
A type of survey in which there is much interest this month is the political poll. Questions I hear a lot are, “Can the polls be trusted?” and “How can they get away with polling so few people?”
We can increase our understanding of polls by answering two questions:
- How can polls be accurate with such seemingly small samples?
- How do we align popular polls with reality? (The general form of this question — How do we align our survey research with the real world? is important in all survey research.)
The Source of Our Trust in Polls
Political polls rely on two statistical principles for their trustworthiness — randomness and weighting.
Think back to high school. Picture yourself in your cafeteria at lunch time. Your task is to pick one person at a time and ask the question, “Who do you want to win the election for student body president?”
How many students do you think you will have to poll before your overall results no longer change very much? Eight people? 15? 25? 50?
As a rule of thumb, your results are going to be reasonably stable after you’ve queried 30 people, provided your selections have been truly random.
Achieving Random Survey Results
How do we achieve random results? It’s difficult and it is critical to the reliability of small sample sizes. Obviously, in the cafeteria example above, you’re not going to query six of your friends all seated at the same table.
But what else will affect the randomness of your sample? Here are some questions to consider in the current example:
- Is there only one cafeteria in your school and is every student equally likely to use it?
- How many lunch periods are there during a day? Is there any bias regarding which students go to lunch at which hours of the day? For instance, do freshmen all go to Lunch I; sophomores to Lunch II; etc.?
- What affects seating patterns? Is there a salad bar? A hot lunch queue? Do students who bring their lunches to school eat in the cafeteria or outside on the campus?
- Can polling at lunch time in the cafeteria account for students who take post-secondary options in the afternoon? Or do they leave the building before lunch and head to their other classes?
- In selecting students to poll, did you inadvertently avoid people you’re uncomfortable with? (Maybe you should have had someone from another school select respondents and conduct the poll.)
Let’s say you’ve satisfied yourself that polling in the cafeteria is going to work. You go ahead and conduct your poll and you find that, after you’ve polled 30 students you have talked with 9 freshmen, 8 sophomores, 5 juniors and 8 seniors. Here are your data:
Who are you voting for?
Percent of the Sample
Overall, it looks close — maybe a tie!
One of the downsides of randomness is it is not guaranteed to generate representative samples. You know that, in your school, 30 percent of the students are freshmen, 27 percent are sophomores, 23 percent are juniors and 20 percent are seniors.
So your random sample does not represent your electorate on a key variable — class.
Therefore, we weight.
Here is the distribution of your students by class:
Proportion of school
Proportion of sample
And here is how we adjust our polling data to represent what we think will happen on Election Day. (There are several methods for weighting our poll results but they all arrive at the same result.)
So, what looked like a pretty close race, may actually be an 8-point victory for Henry. Of course, how well your poll reflects reality will depend partly on the validity of your weighting assumption, that is, that the class you are in affects for whom you will vote.
So this is where polling starts. Of course, there’s a lot more to it, such as:
- Confidence intervals
- Timing of the poll
- Whether or not those that we poll will actually vote
- How to represent students who may vote but never go in the cafeteria
- Controlling for students who lied to us about their choice.
Also, we know that with weighting we have made our sample more representative of the school overall, but how do we know that we’ve represented each class reliably? All of these things affect poll results also.
Randomness and weighting, however, are what give us a firm methodological beginning.
Randomness and weighting are part of how polls generate reasonable results with what appear to be small sample sizes. But there’s another important question to pose to align polling with reality. That is, how do we align popular polls with the Electoral College? For that, see http://www.electoral-vote.com.
It is an excellent site. It aligns popular polls with the realities of the Electoral College and you can learn how they do it by reading the FAQs.