I often get the question about what sample size is needed to get a large enough sample so that statistically significant differences can be found and inferences to a larger population can be made. But something that is often ignored is that these statistical tests were meant to work within the probability sampling theory framework.
Since the advent of online panels and the increase of online surveys using panel-provided samples, the issue of testing for significant differences using standard parametric tests has become a moot point in many research studies.
Nowadays many of the surveys conducted online use samples provided by online panels, but these are mostly convenience samples (non-probability). The populations of online panels include respondents who are willing to participate in studies, excluding those unwilling to be part of the panel who may be members of the target population we are after.
In probability sampling, each possible respondent from the target population has a known probability to be chosen. Probability sampling helps us avoid some of the selection biases that can make a sample not representative of the target population. For more on this read Representative Samples – Does Sample Size Really Matter?
Unfortunately, taking a probability sample is hard and costly. For most consumer research studies and social behavior studies, we really don’t know the size of the actual population of consumers behaving in certain ways or consuming certain product. Trying to find this out would make the research prohibitively expensive. This is why we often have to settle for convenience samples like the ones offered by online panels. They still can offer valuable insights if designed with care, but again doing statistical testing in a convenience sample is pointless since the assumptions about probability sampling are violated.
Online panels are here to stay, and they will continue to be a source for affordable sample for market research. Research using convenience sample is often better than no research at all if the survey is well designed and screening criteria are used to define the target population.
A more appropriate case for testing statistically significant differences are random samples taken from a customer database, since this is essentially the population frame where we can count all members and estimate their probability to be chosen.
However, if you don’t have a customer database or are interested in surveying non-customers, then use a convenience sample. You may feel more confident in your sample if you are able to replicate the results in repeated surveys, but always be cautious about inferences made from convenience samples since there could be a hidden systematic bias in the data.
It is always important that whenever you use convenience samples you consider the following when analyzing the results:
1. Who is systematically excluded from the sample?
2. What groups are over or under represented in the sample?
3. Have the results been replicated with different samples and data collection methods?
If testing for significant difference gives you peace of mind, even when using convenience samples, do it to confirm the “direction” of the data, but restrain yourself from doing inferences to a larger population.
For help on sample size calculation check out Survey Sample Size – What Should It Be?
To calculate sample size and margins of error, use the Sample Size and Margin of Error Calculators from Relevant Insights.
Photo courtesy of lrargerich – Flickr, Creative Commons (Attribution)
Tags: convenience sample, Michaela Mora, Representative Sample, statistical significance, Survey Expert




Responses
RSS feed for comments on this post. • TrackBack URI
Add the first comment using the form below: