What’s a statistically significant sample size?

Do my results apply to the population as a whole?

Exactly how many responses do I need before I can trust my survey data?

These are common questions about how to get survey data that you can act on with confidence, and we hear variations on this theme quite often. Although representative of valid concerns, the main problem with these questions is that they’re about convenience samples, which by their nature will never demonstrate significant differences.

The ability to get sample sizes that are both statistically relevant AND representative of the target population is limited to respondent pools created through probability sampling.

We’ll take some time in this article to explore what it means to have survey data that you can confidently act on, but first we need to get clear on the difference between convenience and probability samples.

The Rise of Survey Panels (and Problems With Their Data)

Since the advent of online panels and the increase in the number of online survey run using panel-provided audiences, the problem of testing for significant differences using standard parametric tests has become a moot point in many research studies.

Nowadays many of the surveys conducted online use samples provided by panel companies. These businesses charge survey administrators a fee per response that they provide.

This is a great way to get access to respondents who meet specific demographic criteria, but every panel respondent shares an important characteristic: they’ve agreed to take surveys as part of a panel.

So, if certain segments of your target population don’t belong to survey panels, these types of samples won’t give you access to those segments.

Because they are a convenient way to access particular respondent groups, these types of audiences are known as convenience samples.

Problems With Probability Samples

The other type of sample that you can use to conduct research is known as a probability sample, in which each possible respondent from the target population has a known probability to be chosen.

Although difficult to do, this type of audience sampling bypasses the selection biases that can prevent a convenience sample from being truly representative of the target population.

The issue is that for most consumer research studies and social behavior studies, we really don’t know the size of the actual population of consumers behaving in certain ways or consuming certain products. Trying to find out would make the research prohibitively expensive.

Convenience Samples Are Here to Stay

Instead of trying to craft a perfect probability sample, most research is safe settling for convenience samples like those offered by panel companies. These types of studies can still offer valuable insights if they’re designed with care, but doing statistical testing on a convenience sample is pointless.

Research conducted using convenience samples is most often better than no research at all, especially if the survey is well-designed and carefully crafted screening criteria are used to clearly define the target population.

Probability Samples of a Customer Database

A more appropriate case for testing statistically significant differences is a random sample taken from a customer database. This is the type of population where we can count all the members and accurately estimate their probability of being chosen to take the survey.

In this situation you’ll have confidence in each customer’s chance of being offered the survey, and you should also know a great deal about the demographic makeup of your audience.

So if 65% of your customers are men but only 40% of your survey respondents end up being male you can weight their results accordingly in your reporting.

Convenience Samples Outside Your Existing Audience

However, if you don’t have a customer database or are interested in surveying non-customers, you can certainly use a convenience sample. You may feel more confident in your sample if you’re able to replicate the results in repeated surveys, but always be cautious about inferences made from convenience samples.

There could be a hidden systematic bias in the data.

It’s therefore always important that you consider the following questions when using convenience samples:

  1. Who is systematically excluded from the sample?
  2. What groups are over or under represented in the sample?
  3. Have the results been replicated with different samples and data collection methods?

When To Test For Significant Difference

If testing for significant difference gives you peace of mind, even when using convenience samples, feel free to do it to confirm the “direction” of the data. But try to restrain yourself from using it to draw inferences about the larger population.

To calculate sample size and margins of error, feel free to check out Relevant Insights’ Sample Size and Margin of Error Calculators.