New Survey Data Cleaning Tool

Before you start analyzing and acting on your survey data, you need to make sure it’s the highest possible quality.

This is particularly true if you used a panel or incentives to collect your responses, as these audiences are more likely to speed through a survey to get their promised reward.

Fortunately, SurveyGizmo offers a Data Cleaning Tool that allows you to thoroughly scrub your survey data. Here we’ll walk you through how (and why) you should take the time to run your survey results through the data cleaner.

Why Survey Data Needs Cleaning

Most people who read and answer a survey thoughtfully display similar behavior. It will take them all about the same time to finish the survey, and their answers to radio buttons and check boxes won’t follow any pattern.

However, there are almost always respondents who rush through their answers, or check off radio buttons in a clear pattern.

These people’s data won’t guide your actions in the right direction because they didn’t give answers based on their real experiences or opinions. To analyze and act on survey data with confidence, we need to exclude these responses from our results.

Common Issues With Survey Data

Speeding, or rushing through a survey without really reading it, is the most common indicator of poor data quality, but there are other indicators too. Our data cleaning tool guides you through an investigation of your results to weed out these common problems:

  • Speeding through surveys
  • Straightlining (e.g. always selecting the first response), or creating a pattern with their responses
  • Writing gibberish in open text questions
  • Creating fake answers
  • Answering trap questions incorrectly
  • Inconsistently answering consistency check questions
  • Selecting all checkboxes
  • Selecting only a single checkbox
  • Writing one word answers for an essay question

The data cleaning tool walks you through each of these common issues so you can customize and apply them to your own results as needed. You’ll then get an overview of how many responses you excluded so you can get insight into the impact your efforts may have on your response total.

Speeding Through Surveys

The data cleaning process starts with identifying respondents who sped through your survey.

We start with this step because speeding is one of the most serious indicators of poor quality data. Respondents who proceed through your survey faster than others are likely not carefully reading your questions, which generally means that their answers are less accurate.

In the speeding step the distribution of average response time per question will be charted for you, like this:

speedy survey responses need cleaning

Average response time is computed for each response by taking the total response time divided by the number of questions that that respondent was shown. The chart will start with a red line, used to eliminate speeders, at the fastest 1% of responses and a blue line, used to normalize slow outliers, at the slowest 10% of responses.

survey reporting ebook

You’ll be able to adjust the position of the both the red and blue lines, so you can customize how fast is too fast for your particular survey.

Be sure to keep an eye on how many responses your settings are putting in quarantine. You don’t want to eliminate all your data in this step.

Other Data Cleaning Options

Once you’ve calibrated your settings for survey speed, you can move on to deciding which additional data cleaning features apply to your project.

options for cleaning survey data

Each of these data quality flags has a default weight, but you can adjust each one based on your own survey questions and data needs.

The Human Element of Survey Data Cleaning

Keep in mind that data cleaning, even when done with a sophisticated tool like this one, is a subjective process.

You should always use your judgement to determine the quality of each response. Even if it fails the data cleaning inspection, a response may still be valid and have a place in your final results.

Additionally, every poor data flag will not apply to every survey. For the best results, consider your survey and your respondents. Think carefully about what data and what respondent behaviors are likely to skew your results, and use the right data cleaning options to flag only that data.


Learn more about the step-by-step process for setting up your own custom data cleaning system in SurveyGizmo by visiting our Help Documentation.


Join the Conversation