How to Clean Your Market Research Survey Data

You’ve fielded a market research survey.

For weeks, you wrote and rewrote your survey questions. You paid for a SurveyMonkey license and spent hours learning how to program your survey. You leveraged dozens of industry connections to get survey answers — a hard-earned set of 300 respondents.

Getting here wasn’t easy.

But unfortunately, you’re not done. Before drawing findings from your survey, you need to clean your data. This is absolutely essential for maintaining the quality of your research. Here are the three most important things to look for when cleaning your survey data.*


These are respondents who took your survey too fast. Identifying responses like this is based on the median time spent taking your survey.

The rule of thumb here is to disqualify responses from anyone who completed your survey in less than half the median time. There are some exceptions, like if your survey includes a logic branch that had certain respondents answering just a few questions. But in general, anyone going more than twice as fast as the average respondent is likely someone who sped through the survey without giving the questions much thought.

You can identify speeders by downloading your survey data into Excel, then subtracting the “time completed” from the “time started.” Most survey platforms I’ve used record this information. If yours doesn’t, you may have to skip this flag, but be sure to check for the following two.

In general, not more than 10% of your survey sample should be discarded for speeding.


These are people who picked the same answer to every (or most) multiple-choice question in your survey. For example, say that you asked four open-ended questions about price (like a Van Westendorp question set). A flatliner answered the same thing for each of these four questions (say, $10).

If you notice that more than 10% of your survey respondents are flagging on flatlining, you may want to look more closely at the questions you’re including in your scan for flatliners. It may be that some respondents only appear to be flatlining, when they are, in fact, giving honest answers. This may have to do with the way you’ve asked your questions (for example, if you placed student, 18–22 years, unmarried, and no kids all as the first answer options to questions asking about employment, age, marital status,and children).

You can only determine this by looking at respondents’ individual answers — but don’t bother with this if less than 10% of your survey sample is flatlining.

Gibberish and Contradictory Answers

These types of responses can be harder to spot. They require you to look, line by line, at answers to open-ended questions in order to identify ones that 1) are gibberish (i.e., dk3i8sw) and/or 2) don’t correspond to other answers in that row. For example, if someone says they are single at the beginning of the survey, then mention their “wife” or “husband” in a later open-ended question, delete that respondent. They are not being honest, and you want data that you can stand on.

If you designed your survey well, your data cleaning shouldn’t result in discarding more than 15% of your responses. If you’re worried you’re throwing out too many, take a closer look at the ones you’re throwing out. Consider keeping a few that give other indications of being good, honest answers or the ones that flagged on only one of the three criteria listed above.

*PeopleFish analysts clean every one of our client’s survey datasets according to the criteria set forth in this article. Data cleaning is a standard piece of our survey project offering. Nevertheless, it’s helpful for researchers to understand how survey data is cleaned, for their own knowledge and, of course, should they want to conduct their own market research survey independently of a market research firm like PeopleFish.

What Does Startup Market Research Actually Look Like?

”Don’t go to market before doing market research.”

Aspiring entrepreneurs hear that all the time. But what does this actually mean? Does every startup do market research? Is market research realistic for a startup with a small budget, little-to-no marketing team, and no product prototypes?

Here’s our Founder’s outline for startup market research projects, based on years of experience running surveys and focus groups for companies large and small. This is the lowest-cost way to validate your business concept before going to market, and wow investors in the process.

Read the full article at Startup Grind. To learn more about how PeopleFish empowers entrepreneurs and innovators with real-world consumer feedback, click here.

Rule #1 When Writing Screener Questions for Your Survey

Screener questions are the gate-keepers of your market research surveys. They sit at the beginning of your survey instrument, and they disqualify anyone who you don’t want to hear from.

Poorly-designed screener questions will undermine the entire purpose of your survey project. They will let the wrong people into your survey, diminishing the accuracy of your data—that is, the extent to which your findings will reflect the feelings and preferences of your actual target market.

There’s more to writing screener questions, though, than just getting the “logic” right. The fact is, dishonest people will always be out there, trying to enter surveys for which they don’t qualify in order to be compensated. That said, here’s the most important thing to remember when writing your survey’s screener questions:

Don’t be obvious.

What does this mean? By way of example, imagine you’re trying to survey parents of children with asthma. The easy and obvious screener question would be something like:

Do you have a child with asthma?


This would be fine if all respondents were honest. But they aren’t. And the fact is, this approach makes the “right answer” obvious. Dishonest respondents who want to be compensated for taking your survey will know what to select (Yes) in order to proceed (because it seems unlikely that a survey would target parents of children who don’t have asthma).

So instead, ask a series of questions that don’t give away the right answers. Here’s an example we used in a recent survey project:

1. You have young children who still live at home.


2. Do any of your children have Asthma, ADHD, or Diabetes?


3. Which of the following conditions do any of your children have?

Asthma, Diabetes, ADHD

This approach doesn’t give away the “right” answers, and it has three points at which someone might disqualify. A much more robust approach than one simple question.

An additional quality control measure would be to disqualify anyone who selects all three diseases in question #3. While it’s possible that someone’s child, or children, has all three of these conditions, it’s highly unlikely. Better to just disqualify such respondents from your survey than risk allowing dishonest respondents to enter.

In an ideal world, we wouldn’t have to worry about all this. Respondents would simply be honest. But as surveys become more ubiquitous, dishonest respondents are finding creative ways to enter into surveys for which they don’t qualify. Robust screener question series combat this trend, and if written well, can almost entirely eliminate the risk of fraudulent responses.

How to Design your First Market Research Survey

At the end of the day, entrepreneurs need consumer data. Investors simply won’t trust your gut. Nor should you.

That said, the first big step toward turning your product or service idea into a sound business concept, and ultimately toward wowing investors, is market research. And the bottom line is that your first market research survey should answer three very specific questions.

In this Startup Grind article, our Founder Nick Freiling identifies these three questions, and explains how to design a basic, first-pass market research survey to test & validate your product idea.

To learn more about how PeopleFish empowers entrepreneurs to get feedback on their market research questions, click here.

How to Overcome Sampling Bias in Your Market Research Survey

In the market research world, sampling bias is a consistent error that arises due to the way a survey’s sample was selected. It occurs when a sample is not random, meaning certain types of respondents are more or less likely to be chosen for the sample.

The result: Survey results that don’t reflect the population you purport to represent. Instead, they reflect a stilted sample.

For example, a survey of potential voters in the upcoming presidential election may suffer from sampling bias if the list of people invited to take the survey come from, say, a conservative think tank’s donor list. Such respondents are going to be more likely to favor the Republican candidate than are voters in general, and it’s precisely those voters in general that a political pollster probably cares about.

Overcoming sampling bias

Generally speaking, sampling bias cannot identified or overcome by examining a survey’s response data alone. Sampling bias is identified only by comparing a survey’s sample to the population of interest.

In other words, you can’t just look at a survey’s results and decide the sample is biased one way or another. You can (and should) compare a survey’s results to other similar surveys to see how respondents’ sentiments might differ, but that’s inexact.

“You can’t just look at a survey’s results and decide the sample is biased one way or another.”

The only way to accurately measure sampling bias is to compare your survey’s sample, on every relevant characteristic imaginable, to the general population your survey aims to understand. This, of course, is impossible, but that doesn’t mean we can’t get close.

For example, pollsters from the hypothetical voter survey mentioned above might include questions about their respondents’ ages, political affiliations, and past voting behavior, then compare those results to other surveys of the voting population to see how well their sample compares.

You can see here, though, that judging sampling bias relies heavily on intuition. What characteristics are relevant to your particular survey? What should you look at for when judging whether your survey sample is biased, based on the issues you’re trying to understand?

What does this mean for you?

First, don’t trust just any survey about your customers. Regardless of the topic or sample, analysts must consider how the sample selection may be biased, and what differences may exist between those who did and did not complete the survey. Further, these differences must be considered in light of the client’s final research question — it could be, for whatever reason, that differences between those who did and did not complete the survey aren’t meaningful to you and won’t affect your key takeaways.

Second, be vigilant in collecting your customers’ contact information. When my team conducts a survey, we try to include as many customers as possible in the survey sample. If we are surveying a coffee shop’s customers, for example, about their willingness to pay more for a particular menu item, our results come with big caveats if we are only able to survey customers that have, say, a rewards account at the coffee shop. Those customers are going to differ from customers who do not belong to the rewards program. They may love the coffee more than non-members and be more willing to pay extra for the menu item in question. Or they may feel betrayed, as rewards members, being asked if they’d pay extra.

There’s really no way of knowing for sure that the sample isn’t biased unless we survey every single customer, and we get closer to that if we have contact info for as many customers as possible.

Finally, include demographic variables in your customer surveys. This way, you can compare the makeup of your sample — the “average” respondent” — to what your intuition tells you is your “average” customer. Gender, age, household income, job, family size, and/or other behaviors and characteristics relevant to your product or service. Asking for these demographic variables also has the benefit of allowing you to cut and segment your results along these demographic variables, perhaps exposing opportunities to up-sell or improve your targeted marketing campaigns.

One more thing…

All this might sound hopeless. Unless we survey every single customer, we can’t be 100% our sample isn’t biased one way or another.

But as mentioned above, intuition really is key to knowing whether your sample is biased and how that might affect your key findings and inferences. Our best work happens with clients who understand their customers in a way that supplements whatever survey they’re trying to run. If a business owner knows from experience, for example, that his rewards program customers are more loyal and generally willing to pay more for his products, we can account for that when drawing inferences from the survey results. If he knows from experience that previous price increases have not affected his business with non-rewards members, we can account for that when drawing inferences from the survey results.

Surveys are typically quantitative market researchbut quantitative data must be interpreted through the lens of experience and subject matter expertise. When it comes to your customers, you probably know more than any single survey can tell you.