Verbatims from open ended survey questions are a rich source of insight for market researchers, and a great way for your survey to tell you something you didn’t already know. But surveys often don’t include them, as analyzing text responses has historically been a hassle.

What if coding text verbatims were fun and easy? Would we ask them more often? Might we learn more of what the market is often very willing to tell us?

If you have a current survey with text verbatim responses, let us know. We’re running a study you might be interested in…

A typical survey

Look at nearly any quant market research survey. If it is like most, all or nearly all of the questions are closed-end:

  • Check all that apply…
  • Rate the following…
  • Rank these items…

Ok, maybe there are some numeric open ends. And maybe there are a couple “Other (specify)____” items. But still.

Now go back and look at the questions clients ask in a typical RFP:

  • “Why do customers purchase …”
  • “What are the top strengths…”

Product managers, marketers, the people who depend on insights from the research, tend to ask open ended questions. Actually, on survey platforms where product managers design their own surveys, (e.g. SurveyMonkey) you tend to see a lot of open ended questions. But if the analyst on the hook for analysis is involved in survey design, typically you’ll see verbatims disappear quickly.

This is because analyzing verbatim text responses has historically been a pain.

"I avoid text open ends because they're a lot of work, not because they're not valuable." -- Senior market researcher


A simple way to use text verbatims is to scan through them and pull out a few great quotes to add “color” to the final report in a “Zagat review” style. That’s pretty simple.


More systematically, coding verbatims is a great way to make quantitative sense of the responses. You can summarize them, crosstab them, and look for correlations.

But coding verbatims is a major hassle. So much so that surveys rarely elicit open-end text responses, and even then only sparingly.

Text Analytics

There are a number of software platforms that promise automated “text analytics” based on sentiment or keywords. But, automated text analysis misses much of the nuance. For instance, in a recent survey, medical device patients expressed the reason they like a product is that it connects to their cell phone, with quotes like: “It works with my cell phone, so I can travel.” “It works with my cell phone, so I can cancel my landline.”

These both mention “cell phone” but are completely different reasons – one promises mobility, one promises economics. APIs for text analytics and “sentiment” analysis often miss the nuances like these that are essential to product marketing insights. If verbatims were jelly beans, automated text analytics might sort them by color, but still mix “lime” and “peppermint.”

Automated analytics might sort jellybeans by color but a human can identify 'Lime' from 'Peppermint'

In most surveys, there are dozens, hundreds, or maybe a couple thousand responses, but few enough that a person could look at them all – and would probably want to. We just want it to be fun and easy.

Outsourced coding

Coding can be outsourced, and there are professionals and even firms who do nothing but this. But that’s pretty expensive, and reserved for a few few questions, on a very few surveys, by a very few firms. Otherwise, in our world, it’s up to you, the market researcher.

Text Verbatims in Protobi

Protobi now offers a Verbatim Coding Widget that dramatically streamlines the process of creating, refining and analyzing codes for text responses (as well as more advanced widgets such as

If you just a want coding done for you, Protobi can work with external professional coding partners to do this typically within a week, within $0.25 to $0.50 per response. Contact for a quote.

Current research

If you have a current survey with text verbatim responses, let us know. We’re running a study to crowdsource verbatim coding and answer some interesting questions:

  • How long does it take people to code a question?
  • Can our crowd code verbatims as well as the study’s own professional analyst?
  • How consistent / reliable are codes across people?
  • Who comes closest to matching the researcher’s own codes?
  • Do other coders identify categories/distinctions that the researcher finds useful?
  • How do the number of categories emerge / coalesce vs number of items?
  • How many items do you need to categorize before you’ve basically got it all?