Clean/revise survey data

Sometimes you need to change the data from your survey, for all sorts of good reasons. This article shows a few different ways to do that...

{filename}

1. Edit data directly in Excel and upload

For project teams in a hurry, a time-honored approach is to download the data, modify it in Excel, and upload the revised file. This is fine, and here's an article how to do that: https://help.protobi.com/update-project-data.

In the example above, you manually edit cell E5 to from 999 to 120, then upload the revised file.

But we generally don't recommend this approach. You or a teammate might later need to repeat these edits with new data.

2. Transform data within Protobi

If it's a simple outlier, Protobi has a number of features to recode, transform, and winsorize data.

  • Recode selected vaues e.g. missings to zero, or 999 to missing.
  • Winsorize or trim values to cap them within reasonable ranges for reporting and summary stats. See

In the example above, you could set element Q1 to squash answers to the range 0 to 120.

The benefit of keeping your changes in the app is that it's clearly documented, is easily modified, and updates whenever the data is refreshed.

3. Revise data programmatically in Protobi

Rather than edit data values directly, we recommend identifying the rules and programming those in Protobi as a data process. Here's an article on that:

The benefit of writing your changes as code is that they can be reapplied when new data arrives, plus the rules are documented if you ever need to review, revise, or reverse them.

You can write code that either runs every time the project is opened (aka "Precalculate"), which is useful for smaller code or when the study is still in field.

Or you can write code that runs once on request and stores the result for rapid access (aka "Data process"). This is useful when the code is complex or you want to lock the result down.

4. Revise data programmatically in R, Python or SPSS

Protobi is designed to play well with other data tools such as R or SPSS. For heavy lifting, particularly statistical modeling like regression, clustering or conjoint analysis, it's easy to get data from Protobi into your preferred toolset, and then back into Protobi for display.

Protobi has a full featured R library and REST API to support working in other languages: