Protobi allows you to weight data with respondent-level weights.

By default, Protobi counts each respondent equally. In practice we may need to weight some responses or respondents more than others.

Why weight?

A survey sample might not exactly match the population in some aspect. For instance, in a survey of 1000 consumers we collect a sample that has 600 male and 400 female respondents. However, there is other data that tells us the actual market in that category is 50% male and 50% female.

Gender Survey Population
Male 60% 50%
Female 40% 50%


In this case, we might need to differentially weight the responses to match both the population proportions and the sample size. To do this we can

  • Up-weight female respondents by 1.250 (i.e. 400 x 1.250 = 500)
  • Down-weight male respondents by 0.8333 (i.e. 600 x 0.8333 = 500)

Another use for weighting is weighting by patient volume or purchase volume. In a survey of physicians about their treatment patterns, some physicians may treat many patients and some fewer. We can weight respondents by the number of patients the physician treats, so that the results project to the patients they treat.

Define a weight column or value

In your data file

To weight data differentially in Protobi, your dataset must have a column that has a weight value for each respondent. If the column is not already in your data file, you can add it manually and upload the revised data. You name the weight field anything, e.g., ‘RESP_WT’. In the example above, the column would have the value 1.25 for female respondents, and 0.8333 for male respondents.

If your SPSS SAV file does identify a weight column, Protobi should automatically recognize it as a global weight.

In Protobi

Alternatively, you can define a weight as a scalar (e.g. 1.02 or 100), instead of a field name. This can be useful if you need to weight all respondents equally but by some number other than one. This works with both global and individual weights.

Weight by an element

You can theoretically weight by any element in your project by referencing its key in the Weight field.

For instance we could weight each customer by their annual sales. To use sales as a weight would mean to assign each respondent their own weight value that is equal to their answer for this particular question.

There are questions that are not suitable to use as weights. For instance, if you were to weight by the below element, type, you are assigning weights based of the underlying unformatted values. 74% of respondents would get a weight of “0”, and 26% of respondents a weight of “1”.

Unformatted type values:


If we were to apply type as a weight to the below element, we would reduce the N size from 157 to 41. Effectively discounting all respondents with vehicle type “Automobile” because the underlying value is “0”.


Set individual weights

More properties dialog

For individual elements specify weights by selecting “More properties” from the context menu. Next to the “weight” field you can enter the data column to reference.

You can also enter Excel-like formulas to specify weights.


Note: When adding numeric fields in a formula you need to include a “+” sign before the variable name to tell it to add numerically (2+7=9) not alphabetically (“2” + “7” = “27”).

If weights are specified, a default footnote appears; you can overwrite the footnote .

In JSON editor

Weights for an individual element can be set in JSON as well. Select the element and press “Edit JSON….” to modify the element properties in JSON syntax. Define a property “weight” with the name of the weight field.

Weights on individual elements override any global weights. For instance, to avoid weighting the global weight field by itself, we can specify “weight”: null.

The example below is from the car_sales.sav dataset and here we’re weighting automobiles by the field sales :

Set a global weight

You can define a weight field globally that applies to all elements. Press the icon under the toolbar to edit Project properties. Enter the name of the column specifying weights.


This will bring up the Project properties dialog where you can specify a global weight:


Toggle weights on/off

If a global weight field is defined for your dataset (e.g., using data column S8), you’ll see a new toolbar button that specifies the weight that the project is using.

You can press this to toggle weights on or off. Its name will change from “Weighted” to “Unweighted” so you can quickly see if the results are weighted or unweighted. “Weighted” is the default.

Define more than one global weight scheme

Protobi can include multiple weight schemes. For instance, a study may weight data differently to project results to the population of patients and physicians.

Protobi looks for a special group element with the key $weights, and interprets each child element of this group as a variable that can be used as weights. Each appears in the Weights dropdown.

Press on the button at the end of the list of tabs, and enter “$weights”. This will find or create a group with that key. Within this group add child elements corresponding to each weight column. One way is to drag weight elements from the the tree on the left and drop them into the $weights group (optionally hold the Shift key when dropping to copy rather than move).

Alternatively you can directly specify weight columns by editing the JSON for the $weights group. For example:

{
    “roundby”: “auto”,
    “key”: “$weights”,
    “children”: [
        “S8v1”,
        “S1”,
        “S2”
    ],
    “type”: “empty”
}

Now the dropdown menu in the Weight button in the toolbar will contain all the children as global weight schemes:


Weight to multiple target characteristics

The weighting examples above are applicable when you only want to apply one weight scheme to the project at a time. However, you might want to weight to multiple target characteristics. We can do this using Random Iterative Method, also known as RIM weighting or raking.

For example, a survey sample may have set quotas to equally sample physicians and nurses , but in the target market nurses may be 70% of customers.

Gender Survey Population
Physician 50% 30%
Nurse 50% 70%

Similarly, the distribution of region in the survey may differ from the target population.

Region Survey sample Population
East 40% 30%
West 60% 70%

It is possible to calculate one weight scheme that adjusts the distribution for more than one variable. This is a little more complex, as setting weights for one variable may affect the distribution of other correlated variables.

For this you can use an iterative algorithm called “Rim weighting”, described here and also in the accordion below. This gist shows a function that calculates weights for selected variables, and runs in a data process .


Example RIM weighting algorithm

Protobi.get_tables([“main”, “OE”], function(err, data) {
    if (err) return callback(err);

    Protobi.get_elements(function(err, protobi) {
        if (err) return callback(err);

        var rows= data[“main”] //primary data file
        protobi.setData(rows);

        Protobi.calculate_rim_weights(
            protobi, 
            ‘weight’, {
            “s3”:{
               “1”:0.49,
               “2”:0.51
            },
            “s2”:{
                “0”:0.84,
                “1”:0.16
            },
            “region”:{
                “Northeast”:0.17,
                “Midwest”:0.21,
                “West”:0.24,
                “South”:0.38,
            }
        })
    return callback (null, rows)
    })
})
Delete


Support

Weighting can get complex and each firm has its own approaches. Our support team is ready to help you with your specific goals. We can help you setup code that does what you need and show you how to modify it from there. Please contact us at support@protobi.com.

Video Tutorial

[insert-question 479406]