Prohibitions and Ways to Avoid Them – The Data Story Guide

This article defines prohibitions, explains why people create them, why they are often not a good idea, and alternatives to using prohibitions.

Definition

A prohibition is a rule indicating that certain combinations should not appear in experiments. For example, if conducting a study looking at the price and brand of a car, showing premium brands with low prices may be prohibited.

Reasons for creating prohibitions

Broadly speaking, there are two quite different reasons for creating prohibitions:

To ensure that experiments are not difficult for respondents to answer. For example, if an experiment included a treatment reflecting different descriptions of a product, to see which description was more appealing, it is generally advisable to prohibit one respondent from seeing both descriptions.
To ensure that experiments are consistent with the "real world". For example, if for cost reasons a telecommunications company would never try and create a phone that was both very cheap and had lots of features, the company may not want the experimental design to have such combinations.

Ensuring that an experimental design addresses the first of these considerations will increase the validity of the experiment.

The problem with prohibitions

Ensuring that the experimental design addresses the second of these prohibitions may reduce the validity of the experiment (because the resulting increase in the standard errors that result from the prohibition may offset any benefit obtained).

For example, when a prohibition preventing Ferrari from appearing at $20K is added to the balanced overlap design reviewed in Choice of Experimental Design Algorithm, a sample size of approximately 15% more is required to produce the same d-errors as with the prohibition-free design.

Ways of avoiding using prohibition

Alternatives to using prohibitions include:

Doing nothing. Often prohibitions come about because a stakeholder sees an alternative in a choice questionnaire and says "we would never do that". For example, Godiva may regard itself as a luxury brand and see sugar-free chocolate as being inconsistent with this positioning. This is not a good reason to use a prohibition. A better reason would be if the combination was so unlikely that it would cause the respondents to give poor quality data.
Combining attributes. For example, in the car example, rather than creating an experimental design with a price attribute and a brand attribute, instead, create a design with a single combined brand-price attribute (aka a superattribute). It can still be presented to respondents as separate attributes.
Alternative-specific designs (see Alternative-Specific Designs).
Conditional pricing tables. For example, create the experimental design with four price levels, but have a specific set of price levels for each of the brands. If doing this, it is important to check the design carefully using the techniques described in the next two chapters). Additionally, it is likely to be a good idea to estimate price as a numeric attribute in the final model (other approaches are possible, but they require some careful thought at analysis time).
Summed pricing. This approach involves creating an experimental design using all the non-price attributes and then creating a price variable by:
- Assigning some value to each of the attribute levels. For example, Ferrari may be assigned a value of $100,000, BMW $50,000, Electric $5,000, etc.
- Summing up the values for each alternative, so that you have a base price for each alternative in each question.
- Randomly generating a number (E.g., in the range of 0.8 to 1.25), and multiplying this by the prices computed in the previous step.
- Modify the prices to make them consistent with how they are likely to appear in the category (e.g., rounded, or, ending in 99).
Pivot designs. With these designs, a respondent is asked to describe their current purchasing (e.g., their current house), and the experimental design is used to generate levels relative to this. For example, if a person’s house price is $500,000, and the price attribute has levels of -10%, 0%, and 10%, then the respondent is shown levels of $450,000, $500,000, and $550,000. A practical problem with such experimental designs is working out the best way to model the data (e.g., should the price be modeled as a numeric variable using the actual prices, or as a categorical variable with three levels).