When we weight data, we create a weight such that our weighted data matches some known facts about the world. These known facts are called targets. Targets can be set for categorical adjustment variables,
Examples of targets
- 12% of people living in California
- 12% of adults are aged 18 to 24
- 49% of adults are male
- 6% of adults who are aged 18 to 24 and are male
- 72% of adults who are aged 18 to 24 are male and are living in California
- Coca-Cola’s market share is 44%
Categorical adjustment variables
In the example about Brad Pitt and Tiger woods, gender was the adjustment variable and the adjustment targets were Male 50% and Female 50%. These targets are proportions. Targets can also be provided as population totals (e.g., Male 159,000,000 and Female 172,000,000).
As the variable contains categories, gender was a categorical adjustment variable.
The categories of a categorical adjustment variable need to be mutually exclusive and exhaustive.
Numeric adjustment variables
It is also possible to use numeric adjustment variables, such as the number of people in the household or the number of products consumed.
When using numeric adjustment variables, the targets are the averages of these variables in the population, or, the total.
It is often appropriate to use composite adjustment variables. For example, age-by-gender-by-geography. This is so common in practice that most software used for creating weights allows you to use tables as adjustment variables (e.g., see the example below), and construct the composite variables in the background.
The Excel screenshot above shows targets expressed in terms of age-by-gender-by-region. Note that these are expressed as percentages that add up to 100. It is possible in many programs to weight to actual population totals (e.g., the number of males aged 18 to 29 in DC, etc.). However, it tends to be safer to first weight to proportions and then, if necessary, multiply this weight by sample size. This is because when you enter population totals it becomes a lot harder to get an intuitive feel for the numbers, as they become too big to easily interpret.
Using multiple adjustment variables
It is routine to use multiple adjustment variables when creating sampling weights. For example, in How to Improve a Weight, the following targets are used:
Many business-oriented studies will weight by age, gender, geography, and some measure of brand usage (e.g., market share).
The Pew Research Center's American Trends Panel weights by:
- Country of birth among Hispanics
- Residing inside vs outside the US among Hispanics and Asian Americans
- Years lived in the US
- Home internet access
- Census region-by-Metro/Non-metro
- Voter registration
- Party affiliation
- Frequency of internet use
- Religious affiliation
- Self-reported voter turnout
- Leaned party affiliation among nonvoters
- Vote choice among voters