Binary Data

Binary data is a special case of nominal data, which either:

Contains exactly two unique values (e.g., 0 or 1, Male or Female, Taller or Shorter).
Contains more than two categories, where:
- Two of the categories are interesting.
- The remaining categories will be treated as missing values when performing the analysis (e..g, Missing data, Don't know, Refused).

Binary variables are also known as boolean variables, dichotomous variables, flags, and indicator variables.

Binary data is particularly useful in data analysis because despite being a special case of nominal, it also has all the properties of ordinal and interval data. For example, consider a binary variable that contains a value of 0 for people that consumed no cans of Coca-Cola in the past week, and a value of 1 for people that consumed one or more cans of Coca-Cola in the past week. With such data:

We can perform all the calculations that we would perform using nominal data (because it is technically nominal data).
When binary data is coded as 1s and 0s, the average of the data is the same as the proportion of 1s in the data. This useful relationship can save a lot of time when computing summary statistics. Mor generally, many techniques developed for interval data are applicable to binary data (e.g., linear regression).
Binary variables can be merged, using an or operation.

See also

Comments