Filtering involves analyzing a subset of all the available data. This article presents a simple example of filtering, discusses the main ways of applying filters, and explains boolean logic and filter variables.
The table below on the left shows age groups for a sample of 327 people. The table on the right has been filtered to show only the data of males.
Alternative ways of presenting filters
Sometimes filters are presented as lists of checkboxes at the top of columns. Such filters then filter the rows of the table according to the selection.
The example below is more complex. In this example, the visualization is filtered by clicking on the age and gender categories, and this causes the entirety of the analysis and visualization to be re-created using only the selected ages and genders.
What is a filter?
A filter is a rule about the subset of data to be used in an analysis. It can be a simple rule, such as Males. Or, a much more complex rule, such as Males in Floriday who eat Smores on Sundays.
Formally, filters are defined using boolean logic.
Applying versus creating filters
In simpler analysis software filters are manually applied to analyses (e.g., by choosing options from a list of checkboxes). In more advanced software, there is a clear distinction between:
- Creating filters.
- Applying filters that have previously been created.
For example, the image below shows how previously created filters are applied in Displayr.
Data analysis software that allows filters to be re-used will often create a new variable in the data set that represents the filter. This filter variable is then re-used as an input to any analyses that wish to re-use the filter and can also be used as an input into other analyses (e.g., as a column in a crosstab).
A filter variable stores the true or false information for every observation in the data set. For example, the variable below on the left shows Gender, and the filter variable to its right is for the Gender is Male, with a label of Not selected representing false and Selected representing true.