Grouping together the most common combinations of missing data can also provide insight into data integrity problems. A heatmap visualization, such as the one below, can help to identify common issues.
- Show the cyan color for missing data.
- Shows the most common missing data patterns at the bottom.
We can see that:
- The most common missing data pattern is that we have 364 observations that have no data on Q3 but do have data on the other questions
- We can also see that have 321 observations with no missing data.
- Q3 most commonly shows up in missing data patterns.
Understanding the implications of such missing data patterns depends on the context. For example, if Q3 is a question that is only asked of a subset of respondents, the above patterns are not likely interesting.