Analysis Methods That Automatically Address Missing Values – The Data Story Guide

A very small number of analysis methods have been developed that automatically address missing values. In particular:

Some classification tree methods automatically deal with missing values, by treating them as extra categories of predictor variables. In particular:
- SPSS's CHAID algorithm.
- Q's CART algorithm.
Some segmentation algorithms automatically address missing values using a Missing At Random (MAR) assumption. In particular, Q and Displayr's:
- K-Means Cluster Analysis
- Latent Class Analysis
Some regression and predictive modeling algorithms address missing values using the dubious Missing Completely At Random (MCAR) assumption. In particular:
- SPSS's Linear Regression when Missing Values is set to Exclude cases pairwise
- Q and Displayr's Linear Regression with Missing data is set to User partial data (pairwise correlations)
Some regression and predictive modeling algorithms address missing values using a Missing At Random (MAR) assumption. In particular, Q and Displayr do this when Missing data is set to Multiple imputation for all their predictive models (e.g., linear regression, logistic regression, etc.).
Some regression and predictive modeling algorithms address missing values using a Non-ignorable assumption when it is believed the data is missing because the cases have no experience relating to the data in the missing predictor variables. In particular, Q and DIsplayr do this when Missing data is set to Dummy variable adjustment.

When a technique is needed, and a variant of it for missing data is not available, the process is to: