A very small number of analysis methods have been developed that automatically address missing values. In particular:
- Some classification tree methods automatically deal with missing values, by treating them as extra categories of predictor variables. In particular:
- SPSS's CHAID algorithm.
- Q's CART algorithm.
- Some segmentation algorithms automatically address missing values using a Missing At Random (MAR) assumption. In particular, Q and Displayr's:
- K-Means Cluster Analysis
- Latent Class Analysis
- Some regression and predictive modeling algorithms address missing values using the dubious Missing Completely At Random (MCAR) assumption. In particular:
- SPSS's Linear Regression when Missing Values is set to Exclude cases pairwise
- Q and Displayr's Linear Regression with Missing data is set to User partial data (pairwise correlations)
- Some regression and predictive modeling algorithms address missing values using a Missing At Random (MAR) assumption. In particular, Q and Displayr do this when Missing data is set to Multiple imputation for all their predictive models (e.g., linear regression, logistic regression, etc.).
- Some regression and predictive modeling algorithms address missing values using a Non-ignorable assumption when it is believed the data is missing because the cases have no experience relating to the data in the missing predictor variables. In particular, Q and DIsplayr do this when Missing data is set to Dummy variable adjustment.
When a technique is needed, and a variant of it for missing data is not available, the process is to:
- Understand the missing data (see Checking and Understanding Missing Data).
- If appropriate, impute the missing data.
Comments
0 comments
Please sign in to leave a comment.