Friday, 6 October 2023

Pairwise or Listwise

When should you use listwise or pairwise in regression analysis or some other analysis that allows you to choose?
This question came to mind when a regression model provided hugely different results when I switched from listwise to pairwise missing values handling. Faced with the task of d deciding which one should be used in the final analysis, I resorted to exploring the data and making a subjective decision on which results made more sense.
Here are some quick rules for decision that came to mind:
1. If results are similar, use pairwise, as it uses more information. 
2. If Missing Completely At Random assumption looks like a good one, use pairwise. But I suggest that you try to check this assumption, I mean, to me it is an assumption that hardly makes sense.
3. If results are meaningfully different, you probably are better off with listwise. But I suggest that you explore your data, see what makes more sense. In my case, I was testing the slope difference between groups, so I created a scatterplot for each group and the pairwise super-low p-value did not make sense, I could not see slope difference between groups. 
4. If pairwise does not work, use listwise. It will happen sometimes that the pairwise approach will result in covariance matrices that are not positive defined, and the model will simply not run.
5. My opinion is that large sample sizes and low percent of missing values will favor pairwise, however, I suggest that you always compare both.

Yes, there will be times you need to use subjectivity, and in fact there is no way to be 100% objective. But I feel like questions like whether the sample size is large, or the results are meaningfully different, or the results are making sense, are all important in statistics, and always subjective to some extent. A good practice is to make it clear what you did and why.



No comments: