I find it useful to write about texts that I read since that usually helps the understanding and recording of the content into my brain (which is sort of a difficult thing by the way).
This time I found an interesting paper at The American Statistician, which talks about Exterior matching. I was familiar with the concept of matching (although never really used it) but not with the exterior matching.
It is common in causal inference, especially with observational data, that we match subjects with other that are similar in some important variables, before comparing groups. For example, the paper talks about survival time for black women with breast cancer being lower than for white women. That on its on is quite concerning, but it is important to understand why it happens.
In causal inference we often will have a research question like "does treatment X improves survival time for breast cancer?" and if all we have is observational data we would like to match or control for things like race in order to make sure the effect of treatment X is not actually a confounding.
But in this case we have these two races and we are more interested on why the survival times are different so that we can possibly do something about it. It could be because black woman comes to the hospital in more advanced stages, of the black woman are older, or they receive a different treatment or who knows, it could be many things.
The idea is that we will match women according to age. Every black woman will be matched with a white woman who had breast cancer and the same age. Then we look at survival time. We then further match each black woman with a white woman, but in age and stage of cancer.
The survival time may become more similar after matching in age because now we are not just comparing black with white women, but we are also making sure they have the same age. So this is the first comparison, with age matched only. In the second comparison we are further making sure they have the same initial stage of cancer. Therefore it is natural and expected, although not necessary, that the survival times become closer to the extent that age are different between groups and age influences the survival time (talking about the first comparison).
By testing whether there is a significant difference in survival times between first and second comparisons we can gather evidence on whether initial stage of cancer is a factor or not, after controlling for age. This could be thought of as a simple comparison of the survival times between the white women in the first match with the white woman in the second match.
The issue is that many times these two groups of white woman have overlaps, that is, the same white woman is in both matches. Therefore the groups are not independent and cannot be compared through usual statistical tests, which assumes independence.
The paper is about solving this difficulty through excluding the overlaps and working only with the white woman that appear only once, by matching them in the best possible way in order to end up with three groups (black, white from match 1 and white from match 2) that are independents.
I thought it is an interesting paper, but it seems to me that especially if the overlap is large, we will end up with concerns about the final match being good and efficient. But there is no free lunch, we will always have to live with this sort of assumption.
No comments:
Post a Comment