A few weeks ago I read in the Newsweek magazine an article about screening tests and others medical procedures that is quite interesting. The text can be found here. In summary, the text is about medical procedures aimed to save the patient that can in fact harm more than cure. This is not the first time I see this. For a very good CBC radio program about the dangers of the PSA screening test for prostate cancer, and screening tests in general, see here. And also here for screening tests in general.
People usually think that if a procedure can detect or fix something bad then it is a good procedure. The problem is that no procedure is perfect. Screening tests can fail and detect the disease on somebody that does not have it. Invasive medical procedures can cause not desired affects and cause damage to health parts of the body. In the case of screening tests it is often the case that a very small part of the population has the condition the test is looking for, therefore a large number of healthy subjects will be tested unnecessarily, often not without adverse effects. The adverse effects that the test has on healthy subjects may not worth the lives saved with early detection by the test.
One of the procedures not recommended in the Newsweek text is the electrocardiogram for people that don't feel anything in the heart. Another is the MRI for lower back pain. And then I have some personal experience with these.
In Brazil, if you are a runner, you will be advised a lot to do a hearth check up "Everybody should do a heart check up before practicing physical activities to make sure they can run". Is there any scientific foundation for this recommendation? I think the reason for this is legal more than based on scientific evidences. Race organizers and coaches may be sued if athletes die on their hands. By making heart check up mandatory, they make doctors accountable rather than themselves. And if the heart check up (which of course involves electrocardiogram) harms the runner more than save their lives, then the runners are also paying a price for the lack of evidences. We could think that runners are more at risk of having something on their heart (I strongly disagree!) and then maybe an electrocardiogram would be justified. But we still need something showing that. Also, I see very few runners dying while running (but they become news among runners very fast!) and when someone dies, I often don't see convincing argument that it was because that runner had some sort of untreated heart condition. I mean, where is the data that shows that fewer runners who go through a check up die compared to those who don't go through heart check up?
The second story is about the lower back MRI. I went to the doctor and I told him I occasionally had this back pain. Very occasionally since I haven't had a bad back pain for more than two years now. It was just a routine exam, I was not complaining about my back, but he asked if I have felt any pain so... He said I should go through an MRI to see what is happening with my lower back. I refused to do it because I did not have time then, but now I see that perhaps I did the right thing. It seems clear that many doctors don't have this sort of training on the importance of data based evidences. They think the MRI is the right thing to do because the MRI will allow them to see everything and this can only be a good thing...
Saturday, 17 September 2011
Saturday, 16 July 2011
And now, the salt
I have always found interesting how things that we take as truth are so often debunked by new experiments.
I grew up listening people and doctors saying salt is not good for our health, specially for our blood pressure. But today I found this article which shows that the scientific evidences we have against salt are minimal if any. I am not here defending the consumption of salt but rather commenting on the bad statistical practice that we find so often in the field of medicine. We believe in so many things that has no statistical foundation and perhaps others that come from bad statistical practice, from which I would mention the lack of replication of experiments.
I am always careful on interpreting published experimental evidences that we often see in the media but for the majority of the population these things are taken as truth. Statistical analysis became widespread which is good but I think we reached a point where maybe statisticians should be hold more accountable for their analyses and every experiment should be followed by a statistician or someone recognized versed on statistics.
I grew up listening people and doctors saying salt is not good for our health, specially for our blood pressure. But today I found this article which shows that the scientific evidences we have against salt are minimal if any. I am not here defending the consumption of salt but rather commenting on the bad statistical practice that we find so often in the field of medicine. We believe in so many things that has no statistical foundation and perhaps others that come from bad statistical practice, from which I would mention the lack of replication of experiments.
I am always careful on interpreting published experimental evidences that we often see in the media but for the majority of the population these things are taken as truth. Statistical analysis became widespread which is good but I think we reached a point where maybe statisticians should be hold more accountable for their analyses and every experiment should be followed by a statistician or someone recognized versed on statistics.
Sunday, 12 June 2011
Prevalence and Incidence
I have seen situations where these two words, very common in epidemiology, are used interchangeably. To tell the truth I may have done that as well.
The fact is that prevalence measures the cases existing in a given population whereas incidence measures only the new cases, it is the rate to which new cases happen.
The prevalence is often measured as a percent or per 100.000. In epidemiological studies, it usually is measured by a question asking whether or not the respondent has the condition and the estimate is the number of positive answers divided by the total number of interviewed. In this case the prevalence is valid for the target population covered by the survey. It is also valid for that point in time, it is a photography of the population on that point in time, as some like to say.
On the other hand the incidence is a rate and therefore it has a time period as part of the measure. For example, one could say that the incidence of disease X is 1% per year in a given population, meaning that in a year, 1% of the population without the condition will acquire the condition. In a survey one would ask respondents whether or not they acquire the condition over the course of the past 12 months and the positive answers would be the numerator in the incidence estimation. The denominator would be these same respondents plus those who were disease free in the past 12 months.
When the number of cases is known for more than one year, it is usual to report the incidence in terms of person-year. For example, 20 cases in 1000 people in two years can be reported as 2% incidence per 2 years or 1% incidence per person-year, dividing the incidence by the number of years. However, if the incidence is not constant over years then this type of reporting is not a good idea since it just show the average across two years where incidences on each year are quite different.
Therefore these two measures can be quite different things and should be reported according to their definition to avoid misinterpretation or results.
The fact is that prevalence measures the cases existing in a given population whereas incidence measures only the new cases, it is the rate to which new cases happen.
The prevalence is often measured as a percent or per 100.000. In epidemiological studies, it usually is measured by a question asking whether or not the respondent has the condition and the estimate is the number of positive answers divided by the total number of interviewed. In this case the prevalence is valid for the target population covered by the survey. It is also valid for that point in time, it is a photography of the population on that point in time, as some like to say.
On the other hand the incidence is a rate and therefore it has a time period as part of the measure. For example, one could say that the incidence of disease X is 1% per year in a given population, meaning that in a year, 1% of the population without the condition will acquire the condition. In a survey one would ask respondents whether or not they acquire the condition over the course of the past 12 months and the positive answers would be the numerator in the incidence estimation. The denominator would be these same respondents plus those who were disease free in the past 12 months.
When the number of cases is known for more than one year, it is usual to report the incidence in terms of person-year. For example, 20 cases in 1000 people in two years can be reported as 2% incidence per 2 years or 1% incidence per person-year, dividing the incidence by the number of years. However, if the incidence is not constant over years then this type of reporting is not a good idea since it just show the average across two years where incidences on each year are quite different.
Therefore these two measures can be quite different things and should be reported according to their definition to avoid misinterpretation or results.
Monday, 25 April 2011
More satisfiction, more suicide
An interesting research finding was published in the Journal of Economic Behavior and Organization. In an analysis covering for many geographical units including countries and states, it was found that the more happy (satisfied) is the region, the higher the suicide rate. I mean, there is an association between these two metrics. The main explanation for the positive association between happiness and suicide rate was that folks tend to compare themselves with others and if there are too many happy people, the unhappy ones will likely be even more unhappy.
Notice that the correlation was found at the geographical level and the researchers try to establish a causal link. This is the well know ecological fallacy, about which you can read more here and if you want to dive deeper in the subject there is this very good book.
The idea is that regions with higher satisfaction levels (or higher % of people satisfied) will usually have also higher rate of suicidal. But that is just an association, it is not a causal link, that is, it is totally possible that these things are associated but they don't cause one another. Moreover, when the association happens at the aggregated level like this one, it seems to me that the number of confounding factor can be enormous and difficult to pinpoint. That is not to say that this kind of analysis is useless, though, but I do think they did not go far enough on testing these confounding factors. The article mentions demographic variables as control, but I think region level variables should be accounted for as well (GDP, temperature, health and development indicators...). It could be, for example (and this is really just a fake example, although I do think the development level of a region might have something to do with this association), that development causes more extreme situations, among them it causes both happiness and depression. That could be why things are associated, they are both cause by the degree of development of the place, meaning that the association is spurious, there is not causal link, it is a third variable (development) which in fact causes both.
I heard in the radio a while ago that allergies are much more common in developed countries, so maybe we could also find an association of allergy with satisfaction and suicide rate. I mean, when looking at data at the aggregated (regional) level, we need to be careful on concluding things that in fact happen at the individual level. Unfortunately we can find explanations (that are just hypotheses) for pretty much any correlation, but explanations that we find after looking at the associations are very weak in statistical validity.
Notice that the correlation was found at the geographical level and the researchers try to establish a causal link. This is the well know ecological fallacy, about which you can read more here and if you want to dive deeper in the subject there is this very good book.
The idea is that regions with higher satisfaction levels (or higher % of people satisfied) will usually have also higher rate of suicidal. But that is just an association, it is not a causal link, that is, it is totally possible that these things are associated but they don't cause one another. Moreover, when the association happens at the aggregated level like this one, it seems to me that the number of confounding factor can be enormous and difficult to pinpoint. That is not to say that this kind of analysis is useless, though, but I do think they did not go far enough on testing these confounding factors. The article mentions demographic variables as control, but I think region level variables should be accounted for as well (GDP, temperature, health and development indicators...). It could be, for example (and this is really just a fake example, although I do think the development level of a region might have something to do with this association), that development causes more extreme situations, among them it causes both happiness and depression. That could be why things are associated, they are both cause by the degree of development of the place, meaning that the association is spurious, there is not causal link, it is a third variable (development) which in fact causes both.
I heard in the radio a while ago that allergies are much more common in developed countries, so maybe we could also find an association of allergy with satisfaction and suicide rate. I mean, when looking at data at the aggregated (regional) level, we need to be careful on concluding things that in fact happen at the individual level. Unfortunately we can find explanations (that are just hypotheses) for pretty much any correlation, but explanations that we find after looking at the associations are very weak in statistical validity.
Sunday, 23 January 2011
ARM Chapter 4 - Linear Regression: Before and After Fitting the model
Today I want to comment on the fourth chapter of this book I am calling ARM. This chapter is about regression analysis and I just want to single out some main points that I though interesting. I want to note that this is a basic chapter on regression analysis and some of the things here I already new or had a sense of it. Others are interesting new ideas. I find these things always worth commenting since this book is a lot about modeling and not so much about math. Things we see here are not found in usual regression books and many times the only way you will get it is by experience.
1) Linear transformations - these are important not only to make variables to look more normal, but to make interpretation easier and more meaningful. A simple example, if you are measuring distances between cities, use it in Km not in meters.
2) Standardizing variables - again helpful for interpretation purposes. The intercept now is interpreted relative to the mean (and not relative to zero, which for many cases in meaningless. For example, if the independent variable is IQ coefficient, the intercept will have to be interpreted as what is predicted by the model when the IQ coefficient is zero, which does not happen). The standardization may also help on interpreting the coefficients of main effects when the model has interactions.
3) Standardizing binary independent variables by dividing it by 2 standard deviation - The authors argues that in the case of binary variables, this standardization will make the coefficient more meaningful because it will still be the effect of changing the binary variable by one unit. While this make sense in a way I am not sure I completely got this. This is a paper with more details which I still need to read.
4) Principal Component and Regression line - Imagine a scaterplot where you plot your X and Y in the horizontal and vertical axes respectively. The regression line is the line that minimizes the sum of the vertical distance to the line. The principal component line is the line that minimizes the distance (the shortest, not the vertical) to the line. The regression line is better because it is what we want - to minimize the error in Y, the error of predicting Y, or graphically, the vertical distance.
5) Logarithmic transformation - The use of logarithmic (natural - base e) transformation has been widespread in regression analysis. We usually justify it by saying that it makes Y normal. The book does not talk about this, maybe also because in the previous chapter it argues that normality is not that important for inference in regression (and what matters is normality of the error, not Y, which is another issue altogether). But let's stick to what is in Chapter 4. Logarithmic transformation of predictors may help in the interpretation since their coefficients con be interpreted as approximate percent change when they are small. Another point is that it may make the model better since some effects are multiplicative rather than additive. For example, suppose Income is one of the predictors. In the original scale, changing from $10K to $20K will have the same effect as changing the predictor from $100K to $110K. If it seems reasonable that in fact the former effect should be higher than a logarithmic or a square root transformation might work better.
6) Modeling tips - which variable to include and which not to include in the model. I have always found this a trick and maybe dangerous subject. The book argues for leaving in the model significant variables and the not significant ones with coefficient that makes sense. We should think hard about variables significant that does not make sense, because maybe it does make sense... Well, I really find this a complicated issue. We are in a better position if we are talking about demographics or things that usually are not highly associated to each other but things can get messy when multicolinearity plays a role in the model... We should also try interactions when main effects are high.
7) To exemplify point 6 above the authors present a model with several independent variables. They end up sort of unsure on what to keep in the model. I think this exemplifies well the day to day experience we have with regression models and even better, by showing this example the book does not restrict itself to cases with beautiful solutions. However I am not sure I agree with the way they transform the variables (like, creating a volume, area, shape variables), because to me the original variables seem clear enough to be kept in the model as they are. It is not needed to create a "volume" variable from three original variables to sort of reduce the number of variables in the model. But it is an interesting approach anyway.
8) They also talk quickly about others customized transformation and modeling strategies, like, if income is your dependent variables and there is a zero income, you might want to create two models, the first modeling Zero Income/Not zero income and the second modeling the size of the income conditional to having income higher than zero.
9) Another interesting reference is this one, on standardization of variables.
I think that is it. And I make again the point that all this is quite interesting as it is regression from the viewpoint of modeling and not so much theoretical mathematic. I think anyone that works with regression analysis in social sciences will understand the importance of the experience, it is not enough to know well the theory behind the statistical model.
1) Linear transformations - these are important not only to make variables to look more normal, but to make interpretation easier and more meaningful. A simple example, if you are measuring distances between cities, use it in Km not in meters.
2) Standardizing variables - again helpful for interpretation purposes. The intercept now is interpreted relative to the mean (and not relative to zero, which for many cases in meaningless. For example, if the independent variable is IQ coefficient, the intercept will have to be interpreted as what is predicted by the model when the IQ coefficient is zero, which does not happen). The standardization may also help on interpreting the coefficients of main effects when the model has interactions.
3) Standardizing binary independent variables by dividing it by 2 standard deviation - The authors argues that in the case of binary variables, this standardization will make the coefficient more meaningful because it will still be the effect of changing the binary variable by one unit. While this make sense in a way I am not sure I completely got this. This is a paper with more details which I still need to read.
4) Principal Component and Regression line - Imagine a scaterplot where you plot your X and Y in the horizontal and vertical axes respectively. The regression line is the line that minimizes the sum of the vertical distance to the line. The principal component line is the line that minimizes the distance (the shortest, not the vertical) to the line. The regression line is better because it is what we want - to minimize the error in Y, the error of predicting Y, or graphically, the vertical distance.
5) Logarithmic transformation - The use of logarithmic (natural - base e) transformation has been widespread in regression analysis. We usually justify it by saying that it makes Y normal. The book does not talk about this, maybe also because in the previous chapter it argues that normality is not that important for inference in regression (and what matters is normality of the error, not Y, which is another issue altogether). But let's stick to what is in Chapter 4. Logarithmic transformation of predictors may help in the interpretation since their coefficients con be interpreted as approximate percent change when they are small. Another point is that it may make the model better since some effects are multiplicative rather than additive. For example, suppose Income is one of the predictors. In the original scale, changing from $10K to $20K will have the same effect as changing the predictor from $100K to $110K. If it seems reasonable that in fact the former effect should be higher than a logarithmic or a square root transformation might work better.
6) Modeling tips - which variable to include and which not to include in the model. I have always found this a trick and maybe dangerous subject. The book argues for leaving in the model significant variables and the not significant ones with coefficient that makes sense. We should think hard about variables significant that does not make sense, because maybe it does make sense... Well, I really find this a complicated issue. We are in a better position if we are talking about demographics or things that usually are not highly associated to each other but things can get messy when multicolinearity plays a role in the model... We should also try interactions when main effects are high.
7) To exemplify point 6 above the authors present a model with several independent variables. They end up sort of unsure on what to keep in the model. I think this exemplifies well the day to day experience we have with regression models and even better, by showing this example the book does not restrict itself to cases with beautiful solutions. However I am not sure I agree with the way they transform the variables (like, creating a volume, area, shape variables), because to me the original variables seem clear enough to be kept in the model as they are. It is not needed to create a "volume" variable from three original variables to sort of reduce the number of variables in the model. But it is an interesting approach anyway.
8) They also talk quickly about others customized transformation and modeling strategies, like, if income is your dependent variables and there is a zero income, you might want to create two models, the first modeling Zero Income/Not zero income and the second modeling the size of the income conditional to having income higher than zero.
9) Another interesting reference is this one, on standardization of variables.
I think that is it. And I make again the point that all this is quite interesting as it is regression from the viewpoint of modeling and not so much theoretical mathematic. I think anyone that works with regression analysis in social sciences will understand the importance of the experience, it is not enough to know well the theory behind the statistical model.
Sunday, 16 January 2011
ESP and Statistics
There has been lately much talking among the statistical community on the paper about extrasensory perception recently published by the well recognized Journal of Personality and Social Psychology. To tell the truth I did not read the paper, maybe because I think the articles I read about the subject, like this one and this one, tells me all I need to know.
Shortly, the paper claims to have found evidence of extrasensory perception which is to say that what happens in the future can influence things now. The world of causality as we define it today would be completely shaken because we would no longer be able to say that if X causes Y, then X happens before Y. The paper claims to have found significant effects in experiments, like this, from the linked article:
It is weird to think people will be able to influence the randomly generated image that they will see in the next moment. Well, in fact I found this non sense because random things generated by a computer are not really random, which is to say that there is no way one would have any influence in the random process since the random sequence is already generated to start with. So to me the experiment does not seems good, but I did not read the details of the experiment, so I don't know.
The point I want to make is that this is likely to be another example of bad use of significance testing, and maybe the paper's biggest contribution will be as and example of how not to do things. The answer to whether or not the effects are real will come only with replication, that is, when others do the same experiment and get the same results.
It is also interesting to see that such a polemic paper was published in such a high level journal. I think when things are polemic the journal should ask for more evidence, maybe, and be more careful on judging the paper.
It will be interesting to watch for what will come on this subject...
Shortly, the paper claims to have found evidence of extrasensory perception which is to say that what happens in the future can influence things now. The world of causality as we define it today would be completely shaken because we would no longer be able to say that if X causes Y, then X happens before Y. The paper claims to have found significant effects in experiments, like this, from the linked article:
"In another experiment, Dr. Bem had subjects choose which of two curtains on a computer screen hid a photograph; the other curtain hid nothing but a blank screen.
A software program randomly posted a picture behind one curtain or the other — but only after the participant made a choice. Still, the participants beat chance, by 53 percent to 50 percent, at least when the photos being posted were erotic ones. They did not do better than chance on negative or neutral photos."
It is weird to think people will be able to influence the randomly generated image that they will see in the next moment. Well, in fact I found this non sense because random things generated by a computer are not really random, which is to say that there is no way one would have any influence in the random process since the random sequence is already generated to start with. So to me the experiment does not seems good, but I did not read the details of the experiment, so I don't know.
The point I want to make is that this is likely to be another example of bad use of significance testing, and maybe the paper's biggest contribution will be as and example of how not to do things. The answer to whether or not the effects are real will come only with replication, that is, when others do the same experiment and get the same results.
It is also interesting to see that such a polemic paper was published in such a high level journal. I think when things are polemic the journal should ask for more evidence, maybe, and be more careful on judging the paper.
It will be interesting to watch for what will come on this subject...
Subscribe to:
Posts (Atom)