Friday, 16 July 2010

I give up

I could not keep reading the book Statistics Applied to Clinical Trial. I had commented here that I hadn't liked the book since its first chapter, but from chapter 8 to chapter 13 the book becomes a joke. I could not read more.

Even if the Statistical Concepts were explained correctly, the book is written in a very confusing and repetitive way. Chapters 8 and 9 talk about the same thing, which Chapter 9 repeating lots of things in Chapter 8. The same thing happens again on Chapters 10, 11 and 12.

The authors ignores important concepts in Statistics. For example when they say " The p-value tells us the chance of making a type I error of finding a difference where there is none.". The p-value is in fact the probability of finding a difference as big as the one found, or bigger, given H0. P-value = P(Data/H0 True) not P(H0 True/Data). They talk a little bit about the non use of standards like 5% for significance test, but they go on and on always using the rule of 5%. They say that p-values too high is a proof that something is wrong with the study but in Hypothesis test we need to define our hypotheses and how to test them and calculate power and everything a priori. If you do not account for rejection of high p-values before the survey, it is not valid to look at the p-value after the fact and say "Well, I was expecting the p-value to be lower than 5% but it turned out to be 99%, so this is too high and I will Reject H0 because the chance of a p-value higher than 98% given H0 is too small". Perhaps a high p-value could be used as part of descriptive analysis, a red flag saying that maybe we should see if there is no obvious reason to get results so in line with H0.

They say that Correlation and Regression do not show causality but simple association (they do not use the word association, but correlation). They go on saying the Hypothesis tests with Clinical Trials is the real thing to detect causality. But, what is the difference between the ANOVA in a Clinical Trial and Regression? This is simply a huge confusion between observational data and experimental data, and they think Regression is used only with Observational Data in spite of not knowing Regression and ANOVA is the same thing. Ok, maybe they don't think Regression is only for observational data since they go on using Regression in an Cross-over trial. The dependent variable is the first measure and the independent is the second measure, showing that they do not have a sense of what is dependent variable and what is independent variable. If you want the Regression to make sense here, the independent variable has to be the treatments and this has to be a Hierarchical Regression with Within subject effects. That is, the Regression they use for their examples are meaningless. Not to mention the fact that they do not interpret the betas.

Finally I want to say that the book is confusing. The reader do not learn how to do a statistical test or a regression because the formulas appear from nowhere. I had trouble to understand how they use the formulas, imagine someone that is not a statistician.

Frustrated with the dissemination of bad use of statistics and wrong concepts, I wrote my negative review of the book on Amazon website. Hopefully others will choose better than me.

No comments: