R makes it really easy to perform statistical and data analytic tasks. More often than not, the hardest part is not the code itself, but rather figuring out what model to fit and how to interpret the output. This is certainly the case with hypothesis testing. Running a t-test is easy (just type t.test(...)), but interpeting the output can be tricky.

The purpose of these notes is to fill in various common gaps that you may currently have in your understanding of hypothesis testing. You should think of these notes not as everything you need to know but rather, everything you need to know for the purpose of this class.

Working example: the t-test

Suppose that someone tells you they’ve come up with a miraculous IQ-boosting drug. They even have the data to prove it!

Here’s the data they show you

aggregate(iq ~ groups,, function(x) round(mean(x), 1))
##      groups    iq
## 1   control 111.3
## 2 treatment 115.6

Interesting… it looks like the average IQ in the group that took the drug is 4 points higher than in the control (placebo) group. Let’s think about things statistically.

First, how many people were in each group. Sample size matters a lot.

##   control treatment 
##        23        19

That’s not a very big sample size. Let’s run a t-test to assess whether the observed difference in average IQ is statistically significant. <- t.test(iq ~ groups, data =
##  Welch Two Sample t-test
## data:  iq by groups
## t = -1.2601, df = 39.598, p-value = 0.215
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.246283   2.610128
## sample estimates:
##   mean in group control mean in group treatment 
##                111.2609                115.5789

We get a t-statistic of -1.26 and a p-value of 0.215.

How should we think about the results of the t-test?

(1) Do we reject the null hypothesis?

(2) What does the p-value actually mean?

(3) Why do we calculate a t-statistic? Why can’t I just look at the difference in means directly?

  • The p-value is the probability that we would observe a difference in average IQ between the treatment and control group at least as large we did if the drug actually had no effect.

(4) Can we say that the probability the drug had no effect is 0.215.