 # P-Value

Go Back

Definition

In layman terms, the p-value is the likelihood that the observed test result occurred by random chance rather than due to a special cause. A large p-value implies that the result occurred by chance and there is no reason to doubt the null hypothesis. A small p-value, on the other hand, implies the result could not reasonably have occurred by random chance, implicating a special cause instead. The threshold of when a given p-value is small enough to reject the null hypothesis is set by choosing an appropriate significance level (typically 0.01, 0.05 or 0.10). A p-value smaller than the chosen alpha is said to be ‘statistically significant’, leading to the null hypothesis being rejected in favor of the alternative. The smaller the significance level the harder it is to reject the null hypothesis (this is a stricter or more ‘conservative’ test).

Formally defined, the p-value of a test is “the probability of rejecting the null hypothesis given the null hypothesis is true”. It can be explained as the probability, assuming the null hypothesis is true, of obtaining by random chance a test statistic value as extreme as or more extreme than that calculated from the sample. The null hypothesis is rejected in favor of the alternative if the p-value of the test is smaller than the chosen significance (alpha) level.

Examples

A significance test is conducted to test whether a new treatment has a significant effect on lowering blood pressure. The test yields a p-value of 0.02. This means assuming that the treatment has no effect, and for this fixed sample size, an effect as large as the observed effect would be seen in only 2% of studies. If the significance criterion (alpha) is set at 0.05, then this p-value of 0.02 leads us to reject the null hypothesis and establishes statistical significance.

Application

Consider an hypothesis test to test whether the population mean is larger than the value assumed by the null hypothesis (a one-sided, larger-than test). The null hypothesis is assumed to be true unless proven false. The image 1 alongside shows the distribution underlying the sample data under the null hypothesis. The significance level (alpha) or rejection region is the area in light blue – this value is chosen by the researcher before conducting the test; it is the amount of risk (of rejecting the null hypothesis when it is in fact true) that the researcher is willing to take on. The p-value is the region contained within the alpha region, with vertical stripes. The boundary of this p-value area is the test statistic value. When the p-value is smaller than the alpha region (as in this case), the test is said to be ‘significant’ and the null hypothesis is rejected in favor of the alternative.

Conversely, if the p-value region was NOT smaller than or contained within the alpha region (as shown in image 2 below), then the test would be declared ‘not significant’ and the null hypothesis would be retained.  