Sigmapedia | MoreSteam.com

Definition

Multiplicity occurs when several tests are performed on the same dataset, such as when analyzing multiple responses or multiple comparisons of subgroups or Parameters, resulting in an increased risk of false positives (Type I error rate). Thus if 10 tests are performed, each at the 0.05 level, the overall level of all the tests combined is 1 – (1 – 0.05)¹⁰ = 1 – (0.95)¹⁰ = 0.4, i.e., the probability that at least one of the tests will be falsely significant, is 0.4 and not 0.05.

Application

If the object of the analysis is simply to predict the value of the response variable for given levels of the explanatory variables, then multiplicity should not be such a problem (the aim is to get a precise prediction and we don't care as much if we have unnecessary variables in the model). But if the object is to describe/explain the behaviour of the response variable, then it is important to have a parsimonious model, i.e., one that includes only the most significant factors and it is here that we have to be aware of multiplicity issues.

One way to guard against multiplicity is to control the overall error rate (also called ‘experiment-wise' or 'family-wise' error rate) at a specified level, say 0.2, and perform each of the k proposed tests at the (0.2/k) level.