 # Box-Cox Transformation

Go Back

Definition

Many data analysis tools rely on certain assumptions about the data, such as: the data follow a normal distribution, the levels of the variable have a constant variance, etc. If any of these assumptions is violated, the conclusions from the analysis may no longer be valid. Sometimes it is possible to transform (change) the data so that the assumptions are more closely approximated, allowing the use of the tools.

Examples

The family of Box-Cox Transformations is defined by the function Yλ or (Yλ - 1)/λ, where Y is the variable to be transformed and λ (Greek letter Lambda) is the non-zero ‘transformation parameter’. Thus, λ = 2 gives the square (y2) and λ = ½, gives the square-root (√y) function. λ = 0 gives the natural logarithm (ln(y)).

Application

Depending on the situation, Box-Cox transformations may be applied to the response variable or to one or more independent variables. These transforms are also used to make a linear model fit better.