Definition

A mathematical operation applied to a data set in order to change it to a form that is more easily described, or to make it conform to certain pre-specified assumptions. Transformations are commonly used to make the dataset approximate a normal distribution. This is done so that statistical techniques can be used which are valid for distributions satisfying the assumption of normality. Data may also be transformed in order to induce linearity or homoskedasticity (equal Variances across groups/populations).

The most commonly used techniques are the Johnson Transformation and the Box-Cox Transformation. Most statistical software packages include these transformations and make them manageable to perform.

Examples

Application

Transformations essentially change the scale of the data, usually by 'pulling in' large values from the tails. However, this also makes the results more complex to interpret due to the curvilinear nature of the functions. For this reason, you must exercise care when interpreting results based on transformed data.

