Definition

A mathematical operation applied to a data set in order to change it to a form that is more easily described, or to make it conform to certain pre-specified assumptions. Transformations are commonly used to make the dataset approximate a normal distribution. This is done so that statistical techniques can be used which are valid for distributions satisfying the assumption of normality. Data may also be transformed in order to induce linearity or homoskedasticity (equal Variances across groups/populations).

The most commonly used techniques are the Johnson Transformation and the Box-Cox Transformation. Most statistical software packages include these transformations and make them manageable to perform.

Examples

Some common transforms are shown in the table.

Application

Transformations essentially change the scale of the data, usually by 'pulling in' large values from the tails. However, this also makes the results more complex to interpret due to the curvilinear nature of the functions. For this reason, you must exercise care when interpreting results based on transformed data.

See Also

Box-Cox Transformation

External Links

Accounting for Non-Constant Variation Across the Data from NIST: - http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm Transforming Non-Normal Data from NIST: - http://www.itl.nist.gov/div898/handbook/pmc/section5/pmc52.htm