Definition

Measures the proportion of variance in one variable that is explained by another variable. In multiple regression analysis, R^{2} gives the proportion of variance in the response (output) variable that is attributable to the set of explanatory (input) variables in the model.

It is interpreted as a measure of the strength of the linear association between the output and input variables and serves as an indicator of how well the regression model fits the data.

Examples

In a simple regression, if the correlation coefficient (r) between X and Y is 0.8, then the coefficient of determination is R^{2}= 0.64, i.e., 64% of the variation in Y is explained by the linear relationship between X and Y. In effect, higher the coefficient of determination, greater is the strength of the relationship between Y and X.

In multiple regression analysis, R^{2} is calculated by taking the ratio of the regression sum of squares and the total sum of squares, or 1 minus the ratio of the residual (error) sum of squares and the total sum of squares. R^{2} takes values between 0 (none of the variance in Y is explained by the model) and 1 (all of the variance in Y is explained by the model).

Application

One drawback of R^{2} is that as more variables are added to the model, the R^{2} will always increase. For this reason the Adjusted R^{2} is actually considered a better measure of the ability of the model to explain the most variance in Y with the least number of explanatory variables. The adjusted R^{2} corrects for the number of explanatory variables in the model. Beware however that Adjusted R^{2} is not a true percentage and in rare cases might even take negative values.