# Model Selection using R-squared (R²) Measure

If you are looking for a widely-used measure that describes how powerful a regression is, the R-squared will be your cup of tea.

R² tells you how related two things are. However, we tend to use R² because it’s easier to interpret. R² is the percentage of variation (i.e. varies from 0 to 1) explained by the relationship between two variables.

In the linear regression model, R-squared acts as an evaluation metric to evaluate the scatter of the data points around the fitted regression line.

An R-squared of zero means our regression line explains none of the variability of the data. An R-squared of 1 would mean our model explains the entire variability of the data.

Formula: Below is the actual formula for calculating the R-Squared value.

Where RSS: Residual Sum of Square and TSS: Total Sum of Square

R-Square value can be defined using three other errors terms.

1. Residual Sum of Square (RSS): It is the summation (for all the data points) of the square of the difference between the actual and the predicted value.

2. Total Sum of Squares (TSS): It is the summation (all data points) of the square of the difference between actual output and average value ‘Y(bar)’.

The above is the simplified version for calculating the R-squared value. It uses both the residual sum of the square and the total sum of the square. If your value of R² is large, you have a better chance of your regression model fitting the observations.

You can have a visual demonstration of the plots of fitted values by observed values in a graphical manner. It illustrates how R-squared values represent the scatter around the regression line.