

If the point is below the regression line, the residual is negative.If the point is above the regression line, the residual is positive.The residuals are the vertical distances between the points and the line. The residual is calculated by taking the actual value of y - the predicted value of y. The distance between the actual value and the predicted value from the line is called the residual or prediction error. When thinking about this graphically, this means that for most of the points in any scatterplot, the actual y-values and the predicted y-values are different. Unless there is a perfect correlation, our predictions are not going to be perfect. Extrapolated predictions can be absolutely wrong.This is risky because we have no evidence to believe that the association between x and y remains linear for unseen values of x.Predicting y at values of x beyond the range of x in the data is called extrapolation.When making predictions using regression, it's important to be aware of the following: Next, you can plug in values of x to get predicted values of y. In order to make predictions using the equation of the regression line, first find the slope and y-intercept. Y-INTERCEPT= The predicted value of Y when X is equal to 0.SLOPE= The average increase in Y associated with a 1-unit increase in X.Here's how you calculate the slope and y-intercept: In our regression equation, b0 is the y-intercept and b1 is the slope. The Slope and Y Intercept of the Regression Line For now, we are going to focus on simple linear regression because it's easy to interpret the results. We can have multiple independent variables to predict the y-variable and this is called multiple regression. This means that we are using one independent variable to predict the y-variable. When we have one x-variable (x1) and one y-variable (y-hat), this is called simple linear regression. Since our regression estimates form a straight line, we can describe them using an equation in slope-intercept form: Regression Equation We can find the equation of this line and use it to make predictions. This is called OLS or Ordinary Least Squares Regression. The most common technique is to try to fit a line that minimizes the squared distance to each of those points. The idea of trying to fit a line as closely as possible to as many points as possible is known as linear regression.
