If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Using least squares regression output

Worked example using least squares regression output.

Want to join the conversation?

  • leaf orange style avatar for user Ryan Jin
    What are SE Coef, T, and P?
    (22 votes)
    Default Khan Academy avatar avatar for user
  • orange juice squid orange style avatar for user Tyler
    Why is it called "least squares regression output?"
    (9 votes)
    Default Khan Academy avatar avatar for user
    • leafers seed style avatar for user Constantine
      Regarding regression, the term doesn't have anything to do with what it really does...
      Extracted from this nice article from http://blog.minitab.com/blog/statistics-and-quality-data-analysis/so-why-is-it-called-regression-anyway:

      "... here’s the irony: The term regression, as Galton used it, didn't refer to the statistical procedure he used to determine the fit lines for the plotted data points. (...) For Galton, “regression” referred only to the tendency of extreme data values to "revert" to the overall mean value. (...)
      Later, as he and other statisticians built on the methodology to quantify correlation relationships and to fit lines to data values, the term “regression” become associated with the statistical analysis that we now call regression. But it was just by chance that Galton's original results using a fit line happened to show a regression of heights. If his study had showed increasing deviance of childrens' heights from the average compared to their parents, perhaps we'd be calling it "progression" instead.

      So, you see, there’s nothing particularly “regressive” about a regression analysis."
      (14 votes)
  • blobby green style avatar for user Pally
    at , why does slope = fertility coef? and y-intercept = constant coef?
    (7 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user mopkaloppt
      To predict life expectancy (Yhat) based on fertility rate(X), it means for every 1 fertility rate change (delta X = 1) you want to know how much does the life expectancy changes (delta Y). Hence, you use the slope (delta Y/ delta X) to be the coefficient of fertility. When you multiply the fertility rate given by this coefficient then you know Y (life expectancy) would change by how much.

      For y-intercept as a constant coefficient I think it's because this is the point in the graph where you know for certain that the linear regression line will pass through, hence the name constant. This value could very well be the mean value of y because every linear regression line will pass through the mean of x and y (x,y) coord. Hopefully, somebody passing by would confirm or correct me on this if my understanding is wrong.
      (2 votes)
  • piceratops seedling style avatar for user Horacio Colbert
    What software gives this output?
    What do the other entries in the table represent?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user credenzabf
    Loving the course, but I don't think we've had "coefficient" defined/explained anywhere—recalling it dimly from school—just as I think we haven't had an explanation for suddenly swapping in "least-squares regression" for "linear regression". (I believe there was a mention that squares would be explained, but here I am with no idea.) I'm acclimating to the shift in variables—b now is a, and m now is b—must be an effect of the hat—but this vid is a leap for me from the preceding. I like the challenge, but ack.
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Darwin Bagley
    Life expectancy of whom?

    The mothers, the children, or the population as a whole?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Vicky Lin
    Sorry a bit off the topic, but I am curious how you calculate the standard errors of the coefficients?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Chris Renis
    At , Sal said "pause this number" instead of "pause this video", and it sounds kinda silly...was that a mistake, or was it just reworded?
    (1 vote)
    Default Khan Academy avatar avatar for user

Video transcript

- [Instructor] Nkechi took a random sample of 10 countries to study fertility rate. And life expectancy. She noticed a strong negative linear relationship between those variables in the sample data. Here is computer output from a least-squares regression analysis for using fertility rate to predict life expectancy. Use this model to predict the life expectancy of a country whose fertility rate is two babies per woman. And you can round your answer to the nearest whole number of years. So pause this number and see if you can do it, you might need to use a calculator. All right now let's do this together. So in general, this computer output is actually giving us a lot of data, more than we need actually, to do this prediction. But it's giving us the data we need to know the equation for a regression line. So the general form of a regression line, a linear regression line would be, our estimate, and that little hat means we're estimating our y value, would be equal to our y-intercept plus our slope, times our x value. Now in this situation, we're using fertility to predict life expectancy. Or let me circle all of life expectancy. So the thing that we're trying to predict, that is y, life expectancy. And fertility, is the thing that we're using to predict that. So that is going to be our x, right over there. Now what are a and b? Well, our computer output gives us that. It's these numbers right over here. Our constant coefficient right over here, this is a. And our slope, is going to be negative 5.97. You could view it as the coefficient on fertility. Remember, this right over here, is fertility. You could even write, rewrite this as our estimated life expectancy, estimated life expectancy. I could put a little hat on it to show this is estimated life expectancy, is going to be equal to 89.70 minus 5.97 times fertility, times fertility rate. I'll just call it, say fert. And period, right over there. Notice, this is the coefficient on fertility, and then this is the constant coefficient. We could do that right over there. And now, we can use this to estimate the life expectancy of a country whose fertility rate is two babies per woman. For fertility, you just put a two here. And then you get your estimated life expectancy. So what's that going to be? We can get out a calculator. So we can say, 5.97 times two is equal to that and then we wanna subtract that from, so put in a negative there, and add that to 89.7 is equal to, and we wanna round to the nearest whole number of years, so that's approximately 78 years. So this is approximately 78 years. And we're done. And just to be clear what even happened here, is that Nkechi, she did a regression, on the x-axis with fertility, fertility, on the y-axis is let's call it l period dot e period. That's our y-axis. Took 10 data points, one, two, three, four, five, six, seven, eight, nine, 10. Put a regression line on, try to fit try to fit a regression line. Saw a negative linear relationship, and then using this regression line to estimate, hey, if fertility is, let's say this is two right over here, what is the estimated life expectancy? And we just saw that that would be roughly 78 years.