Chapter 12Simple Linear Regression商务统计 教学课件.ppt
Chapter 12 Simple Linear Regression,Simple Linear Regression Model,Least Squares Method,Coefficient of Determination,Model Assumptions,Testing for Significance,Using the Estimated Regression Equation for Estimation and Prediction,Computer Solution,Residual Analysis:Validating Model Assumptions,Simple Linear Regression Model,y=b0+b1x+e,where:b0 and b1 are called parameters of the model,e is a random variable called the error term.,The simple linear regression model is:,The equation that describes how y is related to x and an error term is called the regression model.,Simple Linear Regression Equation,The simple linear regression equation is:,E(y)is the expected value of y for a given x value.,b1 is the slope of the regression line.,b0 is the y intercept of the regression line.,Graph of the regression equation is a straight line.,E(y)=0+1x,Simple Linear Regression Equation,Positive Linear Relationship,Slope b1is positive,Regression line,Intercept b0,Simple Linear Regression Equation,Negative Linear Relationship,Slope b1is negative,Regression line,Intercept b0,Simple Linear Regression Equation,No Relationship,Slope b1is 0,Regression line,Intercept b0,Estimated Simple Linear Regression Equation,The estimated simple linear regression equation,is the estimated value of y for a given x value.,b1 is the slope of the line.,b0 is the y intercept of the line.,The graph is called the estimated regression line.,Estimation Process,Regression Modely=b0+b1x+eRegression EquationE(y)=b0+b1xUnknown Parametersb0,b1,b0 and b1provide estimates ofb0 and b1,EstimatedRegression Equation Sample Statisticsb0,b1,Least Squares Method,Least Squares Criterion,where:yi=observed value of the dependent variable for the ith observation,Slope for the Estimated Regression Equation,Least Squares Method,y-Intercept for the Estimated Regression Equation,Least Squares Method,where:xi=value of independent variable for ith observation,n=total number of observations,yi=value of dependent variable for ith observation,Reed Auto periodically hasa special week-long sale.As part of the advertisingcampaign Reed runs one ormore television commercialsduring the weekend preceding the sale.Data from asample of 5 previous sales are shown on the next slide.,Simple Linear Regression,Example:Reed Auto Sales,Simple Linear Regression,Example:Reed Auto Sales,Number of TV Ads,Number ofCars Sold,13213,1424181727,Estimated Regression Equation,Slope for the Estimated Regression Equation,y-Intercept for the Estimated Regression Equation,Estimated Regression Equation,Scatter Diagram and Trend Line,Coefficient of Determination,Relationship Among SST,SSR,SSE,where:SST=total sum of squares SSR=sum of squares due to regression SSE=sum of squares due to error,SST=SSR+SSE,The coefficient of determination is:,Coefficient of Determination,where:SSR=sum of squares due to regressionSST=total sum of squares,r2=SSR/SST,Coefficient of Determination,r2=SSR/SST=100/114=.8772,The regression relationship is very strong;88%of the variability in the number of cars sold can beexplained by the linear relationship between thenumber of TV ads and the number of cars sold.,Sample Correlation Coefficient,where:b1=the slope of the estimated regression equation,The sign of b1 in the equation is“+”.,Sample Correlation Coefficient,rxy=+.9366,Assumptions About the Error Term e,1.The error is a random variable with mean of zero.,2.The variance of,denoted by 2,is the same for all values of the independent variable.,3.The values of are independent.,4.The error is a normally distributed random variable.,Testing for Significance,To test for a significant regression relationship,we must conduct a hypothesis test to determine whether the value of b1 is zero.,Two tests are commonly used:,t Test,and,F Test,Both the t test and F test require an estimate of s 2,the variance of e in the regression model.,An Estimate of s,Testing for Significance,where:,s 2=MSE=SSE/(n-2),The mean square error(MSE)provides the estimateof s 2,and the notation s2 is also used.,Testing for Significance,An Estimate of s,To estimate s we take the square root of s 2.,The resulting s is called the standard error of the estimate.,Hypotheses Test Statistic,Testing for Significance:t Test,Rejection Rule,Testing for Significance:t Test,where:t is based on a t distributionwith n-2 degrees of freedom,Reject H0 if p-value t,1.Determine the hypotheses.,2.Specify the level of significance.,3.Select the test statistic.,a=.05,4.State the rejection rule.,Reject H0 if p-value 3.182(with3 degrees of freedom),Testing for Significance:t Test,Testing for Significance:t Test,5.Compute the value of the test statistic.,6.Determine whether to reject H0.,t=4.541 provides an area of.01 in the uppertail.Hence,the p-value is less than.02.(Also,t=4.63 3.182.)We can reject H0.,Confidence Interval for 1,H0 is rejected if the hypothesized value of 1 is not included in the confidence interval for 1.,We can use a 95%confidence interval for 1 to test the hypotheses just used in the t test.,The form of a confidence interval for 1 is:,Confidence Interval for 1,where is the t value providing an areaof a/2 in the upper tail of a t distributionwith n-2 degrees of freedom,b1 is thepointestimator,is themarginof error,Confidence Interval for 1,Reject H0 if 0 is not included inthe confidence interval for 1.,0 is not included in the confidence interval.Reject H0,=5+/-3.182(1.08)=5+/-3.44,or 1.56 to 8.44,Rejection Rule,95%Confidence Interval for 1,Conclusion,Hypotheses Test Statistic,Testing for Significance:F Test,F=MSR/MSE,Rejection Rule,Testing for Significance:F Test,where:F is based on an F distribution with1 degree of freedom in the numerator andn-2 degrees of freedom in the denominator,Reject H0 if p-value F,1.Determine the hypotheses.,2.Specify the level of significance.,3.Select the test statistic.,a=.05,4.State the rejection rule.,Reject H0 if p-value 10.13(with 1 d.f.in numerator and 3 d.f.in denominator),Testing for Significance:F Test,F=MSR/MSE,Testing for Significance:F Test,5.Compute the value of the test statistic.,6.Determine whether to reject H0.,F=17.44 provides an area of.025 in the upper tail.Thus,the p-value corresponding to F=21.43 is less than 2(.025)=.05.Hence,we reject H0.,F=MSR/MSE=100/4.667=21.43,The statistical evidence is sufficient to concludethat we have a significant relationship between thenumber of TV ads aired and the number of cars sold.,Some Cautions about theInterpretation of Significance Tests,Just because we are able to reject H0:b1=0 and demonstrate statistical significance does not enableus to conclude that there is a linear relationshipbetween x and y.,Rejecting H0:b1=0 and concluding that therelationship between x and y is significant does not enable us to conclude that a cause-and-effectrelationship is present between x and y.,Using the Estimated Regression Equationfor Estimation and Prediction,where:confidence coefficient is 1-andt/2 is based on a t distributionwith n-2 degrees of freedom,Confidence Interval Estimate of E(yp),Prediction Interval Estimate of yp,If 3 TV ads are run prior to a sale,we expectthe mean number of cars sold to be:,Point Estimation,Excels Confidence Interval Output,Confidence Interval for E(yp),The 95%confidence interval estimate of the mean number of cars sold when 3 TV ads are run is:,Confidence Interval for E(yp),25+4.61=20.39 to 29.61 cars,Excels Prediction Interval Output,Prediction Interval for yp,The 95%prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is:,Prediction Interval for yp,25+8.28=16.72 to 33.28 cars,Residual Analysis,Much of the residual analysis is based on an examination of graphical plots.,Residual for Observation i,The residuals provide the best information about e.,If the assumptions about the error term e appear questionable,the hypothesis tests about the significance of the regression relationship and the interval estimation results may not be valid.,Residual Plot Against x,If the assumption that the variance of e is the same for all values of x is valid,and the assumed regression model is an adequate representation of the relationship between the variables,then,The residual plot should give an overall impression of a horizontal band of points,x,0,Good Pattern,Residual,Residual Plot Against x,Residual Plot Against x,x,0,Residual,Nonconstant Variance,Residual Plot Against x,x,0,Residual,Model Form Not Adequate,Residuals,Residual Plot Against x,Residual Plot Against x,End of Chapter 12,