欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    Chapter 10Correlation and Regression.ppt

    • 资源ID:379524       资源大小:1.05MB        全文页数:48页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    Chapter 10Correlation and Regression.ppt

    1、Chapter 10 Correlation and Regression,We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they depend on each other.,Example,x is the height of mothery is the height of daughterQuestion: are the heights of daughters independent of the height o

    2、f their mothers? Or is there a correlation between the heights of mothers and those of daughters? If yes, how strong is it?,Example:,This table includes a random sample of heights of mothers, fathers, and their daughters.,Heights of mothers and their daughters in this sample seem to be strongly corr

    3、elated,But heights of fathers and their daughters in this sample seem to be weakly correlated (if at all).,Section 10-2 Correlation between two variables (x and y),Definition,A correlation exists between two variables when the values of one somehow affect the values of the other in some way.,Key Con

    4、cept,Linear correlation coefficient, r, is a numerical measure of the strength of the linear relationship between two variables, x and y, representing quantitative data. Then we use that value to conclude that there is (or is not) a linear correlation between the two variables. Note: r always belong

    5、s in the interval (-1,1), i.e., 1 r 1,Exploring the Data,We can often see a relationship between two variables by constructing a scatterplot.,Scatterplots of Paired Data,Scatterplots of Paired Data,Scatterplots of Paired Data,Requirements,1. The sample of paired (x, y) data is a random sample of qua

    6、ntitative data. 2. Visual examination of the scatterplot must confirm that the points approximate a straight-line pattern. 3. The outliers must be removed if they are known to be errors. (We will not do that in this course),Notation for the Linear Correlation Coefficient,n = number of pairs of sampl

    7、e data denotes the addition of the items indicated. x denotes the sum of all x-values. x2 indicates that each x-value should be squared and then those squares added. (x)2 indicates that the x-values should be added and then the total squared.,Notation for the Linear Correlation Coefficient,xy indica

    8、tes that each x-value should be first multiplied by its corresponding y-value. After obtaining all such products, find their sum. r = linear correlation coefficient for sample data. = linear correlation coefficient for population data, i.e. linear correlation between two populations.,The linear corr

    9、elation coefficient r measures the strength of a linear relationship between the paired values in a sample.,We should use computer software or calculator to compute r,Formula,Enter x-values into list L1 and y-values into list L2 Press STAT and select TESTS Scroll down to LinRegTTest press ENTER Make

    10、 sure that XList: L1 and YList: L2 choose: b & r 0 Press on Calculate Read r2 = and r = Also read the P-value p = ,Linear correlation by TI-83/84,Interpreting r,Using Table A-6: If the absolute value of the computed value of r, denoted |r|, exceeds the value in Table A-6, conclude that there is a li

    11、near correlation. Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.Note: In most cases we use the significance level a = 0.05 (the middle column of Table A-6).,Interpreting r,Using P-value computed by calculator: If the P-value is a, conclude that there i

    12、s a linear correlation. Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.Note: In most cases we use the significance level a = 0.05.,Caution,Know that the methods of this section apply only to a linear correlation. If you conclude that there is no linear

    13、correlation, it is possible that there is some other association that is not linear.,Round to three decimal places so that it can be compared to critical values in Table A-6. Use calculator or computer if possible.,Rounding the Linear Correlation Coefficient r,Properties of the Linear Correlation Co

    14、efficient r,1. 1 r 1 2. if all values of either variable are converted to a different scale, the value of r does not change. 3. The value of r is not affected by the choice of x and y. Interchange all x- and y-values and the value of r will not change. 4. r measures strength of a linear relationship

    15、. 5. r is very sensitive to outliers, they can dramatically affect its value.,Example:,The paired pizza/subway fare costs from Table 10-1 are shown here in a scatterplot. Find the value of the linear correlation coefficient r for the paired sample data.,Example - 1:,Using software or a calculator, r

    16、 is automatically calculated:,Interpreting the Linear Correlation Coefficient r,Critical Values from Table A-6 and the Computed Value of r,Using a 0.05 significance level, interpret the value of r = 0.117 found using the 62 pairs of weights of discarded paper and glass listed in Data Set 22 in Appen

    17、dix B. Is there sufficient evidence to support a claim of a linear correlation between the weights of discarded paper and glass?,Example - 2:,Example:,Using Table A-6 to Interpret r:If we refer to Table A-6 with n = 62 pairs of sample data, we obtain the critical value of 0.254 (approximately) for =

    18、 0.05. Because |0.117| does not exceed the value of 0.254 from Table A-6, we conclude that there is not sufficient evidence to support a claim of a linear correlation between weights of discarded paper and glass.,Interpreting r: Explained Variation,The value of r2 is the proportion of the variation

    19、in y that is explained by the linear relationship between x and y.,Using the pizza subway fare costs, we have found that the linear correlation coefficient is r = 0.988. What proportion of the variation in the subway fare can be explained by the variation in the costs of a slice of pizza?,With r = 0

    20、.988, we get r2 = 0.976.,We conclude that 0.976 (or about 98%) of the variation in the cost of a subway fares can be explained by the linear relationship between the costs of pizza and subway fares. This implies that about 2% of the variation in costs of subway fares cannot be explained by the costs

    21、 of pizza.,Example:,Common Errors Involving Correlation,1. Causation: It is wrong to conclude that correlation implies causality.2. Linearity: There may be some relationship between x and y even when there is no linear correlation.,Caution,Know that correlation does not imply causality. There may be

    22、 correlation without causality.,Section 10-3Regression,Regression,Definitions,Regression Equation,Given a collection of paired data, the regression equation,Regression LineThe graph of the regression equation is called the regression line (or line of best fit, or least squares line).,algebraically d

    23、escribes the relationship between the two variables.,Example:,Notation for Regression Equation,y-intercept of regression equation Slope of regression equation Equation of the regression line,Population Parameter,Sample Statistic,Requirements,1. The sample of paired (x, y) data is a random sample of

    24、quantitative data. 2. Visual examination of the scatterplot shows that the points approximate a straight-line pattern. 3. Any outliers must be removed if they are known to be errors. Consider the effects of any outliers that are not known errors.,Rounding the y-intercept b0 and the Slope b1,Round to

    25、 three significant digits. If you use the formulas from the book, do not round intermediate values.,Example:,Refer to the sample data given in Table 10-1 in the Chapter Problem. Use technology to find the equation of the regression line in which the explanatory variable (or x variable) is the cost o

    26、f a slice of pizza and the response variable (or y variable) is the corresponding cost of a subway fare. (CPI=Consumer Price Index, not used),Example:,Requirements are satisfied: simple random sample; scatterplot approximates a straight line; no outliers,Here are results from four different technolo

    27、gies technologies,Example:,Example:,Graph the regression equation (from the preceding Example) on the scatterplot of the pizza/subway fare data and examine the graph to subjectively determine how well the regression line fits the data.,Example:,Predicted value of y is y = b0 + b1x Use the regression

    28、 equation for predictions only if the graph of the regression line on the scatterplot confirms that the regression line fits the points reasonably well.,Using the Regression Equation for Predictions,3. Use the regression equation for predictions only if the linear correlation coefficient r indicates

    29、 that there is a linear correlation between the two variables.,4. Use the regression line for predictions only if the value of x does not go much beyond the scope of the available sample data. (Predicting too far beyond the scope of the available sample data is called extrapolation, and it could res

    30、ult in bad predictions.),Using the Regression Equation for Predictions,5. If the regression equation does not appear to be useful for making predictions, the best predicted value of a variable is its point estimate, which is its sample mean, y.,_,Strategy for Predicting Values of Y,If the regression

    31、 equation is not a good model, the best predicted value of y is simply y, the mean of the y values. Remember, this strategy applies to linear patterns of points in a scatterplot.,Using the Regression Equation for Predictions,_,For a pair of sample x and y values, the residual is the difference betwe

    32、en the observed sample value of y and the y-value that is predicted by using the regression equation. That is,Definition,residual = observed y predicted y = y y,Residuals,A straight line satisfies the least-squares property if the sum of the squares of the residuals is the smallest sum possible.,Definition,


    注意事项

    本文(Chapter 10Correlation and Regression.ppt)为本站会员(孙刚)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开