[计算机]SAS程序14 LOGISTIC回归模型.doc
-
资源ID:4561066
资源大小:206.50KB
全文页数:13页
- 资源格式: DOC
下载积分:10金币
友情提示
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
|
[计算机]SAS程序14 LOGISTIC回归模型.doc
LOGISTIC回归由于取值非即,如设取的概率为,则它取的概率为Qi。第个观察对象的发生概率比数(odds)为称为发生比,是发生概率与不发生概率的比。发生比取对数称为LOGIT变换。回归系数的解释,i表示xi改变一个单位时, logit P的平均变化量。相对危险度(relative risk): RR=P1/P2比数Odds=P/(1-P)比数比OR=P/(1-P)/P/(1-P)对比数比取自然对数得到关系式:ln等式左边是比数比的自然对数,等式右边的是同一因素的不同水平与之差。的意义是在其它自变量固定不变的情况下,自变量的水平每改变一个测量单位时所引起的比数比的自然对数改变量。或者说,在其他自变量固定不变的情况下,当自变量的水平每增加一个测量单位时所引起的比数比为增加前的倍。自变量每增加一个单位,其相对危险度为e。Logit回归模型实质是求一种概率的估计,将某种概率与一个线性函数联系起来。例:三种药物drug取值0-2, 病情程度degree 分重-轻两类(0-1);因变量response为治疗效果的效与无效(1-0)Data ex12_1;Input drug degree response count;Datalines;0 1 1 38 0 1 0 640 0 1 100 0 0 821 1 1 951 1 0 181 0 1 501 0 0 352 1 1 882 1 0 262 0 1 342 0 0 37;Proc logistic data=ex12_1 descending;Freq count;Class drug/param=ref descending;Model response=drug degree/rsq scale=n aggregate;Run;Rsq显示R2Scale, SCALE= species method to correct overdispersion,指定参数,=n表示不需要修正。Aggregate计算卡方检验统计量Class 语句将分类变量化成虚拟变量,三种药用两个虚拟变量表示。The LOGISTIC ProcedureModel InformationData Set WORK.EX12_1Response Variable responseNumber of Response Levels 2Frequency Variable countModel binary logitOptimization Technique Fisher's scoringNumber of Observations Read 12Number of Observations Used 12Sum of Frequencies Read 577Sum of Frequencies Used 577Response ProfileOrdered TotalValue response Frequency1 1 3152 0 262Probability modeled is response=1.Class Level InformationDesignClass Value Variablesdrug 2 1 01 0 10 0 0Model Convergence StatusConvergence criterion (GCONV=1E-8) satisfied.Deviance and Pearson Goodness-of-Fit StatisticsCriterion Value DF Value/DF Pr > ChiSqDeviance 0.3749 2 0.1874 0.8291Pearson 0.3689 2 0.1844 0.8316模型拟合集优度检验,Number of unique profiles: 6Model Fit Statistics Intercept Intercept andCriterion Only CovariatesAIC 797.017 641.326SC 801.375 658.757-2 Log L 795.017 633.326R-Square 0.2444 Max-rescaled R-Square 0.3268The LOGISTIC ProcedureTesting Global Null Hypothesis: BETA=0Test Chi-Square DF Pr > ChiSqLikelihood Ratio 161.6907 3 <.0001Score 148.1598 3 <.0001Wald 118.1394 3 <.0001检验模型全部系数为0,拒绝则模型有意义Type 3 Analysis of EffectsWaldEffect DF Chi-Square Pr > ChiSqdrug 2 95.0859 <.0001degree 1 47.4607 <.0001Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 -1.9594 0.2229 77.2441 <.0001drug 2 1 1.8342 0.2406 58.0936 <.0001drug 1 1 2.2850 0.2479 84.9472 <.0001degree 1 1.3806 0.2004 47.4607 <.0001参数估计与检验Odds Ratio EstimatesPoint 95% WaldEffect Estimate Confidence Limitsdrug 2 vs 0 6.260 3.906 10.033drug 1 vs 0 9.826 6.044 15.974degree 3.977 2.685 5.891Association of Predicted Probabilities and Observed ResponsesPercent Concordant 72.2 Somers' D 0.568Percent Discordant 15.4 Gamma 0.649Percent Tied 12.4 Tau-a 0.282Pairs 82530 c 0.784铸铁冶炼,要对铁加热heat和 水中热处理(soaking time),n 表示铸铁块数,r 表示没有准备好轧制的铁块数。data ingots; input Heat Soak r n ; datalines; 7 1.0 0 10 14 1.0 0 31 27 1.0 1 56 51 1.0 3 13 7 1.7 0 17 14 1.7 0 43 27 1.7 4 44 51 1.7 0 1 7 2.2 0 7 14 2.2 2 33 27 2.2 0 21 51 2.2 0 1 7 2.8 0 12 14 2.8 0 31 27 2.8 1 22 51 4.0 0 1 7 4.0 0 9 14 4.0 0 19 27 4.0 1 16 ; proc logistic data=ingots; model r/n=Heat Soak; run; The LOGISTIC Procedure Model Information Data Set WORK.INGOTS Response Variable (Events) r Response Variable (Trials) n Model binary logit Optimization Technique Fisher's scoring实验次数n,事件发生次数r Number of Observations Read 19 Number of Observations Used 19 Sum of Frequencies Read 387 Sum of Frequencies Used 387 Response Profile Ordered Binary Total Value Outcome Frequency 1 Event 12 2 Nonevent 375响应变量分析,发生12次,不发生375次。 Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 108.988 101.346 SC 112.947 113.221 -2 Log L 106.988 95.346用于选择最优级模型,越小越优级 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 11.6428 2 0.0030 Score 15.1091 2 0.0005 Wald 13.0315 2 0.0015模型检验似然比检验(likelihood ratiotest)、计分检验(score test)、Wald检验(Wald test)三种 Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 -5.5592 1.1197 24.6503 <.0001Heat 1 0.0820 0.0237 11.9454 0.0005Soak 1 0.0568 0.3312 0.0294 0.8639系数检验 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits Heat 1.085 1.036 1.137 Soak 1.058 0.553 2.026 The LOGISTIC Procedure Association of Predicted Probabilities and Observed Responses Percent Concordant 64.4 Somers' D 0.460 Percent Discordant 18.4 Gamma 0.555 Percent Tied 17.2 Tau-a 0.028 Pairs 4500 c 0.730Using the parameter estimates, you can calculate the estimated logit of as Logit(p)=log(p/1-p)=-5.5592+0.082 × Heat+0.0568 × Soak If Heat=7 and Soak=1, then logit(p)=-4.92584. Using this logit estimate, you can calculate as follows: P=1/(1+e4.9284)=0.0072例:9-5, 应用回归分析 P256,Y表示骑车上班(Y=1bike,Y=0,BUS),X1年龄,X2月收入,X3性别(1男,0女)X3X1X2y0188500021120000238501023950102812001031850003615001042100010469501048120000551800105621001058180011188500120100001251200012713000128150001309501132100001331800013310000138120001411500014518001148100001521500115618001Data p256;Input X3X1X2y;Datalines;0188500021120000238501023950102812001031850003615001042100010469501048120000551800105621001058180011188500120100001251200012713000128150001309501132100001331800013310000138120001411500014518001148100001521500115618001;Proc logistic data=p256 descending ;Model y=x1-x3;output out=pred p=phat lower=lcl upper=uclpredprobs=(individual crossvalidate);run;proc print data=pred;run;The LOGISTIC ProcedureModel Information Data Set WORK.P256 Response Variable y Number of Response Levels 2 Model binary logit Optimization Technique Fisher's scoring Number of Observations Read 28 Number of Observations Used 28 Response Profile Ordered Total Value y Frequency 1 0 15 2 1 13 Probability modeled is y=0. Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 40.673 33.971 SC 42.005 39.299 -2 Log L 38.673 25.971 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 12.7026 3 0.0053 Score 10.4135 3 0.0154 Wald 6.5331 3 0.0884Analysis of Maximum Likelihood EstimatesStandard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 3.6547 2.0911 3.0545 0.0805X1 1 -0.0822 0.0521 2.4853 0.1149X2 1 -0.00152 0.00187 0.6613 0.4161X3 1 2.5016 1.1578 4.6689 0.0307 The LOGISTIC Procedure Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits X1 0.921 0.832 1.020 X2 0.998 0.995 1.002 X3 12.203 1.262 118.014 Association of Predicted Probabilities and Observed Responses Percent Concordant 87.2 Somers' D 0.744 Percent Discordant 12.8 Gamma 0.744 Percent Tied 0.0 Tau-a 0.384 Pairs 195 c 0.872序号样品数W其中有房屋数收 入(千元)110.01.52.0220.03.23.0325.04.04.0430.05.05.0540.08.06.0650.012.08.0760.018.010.0880.028.013.09100.045.015.01070.036.020.01165.039.025.01250.033.030.01340.030.035.01425.020.040.01530.027.050.01640.038.060.01750.048.070.01860.058.080.0Data ex1;Input no n n1 x;Datalines;110.01.52.0220.03.23.0325.04.04.0430.05.05.0540.08.06.0650.012.08.0760.018.010.0880.028.013.09100.045.015.01070.036.020.01165.039.025.01250.033.030.01340.030.035.01425.020.040.01530.027.050.01640.038.060.01750.048.070.01860.058.080.0;Proc logistic data=ex1;Model n1/n=x;Run;