R语言实验十一.doc
精选优质文档-倾情为你奉上实验十一 因子分析、典型相关分析与对应分析【实验类型】验证性【实验学时】2 学时【实验目的】1、掌握因子分析的基本原理及载荷矩阵的统计意义;2、掌握典型相关分析的基本原理及相关系数的检验方法;3、掌握对应分析的使用条件及其与因子分析的联系。【实验内容】1、因子分析的计算及其结果解释;2、典型相关分析的计算及其结果解释;3、对应分析的计算及其结果解释。【实验方法或步骤】第一部分、教材例题:#例 10.5.1x.df=data.frame(HighlyFor=c(2, 6, 41, 72, 24), For =c(17, 65, 220, 224, 61), Against=c(17, 79, 327, 503, 300), HighlyAgainst=c(5, 6, 48, 47, 41)rownames(x.df)<-c("BelowPrimary", "Primary", "Secondary", "HighSchool","College")library(MASS)biplot(corresp(x.df, nf=2)说明: biplot作出像因子分析的载荷图那样的, 这样可以直观地来展示两个变量各个水平之间的关系.结果说明:1) 对于该图, 主要看横坐标的两种点(就业观点与文化程度)的距离, 纵坐标的距离对于分析贡献意义不大.2) 对于该图可以看出对该观点持赞同态度的是小学以下, 小学, 初中, 而大学文化程度的妇女主要持不同意或者非常不同意的观点, 高中文化程度的持有非常不赞同或者非常同意两种观点.第二部分、课后习题:10.3x1<-c(77,75,63,51,62,52,50,31,44,62,44,12,54,44,46,30,40,36,46,42,23,41,63,55,53,59,64,55,65,60,42,31,49,49,54,18,32,46,31,56,45,40,48,46)x2<-c(82,73,63,67,60,64,50,55,69,46,61,58,49,56,52,69,27,59,56,60,55,63,78,72,61,70,72,67,63,64,69,49,41,53,53,44,45,49,42,40,42,63,48,52)x3<-c(67,71,65,65,58,60,64,60,53,61,52,61,56,55,65,50,54,51,57,54,59,49,80,63,72,68,60,59,58,56,61,62,61,49,46,50,49,53,48,56,55,53,49,53)x4<-c(67,66,70,65,62,63,55,57,53,57,62,63,47,61,50,52,61,45,49,49,53,46,70,70,64,62,62,62,56,54,55,63,49,62,59,57,57,59,54,54,56,54,51,41)x5<-c(81,81,63,68,70,54,63,73,53,45,46,67,53,36,35,45,61,51,32,33,44,34,81,68,73,56,45,44,37,40,45,62,64,47,44,81,64,37,68,35,40,25,37,40)x<-data.frame(x1,x2,x3,x4,x5)fa<-factanal(., factors=2, data=x); fa #选取 2 个因子用 X1,X2,X3,X4,X5 来表示力学,物理,代数,分析,统计等变量,这样因子 f1,f2 与这些原变量之间的关系是: 这里,第一个因子主要与代数、分析、统计这三门开卷考试有较强的正相关,而第二个因子主要和力学,物理这两门闭卷考试,代数(开卷)有较强的正相关性,因此可以给第一个因子取名为“开卷因子”,第二个因子为“闭卷因子”。10.4x1<-c(1.611,1.429,1.447,1.572,1.483,1.371,1.665,1.403,2.620,2.033,2.015,1.501,1.578,1.735,1.453,1.765,1.532,1.488,2.586,1.992)x2<-c(10.59,9.44 ,5.97 ,10.72,10.99,6.46 ,10.51,6.11 ,21.51,24.15,26.86,9.74 ,14.52,14.64,12.88,17.94,29.42,9.23 ,16.07,21.63)x3<-c(0.69 ,0.61 ,0.23 ,0.75 ,0.75 ,0.41 ,0.53 ,0.17 ,1.40 ,1.80 ,1.93 ,0.87 ,1.12 ,1.21 ,0.87 ,0.89 ,2.52 ,0.81 ,0.82 ,1.01 )x4<-c(1.57 ,1.50 ,1.25 ,1.71 ,1.44 ,1.31 ,1.52 ,1.32 ,2.59 ,1.89 ,2.02 ,1.48 ,1.47 ,1.91 ,1.52 ,1.40 ,1.80 ,1.45 ,1.83 ,1.89 )x<-data.frame(x1,x2,x3,x4)fa<-factanal(., factors=1, data=x); fa #选取 1 个因子由此可以找出4个变量的公共因子,可以看出对公共因子影响较大的是x2,x3。10.8x1<-c(0.14,0.20,0.06,0.07,0.12,0.52,0.23,1.19,0.37,0.36,0.42,0.35,0.50,0.56,0.43,0.47,0.49,0.47,0.40,0.66,0.63,0.52,0.44,0.03,0.20,0.04,0.17)x2<-c(0.30,0.50,0.11,0.11,0.22,0.87,0.47,0.38,0.66,0.60,0.77,0.85,0.87,1.15,0.90,0.97,0.79,0.77,0.88,1.30,1.30,1.43,0.87,0.07,0.28,0.10,0.28)x3<-c(0.03,0.14,0.03,0.04,0.06,0.19,0.14,0.09,0.14,0.14,0.17,0.30,0.23,0.29,0.13,0.26,0.21,0.51,0.33,0.21,0.45,0.31,0.17,0.05,0.04,0.11,0.15)x4<-c(0.14,0.22,0.02,0.13,0.12,0.20,0.10,0.11,0.15,0.15,0.10,0.19,0.22,0.28,0.22,0.22,0.20,0.22,0.19,0.30,0.28,0.23,0.25,0.08,0.08,0.07,0.09)x<-data.frame(x1,x2,x3,x4)cancor(x, 1:2, x, 3:4)结果说明:1) $cor给出了典型相关系数; $xcoef是对应于数据X的系数, 即为关于数据X的典型载荷; $ycoef为关于数据Y的典型载荷; $xcenter与$ycenter是数据X与Y的中心, 即样本均值;2) 对于该问题, 第一对典型变量的表达式为 第一对典型变量的相关系数为0.895731396.10.9x<-read.csv("F:/文档/大学课程/R语言/实验/10.9习题.csv",header = TRUE,sep = ",")cancor(x, 1:2, x, 3:4)结果说明:1) $cor给出了典型相关系数; $xcoef是对应于数据X的系数, 即为关于数据X的典型载荷; $ycoef为关于数据Y的典型载荷; $xcenter与$ycenter是数据X与Y的中心, 即样本均值;2) 对于该问题, 第一对典型变量的表达式为第一对典型变量的相关系数为0.7885079.10.10x1<-c(104.9 ,143.3 ,2491.8,692.1 ,1239.1,1394.4,1953.4,2651.7,151.4 ,2942.1,1072.7,2500.3,817.3 ,1600.0,3720.6,4119.6,2138.5,2700.3,1600.1,1511.4,195.8 ,1023.5,2926.5,1100.3,1486.3,98.3 ,976.6 ,753.2 ,103.2 ,274.8 ,780.0 )x2<-c(4.3 ,3.9 ,153.8 ,18.1,80.6,46.3,34.3,36.3 ,12.8 ,232.5 ,58.2 ,298.8 ,26.1 ,90.5 ,377.3 ,362.6 ,279.4 ,137.4 ,80.9 ,57.2 ,10.3 ,30.0 ,181.0 ,71.3 ,27.7 ,4.4 ,37.5 ,38.4 ,23.0 ,7.3 ,42.6 )y1<-c(386.4 ,544.5 ,8990.8,3672.3,5707.3,3964.8,4890.1,9989.2,490.9 ,7777.4,3245.9,8733.1,2713.1,5534.7,11266.1,13127.7,7489.0,7931.7,5193.1,6288.1,871.7 ,3555.9,9571.5,4650.7,5929.6,230.9 ,4331.9,3688.9,529.0 ,1007.6,3404.1)y2<-c(322.7,354.3 ,4485.4,1104.3,2472.3,1482.8,1382.6,2090.4,280.6 ,3900.0,1400.3,3228.7,942.4 ,1897.5,4836.1,4766.0,2027.9,2676.3,1447.1,1519.6,180.8 ,631.9 ,2533.0,659.8 ,1424.3,154.4 ,1314.1,982.3 ,208.3 ,405.4 ,3138.1)y3<-c(15.7,17.3,273.4 ,84.9,79.3 ,109.8 ,114.1 ,123.2,20.3,338.0,90.3,280.7 ,117.4,109.7,428.6,441.7,245.3 ,184.3 ,195.1 ,168.1 ,27.0 ,72.6,212.0 ,70.0 ,120.0 ,3.0,131.1 ,66.1 ,7.2 ,24.6 ,83.3 )y4<-c(395.0,603.3,7244.4,1767.5,1423.6,1401.3,1096.5,1648.3,133.9 ,2957.9,2017.2,3165.0,888.8 ,1002.0,7689.6,6078.7,1469.2,2358.0,1760.0,1552.4,212.2 ,628.1 ,1735.1,647.9 ,1397.8,123.2 ,1099.8,1122.0,264.7 ,407.6 ,880.9 )#province<-c("北京","天津","河北","山西","内蒙古","辽宁","吉林","黑龙江","上海","江苏","浙江","安徽","福建","江西","山东","河南",#"湖北","湖南","广东","广西","海南","重庆","四川","贵州","云南","西藏","陕西","甘肃","青海","宁夏","新疆")x<-data.frame(x1,x2,y1,y2,y3,y4)cancor(x, 1:2, x, 3:6)第一对典型变量的表达式为第一对典型变量的相关系数为0.9850226.10.11x1<-c(39.5,42.0,37.1,40.3,37.6,43.4,42.7,40.5,44.1,40.3,47.3,51.4,44.9,37.1,40.8,41.1,40.5,40.6,44.3,51.2,42.3,43.9,42.2,44.4,49.9,37.3,41.4,42.4,38.8,38.6)x2<-c(9.7 ,8.5 ,12.8,13.7,15.1,13.9,13.4,14.7,9 ,8.5 ,11 ,8.1 ,8.7 ,13.6,12.3,11.8,10.7,4.7 ,6.6 ,4.6 ,10.8,11.3,11.0,10.9,15.8,9.9 ,12.8,11.2,13.6,12.9)x3<-c(10.0,11.9,9.0 ,8.3 ,7.3 ,6.2 ,5.5 ,6.1 ,11.4,10.6,7.0 ,6.3 ,6.7 ,12.2,8.3 ,6.5 ,8.4 ,7.5 ,7.4 ,5.0 ,9.5 ,7.7 ,11.6,7.5 ,3.9 ,11.3,8.9 ,6.6 ,7.7 ,10.4)x4<-c(6.8 ,5.2 ,7.1 ,6.0 ,5.5 ,7.0 ,6.0 ,8.0 ,4.2 ,6.7 ,3.2 ,3.1 ,3.1 ,4.9 ,6.0 ,4.6 ,4.3 ,4.7 ,3.4 ,4.3 ,4.3 ,4.5 ,3.9 ,5.1 ,3.9 ,6.6 ,6.0 ,7.8 ,8.9 ,5.7 )x5<-c(6.2 ,4.9 ,6.8 ,5.8 ,7.2 ,6.0 ,6.0 ,6.5 ,6.0 ,7.9 ,6.4 ,7.7 ,6.0 ,6.0 ,6.2 ,5.5 ,6.7 ,10.8,7.2 ,8.2 ,7.4 ,5.3 ,6.4 ,5.9 ,7.1 ,5.8 ,5.6 ,6.3 ,7.1 ,6.0 )x6<-c(15.2,12.6,13.4,11.9,13.3,11.2,12.6,10.8,11.7,12.2,13.2,8.8 ,11.3,13.3,9.7 ,14.2,14.5,11.6,13.6,11.9,13.4,12.8,11.2,11.4,7.0 ,12.4,12.2,12.3,12.0,13.0)x7<-c(6.4 ,9.8 ,9.1 ,8.1 ,8.3 ,8.3 ,9.8 ,9.1 ,8.6 ,8.8 ,8.0 ,10.2,14.6,8.2 ,12.0,12.1,10.3,14.4,12.8,7.8 ,8.1 ,9.6 ,8.7 ,8.3 ,5.1 ,11.9,6.8 ,7.4 ,6.4 ,8.3 )x8<-c(6.1 ,5.2 ,4.7 ,6.1 ,5.6 ,4.1 ,4.0 ,4.4 ,5.0 ,5.0 ,3.9 ,4.4 ,4.6 ,4.7 ,4.7 ,4.2 ,4.7 ,5.6 ,4.8 ,6.9 ,4.1 ,5.0 ,4.8 ,6.7 ,7.3 ,4.8 ,6.2 ,6.0 ,5.5 ,5.1 )x<-data.frame(x1,x2,x3,x4,x5,x6,x7,x8)rownames(x)<-c("京","津","冀","晋","内","辽","吉","黑","苏","浙","皖","闽","赣","鲁","豫","鄂","湘","粤","桂","琼","渝","川","黔","滇","藏","陕","甘","青","宁","新")library(MASS)cra<-corresp(x,nf=2) #对应分析,因子个数为2crabiplot(cra) #绘制对应分析图abline(v=0,h=0,lty=3)说明: biplot作出像因子分析的载荷图那样的, 这样可以直观地来展示两个变量各个水平之间的关系.结果说明:1) 对于该图, 主要看横坐标的两种点(地区与消费指标)的距离, 纵坐标的距离对于分析贡献意义不大.2) 对于该图可以看出X1食品支出比重较小的是宁、粤等地,X2衣着支出比重与X4医疗保健支出比重较少的是粤、桂、赣、闽琼等地。10.12x<-data.frame( 数学A=c(47,22,10), 数学B=c(31,32,11), 数学C=c(2,21,25), 数学F=c(1,10,20),row.names=c("纯汉字", "半汉字", "纯英文") )library(MASS)cra<-corresp(x,nf=2) #对应分析,因子个数为2crabiplot(cra) #绘制对应分析图abline(v=0,h=0,lty=3)结果表明:纯英文的孩子数学水平常为F、C,而半汉字儿童数学常为B,纯数字儿童数学常为A,这说明汉字具有的抽象图形符号的特性能促进儿童空间和抽象思维能力。专心-专注-专业