当前位置:文档之家› 数据分析实验报告(主成分分析)

数据分析实验报告(主成分分析)

实验八主成分分析一、实验目的和要求能利用原始数据与相关矩阵、协主差矩阵作主成分分析,并能理解标准化变量主成分与原始数据主成分的联系与区别;能根据SAS输出结果选出满足要求的几个主成分.实验要求:编写程序,结果分析.实验内容:书上4.5 4.64.5 data examp4_5;input id x1-x8;cards;1 8.35 23.53 7.51 8.62 17.42 10.00 1.04 11.212 9.25 23.75 6.61 9.19 17.77 10.48 1.72 10.513 8.19 30.50 4.72 9.78 16.28 7.60 2.52 10.324 7.73 29.20 5.42 9.43 19.29 8.49 2.52 10.005 9.42 27.93 8.20 8.14 16.17 9.42 1.55 9.766 9.16 27.98 9.01 9.32 15.99 9.10 1.82 11.357 10.06 28.64 10.52 10.05 16.18 8.39 1.96 10.818 9.09 28.12 7.40 9.62 17.26 11.12 2.49 12.659 9.41 28.20 5.77 10.80 16.36 11.56 1.53 12.1710 8.70 28.12 7.21 10.53 19.45 13.30 1.6611.9611 6.93 29.85 4.54 9.49 16.62 10.65 1.88 13.6112 8.67 36.05 7.31 7.75 16.67 11.68 2.3812.8813 9.98 37.69 7.01 8.94 16.15 11.08 0.83 11.6714 6.77 38.69 6.01 8.82 14.79 11.44 1.74 13.2315 8.14 37.75 9.61 8.49 13.15 9.76 1.28 11.2816 7.67 35.71 8.04 8.31 15.13 7.76 1.41 13.2517 7.90 39.77 8.49 12.94 19.27 11.05 2.04 13.2914.8019 8.82 33.70 7.59 10.98 18.82 14.73 1.78 10.1020 6.25 35.02 4.72 6.28 10.03 7.15 1.93 10.3921 10.60 52.41 7.70 9.98 12.53 11.70 2.31 14.6922 7.27 52.65 3.84 9.16 13.03 15.26 1.98 14.5723 13.45 55.85 5.50 7.45 9.55 9.52 2.21 16.3024 10.85 44.68 7.32 14.51 17.13 12.08 1.26 11.5725 7.21 45.79 7.66 10.36 16.56 12.86 2.25 11.6926 7.68 50.37 11.35 13.30 19.25 14.59 2.75 14.8727 7.78 48.44 8.00 20.51 22.12 15.73 1.15 16.6128 7.94 39.65 20.97 20.82 22.52 12.41 1.75 7.9022.8930 12.47 76.39 5.52 11.24 14.52 22.00 5.46 25.50;run;proc corr cov nosimple data=examp4_5;var x1-x8;run;proc princomp data=examp4_5 prefix=y out=bb; var x1-x8;run;proc plot data=bb;plot y2*y1 $ id='*';proc sort data=bb;by descending y1;run;proc print data=bb;var id y1 y2 x1-x8;run;输出结果:1、样本相关系数矩阵Correlation Matrixx1 x2 x3 x4 x5 x6 x7 x8x1 1.0000 0.3336 -.0545-.0613 -.2894 0.1988 0.34870.3187x2 0.3336 1.0000 -.02290.3989 -.1563 0.7111 0.41360.8350x3 -.0545 -.0229 1.00000.5333 0.4968 0.0328 -.1391-.2584x4 -.0613 0.3989 0.53331.0000 0.6984 0.4679 -.17130.3128x5 -.2894 -.1563 0.49680.6984 1.0000 0.2801 -.2083-.0812x6 0.1988 0.7111 0.03280.4679 0.2801 1.0000 0.41680.7016x7 0.3487 0.4136 -.1391-.1713 -.2083 0.4168 1.00000.3989x8 0.3187 0.8350 -.25840.3128 -.0812 0.7016 0.39891.00002、调用主成分分析的princomp过程,从相关系数矩阵出发进行主成分分析,输出集bbThe PRINCOMP ProcedureObservations 30Variables 8SimpleStatisticsx1 x2 x3 x4Mean 8.706666667 39.056000007.629000000 10.86566667StD 1.614728190 12.438758283.052716540 3.89495579SimpleStatisticsx5 x6x7 x8Mean 16.58900000 11.626000001.902000000 13.06100000StD 2.99785481 3.058108050.851576226 3.647070961)样本相关系数矩阵R的特征值、各主成分贡献率及累计贡献率Eigenvalues of theCorrelation Matrix特征值∧*λ Differencei贡献率% 累计贡献率%1 3.096288290.72906522 0.3870 0.38702 2.367223071.44723572 0.2959 0.6829已达68.29%3 0.919987350.21406199 0.1150 0.79794 0.705925360.20748303 0.0882 0.88625 0.498442330.26855403 0.0623 0.94856 0.229888310.09911254 0.0287 0.97727 0.130775770.07930623 0.0163 0.99368 0.05146954 0.0064 1.0000SAS 系统 14:09 Monday, October 22, 2001 22The PRINCOMP Procedure2)样本相关系数矩阵R特征值的正交化特征向量The SASSystem 17:30 Tuesday, October 26, 2012 4The PRINCOMP ProcedureEigenvectorsy1 y2 y3 y4 y5 y6 y7 y8x1 0.249607 -.241238 0.693918 -.3767700.502313 -.018418 -.036543 0.045052x2 0.519234 -.037607 -.071261 -.224871-.424453 0.001760 -.282467 0.642950x3 -.018480 0.475439 0.577819 0.032379 -.510472 -.173344 0.381416 -.050854x4 0.254092 0.538081 -.021777 -.231066 0.010358 0.399113 -.471680 -.458432x5 0.021695 0.575449 -.048087 0.285368 0.516270 0.146109 0.159192 0.520977x6 0.492663 0.134676 -.145348 0.224222 0.177156 -.754966 -.081452 -.244442x7 0.317147 -.260682 0.286391 0.768116 -.090759 0.355165 -.130720 -.089297x8 0.509332 -.087081 -.271279 -.176990 0.026015 0.304720 0.708416 -.1808213)按第一主成分对各省份进行排序The SAS System 17:30 Tuesday, October 26, 2012 6Obs id y1 y2 x1 x2 x3 x4 x5 x6 x7 x81 30 6.89591 -2.27833 12.47 76.39 5.52 11.24 14.52 22.00 5.46 25.508.00 22.22 20.06 15.12 0.72 22.893 27 1.79214 2.88809 7.78 48.44 8.00 20.51 22.12 15.73 1.15 16.614 26 1.51507 1.37353 7.68 50.37 11.35 13.30 19.25 14.59 2.75 14.875 23 1.40116 -3.17840 13.45 55.855.50 7.45 9.55 9.52 2.21 16.306 21 1.15390 -1.37420 10.60 52.417.70 9.98 12.53 11.70 2.31 14.697 22 1.05651 -1.23524 7.27 52.65 3.84 9.16 13.03 15.26 1.98 14.578 24 0.43543 0.47409 10.85 44.68 7.32 14.51 17.13 12.08 1.26 11.579 25 0.15329 0.11320 7.21 45.79 7.66 10.36 16.56 12.86 2.25 11.6910 17 0.04520 0.98056 7.90 39.77 8.49 12.94 19.27 11.05 2.04 13.2911 28 -0.13324 4.90844 7.94 39.65 20.97 20.82 22.52 12.41 1.75 7.9012 18 -0.13489 0.34363 7.18 40.91 7.32 8.94 17.60 12.75 1.14 14.807.59 10.98 18.82 14.73 1.78 10.1014 12 -0.17044 -0.58962 8.67 36.05 7.31 7.75 16.67 11.68 2.38 12.8815 8 -0.39220 -0.29562 9.09 28.12 7.40 9.62 17.26 11.12 2.49 12.6516 10 -0.43040 0.64570 8.70 28.12 7.21 10.53 19.45 13.30 1.66 11.9617 14 -0.51802 -0.55227 6.77 38.69 6.01 8.82 14.79 11.44 1.74 13.2318 9 -0.61274 -0.28257 9.41 28.20 5.77 10.80 16.36 11.56 1.53 12.1719 13 -0.66670 -0.29548 9.98 37.69 7.01 8.94 16.15 11.08 0.83 11.6720 11 -0.81850 -0.42128 6.93 29.85 4.54 9.49 16.62 10.65 1.88 13.6121 7 -1.11335 -0.01815 10.06 28.64 10.52 10.05 16.18 8.39 1.96 10.8122 15 -1.11496 -0.44043 8.14 37.75 9.61 8.49 13.15 9.76 1.28 11.2823 6 -1.18223 -0.19296 9.16 27.98 9.01 9.32 15.99 9.10 1.82 11.356.61 9.19 17.77 10.48 1.72 10.5125 16 -1.25934 -0.42827 7.67 35.718.04 8.31 15.13 7.76 1.41 13.2526 3 -1.29370 -0.86033 8.19 30.504.72 9.78 16.28 7.60 2.52 10.3227 4 -1.32567 -0.10239 7.73 29.205.42 9.43 19.29 8.49 2.52 10.0028 5 -1.48595 -0.35156 9.42 27.938.20 8.14 16.17 9.42 1.55 9.7629 1 -1.68448 0.16743 8.35 23.537.51 8.62 17.42 10.00 1.04 11.2130 20 -1.96091 -2.10827 6.25 35.024.72 6.28 10.03 7.15 1.93 10.3 由输出结果可以看出:前两个主成分的累计贡献率已达68.29%,因此,取前两个主成分做进一步分析即可.给出了对应于∧*1λ和∧*2λ的正交单位化特征向量∧*1e 和∧*2e ,由此得到标准化指标的前两个样本主成分为11123456780249605092001840254100217049270317105093∧==+-+++++e x ***********........T y x x x x x x x x 为8个指标加权平均,反映各省份在生活基本消费的消费水平能力的综合指标.*1y 值大,则各省份的生活水平越低,21123456780241200376047540538005754013470360700871∧==--++++-+e x ***********........T y x x x x x x x x反映各省份在生活消费品德消费能力综合指标,2*y 值大,则各省份的消费水平越高。

相关主题