当前位置:文档之家› 决策树分类算法的时间和性能测试(DOC)

决策树分类算法的时间和性能测试(DOC)

决策树分类算法的时间和性能测试姓名:ls学号:目录一、项目要求 (3)二、基本思想 (3)三、样本处理 (4)四、实验及其分析 (9)1.总时间 (9)2.分类准确性. (12)五、结论及不足 (13)附录 (14)一、项目要求(1)设计并实现决策树分类算法(可参考网上很多版本的决策树算法及代码,但算法的基本思想应为以上所给内容)。

(2)使用UCI 的基准测试数据集,测试所实现的决策树分类算法。

评价指标包括:总时间、分类准确性等。

(3) 使用UCI Iris Data Set 进行测试。

二、基本思想决策树是一个类似于流程图的树结构,其中每个内部节点表示在一个属性变量上的测试,每个分支代表一个测试输出,而每个叶子节点代表类或分布,树的最顶层节点是根节点。

当需要预测一个未知样本的分类值时,基于决策树,沿着该树模型向下追溯,在树的每个节点将该样本的变量值和该节点变量的阈值进行比较,然后选取合适的分支,从而完成分类。

决策树能够很容易地转换成分类规则,成为业务规则归纳系统的基础。

决策树算法是非常常用的分类算法,是逼近离散目标函数的方法,学习得到的函数以决策树的形式表示。

其基本思路是不断选取产生信息增益最大的属性来划分样例集和,构造决策树。

信息增益定义为结点与其子结点的信息熵之差。

信息熵是香农提出的,用于描述信息不纯度(不稳定性),其计算公式是Pi为子集合中不同性(而二元分类即正样例和负样例)的样例的比例。

这样信息收益可以定义为样本按照某属性划分时造成熵减少的期望,可以区分训练样本中正负样本的能力,其计算公式是三、样本处理以UCI提供的Iris Plants Database为测试样本,Iris Plants共有sepal-length ,sepal-width ,petal-length ,petal-width四种属性,根据属性的不同分为三种: class:-- Iris Setosa-- Iris Versicolour-- Iris Virginica为方便实现,只取Iris Setosa和Iris Versicolour这两种植物的样例进行测试。

实现该算法的样例集合如下:5.1,3.5,1.4,0.2,Iris-setosa4.9,3.0,1.4,0.2,Iris-setosa4.7,3.2,1.3,0.2,Iris-setosa4.6,3.1,1.5,0.2,Iris-setosa5.0,3.6,1.4,0.2,Iris-setosa5.4,3.9,1.7,0.4,Iris-setosa4.6,3.4,1.4,0.3,Iris-setosa5.0,3.4,1.5,0.2,Iris-setosa4.4,2.9,1.4,0.2,Iris-setosa4.9,3.1,1.5,0.1,Iris-setosa5.4,3.7,1.5,0.2,Iris-setosa4.8,3.4,1.6,0.2,Iris-setosa4.8,3.0,1.4,0.1,Iris-setosa4.3,3.0,1.1,0.1,Iris-setosa5.8,4.0,1.2,0.2,Iris-setosa5.7,4.4,1.5,0.4,Iris-setosa5.4,3.9,1.3,0.4,Iris-setosa5.1,3.5,1.4,0.3,Iris-setosa5.7,3.8,1.7,0.3,Iris-setosa5.1,3.8,1.5,0.3,Iris-setosa5.4,3.4,1.7,0.2,Iris-setosa5.1,3.7,1.5,0.4,Iris-setosa4.6,3.6,1.0,0.2,Iris-setosa5.1,3.3,1.7,0.5,Iris-setosa4.8,3.4,1.9,0.2,Iris-setosa5.0,3.0,1.6,0.2,Iris-setosa5.0,3.4,1.6,0.4,Iris-setosa5.2,3.5,1.5,0.2,Iris-setosa5.2,3.4,1.4,0.2,Iris-setosa4.7,3.2,1.6,0.2,Iris-setosa4.8,3.1,1.6,0.2,Iris-setosa5.4,3.4,1.5,0.4,Iris-setosa5.2,4.1,1.5,0.1,Iris-setosa5.5,4.2,1.4,0.2,Iris-setosa4.9,3.1,1.5,0.1,Iris-setosa5.0,3.2,1.2,0.2,Iris-setosa 5.5,3.5,1.3,0.2,Iris-setosa 4.9,3.1,1.5,0.1,Iris-setosa4.4,3.0,1.3,0.2,Iris-setosa5.1,3.4,1.5,0.2,Iris-setosa 5.0,3.5,1.3,0.3,Iris-setosa 4.5,2.3,1.3,0.3,Iris-setosa4.4,3.2,1.3,0.2,Iris-setosa5.0,3.5,1.6,0.6,Iris-setosa 5.1,3.8,1.9,0.4,Iris-setosa4.8,3.0,1.4,0.3,Iris-setosa5.1,3.8,1.6,0.2,Iris-setosa4.6,3.2,1.4,0.2,Iris-setosa5.3,3.7,1.5,0.2,Iris-setosa 5.0,3.3,1.4,0.2,Iris-setosa 7.0,3.2,4.7,1.4,Iris-versicolor6.4,3.2,4.5,1.5,Iris-versicolor 6.9,3.1,4.9,1.5,Iris-versicolor5.5,2.3,4.0,1.3,Iris-versicolor6.5,2.8,4.6,1.5,Iris-versicolor5.7,2.8,4.5,1.3,Iris-versicolor6.3,3.3,4.7,1.6,Iris-versicolor 4.9,2.4,3.3,1.0,Iris-versicolor 6.6,2.9,4.6,1.3,Iris-versicolor 5.2,2.7,3.9,1.4,Iris-versicolor 5.0,2.0,3.5,1.0,Iris-versicolor5.9,3.0,4.2,1.5,Iris-versicolor6.0,2.2,4.0,1.0,Iris-versicolor 6.1,2.9,4.7,1.4,Iris-versicolor5.6,2.9,3.6,1.3,Iris-versicolor6.7,3.1,4.4,1.4,Iris-versicolor 5.6,3.0,4.5,1.5,Iris-versicolor5.8,2.7,4.1,1.0,Iris-versicolor6.2,2.2,4.5,1.5,Iris-versicolor 5.6,2.5,3.9,1.1,Iris-versicolor5.9,3.2,4.8,1.8,Iris-versicolor6.1,2.8,4.0,1.3,Iris-versicolor 6.3,2.5,4.9,1.5,Iris-versicolor 6.1,2.8,4.7,1.2,Iris-versicolor 6.4,2.9,4.3,1.3,Iris-versicolor 6.6,3.0,4.4,1.4,Iris-versicolor 6.8,2.8,4.8,1.4,Iris-versicolor6.7,3.0,5.0,1.7,Iris-versicolor6.0,2.9,4.5,1.5,Iris-versicolor5.7,2.6,3.5,1.0,Iris-versicolor5.5,2.4,3.8,1.1,Iris-versicolor5.5,2.4,3.7,1.0,Iris-versicolor5.8,2.7,3.9,1.2,Iris-versicolor6.0,2.7,5.1,1.6,Iris-versicolor5.4,3.0,4.5,1.5,Iris-versicolor6.0,3.4,4.5,1.6,Iris-versicolor6.7,3.1,4.7,1.5,Iris-versicolor6.3,2.3,4.4,1.3,Iris-versicolor5.6,3.0,4.1,1.3,Iris-versicolor5.5,2.5,4.0,1.3,Iris-versicolor5.5,2.6,4.4,1.2,Iris-versicolor6.1,3.0,4.6,1.4,Iris-versicolor5.8,2.6,4.0,1.2,Iris-versicolor5.0,2.3,3.3,1.0,Iris-versicolor5.6,2.7,4.2,1.3,Iris-versicolor5.7,3.0,4.2,1.2,Iris-versicolor5.7,2.9,4.2,1.3,Iris-versicolor6.2,2.9,4.3,1.3,Iris-versicolor5.1,2.5,3.0,1.1,Iris-versicolor5.7,2.8,4.1,1.3,Iris-versicolor根据样本说明中对样本的总统计:对四种属性进行进一步划分:sepal-length 4.3-5.84 a 5.84-7.9 bsepal-width 2.0-3.05 c 3.05-4.4 dpetal-length 1.0-3.76 e 3.76-6.9 fpetal-width 0.1-1.20 g 1.20-2.5 h得到处理后的测试样例集为:test sepal-length sepal-width petal-length petal-width class1 a d e g Iris-setosa2 a c e g Iris-setosa3 a d e g Iris-setosa4 a d e g Iris-setosa6 a d e g Iris-setosa7 a d e g Iris-setosa8 a d e g Iris-setosa9 a c e g Iris-setosa10 a d e g Iris-setosa11 a d e g Iris-setosa12 a d e g Iris-setosa13 a c e g Iris-setosa14 a c e g Iris-setosa15 a d e g Iris-setosa16 a d e g Iris-setosa17 a d e g Iris-setosa18 a d e g Iris-setosa19 a d e g Iris-setosa20 a d e g Iris-setosa21 a d e g Iris-setosa22 a d e g Iris-setosa23 a d e g Iris-setosa24 a d e g Iris-setosa25 a d e g Iris-setosa26 a c e g Iris-setosa27 a d e g Iris-setosa28 a d e g Iris-setosa29 a d e g Iris-setosa30 a d e g Iris-setosa31 a d e g Iris-setosa32 a d e g Iris-setosa33 a d e g Iris-setosa34 a d e g Iris-setosa35 a d e g Iris-setosa36 a d e g Iris-setosa37 a d e g Iris-setosa38 a d e g Iris-setosa39 a c e g Iris-setosa40 a d e g Iris-setosa41 a d e g Iris-setosa42 a c e g Iris-setosa43 a d e g Iris-setosa44 a d e g Iris-setosa45 a d e g Iris-setosa46 a c e g Iris-setosa47 a d e g Iris-setosa48 a d e g Iris-setosa50 a d e g Iris-setosa51 b d f h Iris-versicolor52 b d f h Iris-versicolor53 b d f h Iris-versicolor54 a c f h Iris-versicolor55 b c f h Iris-versicolor56 a c f h Iris-versicolor57 b d f h Iris-versicolor58 a c e g Iris-versicolor59 b c f h Iris-versicolor60 a c f h Iris-versicolor61 a c e g Iris-versicolor62 b c f h Iris-versicolor63 b c f g Iris-versicolor64 b c f h Iris-versicolor65 a c e h Iris-versicolor66 b d f h Iris-versicolor67 a c f h Iris-versicolor68 a c f g Iris-versicolor69 b c f h Iris-versicolor70 a c f g Iris-versicolor71 b d f h Iris-versicolor72 b c f h Iris-versicolor73 b c f h Iris-versicolor74 b c f g Iris-versicolor75 b c f h Iris-versicolor76 b c f h Iris-versicolor77 b c f h Iris-versicolor78 b c f h Iris-versicolor79 b c f h Iris-versicolor80 a c e g Iris-versicolor81 a c f g Iris-versicolor82 a c e g Iris-versicolor83 a c f g Iris-versicolor84 b c f h Iris-versicolor85 a c f h Iris-versicolor86 b d f h Iris-versicolor87 b d f h Iris-versicolor88 b c f h Iris-versicolor89 a c f h Iris-versicolor90 a c f h Iris-versicolor91 a c f g Iris-versicolor92 b c f h Iris-versicolor93 a c f g Iris-versicolor94 a c e g Iris-versicolor95 a c f h Iris-versicolor96 a c f g Iris-versicolor97 a c f h Iris-versicolor98 b c f h Iris-versicolor99 a c e g Iris-versicolor100 a c f h Iris-versicolorEnd四、实验及其分析1.总时间(1).抽取不同规模的样例进行测试,比较决策树构造时间随机抽取10组样例进行测试,运行结果如图2.6,总时间为0.05s图1 10组样例构建决策树随机抽取40组样例进行测试,运行结果如图2.6,总时间为0.167s图2 40组样例构建决策树随机抽取70组样例进行测试,运行结果如图2.6,总时间为0.369s图3 70组样例构建决策树选取100组样例进行测试,运行结果如图2.6,总时间为0.646s图4 100组样例构建决策树得到样例数—时间表:样例个数10 40 70 100运行时间(s) 0.05 0.167 0.369 0.646表1. 样例数—时间表画出样例数—时间折线图:图4 样例数—时间折线图由图4可以看出,本文的决策树分类算法的运行时间与样例数成正比关系。

相关主题