天津大学《模式识别2》课程教学大纲课程代码:2160265 课程名称:模式识别2学 时: 20 学 分: 1学时分配: 授课:12 上机:8 实验: 实践: 实践(周):授课学院: 计算机学院适用专业: 计算机科学与技术先修课程: 高等数学,线性代数,概率统计一.课程的性质与目的本课程讲授模式识别的基本理论与基本方法。
具体介绍模式识别问题定义,贝叶斯分类器,错误率估计,概率密度估计,窗方法,线性判别分类器,多类别分类,紧邻法,支持向量机,人工神经网络,分类树,K均值聚类,分级聚类等基础模式识别算法的理论和实际使用方法。
二.教学基本要求要求学生了解模式识别的基本理论,掌握基本算法原理,能够根据给出的数据和要求,选择合适的算法,使用现有的软件解决模式识别的模型训练,测试,性能评价问题。
三.教学内容第一章:模式识别的问题定义与数据收集介绍模式识别的问题定义方法,数据形式,模型形式,并指导学生进行一次实际的数据收集。
实践内容:收集包括身高,体重,性别三个维度的数据,并按照模式识别的数据要求进行整理第二章:贝叶斯分类器及其性能评价介绍贝叶斯分类器,两种错误的概念及其估计,证明最小错误率分类器,介绍概率密度估计的基本理论,窗估计方法,介绍性能评价体系,交叉验证的概念,过学习的概念,推广性的概念。
实践内容:利用第一章中收集的数据,建立贝叶斯分类器并进行性能评价。
第三章:线性分类器介绍线性分类器的基本理论,Fisher线性判别器,线性分类器的性能评价。
实践内容:利用第一章的数据,建立Fisher线性分类器,并进行性能评价。
第四章:人工神经网络和支持向量机简介介绍人工神经网络的基本概念和算法,反向传播(BP)训练算法,支持向量机基本概念和算法。
简单介绍统计机器学习理论(SLT)的最基本概念:VC维,泛化能力,模型选择定理。
实践内容:利用第一章中的数据,建立人工神经网络和支持向量机模型,并进行性能评价。
第五章:紧邻法介绍紧邻法的基本理论和方法,紧邻法的错误率边界定理,紧邻法的实现技术,紧邻法在应用上的优势与局限,稀疏性问题。
实践内容:利用第一章中的数据,建立紧邻法分类模型并进行性能评价第六章:分类树介绍分类树的基本理论和方法,介绍C4.5算法的理论与实现。
实践内容:利用第一章中的数据,建立分类树模型并进行性能评价。
第七章:聚类介绍聚类与无监督学习的基本理论和方法,介绍K均值聚类,分级聚类的理论与实现。
简单介绍K均值聚类算法的局限和改进算法。
简单介绍半监督学习的基本方法。
实践内容:利用第一章中的数据,建立聚类模型,并进行模型评价第八章:分类器组合与在线算法介绍分类器组合的基本理论和基本算法,介绍Logistic和semi-logistic 回归技术,AdaBoost技术,Bagging和BootStrap技术,介绍模型训练的在线算法的基本概念和实现算法在线化的基本方法。
实践内容:利用第一章中的数据,建立组合分类器,并进行模型评价 四.学时分配教学内容 授课 上机 实验 实践 实践(周)第一章 1 1第二章 1 1第三章 1 1第四章 1 1第五章 2 1第六章 2 1第七章 2 1第八章 2 1总计: 12 8五.评价与考核方式期末考试(80%)+平时作业(20%)六.教材与主要参考资料教材:模式识别(第三版)作 者:张学工出 版 社:清华大学I S B N :9787302225003出版时间:2010-8-1开 本:16页 数:237页制定人:审核人:批准人:批准日期:年月日TU Syllabus for Pattern Recognition 2Code:2160265 Title: Pattern Recognition 2 Semester Hours: 20Credits:1Semester Hour Structure Lecture :12 Computer Lab :8 Experiment : Practice :Practice (Week):Offered by: School of Computer Science and Technology for: Computer Science and TechnologyPrerequisite: Calculus, Linear Algebra, Statistics, Probability1. ObjectiveThe students will be required to learn the basic theory of pattern recognition including the basic principles, concepts and methods, as well as the implementation of the algorithms. The students need to master the usage and application of existing software for pattern recognition, as well as the performance evaluation of the model and the training procedure. Students need to learn calculus, linear algebra, and statistics before taking this course. The knowledge of stochastic process could also be helpful, but not required.2. Course DescriptionThis course will introduce the basic principle and methods for pattern recognition. This includes problem definition, data format, Bayesian classifier, error rate estimation, probability density estimation, parzen-windowed method, linear classifier, multi-class classifier, nearest-neighbor method, support vector machine, artificial neural network, classification tree, K-means clustering, hierarchical clustering and other basic pattern recognition methods. 3. TopicsChapter 1: Pattern Recognition Problem definition and data collectionThis chapt will introduce the definition of pattern recognition and the regular form for presenting a pattern recognition problem.Practice: Collect a toy problem dataset including the height, weight and genderof your classmates. Format these data to a PR dataset. The Practice of each coming Chapter will be the application of the corresponding algorithm on this dataset.Chapt2: Bayesian classifier and performance evaluationThis Chapt will introduce the basic theory of Bayesian classifier, two types of error and error rate estimation, minimal error rate classifier theorem, as well as the probability density estimation using parzen method and the performance evaluation frame with the cross-validation methods. The concept of over-fitting and generalization ability will also be mentioned.Chapt 3: Linear ClassifierThis chapt will introduce the basic theory of the linear classifier, Fisher linear decision, performance estimation of linear classifiers.Chapt 4: Briefings in Artificial Neural Network and Support Vector MachineThis chapt will introduce the most basic concept and principle of Support Vector Machine and Artificial Neural Network, including the training algorithm of SVM and BP algorithm. The most basic concept of statistical learning theory will also be introduced (VC-dimension, generalization, model selection theorem).Chapt5:Nearest-NeighborThis Chapt will introduce the basic theory and methods for nearest-neighbor method, including the error rate boundary theorem, implementation skills, practical problem on imbalanced dataset, the sparse dataset problem.Chapt 6: Classification TreeBasic theory and method for classification tree, implementation of C4.5 algorithm.Chapt 7: ClusteringThis chapt will introduce the clustering and non-supervised learning principle and methods. The K-means clustering, hierarchical clustering algorithm as well as the problem of these algorithm and the improvements, will be introduced. Semi-Supervised learning will also be mentioned.Chapt 8: Classifier ensembles and online boostingThis chapt will introduce the basic theory and algorithm of : logistic regression,semi-logistic regression, adaboost, bagging, bootstrap and online training algorithms.4. Semester Hour StructureTopics Lecture ComputerLab. ExperimentPracticePractice (Week) Chapt 1 1 1 Chapt 2 1 1 Chapt 3 1 1 Chapt 4 1 1 Chapt 5 2 1 Chapt 6 2 1 Chapt 7 2 1 Chapt 82 1Sum: 12 85. GradingFinal Exam (80%) + Homework (20%)6. Text-Book & Additional ReadingsPATTETN RECOGNITION (3rd Ed ) Xuegong ZhangTsinghua University Press ISBN: 9787302225003 2010-8-1Constitutor: Reviewer: Authorizor: Date:。