摘要伴随着人类基因组测序计划的发展和分子生物学相关技术的突破,数以万计的生物信息学数据急待提取与分析。
同时,计算机与自动化技术不断提高,使其在众多领域的数据处理中发挥着不可替代的作用。
面对如此庞大的数据量,如何充分利用多学科交叉的技术方法进行自动化基因数据分析是当前生命科学的共同课题。
基因微阵列芯片技术的出现提供了集成度相当高的实验工具,它可以一次性对大量基因样品进行检测与分析。
目前芯片的制备与扫描已基本实现自动化,但对于后续的基因微阵列图像数据提取却一直难以实现自动化。
这主要是由微阵列图像数据量大,斑点密度高且不规则,噪声干扰强且对比度不明显等因素引起的。
本文的研究目标是在保证基因点定位与数据提取准确性的前提下,实现基因微阵列图像的整体自动化提取。
为实现这一目标,本文首先针对现有不利于自动化的图像处理流程进行优化改进,先提取基因点边缘再网格定位,后续进行缺失点补偿。
然后,提出基于灰度形态学的自动化图像增强与自适应二值化方法进行自动化预处理,并利用基因点自身的形态特征进行边缘提取。
接着,在二值图上通过基于角度投影的快速倾斜校正法进行网格定位实现自动化图像分割,并解决缺失点补偿与粘连分割问题。
最后,在实验中通过与国际权威软件分析结果进行数据对比,利用大量实际图像数据进一步验证微阵列图像数据提取的可靠性,有效性与完整性。
关键字:微阵列图像,图像处理,自动化提取关键字AbstractAbstractWith the development of human genome sequence plan and the breakthrough of molecular biology correlation technique, millions of biological information science data is waiting for extraction and analysis urgently. At the same time, computer and automation technology enhanced unceasingly, they are playing an irreplaceable role in data processing in many fields. The goal of my study is realizing the integral automatic extraction on the premise of the accurate gene localization and data of gene spots.The gene microarray chip technology has provided the highly integrated experimental tools. It can detect and analyze the massive gene samples. The chip preparation and scanning have realized the automation generally at present, but it always hard to realize the automatic extraction of the data from gene microarray image completely. The basic reasons are the microarray image’s big data quantity, the high spot density, anomalous shape, the strong jamming noise and unobvious contrast ratio. Therefore, the difficulty and goal of my study is realizing the integral automatic extraction on the premise of the accurate gene localization and data of gene spots.In order to achieve this goal, firstly, this paper made the optimized improvement aiming at the existing non-automatic image processing flow. Drew the gene edges, located by grids, and then carried on flaw compensation. Secondly, the automatic image enhancement based on gray-scale morphology and automatic binarization methods is presented to automatic pre-processing, and made the edge detection using gene’s own morphological feature. Thirdly, to realize the automatic image segmentation,carried on the grid localization by fast tilt method which based on the angle projection on two value charts, and solved the problems of flaw compensation and the adhesion division. Finally, according to the data contrast with the international authoritative software analysis result in the experiment, this paper used massive actual image data further confirmed the reliability, validity and integrity of the microarray image data extraction.Key words: Microarray Image; Image Processing; Automatic Extraction目录目录第一章第一章 绪论 (11)1.1选题背景和研究意义 (1)1.2研究现状及相关问题 (2)1.2.1整体处理流程 (2)1.2.2图像预处理方法研究 (3)1.2.3图像分割方法研究 (3)1.2.4软件研发现状 (4)1.3主要研究内容与创新点 (4)1.4 1.4 本文的章节安排本文的章节安排本文的章节安排 (5)第二章 基因微阵列芯片概述 (77)2.1生物芯片 (7)2.1.1生物芯片概述 (7)2.1.2生物芯片的分类 (7)2.1.3微阵列芯片 (8)2.2微阵列芯片制备分析原理 (9)2.2.1芯片制作的材料要求 (10)2.2.2制作方法与机器设备 (10)2.2.3杂交反应和荧光检测 (12)2.2.4图像处理与数据分析 (14)第三章第三章 微阵列图像处理流程改进微阵列图像处理流程改进 (171717)3.1图像的宏观特点 (1717)3.2图像的微观问题与原因 (1717)3.3.33图像处理基本流程 (2020)3.3.44流程改进 (2121)第四章第四章 微阵列图像自动化预处理微阵列图像自动化预处理 (242424)4.1噪声噪声点清除点清除 (2424)4.1.1中值滤波 (24)4.1.2滤波效果分析 (25)4.2形态学自适应图像增强 (2727)4.2.1传统的图像增强法 (28)4.2.2数学形态学原理 (30)4.2.3灰度图像形态学 (32)4.2.4自适应图像增强 (33)4.3二值化阈值自动提取 (3636)4.3.1二值化方法对比 (37)4.3.2自适应二值化遇到的问题 (39)4.3.3基于差分标准差的阈值自适应 (40)第五章第五章 微阵列图像自动化分割微阵列图像自动化分割 (454545)5.1图像分割方法概述 (4545)5.2快速倾斜校正法 (4646)5.2.1倾斜校正方法对比 (47)5.2.2角度投影快速校正法 (50)5.2.3图像旋转校正 (51)5.3微阵列矩阵分割 (5252)5.3.1区块分割 (52)5.3.2网格定位与自动修正 (53)5.4缺失补偿与粘连分离 (5757)5.5基因点与背景分割 (6060)5.5.1形态学边缘分割 (60)5.5.2高亮噪声区轮廓优化 (62)5.6对比值数据计算 (6262)第六章第六章 实验数据分析 (656565)6.1实验数据说明 (6565)6.2主流软件简介 (6666)6.3数据对比分析 (6767)6.3.1独立散点图与对数化分析 (67)6.3.2交叉误差分析 (72)6.3.3交叉散点图对比 (73)6.3.4实验结果综述 (75)第七章第七章 总结与展望 (767676)7.1工作总结 (7676)7.2不足与进一步工作....................................................................................7676 参考文献.............................................................................................................797979 致谢. (838383)CONTENTSCONTENTS CONTENTSChapter 1 Introdution (11)1.11.1 Background and S Background and Signification ignification ignification (1)1.21.2 Actuality Actuality Actuality and Problem and Problem and Problem Relevant to the Study Relevant to the Study (2)1.2.1 the Entire Process of the Image (2)1.2.2 the study of Image Preconditioning (3)1.2.3 the study of Image Segmentation (3)1.2.4 Actuality of Software R&D (4)1.31.3 Major Contents of Study Major Contents of Study Major Contents of Study and and and Main Main Main I I nnovation (4)1.4 1.4 Structure of the Paper Structure of the Paper Structure of the Paper..........................................................................................................5 Chapter 2Chapter 2 Summary of Gene Microarray Chip Summary of Gene Microarray Chip (77)2.12.1 Biochip Biochip (7)2.1.1 Summary of Biochip (7)2.1.2 Category of Biochip (7)2.1.3 Microarray Chip (8)2.22.2 Principle of Chip A Principle of Chip A Principle of Chip Analysis and Preparation about Microarray nalysis and Preparation about Microarray 92.2.1 Material of the Chip (10)2.2.2 Production Methods and Equipment (10)2.2.3 Hybridization Reaction and Fluorescence Detection (12)2.2.4 Image Processing and Data Analysis....................14 Chapter 3Chapter 3 Improvement Improvement of of the Microarray Image the Microarray Image the Microarray Image Processing 17173.13.1 Macro Macro--features of the Imag features of the Image e (1717)3.23.2 Micro Micro--questions of the Image and the Reason (1717)3.3.33 the Basic Process of the Image the Basic Process of the Image (2020)3.3.44 Process Improvement Process Improvement..............................................................................................................2121 Chapter 4 Automatic Pre Chapter 4 Automatic Pre--processi processing ng ng of of of Mic Mic Micr r oarray oarray Image . (242424)4.14.1 Noise Elimination Noise Elimination (2424)4.1.1 Median Filtering (24)4.1.2 Analysis of the Filtering Effect (25)4.24.2 Automatic Image Enhancement Based on Morphology (2727)4.2.1 Traditional Image Enhancement (28)4.2.2 Principle of Mathematical Morphology (30)4.2.3 Gray-scale Morphology (32)4.2.4 Adaptive Image Enhancement (33)4.34.3 Automatic Extraction of Binarization Threshold (3636)4.3.1 Comparison of the Binarization Algorithms (37)4.3.2 Problems of Adaptive Binarization (39)4.3.3 Automatic Binarization Based on Difference and StandardCONTENTSVIIDeviation...................................................40 Chapter Chapter 55 Automatic Segmentation of Microarray .. (454545)5.15.1 Summary of Image Segmentation Methods Summary of Image Segmentation Methods (4545)5.25.2 Fast Tilt Correction Fast Tilt Correction (4646)5.2.1 Comparison of the Tilt Correction Algorithms (47)5.2.2 Tilt Correction Using Angles Projections (50)5.2.3 Image Circumrotate Correction (51)5.35.3 Matrix Segmentation of Microarray Matrix Segmentation of Microarray (5252)5.3.1 Block Segmentation (52)5.3.2 Grid Localization and Automatic Modification (53)5.45.4 Missing Compensation and Overlapping Segmentation (5757)5.55.5 Segmentation of Gene Spots and Background Segmentation of Gene Spots and Background (6060)5.5.1 Morphology Edge Extraction (60)5.5.2 Contour Optimization of High-bright Noise (62)5.65.6 Calculation of the Ratio Data Calculation of the Ratio Data..........................................................6262 Chapter 6Chapter 6 Experiment Data Analysis Experiment Data Analysis (656565)6.16.1 Experiment Data Description Experiment Data Description (6565)6.26.2 Introduction of the Popular Software Introduction of the Popular Software (6666)6.36.3 Data Comparison Analysis Data Comparison Analysis (6767)6.3.1 Independent Plot and Logarithmic Analysis (67)6.3.2 Cross Errors Analysis (72)6.3.3 Cross Plot Comparison (73)6.3.4 Overview of the Result (75)Chapter Chapter 77 Summary and Prospect Summary and Prospect (767676)7.17.1 Study Summary Study Summary (7676)7.27.2 Shor Shortage and Future Study tage and Future Study tage and Future Study..................................................................................................7676 Reference ..........................................................................................................797979 Acknowledge . (838383)VIII第一章 绪论1第一章第一章 绪论绪论1.1选题背景和研究意义选题背景和研究意义生物信息学(Bioinformatics )是一门充分利用计算机信息技术研究生物系统规律的学科,它在人类疾病基因发现、基因与蛋白质的表达与功能研究、合理化药物设计等方面都有着重要的作用[1]。