当前位置:
文档之家› 转录组数据分析解读及实例操作
转录组数据分析解读及实例操作
Content of transcriptome
1. Genes: expression , alterante splices 2. Noncoding RNA: snoRNA, mRNA-like ncRNA, snRNA, some antisense transcripts, pesudogenes, retrotransposon ,and others functional RNAs 3. Some repeat elements
RNA-seq的生物学重复和标准
1. 至少有两个生物学重复,除非“短时间梯度取样” (overlapping time points with high temporal resolution)不需要 技术重复 2. 对基因注释较好的物种,只定量比较研究,可用reads大于 20M;用于注释基因组的转录组,大于>100M 3. 最好有浓度不同长度不同的绝对定量control (Spike-in),以评 估mapping质量、测序均匀性和RNA-seq定量效果 4. “3端/5端比值”是衡量RNA完整性的关键指标(理想值是1),,样品评估关键指标,rpkm值关键结果完备。
– h9p://bow1e-‐/index.shtml/ – h9p:/// – h9p:/// – h9p://cuffl/ – h9p:///cummeRbund/ *Linux, 64bit CPU, 16G memory
Background
mRNlysis Tools Mapping and Assembly tools BWA -‐ BWA is a fast light-‐weighted tool that aligns rela1vely short sequences (queries) to a sequence database (targe), such as the human reference genome SeqMap -‐ A Tool For Mapping Millions Of Short Sequences To The Genome. MAQ -‐ stands for Mapping and Assembly with Quality It builds assembly by mapping short reads to reference sequences. ERANGE -‐ Mapping and Quan1fying Mammalian Transcriptomes by RNA-‐Seq Cufflinks -‐ assembles transcripts, es1mates their abundances, and tests for differen1al expression and regula1on in RNA-‐Seq samples. iAssembler – a standalone package to assemble ESTs generated using Sanger and/or Roche-‐454 pyrosequencing technologies into con1gs. MapPER -‐ an RNA-‐seq paired-‐end read (PER) protocol. Support splice mapping and quan7fy TopHat -‐ is a fast splice junc1on mapper for RNA-‐Seq reads. SpliceMap -‐ SpliceMap is a de novo splice junc1on discovery tool. It offers high sensi1vity and support for arbitrarily long RNA-‐seq read lengths. MapSplice -‐ Splice Junc1on Mapping Tool. Trinity RNA-‐Seq Assembly – so7ware solu1ons targeted to the reconstruc1on of full-‐length transcripts and alterna1vely spliced isoforms from Illumina RNA-‐Seq data PALMapper -‐ a combina1on of the spliced alignment method QPALMA with the short read alignment tool GenomeMapper.
7
• TopHat so7ware
• Cufflinks so7ware
• CummeRbund so7ware
• RNAseq is a powerful tool to detcet the whole transciptome in cell and tissue. • Previous RNAseq research focus on mRNA, but recent studies prove that part of functional noncoding transctipt and proteincoding RNAs are lack of polyA.
转录组数据分析解读及 实例操作
罗奇斌 奇云诺德QY NODE 德国慕尼黑工业大学
Second genera1on sequencers
2
3
4
常规分析
5
实验流程
6
分析所需工具
• Bow1e so7ware • SAM tools
Web-‐based tools rQuant.web -‐ is a web service to provide convenient access to tools for the quan1ta1ve analysis of RNA-‐Seq data. Galaxy -‐ Mapping pipeline for Illumina, 454, and SOLiD sequencing data. UCSC Genome Browser -‐ This site contains the reference sequence and working dra7 assemblies for a large collec1on of genomes. It also provides portals to the ENCODE and Neandertal projects. Bioconductor -‐ Bioconductor is an open source and open development so7ware project for the analysis and comprehension of genomic data. ExpEdit -‐ is a web applica1on for assessing RNA edi1ng in human at known or user specified sites supported by transcript data obtained by RNA-‐Seq experiments. Myrna -‐ a cloud compu1ng tool for RNA sequence. GenePa9ern -‐ is a powerful genomic analysis pladorm that provides access to more than 100 tools for gene expression analysis, proteomics, SNP analysis and common data processing tasks. Others Scripture -‐ is a method for transcriptome reconstruc1on that relies solely on RNA-‐Seq reads and an assembled genome to build a transcriptome ab ini&o. CisGenome -‐ An integrated tool for 1ling array, ChIP-‐seq, genome and cis-‐regulatory element analysis.