当前位置:文档之家› Macs2 操作手册与介绍

Macs2 操作手册与介绍


Using MACS - setup
• cd /home/work/public • mkdir macsout_<user ID>
– <user ID> : e.g. ‘spheikki’ for me – each student MUST have their own folder!!
• to avoid overlapping MACS outputs
ChIP-seq analysis with MACS2
Tips and tricks
Sami Heikkinen, PhD Docent in Molecular Bioinformatics Institute of Biomedicine, UEF
ChIP-Seq simplified
Where?
• checks on seq files
– ls –l seq – head seq/*
• check that macs2 works
– macs2 callpeak
callpeak - Options
Various options to indicate/control input, output, peak modelling and peak calling macs2 callpeak usage: macs2 callpeak [-h] -t TFILE [TFILE ...] [-c [CFILE [CFILE ...]]] [-f {AUTO,BAM,SAM,BED,ELAND,ELANDMULTI,ELANDEXPORT,BOWTIE, BAMPE}] [-g GSIZE] [--keep-dup KEEPDUPLICATES] [--buffer-size BUFFER_SIZE] [--outdir OUTDIR] [-n NAME] [-B] [--verbose VERBOSE] [--trackline] [--SPMR] [-s TSIZE] [--bw BW] [-m MFOLD MFOLD] [--fix-bimodal] [--nomodel] [--shift SHIFT] [--extsize EXTSIZE] [-q QVALUE] [-p PVALUE] [--to-large] [--ratio RATIO] [--down-sample] [--seed SEED] [--nolambda] [--slocal SMALLLOCAL] [--llocal LARGELOCAL] [--broad] [--broad-cutoff BROADCUTOFF] [--call-summits]
36-50 bp
Typically millions of Genetics, 2009
MACS2
• Model-based Analysis of ChIP-Seq • Original version published by Yong Zhang and Tao Liu from the lab of X. Shirley Liu at the Dana-Farber Cancer Institute, Boston
bdgdiff diffpeak
callpeak - Options
Various options to indicate/control input, output, peak modelling and peak calling macs2 callpeak usage: macs2 callpeak [-h] -t TFILE [TFILE ...] [-c [CFILE [CFILE ...]]] [-f {AUTO,BAM,SAM,BED,ELAND,ELANDMULTI,ELANDEXPORT,BOWTIE, BAMPE}] [-g GSIZE] [--keep-dup KEEPDUPLICATES] [--buffer-size BUFFER_SIZE] [--outdir OUTDIR] [-n NAME] [-B] [--verbose VERBOSE] [--trackline] [--SPMR] [-s TSIZE] [--bw BW] [-m MFOLD MFOLD] [--fix-bimodal] [--nomodel] [--shift SHIFT] [--extsize EXTSIZE] [-q QVALUE] [-p PVALUE] [--to-large] [--ratio RATIO] [--down-sample] [--seed SEED] [--nolambda] [--slocal SMALLLOCAL] [--llocal LARGELOCAL] [--broad] [--broad-cutoff BROADCUTOFF] [--call-summits] -t/--treatment FILENAME This is the only REQUIRED parameter for MACS.
Park, Nat Rev Genetics, 2009
Schmidt et al, Methods, 2009
From binding to binding sites
ChIP-seq
~200 bp
Control sample: “Input” or “IgG” - Input: sonicated chromatin without immunoprecipitation - IgG: “unspecific” IP
callpeak – Options - Input
Input files arguments: -t TFILE [TFILE ...], --treatment TFILE [TFILE ...] ChIP-seq treatment file. If multiple files are given as '-t A B C', then they will all be read and combined. REQUIRED. -c [CFILE [CFILE ...]], --control [CFILE [CFILE ...]] Control file. If multiple files are given as '-c A B C', then they will all be read and combined.
– Genome Biology 2008, 9:R137 – now at version 2.1.0.20140616, developed and maintained by Tao Liu at https:///taoliu/MACS/ – https:///taoliu/MACS/blob/macs_v1/README.rst
• Usage tip: use up/down arrow keys to move in command history • ls
– LiSt files in directory – e.g. ‘ls -l’ to show file and folder names AND other info (Long format)
filterdup randsample
callpeak
peaks.xls peaks.narrowPeak OUTPUT FILEs summits.bed model.r model.pdf treat_pileup.bdg control_lambda.bdg
predictd pileup refinepeaks bdgpeakcall bdgbroadcall bdgcmp OUTPUT pileup.bdg refinepeak.bed
Using MACS – connect to server
• Open the SSH client
– at Win –> All programs –> SSH Secure shell –> Secure shell client – “Quick connect”
• connection : intron.uef.fi • username : <your user ID> • password: <your password>
-f {AUTO,BAM,SAM,BED,ELAND,ELANDMULTI,ELANDEXPORT,BOWTIE,BAMPE}, --format {AUTO,BAM,SAM,BED,ELAND,ELANDMULTI,ELANDEXPORT,BOWTIE,BAMPE} Format of tag file, "AUTO", "BED" or "ELAND" or "ELANDMULTI" or "ELANDEXPORT" or "SAM" or "BAM" or "BOWTIE" or "BAMPE". The default AUTO option will let MACS decide which format the file is. Please check the definition in README file if you choose ELAND/ELANDMULTI/ELANDEXPORT/SAM/BAM/BOWTIE. DEFAULT: "AUTO" -g GSIZE, --gsize GSIZE Effective genome size. It can be 1.0e+9 or 1000000000, or shortcuts:'hs' for human (2.7e9), 'mm' for mouse (1.87e9), 'ce' for C. elegans (9e7) and 'dm' for fruitfly (1.2e8), Default:hs --keep-dup KEEPDUPLICATES It controls the MACS behavior towards duplicate tags at the exact same location -- the same coordination and the same strand. The 'auto' option makes MACS calculate the maximum tags at the exact same location based on binomal distribution using 1e-5 as pvalue cutoff; and the 'all' option keeps every tags. If an integer is given, at most this number of tags will be kept at the same location. The default is to keep one tag at the same location. Default: 1 --buffer-size BUFFER_SIZE Buffer size for incrementally increasing internal array size to store reads alignment information. In most cases, you don't have to change this parameter. However, if there are large number of chromosomes/contigs/scaffolds in your alignment, it's recommended to specify a smaller buffer size in order to decrease memory usage (but it will take longer time to read alignment files). Minimum memory requested for reading an alignment file is about # of CHROMOSOME * BUFFER_SIZE * 2 Bytes. DEFAULT: 100000
相关主题