七、call peak !(花这么多钱,就为了这个东西!m6a到底富集在哪?我要坐标!)
还记得那三剑客吗?“bam、bigwig、bed” 。豪不夸张的说,拥有这三样东西,NGS-data的下游分析就手到擒来。前面我们获得了bam和bigwig,接下来我们获得m6a的bed,至于其他bed可以来源与UCSC和文献。
1、MACS2(万能钥匙真的万能吗?)
hisat2_dir=$work_dir/1.1_hisat2/mouse
hisat2_bam_dir=$hisat2_dir/position_sorted_bam
macs2_dir=$hisat2_dir/position_sorted_bam/macs2
cd $hisat2_bam_dir
#----------
my_name=SYSY_rep1
mkdir $macs2_dir/${my_name}
t=C57BL6_Brain_Sample_1_MeRIP-SYSY.hisat2.position_sorted.bam
c=C57BL6_Brain_Sample_1_Non-IP_Control.hisat2.position_sorted.bam
nohup macs2 callpeak -t $hisat2_bam_dir/$t -c $hisat2_bam_dir/$c -g mm -f BAM -n ${my_name} --outdir $macs2_dir/${my_name} -q 0.01 > $macs2_dir/${my_name}.macs2.log &
#----------
2、参数小贴士
参数 | 解释 | |||||
---|---|---|---|---|---|---|
-g/–gsize | hs: 2.7e9;mm: 1.87e9;ce: 9e7;dm: 1.2e8 | |||||
-f | BAM;BAMPE 等 | |||||
-q or -p | <0.001;<0.01;<0.05 等 | |||||
-m/–mfold | -m 5,50 | |||||
–keep-dup | Default: 1 | |||||
–fix-bimodal | when MACS failed to build paired model, it will use the nomodel settings | |||||
–nomodel | 配合 –shift –extsize 使用 | |||||
–nolamda | 谨慎使用 - With this flag on, MACS will use the background lambda as local lambda. This means MACS will not consider the local bias at peak candidate regions. | |||||
–slocal; –llocal | 谨慎使用 - MACS considers 1000bp for small local region(–slocal), and 10000bps for large local region(–llocal) |
3、输出文件简介
NAME_peaks.narrowPeak is BED6+4 format file which contains the peak locations together with peak summit, pvalue and qvalue. You can load it to UCSC genome browser. Definition of some specific columns are:
列头 | 解释 | |||||
---|---|---|---|---|---|---|
5th | integer score for display. It’s calculated as int(-10log10pvalue) or int(-10log10qvalue) depending on whether -p (pvalue) or -q (qvalue) is used as score cutoff. Please note that currently this value might be out of the [0-1000] range defined in UCSC Encode narrowPeak format. You can let the value saturated at 1000 (i.e. p/q-value = 10^-100) by using the following 1-liner awk: awk -v OFS="\t" ‘{$5=$5>1000?1000:$5} {print}’ NAME_peaks.narrowPeak | |||||
7th | fold-change at peak summit | |||||
8th | -log10pvalue at peak summit | |||||
9th | -log10qvalue at peak summit | |||||
10th | relative summit position to peak start |
《=== narrowPeak文件示例 ===》 《=== summit文件示例 ===》
《=== 各类peak属性统计(R) ===》 ###请打开 Rcode - part7
SYSY_rep1_peaks_PeakLengthDensity.pdf SYSY_rep1_peaks_PeakEnrichmentDensity.pdf SYSY_rep1_peaks_PeakPvalueDensity.pdf SYSY_rep1_peaks_PeakSummitPosition.pdf
4、deeptools –> peak 中心信号 summit.bed + bw
5、narrowPeak文件初步过滤
awk 'BEGIN{OFS="\t"; FS="\t"} $5 > 50 {print $0}' XXX.narrowPeak > XXX.filter.narrowPeak
Dependencies
peak+deeotools+summit?heatmap?