工欲善其事必先利其器
pyGenomeTracks 是一款功能强大且灵活的基因组数据可视化工具,用于展示各种类型的基因组数据,如基因注释、信号强度和覆盖度等。从而使研究人员更好地理解基因组的功能和结构。广泛应用于表观遗传学、转录组学、基因组学等研究。其免费开源且具有以下特性
GitHub:
文档:
编程语言:Python
题目:High-resolution TADs reveal DNA sequences underlying genome organization in flies 期刊:Nature Communications 日期:2018-01-15 作者&&单位:Fidel Ramírez & Vivek Bhardwaj && Max Planck Institute of Immunobiology and Epigenetics DOI:https://doi.org/10.1038/s41467-017-02525-w
其提供源码及多种方式安装,方便快捷的还是使用conda安装
#conda create -n chipseq
mamba activate chipseq
mamba install pygenometracks
##测试
pyGenomeTracks -h
##默认安装的是最新版,如果有特定的版本需求,也可以指定版本安装
$pyGenomeTracks --version
3.9
如图所示其工作流程简单明了,主要是两个部分
pyGenomeTracks
在指定区域绘制基因组轨迹使用 make_tracks_file
创建配置文件,然后绘图
make_tracks_file --trackFiles ./data/bigwig2_X_2.5e6_3.5e6.bw -o basic_bw_track.ini
pyGenomeTracks --tracks basic_bw_track.ini --region X:2,500,000-3,000,000 -o ./plot_out/basic_bw.png
image.png
当然,如果觉得图片不合心意,也可以编辑生成的配置文件 basic_bw_track.ini
,自己动手修改相应的设置
不使用make_tracks_file
, 自己直接编辑配置文件
[bigwig file test] #单独轨迹的配置部分
file =./data/bigwig2_X_2.5e6_3.5e6.bw #指定要绘制的 bigWig 文件的路径
height = 4 #设置轨迹图的高度,以厘米为单位
title = bigwig #轨迹图的标题。默认在轨迹图的右侧显示
min_value = 0 #轨迹图的最小值,y轴下限
max_value = 30 #轨迹图的最大值,y轴上限
pyGenomeTracks --tracks 1_bw_track.ini --region X:2,500,000-3,000,000 -o ./plot_out/1_bw.png
添加基因信息
[bigwig file test]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
height = 4
title = bigwig
min_value = 0
max_value = 30
[spacer]
# this simply adds an small space between the two tracks. 添加空白空间,将不同轨迹图分隔开
[genes] #定义了一个基因注释轨迹图,使用 BED 格式的基因注释文件
file = ./data/dm3_genes.bed6.gz #指定要绘制的bed文件
height = 7
title = genes
fontsize = 10 #轨迹图中标签的字体大小
file_type = bed ##文件类型
gene_rows = 10 ##指定轨迹图中显示基因的行数
[x-axis] #定义 x 轴部分
fontsize=10
pyGenomeTracks --tracks 2_bw_genes.ini --region X:2,800,000-3,100,000 -o ./plot_out/2_bw_genes.png
添加垂直线
[bigwig file test]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
height = 4
title = bigwig
min_value = 0
max_value = 30
[spacer]
[genes]
file = ./data/dm3_genes.bed6.gz
height = 7
title = genes
fontsize = 10
file_type = bed
gene_rows = 10
[x-axis]
fontsize=10
[vlines]
file = ./data/tad_classification.bed
type = vlines
pyGenomeTracks --tracks 3_bw_genes_vlines.ini --region X:2,800,000-3,100,000 -o ./plot_out/3_bw_genes_vlines.png
查看bed文件中线的位置
多个轨道叠加
[test bigwig]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
color = blue ##指定轨迹的颜色
height = 7
title = No alpha: (bigwig color=blue 2000 bins) overlaid with (bigwig color = (0.6, 0, 0) max over 300 bins) overlaid with (bigwig mean color = green 200 bins)
number_of_bins = 2000 ##将区域分成 2000 个 bin,计算并绘制每个 bin 的均值
min_value = 0
max_value = 30
[test bigwig max]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
color = (0.6, 0, 0) ##指定轨迹颜色,RGB值
summary_method = max ##使用最大值作为 summary 方法
number_of_bins = 300 ##将区域分成 300 个bin
overlay_previous = share-y ##叠加在前一个轨迹图上,并共享 y 轴
[test bigwig mean]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
color = green
type = fill ##使用填充类型绘制轨迹
number_of_bins = 200
overlay_previous = share-y
[spacer]
[test bigwig]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
color = blue
height = 7
title = alpha (bigwig color = blue 2000 bins) overlaid with (bigwig color = (0.6, 0, 0) alpha = 0.5 max over 300 bins) overlaid with (bigwig mean color = green alpha = 0.5 200 bins)
number_of_bins = 2000
min_value = 0
max_value = 30
[test bigwig max]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
color = (0.6, 0, 0)
alpha = 0.5 ##设置轨迹透明度
summary_method = max
number_of_bins = 300
overlay_previous = share-y
[test bigwig mean]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
color = green
alpha = 0.5
type = fill
number_of_bins = 200
overlay_previous = share-y
[spacer]
[test bigwig]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
height = 7
title = alpha for lines/points: (bigwig color=(0.6, 0, 0) alpha = 0.5 max) overlaid with (bigwig mean color = green alpha = 0.5 line:2) overlaid with (bigwig min color = blue alpha = 0.5 points:3)
color = (0.6, 0, 0)
alpha = 0.5
summary_method = max
number_of_bins = 300
min_value = 0
max_value = 30
[test bigwig mean]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
color = green
type = line:2 ##使用线条类型绘制轨迹,线条宽度为 2
alpha = 0.5
summary_method = mean
number_of_bins = 300
overlay_previous = share-y
[test bigwig min]
file = ./data/bigwig2_X_2.5e6_3.5e6.bw
color = blue
summary_method = min
number_of_bins = 1000
type = points:3 ##使用点类型绘制轨迹,点大小为 3
alpha = 0.5
overlay_previous = share-y
[x-axis]
pyGenomeTracks --tracks 4_bw_alpha.ini --region X:2700000-3100000 --trackLabelFraction 0.2 --dpi 130 -o ./plot_out/4_master_alpha.png
--trackLabelFraction ##更改该轨迹标签的空间占图像宽度的比例。默认情况下,轨迹标签的空间占图像宽度的 0.05
--dpi ##指定图像的分辨率(DPI)。默认72
学习了基本的配置文件规则,我们来实例展示一个chip-seq数据结果的可视化
##新建 8_narrowpeaks.ini 文件,写入以下内容
[bigwig file test]
file = /home/data/t020559/chip_seq/GSE205035_PRJNA843319/e_mkdup/test_H3K27ac_mkdup.bw
height = 4
title = test_bw
min_value = 0
max_value = 40
color = red
[spacer]
[narrow ]
file = /home/data/t020559/chip_seq/GSE205035_PRJNA843319/f_narrowPeak/test_H3K27ac_peaks.narrowPeak
height = 4
max_value = 40
color = blue
use_summit = false
title = test_narrowpeak
show_labels = false
[spacer]
[genes]
file = /home/data/t020559/ref/homo/gencode/gencode.v45.annotation.gtf
height = 3
title = genes
fontsize = 10
file_type = gtf
[x-axis]
fontsize = 10
pyGenomeTracks --tracks 8_narrowpeaks.ini --region chr22:23700000-24300000 --trackLabelFraction 0.2 --dpi 130 -o ./plot_out/8_narrowpeaks.png
更多用法,详见:https://pygenometracks.readthedocs.io/en/latest/content/examples.html#examples-with-peaks