今天,给大家详细地介绍一下PyComplexheatmap(https://github.com/DingWB/PyComplexHeatmap) 中annotation的使用方法,也就是如何用python在热图中添加【行】/【列】注释信息。比如样本的疾病状态(肿瘤或者正常样本、年龄、性别、分型等)。
# 导入示例数据
with open(os.path.join(os.path.dirname(PyComplexHeatmap.__file__),"../data/mammal_array.pkl"), 'rb') as f:
data = pickle.load(f)
df, df_rows, df_cols, col_colors_dict = data
GSM4412025 | GSM4412026 | GSM4412027 | GSM4412028 | GSM4412029 | GSM4412030 | GSM4412031 | GSM4412032 | GSM4412033 | GSM4412034 | ... | GSM4997945 | GSM4997946 | GSM4997947 | GSM4997948 | GSM4997949 | GSM4997950 | GSM4997951 | GSM4997952 | GSM4997953 | GSM4997954 | |
sheep | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | ... | 0.435033 | 0.432900 | 0.446626 | 0.449123 | 0.497180 | 0.515918 | 0.483706 | 0.504681 | 0.529076 | 0.446443 |
beluga whale | 0.687488 | 0.694207 | 0.706525 | 0.702734 | 0.687014 | 0.704003 | 0.705887 | 0.693806 | 0.719417 | 0.712677 | ... | 0.560381 | 0.571552 | 0.610392 | 0.613619 | 0.675832 | 0.668502 | 0.624820 | 0.658377 | 0.702334 | 0.575034 |
house mouse | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
vaquita | 0.693523 | 0.702525 | 0.716792 | 0.725095 | 0.711261 | 0.708651 | 0.717952 | 0.705486 | 0.720915 | 0.724171 | ... | 0.581862 | 0.594443 | 0.628908 | 0.639457 | 0.680801 | 0.707493 | 0.662338 | 0.665142 | 0.751859 | 0.584952 |
large flying fox | 0.286822 | 0.269406 | 0.296796 | 0.314719 | 0.305074 | 0.308419 | 0.268421 | 0.297236 | 0.295019 | 0.297973 | ... | 0.281684 | 0.363926 | 0.367216 | 0.359392 | 0.289821 | 0.257351 | 0.234301 | 0.295840 | 0.249336 | 0.344848 |
greater horseshoe bat | 0.530606 | 0.517989 | 0.520719 | 0.525855 | 0.507583 | 0.511993 | 0.528107 | 0.521563 | 0.542159 | 0.509526 | ... | 0.735151 | 0.714805 | 0.708868 | 0.738207 | 0.823604 | 0.828040 | 0.823382 | 0.796882 | 0.878783 | 0.693625 |
little brown bat | 0.202525 | 0.215990 | 0.238977 | 0.241872 | 0.267750 | 0.236729 | 0.202384 | 0.240519 | 0.251148 | 0.267257 | ... | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
7 rows × 883 columns
PredictedTaxid | PredictedSpecies | common_names | Family | |
sheep | 9940.0 | ovis_aries_rambouillet | sheep | Bovidae |
beluga whale | 9749.0 | delphinapterus_leucas | beluga whale | Monodontidae |
house mouse | 10090.0 | mus_musculus | house mouse | Muridae |
vaquita | 42100.0 | phocoena_sinus | vaquita | Phocoenidae |
large flying fox | 132908.0 | pteropus_vampyrus | large flying fox | Pteropodidae |
greater horseshoe | NaN | NaN | NaN | NaN |
bat | 59479.0 | rhinolophus_ferrumequinum | greater horseshoe bat | Rhinolophidae |
little brown bat | 59463.0 | myotis_lucifugus | little brown bat | Vespertilionidae |
GSE | Basename | NCBI_scientific_name | taxid | Tissue | Sex | Family | Order | Species | SuccessRate | common_names | |
GSM4412025 | GSE147003 | GSM4412025 | Ovis aries | 9940 | Blood | Female | Bovidae | Artiodactyla | Ovis aries | 0.765818 | sheep |
GSM4412026 | GSE147003 | GSM4412026 | Ovis aries | 9940 | Blood | Female | Bovidae | Artiodactyla | Ovis aries | 0.797669 | sheep |
GSM4412027 | GSE147003 | GSM4412027 | Ovis aries | 9940 | Blood | Female | Bovidae | Artiodactyla | Ovis aries | 0.759256 | sheep |
GSM4412028 | GSE147003 | GSM4412028 | Ovis aries | 9940 | Blood | Male | Bovidae | Artiodactyla | Ovis aries | 0.749813 | sheep |
GSM4412029 | GSE147003 | GSM4412029 | Ovis aries | 9940 | Blood | Female | Bovidae | Artiodactyla | Ovis aries | 0.770433 | sheep |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
GSM4997950 | GSE164127 | GSM4997950 | Eptesicus fuscus | 29078 | Skin | Male | Vespertilionidae | Chiroptera | Eptesicus fuscus | 0.660425 | big brown bat |
GSM4997951 | GSE164127 | GSM4997951 | Eptesicus fuscus | 29078 | Skin | Male | Vespertilionidae | Chiroptera | Eptesicus fuscus | 0.652822 | big brown bat |
GSM4997952 | GSE164127 | GSM4997952 | Eptesicus fuscus | 29078 | Skin | Female | Vespertilionidae | Chiroptera | Eptesicus fuscus | 0.664746 | big brown bat |
GSM4997953 | GSE164127 | GSM4997953 | Eptesicus fuscus | 29078 | Skin | Male | Vespertilionidae | Chiroptera | Eptesicus fuscus | 0.650848 | big brown bat |
GSM4997954 | GSE164127 | GSM4997954 | Eptesicus fuscus | 29078 | Skin | Male | Vespertilionidae | Chiroptera | Eptesicus fuscus | 0.657170 | big brown bat |
883 rows × 11 columns
#Put annotations on the top
col_ha = HeatmapAnnotation(label=anno_label(df_cols.Family, merge=True, rotation=45),
Family=anno_simple(df_cols.Family, legend=True),
Tissue=df_cols.Tissue,label_side='right', axis=1)
plt.figure(figsize=(7, 4))
cm = ClusterMapPlotter(data=df, top_annotation=col_ha,
show_rownames=True, show_colnames=False,row_names_side='left',
col_split=df_cols.Family, cmap='exp1', label='AUC',
rasterized=True, legend=True)
#plt.savefig("clustermap.pdf", bbox_inches='tight')
Starting calculating row orders..
Starting calculating col orders..
Starting plotting HeatmapAnnotations
Collecting annotation legends..
Incresing ncol
Incresing ncol
More than 3 cols is not supported
Legend too long, generating a new column..
如果有很多图例,PyComplexHeatmap会自动将所有的图例按顺序排列,如果一列放不下,就「自动增加一列」,用两列来画图例(figure legends)。如上图所示,就有2列图例。此外,在注释文字(比如Bovidae)与热图之间曲线的形状和颜色都会随着文字的旋转角度和颜色一起变化,会自动调整角度,使之与注释文字的角度相匹配。
#Put annotations on the bottom
col_ha = HeatmapAnnotation(Tissue=anno_simple(df_cols.Tissue,height=5),
Family=anno_simple(df_cols.Family, legend=False,height=6),
label=anno_label(df_cols.Family, merge=True,rotation=-45),
plt.figure(figsize=(7, 4))
cm = ClusterMapPlotter(data=df, bottom_annotation=col_ha,
show_rownames=True, show_colnames=False,row_names_side='right',
col_split=df_cols.Family, cmap='jet', label='AUC',
rasterized=True, legend=True)
Starting calculating row orders..
Starting calculating col orders..
Starting plotting HeatmapAnnotations
Collecting annotation legends..
如果想要把列注释信息放在热图下方,那就需要「改变HeatmapAnnotation的顺序」, anno_label
来将【行】名字row labels
【行】注释(annotation bar)的高度,可以通过height
#Put annotations on the left
row_ha = HeatmapAnnotation(label=anno_label(df_cols.Family, merge=True,rotation=45),
Family=anno_simple(df_cols.Family, legend=False,height=5),
plt.figure(figsize=(4, 7))
cm = ClusterMapPlotter(data=df.T,left_annotation=row_ha,
show_rownames=False, show_colnames=True,col_names_side='top',
row_split=df_cols.Family, cmap='exp1', label='AUC',
rasterized=True, legend=True,
Starting calculating row orders..
Starting calculating col orders..
Starting plotting HeatmapAnnotations
Collecting annotation legends..
#Put annotation on the right
row_ha = HeatmapAnnotation(Tissue=df_cols.Tissue,
Family=anno_simple(df_cols.Family, legend=False,height=5),
label=anno_label(df_cols.Family, merge=True,rotation=45),
plt.figure(figsize=(4, 7))
cm = ClusterMapPlotter(data=df.T,right_annotation=row_ha,
show_rownames=False, show_colnames=True,col_names_side='bottom',
row_split=df_cols.Family, cmap='jet', label='AUC',
rasterized=True, legend=True,
#plt.savefig("annotation.pdf", bbox_inches='tight')
Starting calculating row orders..
Starting calculating col orders..
Starting plotting HeatmapAnnotations
Collecting annotation legends..