单细胞数据的二次分群和初次分群没有本质区别,就使用subset(seu.obj, idents = "DC")获取单个细胞亚群的Seurat对象再跑常规的降维聚类分群流程即可。
Seurat里的FindClusters函数设置的resolution数值越大,分群的数量就越多,但是当单细胞数量太多的时候,会遇到resolution再变大,分群的数量也不再增加的情况。一次分群分不开时就会需要二次分群。
rm(list = ls())
library(Seurat)
library(dplyr)
load("../24.5.23scGSVA/seu.obj.Rdata")
p1 = DimPlot(seu.obj, reduction = "umap",label=T)+NoLegend()
p1
这里以树突细胞(DC)为例进行二次分群
核心就是提取感兴趣的亚群的细胞,后面就是标准流程和可视化了,没有区别
sub.cells <- subset(seu.obj, idents = "DC")
f = "obj.Rdata"
if(!file.exists(f)){
sub.cells = sub.cells %>%
NormalizeData() %>%
FindVariableFeatures() %>%
ScaleData(features = rownames(.)) %>%
RunPCA(features = VariableFeatures(.)) %>%
FindNeighbors(dims = 1:15) %>%
FindClusters(resolution = 0.5) %>%
RunUMAP(dims = 1:15)
save(sub.cells,file = f)
}
load(f)
DimPlot(sub.cells, reduction = 'umap',label = T)+NoLegend()
sub.cells.markers <- FindAllMarkers(sub.cells, only.pos = TRUE,
min.pct = 0.25, logfc.threshold = 0.25)
top10 <- sub.cells.markers %>%
group_by(cluster) %>%
top_n(n = 10, wt = avg_log2FC) %>%
pull(gene);top10
## [1] "JCHAIN" "IGKC" "MZB1" "PACSIN1" "WNT10A" "MAP1A" "VASH2"
## [8] "NIBAN3" "SMPD3" "TNFRSF4" "LYZ" "TIMP1" "GPAT3" "ITGAX"
## [15] "SAMSN1" "OLR1" "FPR3" "EREG" "FCN1" "AKAP12"
VlnPlot(sub.cells, features = top10)
RidgePlot(sub.cells, features = top10)
FeaturePlot(sub.cells, features = top10)
DotPlot(sub.cells,features = top10)+ RotatedAxis()
DoHeatmap(sub.cells, features = top10) + NoLegend()
上面的umap图是感兴趣的单独的展示,也可以把它放回原来的seurat对象里。
sub.cells@meta.data$celltype = paste0("M",sub.cells$seurat_clusters)
seu.obj$celltype = as.character(Idents(seu.obj))
seu.obj$celltype = ifelse(seu.obj$celltype=="DC",
sub.cells$celltype[match(colnames(seu.obj),colnames(sub.cells))],
seu.obj$celltype)
Idents(seu.obj) = seu.obj$celltype
p2 = DimPlot(seu.obj,label = T)+NoLegend()
p1+p2