首页
学习
活动
专区
圈层
工具
发布
社区首页 >专栏 >R tips:使用最近邻算法进行空间浸润带的计算

R tips:使用最近邻算法进行空间浸润带的计算

作者头像
生信菜鸟团
发布2025-02-18 14:46:51
发布2025-02-18 14:46:51
3970
举报
文章被收录于专栏:生信菜鸟团生信菜鸟团

本文使用最近邻算法进行浸润带的计算。

空间组学中,有的时候需要对免疫浸润带进行特定距离的划分,形成一层一层的浸润区域。

以10X的官方xenium示例数据为例https://www.10xgenomics.com/datasets/ffpe-human-breast-with-pre-designed-panel-1-standard。

圈选ROI并计算浸润边界

下载的数据使用Xenium explorer打开,然后找到需要进行计算浸润带的位置,并根据方向将相应的全部选中。

如下图所示,假设中间的位置是需要进行浸润带计算的位置,而需要计算浸润带的方向是向下,则在Xenium explorer中选择套索工具仔细的圈画浸润边界,并将浸润带计算方向上的所有细胞选中。

然后在Xenium explorer中将图示的ROI区域的边界坐标下载下来。

由于Xenium explorer无法圈画单条线,至少也是一个闭合区域,因此需要先计算好浸润边界的位置:

  1. library(tidyverse)
  2. library(sp)
  3. # 读取ROI的边界坐标及细胞的质心坐标
  4. tumor_boundary <- read_csv("data/coordinates.csv", skip = 2)
  5. cells_position <- arrow::read_parquet("Xenium_V1_FFPE_Human_Breast_IDC/cells.parquet")
  6. # 然后根据ROI边界坐标,将细胞分配到ROI中
  7. cell_idx <-
  8. sp::point.in.polygon(
  9. cells_position$x_centroid, # point
  10. cells_position$y_centroid, # point
  11. tumor_boundary$X, # polygon
  12. tumor_boundary$Y
  13. ) %>%
  14. as.logical # only 0 is FALSE, all others is TRUE
  15. # 于是tumor_area_1即是ROI中的细胞,图示中的所有细胞
  16. # tumor_area_2是剩余细胞,也就是图示中的上方未被框选中的细胞
  17. tumor_area_1 <- cells_position[cell_idx, ] %>% mutate(x = x_centroid, y = y_centroid )
  18. tumor_area_2 <- cells_position[!cell_idx, ] %>% mutate(x = x_centroid, y = y_centroid )

获得了浸润边界的两组细胞之后,就可以进行浸润边界的计算:

  1. # 根据tumor_area_1和tumor_area_2找到绘制的tumor boundary
  2. # 寻找1和2中的最近邻点,以20um(大概一个细胞,根据效果调整)为限度,
  3. # 如果没有nn点则不是边界点。
  4. # 找到1和2中的边界点,取均值即是最终的边界点
  5. # dat: tumor_area_1
  6. # query: tumor_area_2
  7. nn_left_right <-
  8. RANN::nn2(
  9. tumor_area_1[2:3],
  10. tumor_area_2[2:3],
  11. k = 10,
  12. searchtype = 'radius',
  13. radius = 20 # 若边界未能找到,则调整radius的值
  14. )
  15. non_boundary_idx <- apply(nn_left_right$nn.idx == 0, 1, all)
  16. if(sum(non_boundary_idx) == nrow(tumor_area_2)){
  17. print("radius is small, no boundary point.")
  18. }
  19. boundary_idx_1 <- nn_left_right$nn.idx[! non_boundary_idx, 1]
  20. boundary_1 <- tumor_area_1[boundary_idx_1, ]
  21. boundary_idx_2 <- ! non_boundary_idx
  22. boundary_2 <- tumor_area_2[boundary_idx_2, ]
  23. # comuted boudary coords
  24. boundary_coords_df <-
  25. data.frame(
  26. x = (boundary_1x_centroid + boundary_2x_centroid) / 2,
  27. y = (boundary_1y_centroid + boundary_2y_centroid) / 2
  28. )

绘图展示,计算的圈画的浸润边界位置:

  1. # 只展示浸润边界附近区域的细胞
  2. boundary_df_for_plot <-
  3. boundary_coords_df %>%
  4. .[chull(.), ] %>%
  5. map(range) %>%
  6. map(~ .x + c(-500, 500))
  7. # 定义用于筛选细胞的函数
  8. filter_points <- function(dat, boundary){
  9. x_idx <- datx_centroid < boundary
  10. y_idx <- daty_centroid < boundary
  11. dat[x_idx & y_idx, ]
  12. }
  13. # 绘图
  14. ggplot() +
  15. geom_point(
  16. # left area
  17. data = tumor_area_1 %>% dplyr::slice_sample(prop = 0.1) %>% filter_points(boundary_df_for_plot), # 仅绘制10%的点,加快绘制速度
  18. aes(x = x_centroid, y = -y_centroid),
  19. color = "gray"
  20. ) +
  21. geom_point(
  22. # right area
  23. data = tumor_area_2 %>% dplyr::slice_sample(prop = 0.1) %>% filter_points(boundary_df_for_plot), # 仅绘制10%的点,加快绘制速度
  24. aes(x = x_centroid, y = -y_centroid),
  25. color = "gray50"
  26. ) +
  27. geom_point(
  28. # comute left_boundary
  29. data = boundary_1,
  30. aes(x = x, y = -y),
  31. color = "red",
  32. size = 0.1
  33. ) +
  34. geom_point(
  35. # compute right_boundary
  36. data = boundary_2,
  37. aes(x = x, y = -y),
  38. color = "blue",
  39. size = 0.1
  40. ) +
  41. geom_point(
  42. # the final tumor boundary
  43. data = boundary_coords_df,
  44. aes(x = x, y = -y),
  45. color = "green",
  46. size = 0.1
  47. ) +
  48. theme_void()

如下图所示,计算的浸润边界是绿色点,用于计算浸润边界的上下边界配对点是红蓝色点。

使用最近邻算法往下寻找浸润区域

假设需要以250um为单位,分别找到250um 500um及750um的浸润区域,则可如下操作:

先定义一个最近邻的工具函数:

  1. # reduceFindNN find all (k) nn points in a certain radius,
  2. # k would be increased when k is not enough for a certain radius
  3. reduceFindNN <- function(dat, k = 10, query = NULL, searchtype = "radius", radius = 100, mpp = 1, max_k = 2000){
  4. if(is.null(query)){
  5. query <- dat
  6. }
  7. stopifnot(all(c('x', 'y') %in% colnames(dat)))
  8. stopifnot(all(c('x', 'y') %in% colnames(query)))
  9. dat <- dat[c('x', 'y')]
  10. query <- query[c('x', 'y')]
  11. nn_dat <- RANN::nn2(
  12. dat,
  13. query = query,
  14. k = k,
  15. radius = radius / mpp # radius, um -> pixel
  16. )
  17. k_enough <- sum(apply(nn_dat
  18. cat(str_glue("k is {k} and is enough for {k_enough * 100}% cells.\n\n"))
  19. while(k < max_k && k_enough < 0.99){
  20. if(k < 700){
  21. k <- k + 100
  22. }else if(k < 2000){
  23. k <- k + 200
  24. }else if(k < 5000){
  25. k <- k + 500
  26. }else{
  27. k <- k + 1000
  28. }
  29. nn_dat <- RANN::nn2(
  30. dat,
  31. query = query,
  32. searchtype = searchtype,
  33. k = k,
  34. radius = radius / mpp # radius, um -> pixel
  35. )
  36. k_enough <- sum(apply(nn_dat
  37. cat(str_glue("k is {k} and is enough for {k_enough * 100}% cells.\n\n"))
  38. }
  39. return(nn_dat)
  40. }

浸润带的计算:

  1. radius_ls <- c(250, 500, 750) %>% set_names(., .)
  2. coords_tumor_infiltration_ls <-
  3. radius_ls %>%
  4. map(function(radius){
  5. dat_nn <-
  6. reduceFindNN(
  7. dat = tumor_area_1, # %>% rbind(tumor_boundary_coords) %>% distinct(),
  8. query = boundary_coords_df, # tumor_boundary_coords,
  9. radius = radius,
  10. mpp = 1,
  11. k = 100,
  12. max_k = nrow(tumor_area_1) # 2000
  13. )
  14. # all nn points is infiltration points
  15. points_idx <- dat_nn$nn.idx %>% c() %>% .[. != 0] %>% unique
  16. # infiltration area
  17. coords <- tumor_area_1[points_idx, ]
  18. coords
  19. })

750um浸润带减去500um浸润带就是500-750um浸润带,以此类推:

  1. # 当前浸润带减去上一层的浸润带
  2. coords_tumor_infiltration_ls2 <- list()
  3. for(i in rev(seq_along(coords_tumor_infiltration_ls))){
  4. name <- names(coords_tumor_infiltration_ls)[i]
  5. if(i == 1){
  6. coords_tumor_infiltration_ls2[[name]] <- coords_tumor_infiltration_ls[[1]]
  7. return()
  8. }
  9. present_df <- coords_tumor_infiltration_ls[[i]]
  10. next_df <- coords_tumor_infiltration_ls[[i-1]]
  11. coords_tumor_infiltration_ls2[[name]] <- present_df[!present_df
  12. }
  13. # 获得浸润带 ----------
  14. coords_tumor_infiltration_df <-
  15. coords_tumor_infiltration_ls2 %>%
  16. enframe(name = "radius") %>%
  17. unnest(value)

获得浸润区域如下:

  1. p_infile_full <-
  2. cells_position %>%
  3. ggplot(aes(x = x_centroid , y = -y_centroid)) +
  4. geom_point() +
  5. geom_point(
  6. data = boundary_coords_df,
  7. aes(x = x, y = -y),
  8. color = "red",
  9. size = 0.5
  10. ) +
  11. geom_point(
  12. data = coords_tumor_infiltration_df,
  13. aes(x = x, y = -y, color = radius),
  14. # color = "blue",
  15. alpha = 0.5,
  16. size = 0.5
  17. ) +
  18. theme_void()
  19. p_infile_subset <-
  20. cells_position %>%
  21. dplyr::slice_sample(prop = 0.1) %>%
  22. filter_points(boundary_df_for_plot) %>%
  23. ggplot(aes(x = x_centroid , y = -y_centroid)) +
  24. geom_point() +
  25. geom_point(
  26. data = boundary_coords_df,
  27. aes(x = x, y = -y),
  28. color = "red",
  29. size = 0.5
  30. ) +
  31. geom_point(
  32. data = coords_tumor_infiltration_df,
  33. aes(x = x, y = -y, color = radius),
  34. # color = "blue",
  35. alpha = 0.5,
  36. size = 0.5
  37. ) +
  38. theme_void()

如下图所示,浸润边界的浸润带展示:

全图展示的浸润带:

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2025-02-15,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 生信菜鸟团 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 圈选ROI并计算浸润边界
  • 使用最近邻算法往下寻找浸润区域
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档