我正在尝试创建一个新列,以显示在我的dataframe中的两个列中是否存在跨字符串的匹配。This question几乎是我所要求的,但是我不想过滤,而是创建一个新的列来显示是否有匹配(真还是假)。
下面是一个示例dataframe:
transcript target
he saw the dog saw
she gave them it gave
watch out for danger
real bravery brave我想要创建一个新的列,显示两者之间的任何匹配:
transcript target match
he saw the dog saw T
she gave them it gave T
watch out for danger F
real bravery brave T我更喜欢使用dplyr(),但我愿意接受其他建议!
发布于 2018-08-13 11:50:24
使用stringr::str_detect,我们可以检查transcript是否包含target
library(stringr)
library(dplyr)
df %>% mutate_if(is.factor, as.character) %>% #If transcript and target are character class in your df then no need to this step
mutate(match = str_detect(transcript,target))
transcript target match
1 he saw the dog saw TRUE
2 she gave them it gave TRUE
3 watch out for danger FALSE
4 real bravery brave TRUE发布于 2018-08-13 11:59:24
您要求使用dplyr方法,但这里还有一个使用grepl的基R方法
df1$match <- mapply(grepl, df1$target, df1$transcript)
df1
transcript target match
1 he saw the dog saw TRUE
2 she gave them it gave TRUE
3 watch out for danger FALSE
4 real bravery brave TRUE在dplyr变体语句中使用grepl:
df1 %>%
mutate(match = mapply(grepl, target, transcript))
transcript target match
1 he saw the dog saw TRUE
2 she gave them it gave TRUE
3 watch out for danger FALSE
4 real bravery brave TRUE发布于 2018-08-13 12:15:29
可以使用dplyr::rowwise()和grepl创建match列,如下所示:
library(dplyr)
df %>% rowwise() %>%
mutate(match = grepl(target,transcript)) %>%
as.data.frame()
# transcript target match
# 1 he saw the dog saw TRUE
# 2 she gave them it gave TRUE
# 3 watch out for danger FALSE
# 4 real bravery brave TRUE数据:
df <- read.table(text =
"transcript target
'he saw the dog' saw
'she gave them it' gave
'watch out for' danger
'real bravery' brave",
header = TRUE, stringsAsFactors = FALSE)https://stackoverflow.com/questions/51821601
复制相似问题