我有一个来自活动监视器的数据,我需要计算睡眠和清醒的时间长度。对于每个5分钟的时间点,我都会标记出受试者在这段时间内是睡着还是醒着。我不知道如何计算动物在“状态”列切换到觉醒之前持续睡了多长时间(也就是,如果连续有两行“唤醒”,然后状态又切换回“睡眠”,我想让R告诉我受试者在8:00-8:05和8:05-8:10这段时间内是清醒的)。
最后,我想要说的是,在一整天的时间里,清醒和睡眠的平均时间是多少?
非常感谢!下面是我的df示例。
timeofday<-c("8:00","8:05","8:10","8:20","8:25")
activity<-c(1250,1650,200,100,40)
state<-c("awake","awake","sleep","sleep","sleep")
data_frame(timeofday,state,activity
发布于 2021-10-15 04:56:13
我不确定您正在寻找什么样的输出,但您可以尝试这种方法-
library(dplyr)
library(lubridate)
df %>%
mutate(timeperiod = hm(timeofday),
group = data.table::rleid(state)) %>%
group_by(group) %>%
summarise(timeofday = paste(first(timeofday), last(timeofday), sep = '-'),
state = first(state),
timeperiod = last(timeperiod) - first(timeperiod))
# group timeofday state timeperiod
# <int> <chr> <chr> <Period>
#1 1 8:00-8:05 awake 5M 0S
#2 2 8:10-8:25 sleep 15M 0S
发布于 2021-10-15 06:52:33
下面是一种data.table
方法
library(data.table)
setDT(mydata)
mydata[, timeofday := as.ITime(timeofday)]
# create end times, or else you will get gaps
mydata[, timeofday_2 := shift(timeofday, type = "lead")]
mydata[is.na(timeofday_2), timeofday_2 := timeofday]
# summarise
mydata[, .(state = state[1],
from = min(timeofday),
to = max(timeofday_2)),
by = .(period = rleid(state))]
# period state from to
# 1: 1 awake 08:00:00 08:10:00
# 2: 2 sleep 08:10:00 08:25:00
发布于 2021-10-15 07:27:29
Base R解决方案:
# Coerce the time vector to appropriate type: timeofday => POSIXlt vector
df$timeofday <- strptime(
df$timeofday,
"%H:%M"
)
# Numericise the state of the animal: state_no => integer vector
df$state_no <- as.integer(
factor(
df$state,
levels = c("awake", "rest", "sleep"),
ordered = TRUE
)
)
# Group the patient state: grp => integer vector
df$grp <- with(
df[order(timeofday),],
cumsum(c(0, abs(diff(state_no)) > 0))
)
# Split-apply-combine to calculate the start and end times of each
# event: res => data.frame
res <- data.frame(do.call(rbind, lapply(with(df, split(df, grp)), function(x){
data.frame(
event = unique(x$grp),
state = unique(x$state),
start_time = format(min(x$timeofday, na.rm = TRUE), "%H:%M"),
end_time = format(max(x$timeofday, na.rm = TRUE), "%H:%M"),
duration_in_approx_5_mins = difftime(
max(x$timeofday, na.rm = TRUE),
min(x$timeofday, na.rm = TRUE),
units = "mins"
)
)
}
)
),
stringsAsFactors = FALSE,
row.names = NULL
)
# Print the result: data.frame => stdout(console)
res
https://stackoverflow.com/questions/69580104
复制相似问题