我想在data.table中为组g中的每一行绘制Y。为了说明这一点,请使用这个简单的模拟:
set.seed(123)
N = 50
g = 3
DT = data.table(id = rep(1:N,g),
group_id = sort(rep(1:g, N)),
p = runif(150, min = 0, max = 1)
)
DT[, Y := rbinom(n = .N, size = 1, prob = p), by = group_id]
DTs = split(DT, by = "group_id")
DTs = rbindlist(DTs[1], idcol=T)
DTs[, Y := rbinom(n = .N, size = 1, prob = p)]
head(DT)
head(DTs)
为什么DT和DT的Y不同?我认为这是等同的。
> head(DT)
id group_id p Y
1: 1 1 0.2875775 1
2: 2 1 0.7883051 1
3: 3 1 0.4089769 0
4: 4 1 0.8830174 1
5: 5 1 0.9404673 1
6: 6 1 0.0455565 0
> head(DTs)
.id id group_id p Y
1: 1 1 1 0.2875775 1
2: 1 2 1 0.7883051 1
3: 1 3 1 0.4089769 1
4: 1 4 1 0.8830174 1
5: 1 5 1 0.9404673 1
6: 1 6 1 0.0455565 0
发布于 2020-09-05 20:23:02
每次执行随机抽奖时,随机数生成器都会被打乱,因此在第二次二项式抽奖时,它处于不同的状态。如果您想要相同的抽奖,则需要确保在抽签之前设置种子:
set.seed(123)
N = 50
g = 3
DT = data.table(id = rep(1:N,g),
group_id = sort(rep(1:g, N)),
p = runif(150, min = 0, max = 1) # RNG changes state
)
# Specify seed for binomial draw
set.seed(456)
DT[, Y := rbinom(n = .N, size = 1, prob = p), by = group_id] # RNG changes state
DTs = split(DT, by = "group_id")
DTs = rbindlist(DTs[1], idcol=T)
# If we want the same draw, we need to set the seed to the same state
set.seed(456)
DTs[, Y := rbinom(n = .N, size = 1, prob = p)]
给你
head(DT)
#> id group_id p Y
#> 1: 1 1 0.2875775 0
#> 2: 2 1 0.7883051 1
#> 3: 3 1 0.4089769 1
#> 4: 4 1 0.8830174 1
#> 5: 5 1 0.9404673 1
#> 6: 6 1 0.0455565 0
head(DTs)
#> .id id group_id p Y
#> 1: 1 1 1 0.2875775 0
#> 2: 1 2 1 0.7883051 1
#> 3: 1 3 1 0.4089769 1
#> 4: 1 4 1 0.8830174 1
#> 5: 1 5 1 0.9404673 1
#> 6: 1 6 1 0.0455565 0
https://stackoverflow.com/questions/63753773
复制相似问题