从本质上讲,我有一个由列组成的数据帧: a、b、c、d、e、f、g、h
我想要回归
gls(a~h)
gls(b~h)
.
.
.
gls(g~h)并将这些回归保存到包含每个回归的系数和残差的列表中。我计划使用以下命令来提取它们:
sapply(list, coef,)
sapply(list, residuals,)但是,我的数据集中有数千列。我如何在r中使用循环来做这件事?
发布于 2020-09-29 07:32:25
使用列表和循环尝试这种方法(我已经使用了虚拟数据)。此外,函数gls需要公式,因此您可以巧妙地将as.formula()与相应的名称一起使用。代码如下:
library(nlme)
#Data
df <- data.frame(a=rnorm(10,1,2),
b=rnorm(10,1,5),
c=rnorm(10,1,10),
d=rnorm(10,1,1),
e=rnorm(10,1,8),
f=rnorm(10,2,2),
g=rnorm(10,1,9),
h=rpois(10,0.8))
#Code
index <- which(names(df)!='h')
#List
List <- list()
#Loop
for(i in index)
{
#name covariates
v1 <- names(df[,i,drop=F])
v2 <- 'h'
v3 <- as.formula(paste0(v1,'~',v2))
List[[i]] <- gls(v3,data=df)
names(List)[i] <- v1
}
List输出:
List
$a
Generalized least squares fit by REML
Model: v3
Data: df
Log-restricted-likelihood: -19.74037
Coefficients:
(Intercept) h
1.7981345 0.4395727
Degrees of freedom: 10 total; 8 residual
Residual standard error: 2.252625
$b
Generalized least squares fit by REML
Model: v3
Data: df
Log-restricted-likelihood: -22.96146
Coefficients:
(Intercept) h
1.562261 -3.610542
Degrees of freedom: 10 total; 8 residual
Residual standard error: 3.36939
$c
Generalized least squares fit by REML
Model: v3
Data: df
Log-restricted-likelihood: -27.98972
Coefficients:
(Intercept) h
-2.839797 3.759619
Degrees of freedom: 10 total; 8 residual
Residual standard error: 6.31713
$d
Generalized least squares fit by REML
Model: v3
Data: df
Log-restricted-likelihood: -12.75595
Coefficients:
(Intercept) h
1.0048859 0.7997716
Degrees of freedom: 10 total; 8 residual
Residual standard error: 0.9408642
$e
Generalized least squares fit by REML
Model: v3
Data: df
Log-restricted-likelihood: -30.02588
Coefficients:
(Intercept) h
-0.08323116 -0.11157127
Degrees of freedom: 10 total; 8 residual
Residual standard error: 8.148097
$f
Generalized least squares fit by REML
Model: v3
Data: df
Log-restricted-likelihood: -19.21926
Coefficients:
(Intercept) h
2.7446886 -0.1064889
Degrees of freedom: 10 total; 8 residual
Residual standard error: 2.110568
$g
Generalized least squares fit by REML
Model: v3
Data: df
Log-restricted-likelihood: -29.19166
Coefficients:
(Intercept) h
8.483339 -4.746078
Degrees of freedom: 10 total; 8 residual
Residual standard error: 7.341237 发布于 2020-09-29 14:27:27
使用reformulate。
rg <- names(d)[!names(d) %in% "h"]
setNames(lapply(rg, function(x)
lm(reformulate(x, "h"), d)[c("coefficients", "residuals")]), rg)
# $a
# $a$coefficients
# (Intercept) a
# -0.1397861 1.5754488
#
# $a$residuals
# 1 2 3 4 5 6 7
# 0.39570210 -1.23447278 1.18733412 1.54241624 -1.48755817 2.11774373 -1.67997168
# 8 9 10 11 12 13 14
# 0.73094508 0.86101269 1.73417911 0.51527327 -1.82142311 4.82370202 0.77177476
# 15 16 17 18 19 20
# -0.84239625 -1.94642481 -0.16537106 -0.08808104 -4.66894171 -0.74544252
#
#
# $b
# $b$coefficients
# (Intercept) b
# 0.5718898 1.5104357
#
# $b$residuals
# 1 2 3 4 5 6
# 2.307058939 -0.145249746 1.307418592 -0.006905091 -4.424897936 1.889070097
# 7 8 9 10 11 12
# 0.378266810 2.533283303 2.634312498 1.890371544 1.171424560 0.004782197
# 13 14 15 16 17 18
# 0.360489930 0.540625651 -2.526815303 0.937237885 -0.139997905 -3.699625069
# 19 20
# -5.578942652 0.568091696
#
#
# $c
# $c$coefficients
# (Intercept) c
# 0.163988 1.552544
#
# $c$residuals
# 1 2 3 4 5 6 7
# 1.9319809 -1.8673426 0.2785686 3.3639259 0.9698881 0.9748069 1.6573032
# 8 9 10 11 12 13 14
# -1.9639900 4.4070009 0.3136801 1.7674514 2.6942398 -0.1145370 -0.9693461
# 15 16 17 18 19 20
# -1.4955686 -1.6776488 -1.9715968 -4.7164340 -4.1706428 0.5882611
#
#
# $d
# $d$coefficients
# (Intercept) d
# -0.007487103 1.058906754
#
# $d$residuals
# 1 2 3 4 5 6 7
# 2.8121452 -2.4525667 1.0110283 0.9249691 -0.2128186 0.4389798 0.2134230
# 8 9 10 11 12 13 14
# -0.6501655 2.9336712 0.7397345 3.5432953 1.7442696 1.8430766 1.2099507
# 15 16 17 18 19 20
# -0.6099311 -1.6920376 -1.5589256 -4.8965761 -7.7081168 2.3665949 数据:
set.seed(42)
d <- setNames(data.frame(matrix(rnorm(20*4), 20, 4)), letters[1:4])
d$h <- with(d, a + b + c + d + rnorm(20))https://stackoverflow.com/questions/64110754
复制相似问题