嗨,我现在有70个工作表在同一个excel工作簿下,我试图根据数据找到每日报表,然后将所有70个工作表合并到一个数据框架中。所有70张纸上都有日期和公开价格。到目前为止,这是软件R的代码
require(XLConnect)
wb <- loadWorkbook(system.file("crypto.xlsx", package = "XLConnect"))
crypto = readWorksheet(wb, sheet = getSheets(wb),startRow=1,endRow = 20, endCol=2)
我希望找到每个工作表的返回,然后将所有70页合并到一个数据帧中。然而,时间周期是不同的,在所有这些床单,我找不到合并他们的方法。在最后的数据文件中,我希望变量Date, Price1, Dailyreturn1, Price2, DailyReturn2 ......... Price70, DailyReturn70
能够帮助返回和合并。
下面是我的数据与更多行的样子,所有70张表的数据格式是相同的。
Date Open High Low Close Market Cap
28-Apr-13 135.3 135.98 132.1 134.21 1,500,520,000
29-Apr-13 134.44 147.49 134 144.54 1,491,160,000
30-Apr-13 144 146.93 134.05 139 1,597,780,000
1-May-13 139 139.89 107.72 116.99 1,542,820,000
我希望最终的结果看起来有点像这样,所有70张纸加在一起。医生负责每日回程
Date Open1 DR1 Open2 DR2 Open3 DR3............Open70 DR70
发布于 2017-11-27 20:57:20
假设链合并与Reduce
保持所有x行(即左联接)。但是,第一次更新列表可以重命名列,这可以在lapply
中用setNames
完成。
# ASSUME ALL DFs ARE ONLY Date and Open
crypto2 <- lapply(seq_along(crypto), function(i)
setNames(crypto[[i]], c("Date", paste0("DR", i))))
finaldf <- Reduce(function(x, y) merge(x, y, by="Date", all.x=TRUE), crypto2)
注意:如果您的数据具有不同的开放值的重复日期,则要非常小心这种Reduce
方法,因为merge
将快速增长,因为每个联接组合都将被返回。
带有随机数据的示例(聚合用于只返回上述警告的唯一日期)
set.seed(123)
crypto <- lapply(seq(10), function(i)
aggregate(Open~Date, data.frame(
Date = Sys.Date()- sapply(seq(30), function(i) sample(15,1,replace=TRUE)),
Open = abs(rnorm(30)*100)
), FUN=max))
crypto2 <- lapply(seq_along(crypto), function(i)
setNames(crypto[[i]], c("Date", paste0("DR", i))))
finaldf <- Reduce(function(x, y) merge(x, y, by="Date", all.x=TRUE), crypto2)
输出
finaldf
# Date DR1 DR2 DR3 DR4 DR5 DR6 DR7 DR8 DR9 DR10
# 1 2017-11-12 70.92008 56.29895 37.56029 136.403745 100.56363 NA 138.80490 28.73408 64.75134 195.07210
# 2 2017-11-13 102.55714 212.84519 190.23618 199.274849 127.22668 NA 151.60676 NA 188.63252 90.35164
# 3 2017-11-15 44.82098 105.27115 135.98407 101.755864 123.56935 123.66758 107.49451 NA 91.35660 45.73573
# 4 2017-11-16 22.04866 36.89645 63.07541 151.921771 43.88187 103.96804 168.75895 144.14618 90.44355 NA
# 5 2017-11-17 122.07177 37.45809 229.30790 175.323736 88.33028 35.88557 84.90611 72.82191 23.11149 70.10036
# 6 2017-11-18 109.68390 NA 159.85088 62.418747 163.35684 116.52534 154.66092 90.42150 33.33700 79.76007
# 7 2017-11-19 205.00847 NA 78.86220 246.589819 NA 60.85572 NA 245.80605 85.52022 184.32663
# 8 2017-11-20 114.88076 37.24388 154.75811 72.160444 71.27033 60.53707 107.40123 11.88494 131.82207 76.46060
# 9 2017-11-21 33.32074 126.01552 50.21987 257.145815 222.49877 50.63335 208.67174 106.10953 115.29362 148.37795
# 10 2017-11-22 230.91689 21.53805 11.36399 97.400958 239.74525 96.72673 NA 58.99827 216.14158 122.09637
# 11 2017-11-24 33.17820 104.91770 66.47694 NA 45.03386 142.05655 260.16997 154.04424 NA 103.81310
# 12 2017-11-25 37.96395 97.69734 52.11173 57.185006 163.63793 18.43089 141.52819 154.77765 65.80102 109.95509
# 13 2017-11-26 37.06600 76.90422 175.65274 3.303616 177.95029 194.58512 87.52555 118.71868 39.43948 111.64164
https://stackoverflow.com/questions/47501307
复制相似问题