我正在尝试从调查数据创建表,但是对于我需要创建的所有表,我想出的解决方案是无法管理的。
我对不同的人群、政党和他们对某些问题的看法进行了调查。下面是示例数据和我的(几乎)工作繁琐的解决方案。我已经在"ideal.table“data.frame中包含了我正在寻找的解决方案(如下所示)
pop <- c("elite", "elite", "public", "public", "public", "public")
party <- c("D", "R", "R", "D", "D", "R")
opinion <- c("pro", "con", "pro", "con", "pro", "pro")
df <- data.frame(pop, party, opinion)
party.table <- prop.table(table(df[df$pop=="public",][["party"]], df[df$pop=="public",][["opinion"]]),2)
elite.table <- prop.table(table(df[df$pop=="elite",][["opinion"]]))
public.table <- prop.table(table(df[df$pop=="public",][["opinion"]]))
group <- c("R", "D", "elite", "public")
percent.pro <- c(0.3, 0.6, 0.5, 0.75)
percent.con <- c(0.7, 0.4, 0.5, 0.25)
ideal.table <- data.frame(group, percent.pro, percent.con)
library(dplyr)
library(tidyr)
# create data frames from tables
x = data.frame(elite.table)
names(x) = c("elite","value")
y = data.frame(party.table) %>% spread(Var2,Freq)
names(y)[1] = "group"
z = data.frame(public.table)
names(z)[1] = "group"
# join data frames
x %>% inner_join(y, by="group") %>% inner_join(z, by="group")
我还没有找到这方面的解决方案,但即使我找到了这个特定数据集的解决方案,有时我也会将多个表与二维表组合在一起,比这里显示的组更多。是否有更好的方法来获得不同数据子集的交叉表比例?
group percent.pro percent.con
1 R 0.30 0.70
2 D 0.60 0.40
3 elite 0.50 0.50
4 public 0.75 0.25
谢谢你的帮助!
发布于 2015-10-04 19:00:14
library(dplyr)
library(tidyr)
df %>%
gather(variable, group, -opinion) %>%
group_by(variable, group) %>%
summarize(percent.pro = sum(opinion == "pro") / n() ) %>%
mutate(percent.com = 1 - percent.pro)
https://stackoverflow.com/questions/32939317
复制相似问题