我有一个很宽的数据集,如下所示:
dataset <- data.frame(id = c(1, 2, 3, 4, 5),
basketball.time1 = c(2, 5, 4, 3, 3),
basketball.time2 = c(3, 4, 5, 3, 2),
basketball.time3 = c(1, 8, 4, 3, 1),
volleyball.time1 = c(2, 3, 4, 0, 1),
volleyball.time2 = c(3, 4, 3, 1, 3),
volleyball.time3 = c(1, 8, 12, 2, 3))
我想要的是长格式的数据集,将id
、time
、basketball
和volleyball
作为单独的变量。我希望使用由“”分隔的字符串创建time
列,其中包含三个因素(time1、time2和time3)。在篮球和排球专栏的末尾。
非常感谢!
编辑:固定排版
发布于 2021-12-30 08:31:59
我们可以使用pivor_longer %>% pivot_wider
。如果我们将适当的参数设置为separate
,则不需要pivor_longer
。
library(tidyr)
dataset %>%
pivot_longer(cols = matches('time\\d+$'), names_to = c('sport', 'time'), names_pattern = '(.*)\\.(.*)') %>%
pivot_wider(names_from = sport, values_from = value)
# A tibble: 15 × 5
id time basketball volleyball vollyeball
<dbl> <chr> <dbl> <dbl> <dbl>
1 1 time1 2 2 NA
2 1 time2 3 3 NA
3 1 time3 1 NA 1
4 2 time1 5 3 NA
5 2 time2 4 4 NA
6 2 time3 8 NA 8
7 3 time1 4 4 NA
8 3 time2 5 3 NA
9 3 time3 4 NA 12
10 4 time1 3 0 NA
11 4 time2 3 1 NA
12 4 time3 3 NA 2
13 5 time1 3 1 NA
14 5 time2 2 3 NA
15 5 time3 1 NA 3
发布于 2021-12-30 08:30:45
一种可能的解决办法:
library(tidyverse)
dataset <- data.frame(id = c(1, 2, 3, 4, 5),
basketball.time1 = c(2, 5, 4, 3, 3),
basketball.time2 = c(3, 4, 5, 3, 2),
basketball.time3 = c(1, 8, 4, 3, 1),
volleyball.time1 = c(2, 3, 4, 0, 1),
volleyball.time2 = c(3, 4, 3, 1, 3),
vollyeball.time3 = c(1, 8, 12, 2, 3))
dataset %>%
pivot_longer(cols = -id) %>%
separate(name,into = c("name", "time")) %>%
pivot_wider(id_cols = c(id, name, time))
#> # A tibble: 15 × 5
#> id time basketball volleyball vollyeball
#> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 1 time1 2 2 NA
#> 2 1 time2 3 3 NA
#> 3 1 time3 1 NA 1
#> 4 2 time1 5 3 NA
#> 5 2 time2 4 4 NA
#> 6 2 time3 8 NA 8
#> 7 3 time1 4 4 NA
#> 8 3 time2 5 3 NA
#> 9 3 time3 4 NA 12
#> 10 4 time1 3 0 NA
#> 11 4 time2 3 1 NA
#> 12 4 time3 3 NA 2
#> 13 5 time1 3 1 NA
#> 14 5 time2 2 3 NA
#> 15 5 time3 1 NA 3
发布于 2021-12-30 08:31:35
pivot_longer
separate
in sport
和time
columnpivot_wider
sport
列library(dplyr)
library(tidyr)
dataset %>%
pivot_longer(
-id
) %>%
separate(name, c("sport", "time")) %>%
pivot_wider(
names_from = sport
)
id time basketball volleyball vollyeball
<dbl> <chr> <dbl> <dbl> <dbl>
1 1 time1 2 2 NA
2 1 time2 3 3 NA
3 1 time3 1 NA 1
4 2 time1 5 3 NA
5 2 time2 4 4 NA
6 2 time3 8 NA 8
7 3 time1 4 4 NA
8 3 time2 5 3 NA
9 3 time3 4 NA 12
10 4 time1 3 0 NA
11 4 time2 3 1 NA
12 4 time3 3 NA 2
13 5 time1 3 1 NA
14 5 time2 2 3 NA
15 5 time3 1 NA 3
https://stackoverflow.com/questions/70534149
复制