英国广播公司(British Broadcasting Corporation;BBC
)是全球最大的新闻媒体,其中各类新闻稿件采用的统计图表能很好地传达信息。为了方便清洗可重复数据和绘制图表,BBC数据团队用R对数据进行处理和可视化,经年累月下于去年整理绘图经验并开发了R包-bbplot
,帮助我们画出和BBC新闻中一样好看的图形。
使用pacman[1]
软件包中的p_load
函数通过以下代码一次性加载。
#安装pcaman软件包并对其他R包进行加载
if(!require(pacman))install.packages("pacman")
pacman::p_load('dplyr', 'tidyr', 'gapminder',
'ggplot2', 'ggalt',
'forcats', 'R.utils', 'png',
'grid', 'ggpubr', 'scales',
'bbplot')
bbplot不在CRAN
上,因此必须使用devtools
直接从Github
安装它(编程模板-R语言脚本写作:最简单的统计与绘图,包安装、命令行参数解析、文件读取、表格和矢量图输出)。
# install.packages('devtools')
devtools::install_github('bbc/bbplot')
下载软件包并成功安装后,就可以创建图表了( Science组合图表解读)。
该软件包具有两个函数功能,bbc_style()
和finalise_plot()
。
bbc_style():没有参数,通常是将文本大小、字体和颜色,轴线,轴线文本,边距和许多其他标准图表组件转换为BBC样式。
对于折线图而言,折线的颜色或条形图的颜色,并不是从bbc_style()
函数中直接实现的,而是需要在其他标准ggplot(ggplot2高效实用指南 (可视化脚本、工具、套路、配色))图表函数中明确设置。
下面的代码显示了如何在标准图表制作工作流程中使用bbc_style()
。这是一个非常简单的折线图的示例,使用了gapminder
程序包中的数据。
#Data for chart from gapminder package
line_df <- gapminder %>%
filter(country == "Malawi")
#Make plot
line <- ggplot(line_df, aes(x = year, y = lifeExp)) +
geom_line(colour = "#1380A1", size = 1) +
geom_hline(yintercept = 0, size = 1, colour="#333333") +
bbc_style() +
labs(title="Living longer",
subtitle = "Life expectancy in Malawi 1952-2007")
这是bbc_style()
函数在后台实际执行的操作。它实质上修改了ggplot2
主题功能(ggplot2学习笔记之图形排列)中的某些参数。
例如,第一个参数是设置图标题元素的字体、大小、和字体颜色。
## function ()
## {
## font <- "Helvetica"
## ggplot2::theme(plot.title = ggplot2::element_text(family = font,
## size = 28, face = "bold", color = "#222222"), plot.subtitle = ggplot2::element_text(family = font,
## size = 22, margin = ggplot2::margin(9, 0, 9, 0)), plot.caption = ggplot2::element_blank(),
## legend.position = "top", legend.text.align = 0, legend.background = ggplot2::element_blank(),
## legend.title = ggplot2::element_blank(), legend.key = ggplot2::element_blank(),
## legend.text = ggplot2::element_text(family = font, size = 18,
## color = "#222222"), axis.title = ggplot2::element_blank(),
## axis.text = ggplot2::element_text(family = font, size = 18,
## color = "#222222"), axis.text.x = ggplot2::element_text(margin = ggplot2::margin(5,
## b = 10)), axis.ticks = ggplot2::element_blank(),
## axis.line = ggplot2::element_blank(), panel.grid.minor = ggplot2::element_blank(),
## panel.grid.major.y = ggplot2::element_line(color = "#cbcbcb"),
## panel.grid.major.x = ggplot2::element_blank(), panel.background = ggplot2::element_blank(),
## strip.background = ggplot2::element_rect(fill = "white"),
## strip.text = ggplot2::element_text(size = 22, hjust = 0))
## }
## <environment: namespace:bbplot>
通过向bbc_style()
函数中包含的主题添加额外的主题参数,例如添加一些网格线。
theme(panel.grid.major.x = element_line(color="#cbcbcb"),
panel.grid.major.y=element_blank())
finalise_plot()
是bbplot程序包的第二个函数。它能按照BBC图形的标准将标题和副标题左对齐,在绘图的右下角添加页脚,也可以在左下角添加来源。它还可以将图表保存到指定的位置。该函数有五个参数:
plot_name
would be "line"
source text
that you want to appear at the bottom left corner of your plot. You will need to type the word "Source:"
before it, so for example source = "Source: ONS"
would be the right way to do that.the precise filepath
that you want your graphic to save to, including the .png
extension at the end. This does depend on your working directory and if you are in a specific R project. An example filepath would be: Desktop/R_projects/charts/line_chart.png
.640px by default
, so only call this argument if you want the chart to have a different width, and specify what you want it to be.450px by default
, so only call this argument if you want the chart to have a different height, and specify what you want it to be.a placeholder PNG file
with a background that matches the background colour of the plot, so do not specify the argument if you want it to appear without a logo. If you want to add your own logo
, just specify the path to your PNG file. The package has been prepared with a wide and thin image in mind.在标准工作流程中使用finalise_plot()
的示例:
finalise_plot(plot_name = my_line_plot,
source = "Source: Gapminder",
save_filepath = "filename_that_my_plot_should_be_saved_to.png",
width_pixels = 640,
height_pixels = 450,
logo_image_path = "placeholder.png")
那么如何保存上面创建的示例图?
finalise_plot(plot_name = line,
source = "Source: Gapminder",
save_filepath = "images/line_plot_finalised_test.png",
width_pixels = 640,
height_pixels = 550)
!!前方高能!!一大波图即将“来袭”……
#准备数据
line_df <- gapminder %>%
filter(country == "China")
#作图
line <- ggplot(line_df, aes(x = year, y = lifeExp)) +
geom_line(colour = "#1380A1", size = 1) +
geom_hline(yintercept = 0, size = 1, colour="#333333") +
bbc_style() +
labs(title="Living longer",
subtitle = "Life expectancy in China 1952-2007")
#准备数据
multiple_line_df <- gapminder %>%
filter(country == "China" | country == "United States")
#作图
multiple_line <- ggplot(multiple_line_df, aes(x = year, y = lifeExp, colour = country)) +
geom_line(size = 1) +
geom_hline(yintercept = 0, size = 1, colour="#333333") +
scale_colour_manual(values = c("#FAAB18", "#1380A1")) +
bbc_style() +
labs(title="Living longer",
subtitle = "Life expectancy in China and the US")
#准备数据
bar_df <- gapminder %>%
filter(year == 2007 & continent == "Africa") %>%
arrange(desc(lifeExp)) %>%
head(5)
#作图
bars <- ggplot(bar_df, aes(x = country, y = lifeExp)) +
geom_bar(stat="identity",
position="identity",
fill="#1380A1") +
geom_hline(yintercept = 0, size = 1, colour="#333333") +
bbc_style() +
labs(title="Reunion is highest",
subtitle = "Highest African life expectancy, 2007")
#准备数据
stacked_df <- gapminder %>%
filter(year == 2007) %>%
mutate(lifeExpGrouped = cut(lifeExp,
breaks = c(0, 50, 65, 80, 90),
labels = c("Under 50", "50-65", "65-80", "80+"))) %>%
group_by(continent, lifeExpGrouped) %>%
summarise(continentPop = sum(as.numeric(pop)))
#set order of stacks by changing factor levels
stacked_df$lifeExpGrouped = factor(stacked_df$lifeExpGrouped, levels = rev(levels(stacked_df$lifeExpGrouped)))
#作图
stacked_bars <- ggplot(data = stacked_df,
aes(x = continent,
y = continentPop,
fill = lifeExpGrouped)) +
geom_bar(stat = "identity",
position = "fill") +
bbc_style() +
scale_y_continuous(labels = scales::percent) +
scale_fill_viridis_d(direction = -1) +
geom_hline(yintercept = 0, size = 1, colour = "#333333") +
labs(title = "How life expectancy varies",
subtitle = "% of population by life expectancy band, 2007") +
theme(legend.position = "top",
legend.justification = "left") +
guides(fill = guide_legend(reverse = TRUE))
只需要将position =“identity”更改为position =“dodge”:
#准备数据
grouped_bar_df <- gapminder %>%
filter(year == 1967 | year == 2007) %>%
select(country, year, lifeExp) %>%
spread(year, lifeExp) %>%
mutate(gap = `2007` - `1967`) %>%
arrange(desc(gap)) %>%
head(5) %>%
gather(key = year,
value = lifeExp,
-country,
-gap)
#画图
grouped_bars <- ggplot(grouped_bar_df,
aes(x = country,
y = lifeExp,
fill = as.factor(year))) +
geom_bar(stat="identity", position="dodge") +
geom_hline(yintercept = 0, size = 1, colour="#333333") +
bbc_style() +
scale_fill_manual(values = c("#1380A1", "#FAAB18")) +
labs(title="We're living longer",
subtitle = "Biggest life expectancy rise, 1967-2007")
library("ggalt")
library("tidyr")
#准备数据
dumbbell_df <- gapminder %>%
filter(year == 1967 | year == 2007) %>%
select(country, year, lifeExp) %>%
spread(year, lifeExp) %>%
mutate(gap = `2007` - `1967`) %>%
arrange(desc(gap)) %>%
head(10)
#作图
ggplot(dumbbell_df, aes(x = `1967`, xend = `2007`, y = reorder(country, gap), group = country)) +
geom_dumbbell(colour = "#dddddd",
size = 3,
colour_x = "#FAAB18",
colour_xend = "#1380A1") +
bbc_style() +
labs(title="We're living longer",
subtitle="Biggest life expectancy rise, 1967-2007")
hist_df <- gapminder %>%
filter(year == 2007)
ggplot(hist_df, aes(lifeExp)) +
geom_histogram(binwidth = 5, colour = "white", fill = "#1380A1") +
geom_hline(yintercept = 0, size = 1, colour="#333333") +
bbc_style() +
scale_x_continuous(limits = c(35, 95),
breaks = seq(40, 90, by = 10),
labels = c("40", "50", "60", "70", "80", "90 years")) +
labs(title = "How life expectancy varies",
subtitle = "Distribution of life expectancy in 2007")
去掉图例:
multiple_line + guides(colour=FALSE)
#or
multiple_line + theme(legend.position = "none")
改变图例位置:
multiple_line + theme(legend.position = "right")
翻转坐标轴:
bars <- bars + coord_flip()#垂直变成水平
添加/删除网格线:
bars <- bars + coord_flip() +
theme(panel.grid.major.x = element_line(color="#cbcbcb"),
panel.grid.major.y=element_blank())
#默认主题只有y轴的网格线。使用panel.grid.major.x = element_line添加x轴上的网格线。(使用panel.grid.major.y = element_blank()删除y轴上的网格线)
人工更改轴间距:
使用scale_y_continuous
或scale_x_continuous
更改轴文本标签:
bars <- bars + scale_y_continuous(limits=c(0,85),
breaks = seq(0, 80, by = 20),
labels = c("0","20", "40", "60", "80 years"))
bars
在轴标签上添加千位分隔符
+ scale_y_continuous(labels = function(x) format(x, big.mark = ",",
scientific = FALSE))
将百分比符号添加到轴标签:
+ scale_y_continuous(labels = function(x) paste0(x, "%"))
ggplot可以轻松创建多个小图表,这被称为构面。如果将需要可视化的数据按某个变量划分,则需要使用函数facet_wrap
或facet_grid
。
#准备数据
facet <- gapminder %>%
filter(continent != "Americas") %>%
group_by(continent, year) %>%
summarise(pop = sum(as.numeric(pop)))
#作图
facet_plot <- ggplot() +
geom_area(data = facet, aes(x = year, y = pop, fill = continent)) +
scale_fill_manual(values = c("#FAAB18", "#1380A1","#990000", "#588300")) +
facet_wrap( ~ continent, ncol = 5) +
scale_y_continuous(breaks = c(0, 2000000000, 4000000000),
labels = c(0, "2bn", "4bn")) +
bbc_style() +
geom_hline(yintercept = 0, size = 1, colour = "#333333") +
theme(legend.position = "none",
axis.text.x = element_blank()) +
labs(title = "Asia's rapid growth",
subtitle = "Population growth by continent, 1952-2007")
可以尝试的参数实在是太多啦!大家可以试一试呀!
[1]:https://bbc.github.io/rcookbook/#how_to_create_bbc_style_graphics
来源:https://bbc.github.io/rcookbook/
撰文:May
编辑:生信宝典