描绘出企鹅不同族群的脚蹼长度和身体重量之间的关系,复现下图
A variable is a quantity, quality, or property that you can measure.
A value is the state of a variable when you measure it. The value of a variable may change from measurement to measurement.
An observation is a set of measurements made under similar conditions (you usually make all of the measurements in an observation at the same time and on the same object). An observation will contain several values, each associated with a different variable. We’ll sometimes refer to an observation as a data point.
Tabular data is a set of values, each associated with a variable and an observation. Tabular data is tidy if each value is placed in its own “cell”, each variable in its own column, and each observation in its own row.
install.packages("tidyverse")
library(tidyverse)
library(palmerpenguins)#包含penguins数据框
library(ggthemes)#包含colorblind safe color palette功能,ggplot2作图时需要。
> penguins#查看penguins数据信息。或运行view(penguins)/glimpse()命令。
## A tibble: 344 × 8
# species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
# <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
# 1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
# 2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
# 3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
# 4 Adelie Torgersen NA NA NA NA NA 2007
# 5 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
# 6 Adelie Torgersen 39.3 20.6 190 3650 male 2007
# 7 Adelie Torgersen 38.9 17.8 181 3625 fema… 2007
# 8 Adelie Torgersen 39.2 19.6 195 4675 male 2007
# 9 Adelie Torgersen 34.1 18.1 193 3475 NA 2007
#10 Adelie Torgersen 42 20.2 190 4250 NA 2007
## ℹ 334 more rows
## ℹ Use `print(n = ...)` to see more rows
本次作图的变量是:
ggolot(data = penguins)#此时是空白画板,没有定义作图
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
)#此时作图定义了横纵坐标,在ggplt2 中,The mapping argument is always defined in the aes() function
ggplot2中有函数geom_bar()
/geom_line()
/ geom_point()
/ geom_boxplot()
等。
> ggplot(
+ data = penguins,
+ mapping = aes(x = flipper_length_mm, y = body_mass_g)
+ ) +
+ geom_point()
#Warning message:
#Removed 2 rows containing missing values or values outside the scale range
#(`geom_point()`).
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)
) +
geom_point()
geom_smmoth()
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)
) +
geom_point() +
geom_smooth(method = "lm")#draw the line of best fit based on a linear model with method = "lm"
理解ggplot2作图整体和局部的概念和区分
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
) +
geom_point(mapping = aes(color = species)) +
geom_smooth(method = "lm")
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
) +
geom_point(mapping = aes(color = species, shape = species)) +
geom_smooth(method = "lm")
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
) +
geom_point(aes(color = species, shape = species)) +
geom_smooth(method = "lm") +
labs(
title = "Body mass and flipper length",
subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins",
x = "Flipper length (mm)", y = "Body mass (g)",
color = "Species", shape = "Species"
) +
scale_color_colorblind()
scale_color_colorblind()
: improve the color palette to be colorblind safe
labs()
: 各种title命名。
作图复现完成
> glimpse(penguins)
#Rows: 344
#Columns: 8
?penguins
to find out.ggplot(
data = penguins,
mapping = aes(x = bill_length_mm,y = bill_depth_mm,color = species,shape = species)
)+
geom_point()+
geom_smooth(method = "lm")
可以看到三个不同的种群的趋势是相同的bill_length_mm越大,bill_depth_mm越大。但是如果把他们当作整体画趋势线就会的出错误的结果。
ggplot(
data = penguins,
mapping = aes(x = bill_length_mm,y = bill_depth_mm)
)+
geom_point(aes(color = species,shape = species))+
geom_smooth(method = "lm")##错误方式
> ggplot(data = penguins) +
+ geom_point()
#Error in `geom_point()`:
#! Problem while setting up geom.
#ℹ Error occurred in the 1st layer.
#Caused by error in `compute_geom_1()`:
#! `geom_point()` requires the following missing aesthetics: x and y.
#Run `rlang::last_trace()` to see where the error occurred.
运行?geom_point()
理解下边两段代码区别:
> ggplot(
+ data = penguins,
+ mapping = aes(x = bill_length_mm,y = bill_depth_mm,color = species,shape = species)
+ )+
+ geom_point()+
+ geom_smooth(method = "lm")
#`geom_smooth()` using formula = 'y ~ x'
#Warning messages:
#1: Removed 2 rows containing non-finite outside the scale range (`stat_smooth()`).
#2: Removed 2 rows containing missing values or values outside the scale range
ggplot(
data = penguins,
mapping = aes(x = bill_length_mm,y = bill_depth_mm,color = species,shape = species)
)+
geom_point(na.rm = TRUE)+
geom_smooth(method = "lm")
#`geom_smooth()` using formula = 'y ~ x'
#Warning message:
#Removed 2 rows containing non-finite outside the scale range #(`stat_smooth()`).
ggplot(
data = penguins,
mapping = aes(x = bill_length_mm,y = bill_depth_mm,color = species,shape = species)
)+
geom_point(na.rm = TRUE)+
geom_smooth(method = "lm")+
labs(
caption = "Data come from the palmerpenius package"
)
> ggplot(
+ data = penguins,
+ mapping = aes(x = flipper_length_mm,y = body_mass_g)
+ )+
+ geom_point(aes(color = bill_depth_mm))+
+ geom_smooth()
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g, color = island)
) +
geom_point() +
geom_smooth(se = FALSE)
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
) +
geom_point() +
geom_smooth()
ggplot() +
geom_point(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
) +
geom_smooth(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
)
以上两行代码运行结果的图一致。
#6 代码写法
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
) +
geom_point()
ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) +
geom_point()
penguins |>
ggplot(aes(x = flipper_length_mm, y = body_mass_g)) +
geom_point()
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。