连接意味着在箱线图上用一条线

发布于 2024-09-28 04:33:04 字数 658 浏览 1 评论 0原文

我有一个显示多个框的箱线图。我想用一条线将每个框的平均值连接在一起。箱线图默认不显示均值,中间线仅表示中位数。我尝试过

ggplot(data, aes(x=xData, y=yData, group=g)) 
    + geom_boxplot() 
    + stat_summary(fun.y=mean, geom="line")

这不起作用。

有趣的是,这样做

stat_summary(fun.y=mean, geom="point") 

会在每个框中绘制中点。为什么“线”不起作用?

像这样但使用ggplot2, https://aliquote.org/pub/RMB/c4_sols/RMB_c4_sols .html#Fig.%203

在此处输入图像描述

I have a boxplot showing multiple boxes. I want to connect the mean for each box together with a line. The boxplot does not display the mean by default, instead the middle line only indicates the median. I tried

ggplot(data, aes(x=xData, y=yData, group=g)) 
    + geom_boxplot() 
    + stat_summary(fun.y=mean, geom="line")

This does not work.

Interestingly enough, doing

stat_summary(fun.y=mean, geom="point") 

draws the median point in each box. Why would "line" not work?

Something like this but using ggplot2,
https://aliquote.org/pub/RMB/c4_sols/RMB_c4_sols.html#Fig.%203

enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

又爬满兰若 2024-10-05 04:33:04

这就是您要找的吗?

library(ggplot2)

x <- factor(rep(1:10, 100))
y <- rnorm(1000)
df <- data.frame(x=x, y=y)

ggplot(df, aes(x=x, y=y)) + 
geom_boxplot() + 
stat_summary(fun=mean, geom="line", aes(group=1))  + 
stat_summary(fun=mean, geom="point")

更新:

有关设置 group=1 的一些说明:我认为我在 Hadley Wickham 的书中找到了解释“ggplot2:用于数据分析的优雅图形。在第 51 页,他写道:

不同层上的不同组。

有时我们想要绘制摘要
根据不同级别的
聚合。不同层可能
群体审美不同,所以
有些人表现出个人水平
数据,而其他人则显示摘要
更大的群体。

基于前面的示例,
假设我们想添加一个平滑
线到刚刚创建的情节,基于
关于所有人的年龄和身高
男孩们。如果我们使用相同的分组
我们用于线条的平滑度,
我们得到图 4.4 中的第一张图。

p + geom_smooth(aes(组 = 主题),
方法=“lm”,se = F)

这不是我们想要的;我们有
无意中添加了一条平滑线
对于每个男孩。这个新层需要一个
不同群体审美,group=1,
这样新的线路将基于
所有数据,如第二个所示
图中的情节。修改层
看起来像这样:

p + geom_smooth(aes(组 = 1),
方法=“lm”,大小= 2,se = F)

[...] 在中使用 aes(group = 1)
平滑层适合单行
最适合所有男孩。”

Is that what you are looking for?

library(ggplot2)

x <- factor(rep(1:10, 100))
y <- rnorm(1000)
df <- data.frame(x=x, y=y)

ggplot(df, aes(x=x, y=y)) + 
geom_boxplot() + 
stat_summary(fun=mean, geom="line", aes(group=1))  + 
stat_summary(fun=mean, geom="point")

Update:

Some clarification about setting group=1: I think that I found an explanation in Hadley Wickham's book "ggplot2: Elegant Graphics for Data Analysis. On page 51 he writes:

Different groups on different layers.

Sometimes we want to plot summaries
based on different levels of
aggregation. Different layers might
have different group aesthetics, so
that some display individual level
data while others display summaries of
larger groups.

Building on the previous example,
suppose we want to add a single smooth
line to the plot just created, based
on the ages and heights of all the
boys. If we use the same grouping for
the smooth that we used for the line,
we get the first plot in Figure 4.4.

p + geom_smooth(aes(group = Subject),
method="lm", se = F)

This is not what we wanted; we have
inadvertently added a smoothed line
for each boy. This new layer needs a
different group aesthetic, group = 1,
so that the new line will be based on
all the data, as shown in the second
plot in the figure. The modified layer
looks like this:

p + geom_smooth(aes(group = 1),
method="lm", size = 2, se = F)

[...] Using aes(group = 1) in the
smooth layer fits a single line of
best fit across all boys."

浅忆 2024-10-05 04:33:04

另一种更长的方法(如果数据位于两个不同的

library(dplyr); library(ggplot2)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

x <- factor(rep(1:10, 100)); y <- rnorm(1000);
df <- data.frame(x=x, y=y);
df_for_line <- df %>% group_by(x) %>% summarise(mean_y = mean(y));
ggplot(df, aes(x = x, y = y)) + geom_boxplot() + 
    geom_path(data = df_for_line, aes(x = x, y = mean_y, group = 1))

reprex 包 (v1.0.0)


Again, `group = 1` is the key.

Another longer approach (in case if the data is in two different ) is:

library(dplyr); library(ggplot2)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

x <- factor(rep(1:10, 100)); y <- rnorm(1000);
df <- data.frame(x=x, y=y);
df_for_line <- df %>% group_by(x) %>% summarise(mean_y = mean(y));
ggplot(df, aes(x = x, y = y)) + geom_boxplot() + 
    geom_path(data = df_for_line, aes(x = x, y = mean_y, group = 1))

Created on 2021-04-15 by the reprex package (v1.0.0)


Again, `group = 1` is the key.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文