从 R 中的 ggplot 生成统计摘要

发布于 2025-01-15 06:18:27 字数 451 浏览 1 评论 0原文

我是一名 R 新手,正在使用教授提供的脚本进行项目,但我无法获得与我创建的箱形图相匹配的数据的准确平均值。 时得到的平均值

该图中的平均值低于每茎 300 公斤,以及当我使用ggsummarystats( DBHdata, x = "location", y = "biomassKeith_and_Camphor", ggfunc = ggboxplot, add = "jitter" ) code>

tapply(DBHdata$biomassBrown_and_Camphor, DBHdata$location,mean)

我最终平均重量超过 600 公斤/茎。有没有办法在我的箱线图代码中生成汇总统计数据。

每茎千克的箱线图

I'm an R novice and working on project with script provided by my professor and I'm having trouble getting an accurate mean for my data that matches the box plot that I created. The mean in this plot is below 300kg per stem and the mean I am getting when I use

ggsummarystats( DBHdata, x = "location", y = "biomassKeith_and_Camphor", ggfunc = ggboxplot, add = "jitter" )

or

tapply(DBHdata$biomassBrown_and_Camphor, DBHdata$location, mean)

I end up with means over 600 kg/stem. Is there way to produce summary statistics in the code for my box plot.

Box and Whisker plot of kg per stem

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夜吻♂芭芘 2025-01-22 06:18:27

箱线图不包含平均值,而是包含中值。因此,这可以解释您在计算中观察到的变化。

The boxplots do not contain mean values, but median instead. So this could explain the variation you are observing in your calculations.

盗琴音 2025-01-22 06:18:27

此外,数据似乎非常偏向于大数字,因此尽管中位数约为 200,但平均值超过 600 并不令人意外

Additionally, the data appears to be very skewed towards large numbers, so a mean of over 600 despite medians of ca 200 is not surpringing

晨敛清荷 2025-01-22 06:18:27

正如其他人指出的那样,箱线图显示了默认值的中位数。
如果您想使用 ggstatsplot 获取平均值,您可以更改使用 summaries 参数调用的函数,如下所示:

ggsummarystats(DBHdata, x = "location", y = "biomassKeith_and_Camphor",
ggfunc = ggboxplot, add = "jitter", summaries = c("n", "median", "iqr", "mean"))

这将在 n、中位数和四分位数范围 (iqr) 的标准输出之外添加平均值。

As others have pointed out, a boxplot shows the median per default.
If you want to get the mean with ggstatsplot, you can change the functions that you call with the summaries argument, as such:

ggsummarystats(DBHdata, x = "location", y = "biomassKeith_and_Camphor",
ggfunc = ggboxplot, add = "jitter", summaries = c("n", "median", "iqr", "mean"))

This would add the mean besides the standard output of n, median, and interquartile range (iqr).

风启觞 2025-01-22 06:18:27

我不确定我是否正确理解你的问题,但首先尝试使用聚合计算组均值,然后添加带有均值的文本。

示例代码:

means <- aggregate(weight ~  group, PlantGrowth, mean)

library(ggplot2)
    ggplot(PlantGrowth, aes(x=group, y=weight, fill=group)) + 
    geom_boxplot() +
      stat_summary(fun=mean, colour="darkred", geom="point", 
                   shape=18, size=3, show.legend=FALSE) + 
      geom_text(data = means, aes(label = weight, y = weight + 0.08))

绘图:

在此处输入图像描述

示例数据:

data(PlantGrowth)

I'm not sure if I understand your question correctly, but first try calculating the group means with aggregate and then adding a text with means.

Sample code:

means <- aggregate(weight ~  group, PlantGrowth, mean)

library(ggplot2)
    ggplot(PlantGrowth, aes(x=group, y=weight, fill=group)) + 
    geom_boxplot() +
      stat_summary(fun=mean, colour="darkred", geom="point", 
                   shape=18, size=3, show.legend=FALSE) + 
      geom_text(data = means, aes(label = weight, y = weight + 0.08))

Plot:

enter image description here

Sample data:

data(PlantGrowth)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文