将汇总统计数据（甚至原始数据点）添加到躲避位置箱线图中

发布于 2024-08-06 03:49:32 字数 779 浏览 2 评论 0原文

假设您有以下数据集：

trt &lt;- ifelse(runif(100)&lt;0.5,"drug","placebo")
inj.site &lt;- ifelse(runif(100)&lt;0.5,"ankle","wrist")
relief &lt;- 20 + 0.5*(inj.site=="ankle") + 0.5*(trt=="drug") + rnorm(100)
to.analyze &lt;- data.frame(trt,inj.site,relief)

现在，我们的想法是制作一个箱线图，其中 x 轴上有损伤部位，并并排有治疗箱：

bplot &lt;- ggplot(to.analyze,aes(inj.site,relief,fill=trt)) + geom_boxplot(position="dodge")

很简单。但现在我想在框的顶部添加原始数据点。如果我没有带有 position="dodge" 的框，这会很容易：

bplot + geom_point(aes(colour=trt))

但是，这会在框之间绘制点，并添加 position="dodge" >对于这个几何形状似乎不起作用。我如何调整它以便在方框上绘制点？

奖励：与使用 stat_summary(blah,y.fun=mean,shape="+") 过度绘制均值的情况相同，也有同样的问题。

原文

Say you have the following dataset:

trt <- ifelse(runif(100)<0.5,"drug","placebo")
inj.site <- ifelse(runif(100)<0.5,"ankle","wrist")
relief <- 20 + 0.5*(inj.site=="ankle") + 0.5*(trt=="drug") + rnorm(100)
to.analyze <- data.frame(trt,inj.site,relief)

Now, the idea is to make a boxplot with injury site on the x-axis and boxes by treatment side-by-side:

bplot <- ggplot(to.analyze,aes(inj.site,relief,fill=trt)) + geom_boxplot(position="dodge")

Easy enough. But now I want to add raw data points on top of the boxes. If I didn't have boxes with position="dodge", this would be easy:

bplot + geom_point(aes(colour=trt))

However, this draws points in between the boxes, and adding a position="dodge"to this geometry does not seem to work. How do I adjust this so that points are drawn over the boxes?

Bonus: same situation with using stat_summary(blah,y.fun=mean,shape="+") to overplot the means, which has the same issue.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不必了 2024-08-13 03:49:32

如果我在这里错了，哈德利无疑会纠正我......

这是自然语法：

bplot + geom_point(aes(colour=trt), position=position_dodge(width=.5))

（position =“dodge”将做同样的事情，没有参数。）

当我绘制它时，我得到一些看起来像position_jitter（），这大概也是你得到的。

出于好奇，我查看了源代码，在那里我找到了 pos_dodge() 函数。（在 R 提示符下输入 pos_dodge 即可查看...）到此结束：

within(df, {
  xmin <- xmin + width / n * (seq_len(n) - 1) - diff * (n - 1) / (2 * n)
  xmax <- xmin + d_width / n
  x <- (xmin + xmax) / 2
})

n 是数据框的行数。所以看起来它正在以行索引的分数来躲避各个点！所以第一个点是闪避的 width/n，第二个点是闪避的 2 * width/n，最后一个点是闪避的 n * width/n。

这显然不是您的意思，尽管这是您所说的。您可能会陷入手动重新创建躲避的箱线图，或使用不同的可视化（例如分面）的困境？

ggplot(to.analyze,aes(inj.site,relief)) + geom_boxplot() + facet_wrap(~ trt)

Hadley will doubtless correct me if I'm wrong here...

Here's the natural syntax:

bplot + geom_point(aes(colour=trt), position=position_dodge(width=.5))

(position="dodge" will do the same thing, without the parameter.)

When I plot it, I get something that looks like a position_jitter(), which is presumably what you get too.

Curious, I went to look in the source, where I found the pos_dodge() function. (Type pos_dodge at an R prompt to see it...) Here's the end of it:

within(df, {
  xmin <- xmin + width / n * (seq_len(n) - 1) - diff * (n - 1) / (2 * n)
  xmax <- xmin + d_width / n
  x <- (xmin + xmax) / 2
})

n is the number of rows of the data frame. So it looks like it's dodging the individual points by a fraction indexed by the row! So the first point is dodged width/n, the second is dodged 2 * width/n, and the last is dodged n * width/n.

This is obviously not what you meant, although it is what you said. You may be stuck recreating the dodged boxplot manually, or using a different visualization, like faceting maybe?

ggplot(to.analyze,aes(inj.site,relief)) + geom_boxplot() + facet_wrap(~ trt)

回复收藏 0 原文

~没有更多了~

关于作者

唠甜嗑

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

将汇总统计数据（甚至原始数据点）添加到躲避位置箱线图中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

梦里南柯

不将就、

alipaysp_ZRaVhH1Dn

青衫儰鉨ミ守葔

故事未完

梦晓ヶ微光ヅ倾城

友情链接

将汇总统计数据（甚至原始数据点）添加到躲避位置箱线图中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

梦里南柯

不将就、

alipaysp_ZRaVhH1Dn

青衫儰鉨ミ守葔

故事未完

梦晓ヶ微光ヅ倾城

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。