将汇总统计数据(甚至原始数据点)添加到躲避位置箱线图中
假设您有以下数据集:
trt <- ifelse(runif(100)<0.5,"drug","placebo")
inj.site <- ifelse(runif(100)<0.5,"ankle","wrist")
relief <- 20 + 0.5*(inj.site=="ankle") + 0.5*(trt=="drug") + rnorm(100)
to.analyze <- data.frame(trt,inj.site,relief)
现在,我们的想法是制作一个箱线图,其中 x 轴上有损伤部位,并并排有治疗箱:
bplot <- ggplot(to.analyze,aes(inj.site,relief,fill=trt)) + geom_boxplot(position="dodge")
很简单。但现在我想在框的顶部添加原始数据点。如果我没有带有 position="dodge"
的框,这会很容易:
bplot + geom_point(aes(colour=trt))
但是,这会在框之间绘制点,并添加 position="dodge"
>对于这个几何形状似乎不起作用。我如何调整它以便在方框上绘制点?
奖励:与使用 stat_summary(blah,y.fun=mean,shape="+") 过度绘制均值的情况相同,也有同样的问题。
Say you have the following dataset:
trt <- ifelse(runif(100)<0.5,"drug","placebo")
inj.site <- ifelse(runif(100)<0.5,"ankle","wrist")
relief <- 20 + 0.5*(inj.site=="ankle") + 0.5*(trt=="drug") + rnorm(100)
to.analyze <- data.frame(trt,inj.site,relief)
Now, the idea is to make a boxplot with injury site on the x-axis and boxes by treatment side-by-side:
bplot <- ggplot(to.analyze,aes(inj.site,relief,fill=trt)) + geom_boxplot(position="dodge")
Easy enough. But now I want to add raw data points on top of the boxes. If I didn't have boxes with position="dodge"
, this would be easy:
bplot + geom_point(aes(colour=trt))
However, this draws points in between the boxes, and adding a position="dodge"
to this geometry does not seem to work. How do I adjust this so that points are drawn over the boxes?
Bonus: same situation with using stat_summary(blah,y.fun=mean,shape="+")
to overplot the means, which has the same issue.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果我在这里错了,哈德利无疑会纠正我......
这是自然语法:
(position =“dodge”将做同样的事情,没有参数。)
当我绘制它时,我得到一些看起来像position_jitter( ),这大概也是你得到的。
出于好奇,我查看了源代码,在那里我找到了 pos_dodge() 函数。 (在 R 提示符下输入 pos_dodge 即可查看...)到此结束:
n 是数据框的行数。所以看起来它正在以行索引的分数来躲避各个点!所以第一个点是闪避的 width/n,第二个点是闪避的 2 * width/n,最后一个点是闪避的 n * width/n。
这显然不是您的意思,尽管这是您所说的。您可能会陷入手动重新创建躲避的箱线图,或使用不同的可视化(例如分面)的困境?
Hadley will doubtless correct me if I'm wrong here...
Here's the natural syntax:
(position="dodge" will do the same thing, without the parameter.)
When I plot it, I get something that looks like a position_jitter(), which is presumably what you get too.
Curious, I went to look in the source, where I found the pos_dodge() function. (Type pos_dodge at an R prompt to see it...) Here's the end of it:
n is the number of rows of the data frame. So it looks like it's dodging the individual points by a fraction indexed by the row! So the first point is dodged width/n, the second is dodged 2 * width/n, and the last is dodged n * width/n.
This is obviously not what you meant, although it is what you said. You may be stuck recreating the dodged boxplot manually, or using a different visualization, like faceting maybe?