在点图中添加线交叉因子(ggplot)
我正在尝试用与 ggplot 中的因素交叉的线来复制批次/病例/治疗的点图。这类似于 Douglas Bates 线性模型课程中的这个图,它在 y 轴上显示 6 个组,在 x 轴上显示连续响应,每个组的平均值由一条线连接: Dotplot from Bates">
使用与 lme4 包捆绑的 sleepstudy 数据集作为示例,我有:
library(ggplot2)
p <- ggplot(sleepstudy, aes(x=Reaction, y=reorder(Subject, Reaction)))
p <- p + geom_point()
print(p)
然后,我创建一个数据框,其中包含每个受试者的平均反应时间:
mean_rxn <- function(df) mean(df$Reaction, na.rm=T)
sleepsummary <- ddply(sleepstudy, .(Subject), mean_rxn)
我能够在每个受试者的平均值处绘制点:
p.points <- p + geom_point(data=sleepsummary, aes(x=V1, y=reorder(Subject, V1), size=10))
print(p.points)
但我无法获得交叉因素的线。也就是说,从 geom_point 更改为 geom_line 不会显示任何内容
# does nothing
p.line <- p + geom_line(data=sleepsummary, aes(x=V1, y=reorder(Subject, V1)))
print(p.line)
有人有任何想法吗?最终,我的目标是以这种方式在原始数据之上绘制一些模型结果,因此在绘制原始数据帧时“即时”计算的方法不太有用,因为我需要从更复杂的模型拟合。
感谢您的帮助!
瑞安
I'm trying to replicate a dotplot of batches/cases/treatments with lines crossing the factors in ggplot. That is something like this plot from Douglas Bates' linear models course which shows 6 groups on the y axis with a continuous response on the x axis with the mean for each group joined by a line:
Using the sleepstudy dataset bundled with the lme4 package as an example, I have:
library(ggplot2)
p <- ggplot(sleepstudy, aes(x=Reaction, y=reorder(Subject, Reaction)))
p <- p + geom_point()
print(p)
Which gives the basic dotplot, with subjects on the y axis in order of increasing reaction time.
I then create a data frame with mean reaction times for each subject:
mean_rxn <- function(df) mean(df$Reaction, na.rm=T)
sleepsummary <- ddply(sleepstudy, .(Subject), mean_rxn)
I am able to plot points at the mean for each subject:
p.points <- p + geom_point(data=sleepsummary, aes(x=V1, y=reorder(Subject, V1), size=10))
print(p.points)
But I can't get lines to cross the factors. That is, changing from geom_point to geom_line displays nothing
# does nothing
p.line <- p + geom_line(data=sleepsummary, aes(x=V1, y=reorder(Subject, V1)))
print(p.line)
Anyone have any ideas? Ultimately, my goal is to plot some model results on top of the raw data in this fashion, so methods that calculate means "on the fly" in the plotting of the original data frame are less useful because I need to get my data points from a more complex model fit.
Thanks for any help!
Ryan
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
已编辑
我的第一个建议是在绘图之前将值转换为数字。
但 Hadley 指出,最好在解决方案中使用 group=1,而不是 as.numeric():
Edited
My first proposal was to convert values to numeric before plotting.
But Hadley points out it is preferable to use group=1 in the solution, rather than as.numeric():
您还可以像这样使用 stat_summary :
You can also use stat_summary like this this :