将文本添加到与条件匹配的ggplot geom_jitter点
如何向使用 geom_jittered 渲染的点添加文本来标记它们? geom_text 将不起作用,因为我不知道抖动点的坐标。您能否捕获抖动点的位置,以便我可以传递给 geom_text?
我的实际用途是绘制一个带有 geom_jitter 的箱线图以显示数据分布,我想标记离群点或符合特定条件的点(例如用于为图着色的值的较低 10%) )。
一种解决方案是捕获抖动图的 xy 位置并稍后在另一层中使用它,这可能吗?
[更新]
根据 Joran 的回答,解决方案是使用基础包中的抖动函数计算抖动值,将它们添加到数据框中并将它们与 geom_point 一起使用。为了进行过滤,他使用 ddply 来创建一个过滤列(一个逻辑向量),并使用它来对 geom_text 中的数据进行子集化。
他要求提供最小的数据集。我刚刚修改了他的示例(标签列中的唯一标识符)
dat <- data.frame(x=rep(letters[1:3],times=100),y=runif(300),
lab=paste('id_',1:300,sep=''))
这是 joran 示例使用我的数据并将 ids 的显示降低到最低 1% 的结果
这是对代码的修改,以便通过另一个变量获得颜色,显示此变量的一些值(每组最低的 1%):
library("ggplot2")
#Create some example data
dat <- data.frame(x=rep(letters[1:3],times=100),y=runif(300),
lab=paste('id_',1:300,sep=''),quality= rnorm(300))
#Create a copy of the data and a jittered version of the x variable
datJit <- dat
datJit$xj <- jitter(as.numeric(factor(dat$x)))
#Create an indicator variable that picks out those
# obs that are in lowest 1% by x
datJit <- ddply(datJit,.(x),.fun=function(g){
g$grp <- g$y <= quantile(g$y,0.01);
g$top_q <- g$qual <= quantile(g$qual,0.01);
g})
#Create a boxplot, overlay the jittered points and
# label the bottom 1% points
ggplot(dat,aes(x=x,y=y)) +
geom_boxplot() +
geom_point(data=datJit,aes(x=xj,colour=quality)) +
geom_text(data=subset(datJit,grp),aes(x=xj,label=lab)) +
geom_text(data=subset(datJit,top_q),aes(x=xj,label=sprintf("%0.2f",quality)))
How can I add text to points rendered with geom_jittered to label them? geom_text will not work because I don't know the coordinates of the jittered dots. Could you capture the position of the jittered points so I can pass to geom_text?
My practical usage would be to plot a boxplot with the geom_jitter over it to show the data distribution and I would like to label the outliers dots or the ones that match certain condition (for example the lower 10% for the values used for color the plots).
One solution would be to capture the xy positions of the jittered plots and use it later in another layer, is that possible?
[update]
From Joran answer, a solution would be to calculate the jittered values with the jitter function from the base package, add them to a data frame and use them with geom_point. For filtering he used ddply to have a filter column (a logic vector) and use it for subsetting the data in geom_text.
He asked for a minimal dataset. I just modified his example (a unique identifier in the label colum)
dat <- data.frame(x=rep(letters[1:3],times=100),y=runif(300),
lab=paste('id_',1:300,sep=''))
This is the result of joran example with my data and lowering the display of ids to the lowest 1%
And this is a modification of the code to have colors by another variable and displaying some values of this variable (the lowest 1% for each group):
library("ggplot2")
#Create some example data
dat <- data.frame(x=rep(letters[1:3],times=100),y=runif(300),
lab=paste('id_',1:300,sep=''),quality= rnorm(300))
#Create a copy of the data and a jittered version of the x variable
datJit <- dat
datJit$xj <- jitter(as.numeric(factor(dat$x)))
#Create an indicator variable that picks out those
# obs that are in lowest 1% by x
datJit <- ddply(datJit,.(x),.fun=function(g){
g$grp <- g$y <= quantile(g$y,0.01);
g$top_q <- g$qual <= quantile(g$qual,0.01);
g})
#Create a boxplot, overlay the jittered points and
# label the bottom 1% points
ggplot(dat,aes(x=x,y=y)) +
geom_boxplot() +
geom_point(data=datJit,aes(x=xj,colour=quality)) +
geom_text(data=subset(datJit,grp),aes(x=xj,label=lab)) +
geom_text(data=subset(datJit,top_q),aes(x=xj,label=sprintf("%0.2f",quality)))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你的问题不完全清楚;例如,您在某一时刻提到了标签点,但也提到了着色点,所以我不确定您真正的意思是哪一个,或者两者都有。一个可重现的例子会非常有帮助。但我进行了一些猜测,以下代码完成了我认为您所描述的操作:
Your question isn't completely clear; for example, you mention labeling points at one point but also mention coloring points, so I'm not sure which you really mean, or perhaps both. A reproducible example would be very helpful. But using a little guesswork on my part, the following code does what I think you're describing:
只是对 Joran 精彩解决方案的补充:
当我尝试使用facet_wrap()在多面图中使用时,我遇到了x轴定位的问题。问题是,ggplot2 使用 1 作为每个方面的 x 值。解决方案是创建一个抖动 1 的向量:
Just an addition to Joran's wonderful solution:
I ran into trouble with the x-axis positioning when I tried to use in a facetted plot using facet_wrap(). The problem is, that ggplot2 uses 1 as the x-value on every facet. The solution is to create a vector of jittered 1s: