GGPLOT2 RAINCLOUD图中的分布曲线不正确
我注意到R。 GEOM_FLAT_VIOLIN()制造的分布曲线的形状似乎与数据值(不应该在的地方)链接在一起,我找不到如何恢复其独立性。我设法找到的唯一线索是:曲线是根据数据中最低值缩小的,尽管收缩会影响出现这些值的整个面板,而不仅仅是包含它们的子组。
下面是一个可再现的示例,也是其图像输出的链接,以表明我的意思。 请提前注:RainCloud包(以本文)不在cran afaik上,所以我直接从作者' github repo 。我还尝试了重现。其他实现(例如ggrdiges :: geom_dense_ridges()或{ggdist},除非我缺少某些内容,否则对图形参数(例如平滑)的控制级别相同。
示例代码:
library(reshape2)
library(ggplot2)
source("https://gist.githubusercontent.com/benmarwick/2a1bb0133ff568cbe28d/raw/fb53bd97121f7f9ce947837ef1a4c65a73bffb3f/geom_flat_violin.R")
# load data and melt into longform
data(iris)
miris <- melt(iris,id.vars = "Species", measure.vars = colnames(iris)[1:4], variable.name = "measurement")
## 1- plotting as is gives horizontally "squashed" curves in two of four panels
ggplot(miris, aes(x = Species, y = value, fill = Species)) +
geom_flat_violin(position = position_nudge(x = .15, y = 0)) +
facet_wrap(~measurement)
## 2- manipulating the group of smallest values seems to fix the relevant panel (but fixing other groups doesn't fix the problem - I tried that)
airis <- miris
# get indices of data to manipulate
inds <- intersect(which(airis$Species == "setosa"), which(airis$measurement == "Petal.Width"))
# assign larger values
airis$value[inds] <- rnorm(length(inds), 3, 0.5)
ggplot(airis, aes(x = Species, y = value, fill = Species)) +
geom_flat_violin(position = position_nudge(x = .15, y = 0)) +
facet_wrap(~measurement)
## this second plot shows larger distribution curves for all speceis in the "Petal.Width" panel, although values were only changed for "setosa"
这是绘制的数据的并排图像,然后调整小值之后,就像上述代码所做的那样
有人知道问题可能在哪里,还是可以做些什么来解决问题?
非常感谢!
I noticed strange behaviour of the raincloud plot package in R. Specifically, density curves are sensitive to data values in some (but not all) cases. It seems that the shape of the distribution curve made by geom_flat_violin() is somehow linked to data values (where it shouldn't be), and I can't find how to restore their independence. The only clue I managed to find: the curves are shrunk based on the lowest values in the data, although shrinkage affects the whole panel where those values occur, not just the sub-group containing them.
Below is a reproducible example, and a link to its image output to show what I mean.
Just a note in advance: the raincloud package (presented in This paper) is not on CRAN afaik, so I lifted it directly from the authors' github repo. I also tried an alternative source file which reproduces the . Other implementations such as ggrdiges::geom_density_ridges() or {ggdist} didn't have the same level of control on graphic parameters (e.g. smoothing), unless I'm missing something.
Example code:
library(reshape2)
library(ggplot2)
source("https://gist.githubusercontent.com/benmarwick/2a1bb0133ff568cbe28d/raw/fb53bd97121f7f9ce947837ef1a4c65a73bffb3f/geom_flat_violin.R")
# load data and melt into longform
data(iris)
miris <- melt(iris,id.vars = "Species", measure.vars = colnames(iris)[1:4], variable.name = "measurement")
## 1- plotting as is gives horizontally "squashed" curves in two of four panels
ggplot(miris, aes(x = Species, y = value, fill = Species)) +
geom_flat_violin(position = position_nudge(x = .15, y = 0)) +
facet_wrap(~measurement)
## 2- manipulating the group of smallest values seems to fix the relevant panel (but fixing other groups doesn't fix the problem - I tried that)
airis <- miris
# get indices of data to manipulate
inds <- intersect(which(airis$Species == "setosa"), which(airis$measurement == "Petal.Width"))
# assign larger values
airis$value[inds] <- rnorm(length(inds), 3, 0.5)
ggplot(airis, aes(x = Species, y = value, fill = Species)) +
geom_flat_violin(position = position_nudge(x = .15, y = 0)) +
facet_wrap(~measurement)
## this second plot shows larger distribution curves for all speceis in the "Petal.Width" panel, although values were only changed for "setosa"
Does anyone know where the problem might be, or what can be done to fix it?
Many thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论