如何避免使用基本图过度绘制(对于点)?

发布于 2024-09-16 06:15:11 字数 984 浏览 7 评论 0 原文

我正在完成论文的图表并决定(在对 stats.stackoverflow 进行讨论之后),为了传输尽可能多的信息,创建以下图表,在前台显示平均值和在后台原始数据: alt text

但是,仍然存在一个问题,那就是过度绘制。例如,标记点看起来反映了一个数据点,但实际上该位置存在 5 个具有相同值的数据点。
因此,我想知道是否有一种方法可以使用points作为函数来处理基本图中的过度绘制。
如果例如各个点变得更暗,或更粗,或者......

手动执行它不是一个选项(太多像这样的图形和点),那将是理想的。此外,ggplot2也不是我想学习处理这个单一问题的东西(一个原因是我倾向于喜欢ggplot2不支持的双轴) 。


更新:我编写了一个函数,可以自动创建上述图表,并通过添加垂直或水平抖动(或两者)来避免过度绘制:检查一下!

此函数现在可用作 raw.means.plotplotrix 中的 >raw.means.plot2 包(在 CRAN 上)。

I am in my way of finishing the graphs for a paper and decided (after a discussion on stats.stackoverflow), in order to transmit as much information as possible, to create the following graph that present both in the foreground the means and in the background the raw data:
alt text

However, one problem remains and that is overplotting. For example, the marked point looks like it reflects one data point, but in fact 5 data points exists with the same value at that place.
Therefore, I would like to know if there is a way to deal with overplotting in base graph using points as the function.
It would be ideal if e.g., the respective points get darker, or thicker or,...

Manually doing it is not an option (too many graphs and points like this). Furthermore, ggplot2 is also not what I want to learn to deal with this single problem (one reason is that I tend to like dual-axes what is not supprted in ggplot2).


Update: I wrote a function which automatically creates the above graphs and avoids overplotting by adding vertical or horizontal jitter (or both): check it out!

This function is now available as raw.means.plot and raw.means.plot2 in the plotrix package (on CRAN).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

多孤肩上扛 2024-09-23 06:15:11

标准方法是在绘图之前向数据添加一些噪声。 R 有一个函数 jitter() 正是这样做的。您可以使用它向绘图中的坐标添加必要的噪声。例如:

X <- rep(1:10,10)
Z <- as.factor(sample(letters[1:10],100,replace=T))

plot(jitter(as.numeric(Z),factor=0.2),X,xaxt="n")
axis(1,at=1:10,labels=levels(Z))

Standard approach is to add some noise to the data before plotting. R has a function jitter() which does exactly that. You could use it to add the necessary noise to the coordinates in your plot. eg:

X <- rep(1:10,10)
Z <- as.factor(sample(letters[1:10],100,replace=T))

plot(jitter(as.numeric(Z),factor=0.2),X,xaxt="n")
axis(1,at=1:10,labels=levels(Z))
嗳卜坏 2024-09-23 06:15:11

除了抖动之外,另一个好的方法是 alpha 混合,您可以将其作为第四个颜色参数(在支持它的图形设备上)获得。我在 这个问题<中提供了两个直方图“重叠绘制”的示例< /a>.

Besides jittering, another good approach is alpha blending which you can obtain (on the graphics devices supporing it) as the fourth color parameter. I provided an example for 'overplotting' of two histograms in this SO question.

回忆躺在深渊里 2024-09-23 06:15:11

对于显示点数的一般问题的另一个想法是使用地毯图(地毯函数),这会沿着边缘放置小刻度线,可以显示有多少点贡献(仍然使用抖动或 alpha 混合来表示平局)。这允许实际点显示其真实值而不是抖动值,但地毯可以指示绘图的哪些部分具有更多值。

对于示例图,直接抖动或 alpha 混合可能是最好的,但在其他一些情况下,地毯图可能很有用。

One additional idea for the general problem of showing the number of points is using a rug plot (rug function), this places small tick marks along the margin that can show how many points contribute (still use jittering or alpha blending for ties). This allows the actual points to show their true rather than jittered values, but the rug can then indicate which parts of the plot have more values.

For the example plot direct jittering or alpha blending is probably best, but in some other cases the rug plot can be useful.

海之角 2024-09-23 06:15:11

你也可以使用sunflowerplot,虽然它很难实现这里。正如 Dirk 所建议的,我会使用 alpha 混合。

You may also use sunflowerplot, while it would be hard to implement it here. I would use alpha-blending, as Dirk suggested.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文