点图“分箱/分组”在R中

发布于 2025-01-10 05:29:09 字数 1457 浏览 6 评论 0原文

我正在尝试在 R 中创建一个点图,类似于下面的图,其中每个组都与其余组明显分开: http://www.sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-software-and-data-visualization
理想情节

我拥有的数据如下所示,其中我有一个要绘制的值,以及一个组列,该组列应将数据分为不同的组 (1-5)(类似于牙痛中的“剂量”列)上一个链接中的数据集):
我的数据

这是我当前正在使用的绘图代码:

p<-ggplot(new_df, aes(x=group, y=ploidy)) + 
  geom_dotplot(binaxis='y', stackdir='centerwhole', binpositions = 'bygroup', binwidth = 0.5, position = "dodge", dotsize = 0.2)

ggplot(new_df, aes(x=group, y=ploidy)) + 
  geom_dotplot(binaxis='y', stackdir='centerwhole',
               stackratio=0, dotsize=0.2, stackgroups = TRUE)
p + stat_summary(fun=median, geom="point", shape=18,
                 size=3, color="red")

它返回以下绘图: 当前情节我怀疑这里的问题是大多数值位于 2-3 范围内,因此它们溢出到其他垃圾箱/组。

我尝试使用牙痛数据集等简单数据集重新创建问题,但问题不会在那些较小的数据集中再次出现。这是数据集的链接,因为使用小样本数据集重新创建问题是行不通的: http://sendanywhe.re /Y5O133EM

任何帮助将不胜感激

I'm trying to create a dotplot in R, similar to the following plot, where each group is distinctly separated from the rest: http://www.sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-software-and-data-visualization
ideal plot

The data I have looks as follows, where I have a value to plot, and a group column that should bin the data into distinct groups (1-5) (similar to the 'dose' column in the Toothache dataset in the previous link):
my data

This is the plotting code I'm currently using:

p<-ggplot(new_df, aes(x=group, y=ploidy)) + 
  geom_dotplot(binaxis='y', stackdir='centerwhole', binpositions = 'bygroup', binwidth = 0.5, position = "dodge", dotsize = 0.2)

ggplot(new_df, aes(x=group, y=ploidy)) + 
  geom_dotplot(binaxis='y', stackdir='centerwhole',
               stackratio=0, dotsize=0.2, stackgroups = TRUE)
p + stat_summary(fun=median, geom="point", shape=18,
                 size=3, color="red")

and it returns the following plot: current plot
I suspect the issue here is that the majority of the values sit at the 2-3 range, and thus they're overflowing to the other bins/groups.

I tried re-creating the problem with simple datasets like the Toothache dataset, but the issue doesn't reappear in those smaller datasets. Here is a link to the dataset, since recreating the problem with small sample datasets doesn't work: http://sendanywhe.re/Y5O133EM

Any help would be appreciated

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

白衬杉格子梦 2025-01-17 05:29:09

我认为你溢出了分配的空间
通过为每个人使用指定位置来绘制图表
观察(有时称为“堆叠”)。反而
你应该“抖动”个人的立场
特定分配区域内的观察。
抖动,意味着引入少量
要避免的点位置的随机性(主要是
无论如何)过度绘制。

我将使用核心图形来说明这一点
R 的以下虚构数据。这引起了人们的注意
问题所在,而不是具体的编程解决方案
在 ggplot 中,我会让你算出来。

set.seed(2022)
a = round(rnorm(30, 50, 5))
b = round(rnorm(70, 55, 4))
c = round(rnorm(55, 40, 6))
d = round(rnorm(80, 45, 5))
x = c(a,b,c,d)
g = rep(1:4, c(30,70,55,80))


stripchart(x ~ g, meth="jitter", vertical=T, pch=20)

抱歉,本网站不允许发布图片。希望你
你明白了。

I think you are overflowing the allocated space in the
chart by using specified locations for each individual
observation (sometimes called 'stacking'). Instead
you should 'jitter' the positions of the individual
observations inside a specific allocated region.
Jittering, means to introduce a small amount of
randomness to the position of a point to avoid (mostly
anyhow) overplotting.

I will illustrate this using graphics from the core
of R for the following fictitious data. This focuses attention
on what is wrong, more than on the specific programming solution
in ggplot, which I will let you work out.

set.seed(2022)
a = round(rnorm(30, 50, 5))
b = round(rnorm(70, 55, 4))
c = round(rnorm(55, 40, 6))
d = round(rnorm(80, 45, 5))
x = c(a,b,c,d)
g = rep(1:4, c(30,70,55,80))


stripchart(x ~ g, meth="jitter", vertical=T, pch=20)

Sorry, not allowed to post images on this site. Hope you
you get the idea.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文