点图“分箱/分组”在R中
我正在尝试在 R 中创建一个点图,类似于下面的图,其中每个组都与其余组明显分开: http://www.sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-software-and-data-visualization
我拥有的数据如下所示,其中我有一个要绘制的值,以及一个组列,该组列应将数据分为不同的组 (1-5)(类似于牙痛中的“剂量”列)上一个链接中的数据集):
这是我当前正在使用的绘图代码:
p<-ggplot(new_df, aes(x=group, y=ploidy)) +
geom_dotplot(binaxis='y', stackdir='centerwhole', binpositions = 'bygroup', binwidth = 0.5, position = "dodge", dotsize = 0.2)
ggplot(new_df, aes(x=group, y=ploidy)) +
geom_dotplot(binaxis='y', stackdir='centerwhole',
stackratio=0, dotsize=0.2, stackgroups = TRUE)
p + stat_summary(fun=median, geom="point", shape=18,
size=3, color="red")
它返回以下绘图: 我怀疑这里的问题是大多数值位于 2-3 范围内,因此它们溢出到其他垃圾箱/组。
我尝试使用牙痛数据集等简单数据集重新创建问题,但问题不会在那些较小的数据集中再次出现。这是数据集的链接,因为使用小样本数据集重新创建问题是行不通的: http://sendanywhe.re /Y5O133EM
任何帮助将不胜感激
I'm trying to create a dotplot in R, similar to the following plot, where each group is distinctly separated from the rest: http://www.sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-software-and-data-visualization
The data I have looks as follows, where I have a value to plot, and a group column that should bin the data into distinct groups (1-5) (similar to the 'dose' column in the Toothache dataset in the previous link):
This is the plotting code I'm currently using:
p<-ggplot(new_df, aes(x=group, y=ploidy)) +
geom_dotplot(binaxis='y', stackdir='centerwhole', binpositions = 'bygroup', binwidth = 0.5, position = "dodge", dotsize = 0.2)
ggplot(new_df, aes(x=group, y=ploidy)) +
geom_dotplot(binaxis='y', stackdir='centerwhole',
stackratio=0, dotsize=0.2, stackgroups = TRUE)
p + stat_summary(fun=median, geom="point", shape=18,
size=3, color="red")
and it returns the following plot:
I suspect the issue here is that the majority of the values sit at the 2-3 range, and thus they're overflowing to the other bins/groups.
I tried re-creating the problem with simple datasets like the Toothache dataset, but the issue doesn't reappear in those smaller datasets. Here is a link to the dataset, since recreating the problem with small sample datasets doesn't work: http://sendanywhe.re/Y5O133EM
Any help would be appreciated
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为你溢出了分配的空间
通过为每个人使用指定位置来绘制图表
观察(有时称为“堆叠”)。反而
你应该“抖动”个人的立场
特定分配区域内的观察。
抖动,意味着引入少量
要避免的点位置的随机性(主要是
无论如何)过度绘制。
我将使用核心图形来说明这一点
R 的以下虚构数据。这引起了人们的注意
问题所在,而不是具体的编程解决方案
在 ggplot 中,我会让你算出来。
抱歉,本网站不允许发布图片。希望你
你明白了。
I think you are overflowing the allocated space in the
chart by using specified locations for each individual
observation (sometimes called 'stacking'). Instead
you should 'jitter' the positions of the individual
observations inside a specific allocated region.
Jittering, means to introduce a small amount of
randomness to the position of a point to avoid (mostly
anyhow) overplotting.
I will illustrate this using graphics from the core
of R for the following fictitious data. This focuses attention
on what is wrong, more than on the specific programming solution
in
ggplot
, which I will let you work out.Sorry, not allowed to post images on this site. Hope you
you get the idea.