ggplot2 条形图的子集 data.frame
我有以下数据:
Splice.Pair proportion
1 AA-AG 0.010909091
2 AA-GC 0.003636364
3 AA-TG 0.003636364
4 AA-TT 0.007272727
5 AC-AC 0.003636364
6 AC-AG 0.003636364
7 AC-GA 0.003636364
8 AC-GG 0.003636364
9 AC-TC 0.003636364
10 AC-TG 0.003636364
11 AC-TT 0.003636364
12 AG-AA 0.010909091
13 AG-AC 0.007272727
14 AG-AG 0.003636364
15 AG-AT 0.003636364
16 AG-CC 0.003636364
17 AG-CT 0.007272727
... ... ...
我想要获得一个条形图,直观地显示每个剪接对的比例,但仅限于比例超过 0.004 的剪接对。我尝试了以下操作:
nc.subset <- subset(nc.dat, proportion > 0.004)
qplot(Splice.Pair, proportion, data=nc.dat.subset,geom="bar", xlab="Splice Pair", ylab="Proportion of total non-canonical splice sites") + coord_flip();
但这只是给了我一个条形图,其中 Y 轴上包含所有拼接对,但被过滤掉的拼接对缺少条形。
我不知道发生了什么让所有类别仍然存在:s
I have the following data:
Splice.Pair proportion
1 AA-AG 0.010909091
2 AA-GC 0.003636364
3 AA-TG 0.003636364
4 AA-TT 0.007272727
5 AC-AC 0.003636364
6 AC-AG 0.003636364
7 AC-GA 0.003636364
8 AC-GG 0.003636364
9 AC-TC 0.003636364
10 AC-TG 0.003636364
11 AC-TT 0.003636364
12 AG-AA 0.010909091
13 AG-AC 0.007272727
14 AG-AG 0.003636364
15 AG-AT 0.003636364
16 AG-CC 0.003636364
17 AG-CT 0.007272727
... ... ...
I want to get a barchart visualising the proportion of each splice pair but only for splice pairs that have a proportion over, say, 0.004. I tried the following:
nc.subset <- subset(nc.dat, proportion > 0.004)
qplot(Splice.Pair, proportion, data=nc.dat.subset,geom="bar", xlab="Splice Pair", ylab="Proportion of total non-canonical splice sites") + coord_flip();
But this just gives me a bar chart with all splice pairs on the Y-axis, except that the splice pairs that were filtered out are missing bars.
I have no idea what is happening to allow all categories to still be present :s
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
发生的情况是 Splice.Pair 是一个因素。当您对数据框进行子集化时,该因子保留其级别属性,该属性仍然具有所有原始级别。您可以通过简单地将子集包装在
droplevels
中来避免此类问题:更一般地,如果您不喜欢这种使用因子自动保留级别的方式,您可以将 R 设置为将字符串存储为字符向量,而不是默认情况下,通过设置:
在 R 会话开始时考虑因素(也可以将其作为选项传递给
data.frame
)。编辑
关于运行可能缺少
droplevels
的旧版本R的问题,@rcs在评论中指出,单个因素的方法很容易在您的计算机上实现自己的。数据帧的方法只是稍微复杂一些:但是当然,最好的解决方案仍然是升级到最新版本的R。
What's happening is that Splice.Pair is a factor. When you subset your data frame, the factor retains it's levels attribute, which still has all of the original levels. You can avoid this kind of problem by simply wrapping your subsetting in
droplevels
:More generally, if you dislike this kind of automatic retention of levels with factors, you can set R to store strings as character vectors rather than factors by default by setting:
at the beginning of your R session (this can also be passed as an option to
data.frame
as well).EDIT
Regarding the issue of running older versions of R that may lack
droplevels
, @rcs points out in a comment that the method for a single factor is very simple to implement on your own. The method for data frames is only slightly more complicated:But of course, the best solution is still to upgrade to the latest version of R.
检查 Splice.Pair 是否是一个因素。如果是这种情况,请使用
droplevels()
删除不再用于解决问题的级别。您也许可以将
droplevels
合并到qlot
中,但那是为了让您找到您:-)Check whether Splice.Pair is a factor. If that's the case, use
droplevels()
to remove the levels that are no longer used to resolve your problem.You may be able to incorporate
droplevels
intoqlot
, but that's for you to find you :-)